paper-fetch
DOI in, PDF out. A Claude Code skill that resolves paper PDFs through Unpaywall, Semantic Scholar, arXiv, PubMed Central, bioRxiv/medRxiv, and finally Sci-Hub mirrors as a last resort — across every discipline, with zero dependencies. Disable the Sci-Hub fallback with PAPER_FETCH_NO_SCIHUB=1.
npx skills add Agents365-ai/365-skills -g
Why This Skill
A deterministic replacement for ad-hoc "can you find this PDF" requests.
6-Source Fallback Chain
Unpaywall → Semantic Scholar → arXiv → PubMed Central → bioRxiv/medRxiv → Sci-Hub mirrors. Stops at the first valid PDF, reports failure with metadata if none found.
All Disciplines
Not just life sciences or CS. Unpaywall + Semantic Scholar cover humanities, social sciences, chemistry, physics, economics — any Crossref DOI.
OA First, Sci-Hub Last-Resort
Tries every legal open-access source before falling back to Sci-Hub. Mobile-UA fetches with 1 req/s pacing. Disable entirely with PAPER_FETCH_NO_SCIHUB=1; pin specific mirrors with PAPER_FETCH_SCIHUB_MIRRORS.
Zero Dependencies
Pure Python 3.8+ standard library. No pip install, no virtualenv, no Node.js — just clone and run.
Batch Mode
Pass a file of DOIs with --batch dois.txt. Output is auto-named author_year_title.pdf for consistent library organization.
Agent-Native Output
Stable JSON envelope on stdout, NDJSON progress events on stderr, typed exit codes (0/1/3/4), machine-readable schema subcommand, ok: "partial" batches with next retry hints, --idempotency-key replay. Scored 28/28 on the agent-native CLI rubric.
How It Resolves a DOI
The skill tries each source in order and stops at the first one that returns a valid PDF.
Unpaywall
Queries api.unpaywall.org/v2/{doi} and reads best_oa_location.url_for_pdf. Covers every publisher with an OA copy in any institutional repository. Requires UNPAYWALL_EMAIL (optional — skipped if not set).
Semantic Scholar
Queries api.semanticscholar.org/graph/v1/paper/DOI:{doi} for the openAccessPdf field and externalIds (arXiv, PMC). Cross-disciplinary academic graph.
arXiv
If the paper has an arXiv ID, downloads from arxiv.org/pdf/{arxiv_id}.pdf. Covers physics, math, CS, stats, quantitative finance, economics, and EE.
PubMed Central OA
If the paper has a PMCID, downloads from ncbi.nlm.nih.gov/pmc/articles/{pmcid}/pdf/. Biomedical OA subset only.
bioRxiv / medRxiv
If the DOI prefix is 10.1101, queries api.biorxiv.org/details/{server}/{doi} for the latest-version PDF URL. Biology and medicine preprints.
Sci-Hub mirrors (last resort)
Iterates the configured mirror list (sci-hub.ru, .st, .se, …) only when every OA source missed. Mobile-UA fetches throttled to 1 req/s; short-circuits when the page reports the paper is not in the corpus. Disable with PAPER_FETCH_NO_SCIHUB=1; override mirrors with PAPER_FETCH_SCIHUB_MIRRORS.
Discipline Coverage
Works for every field, not just life sciences or CS. Coverage depends on OA availability, not subject area.
| Source | Discipline Scope |
|---|---|
| Unpaywall | All disciplines — every Crossref DOI (humanities, social sciences, physics, chemistry, economics, …) |
| Semantic Scholar | All disciplines — cross-domain academic graph |
| arXiv | Physics, math, CS, statistics, quantitative finance, economics, EE |
| PubMed Central | Biomedical only |
| bioRxiv / medRxiv | Biology / medicine preprints only |
In practice, Unpaywall + Semantic Scholar alone cover OA papers in chemistry, materials, economics, psychology, and humanities via institutional repositories, SSRN, RePEc, and publisher-hosted OA copies. arXiv/PMC/bioRxiv are additional fallbacks for their specific domains. If no legal OA copy exists, the skill reports failure honestly — it will never bypass paywalls regardless of discipline.
vs Native Agent
What you get with the skill vs prompting an LLM to "find this PDF."
| Feature | Native Agent | This Skill |
|---|---|---|
| DOI resolution strategy | Ad-hoc web search | ✓ Deterministic 6-source chain |
| Unpaywall integration | ✗ | ✓ Highest OA hit rate |
| arXiv / PMC / bioRxiv fallback | Manual | ✓ Automatic |
| Batch download | ✗ | ✓ --batch dois.txt or --batch - (stdin) |
| Consistent filenames | ✗ | ✓ author_year_title.pdf |
| Agent-native JSON output | ✗ | ✓ Stable envelope + NDJSON progress |
| Machine-readable schema | ✗ | ✓ fetch.py schema |
| Idempotent retries | ✗ | ✓ --idempotency-key replays original envelope |
| Typed exit codes | ✗ | ✓ 0/1/3/4 route failures deterministically |
| Dry-run preview | ✗ | ✓ --dry-run resolves without downloading |
| Host allowlist safety | ✗ | ✓ Restricted to known OA domains |
| 50 MB size cap | ✗ | ✓ Prevents runaway downloads |
| PDF header validation | ✗ | ✓ Rejects HTML landing pages |
| Sci-Hub last-resort fallback | None | ✓ Mobile-UA, 1 req/s pacing, opt-out via PAPER_FETCH_NO_SCIHUB=1 |
| Dependencies | Varies | ✓ Python stdlib only |
| Works across all disciplines | Varies | ✓ Any field |
Install
Pick your platform. Takes 10 seconds. No pip install required.
# Add the 365-skills marketplace, then install paper-fetch > /plugin marketplace add Agents365-ai/365-skills > /plugin install paper-fetch # Optional: set Unpaywall contact email for highest hit rate export UNPAYWALL_EMAIL=you@example.com
# Via ClawHub registry clawhub install paper-fetch-pro-skill
# Any agent that supports the Agent Skills format npx skills add Agents365-ai/365-skills -g
# Any agent that supports the Agent Skills format npx skills add Agents365-ai/365-skills -g
# Any agent that supports the Agent Skills format npx skills add Agents365-ai/365-skills -g
# Via SkillsMP CLI skills install paper-fetch
Usage
Call directly from the command line, or just ask your agent in natural language.
# Single DOI (auto-detects TTY: JSON when piped, text in a terminal)
python skills/paper-fetch/scripts/fetch.py 10.1038/s41586-020-2649-2
# Force human-readable output
python skills/paper-fetch/scripts/fetch.py 10.1038/s41586-020-2649-2 --format text
# Dry-run preview — resolve without downloading
python skills/paper-fetch/scripts/fetch.py 10.1038/s41586-020-2649-2 --dry-run
# Batch mode — one DOI per line
python skills/paper-fetch/scripts/fetch.py --batch dois.txt --out ~/papers
# Pipe DOIs from another tool
echo 10.1038/s41586-021-03819-2 | python skills/paper-fetch/scripts/fetch.py --batch -
# Safely retriable batch (replay on retry, no network I/O)
python skills/paper-fetch/scripts/fetch.py --batch dois.txt --out ~/papers \
--idempotency-key monday-review-batch
# Agent discovery — machine-readable CLI schema
python skills/paper-fetch/scripts/fetch.py schema --pretty
Or just ask your agent: "Download the AlphaFold2 paper PDF to my ~/papers folder."
Related Skills
Part of the Agents365-ai research-skill family — pick the right tool for the job.
semanticscholar-skill
Semantic Scholar API search — when you need to FIND papers before fetching.
asta-skill
Same corpus via Ai2 Asta MCP — when your host supports MCP and you have an Asta API key.
scholar-deep-research
8-phase literature review pipeline — when you want a structured cited report, not just PDFs.
zotero-research-assistant
Zotero library workflows — when references go into Zotero.