6-Source Fallback Chain · OA-First, Sci-Hub Last-Resort

paper-fetch

DOI in, PDF out. A Claude Code skill that resolves paper PDFs through Unpaywall, Semantic Scholar, arXiv, PubMed Central, bioRxiv/medRxiv, and finally Sci-Hub mirrors as a last resort — across every discipline, with zero dependencies. Disable the Sci-Hub fallback with PAPER_FETCH_NO_SCIHUB=1.

npx skills add Agents365-ai/365-skills -g

Why This Skill

A deterministic replacement for ad-hoc "can you find this PDF" requests.

🔗

6-Source Fallback Chain

Unpaywall → Semantic Scholar → arXiv → PubMed Central → bioRxiv/medRxiv → Sci-Hub mirrors. Stops at the first valid PDF, reports failure with metadata if none found.

🌐

All Disciplines

Not just life sciences or CS. Unpaywall + Semantic Scholar cover humanities, social sciences, chemistry, physics, economics — any Crossref DOI.

🛡️

OA First, Sci-Hub Last-Resort

Tries every legal open-access source before falling back to Sci-Hub. Mobile-UA fetches with 1 req/s pacing. Disable entirely with PAPER_FETCH_NO_SCIHUB=1; pin specific mirrors with PAPER_FETCH_SCIHUB_MIRRORS.

Zero Dependencies

Pure Python 3.8+ standard library. No pip install, no virtualenv, no Node.js — just clone and run.

📦

Batch Mode

Pass a file of DOIs with --batch dois.txt. Output is auto-named author_year_title.pdf for consistent library organization.

🤖

Agent-Native Output

Stable JSON envelope on stdout, NDJSON progress events on stderr, typed exit codes (0/1/3/4), machine-readable schema subcommand, ok: "partial" batches with next retry hints, --idempotency-key replay. Scored 28/28 on the agent-native CLI rubric.

How It Resolves a DOI

The skill tries each source in order and stops at the first one that returns a valid PDF.

1

Unpaywall

Queries api.unpaywall.org/v2/{doi} and reads best_oa_location.url_for_pdf. Covers every publisher with an OA copy in any institutional repository. Requires UNPAYWALL_EMAIL (optional — skipped if not set).

2

Semantic Scholar

Queries api.semanticscholar.org/graph/v1/paper/DOI:{doi} for the openAccessPdf field and externalIds (arXiv, PMC). Cross-disciplinary academic graph.

3

arXiv

If the paper has an arXiv ID, downloads from arxiv.org/pdf/{arxiv_id}.pdf. Covers physics, math, CS, stats, quantitative finance, economics, and EE.

4

PubMed Central OA

If the paper has a PMCID, downloads from ncbi.nlm.nih.gov/pmc/articles/{pmcid}/pdf/. Biomedical OA subset only.

5

bioRxiv / medRxiv

If the DOI prefix is 10.1101, queries api.biorxiv.org/details/{server}/{doi} for the latest-version PDF URL. Biology and medicine preprints.

6

Sci-Hub mirrors (last resort)

Iterates the configured mirror list (sci-hub.ru, .st, .se, …) only when every OA source missed. Mobile-UA fetches throttled to 1 req/s; short-circuits when the page reports the paper is not in the corpus. Disable with PAPER_FETCH_NO_SCIHUB=1; override mirrors with PAPER_FETCH_SCIHUB_MIRRORS.

Discipline Coverage

Works for every field, not just life sciences or CS. Coverage depends on OA availability, not subject area.

SourceDiscipline Scope
UnpaywallAll disciplines — every Crossref DOI (humanities, social sciences, physics, chemistry, economics, …)
Semantic ScholarAll disciplines — cross-domain academic graph
arXivPhysics, math, CS, statistics, quantitative finance, economics, EE
PubMed CentralBiomedical only
bioRxiv / medRxivBiology / medicine preprints only

In practice, Unpaywall + Semantic Scholar alone cover OA papers in chemistry, materials, economics, psychology, and humanities via institutional repositories, SSRN, RePEc, and publisher-hosted OA copies. arXiv/PMC/bioRxiv are additional fallbacks for their specific domains. If no legal OA copy exists, the skill reports failure honestly — it will never bypass paywalls regardless of discipline.

vs Native Agent

What you get with the skill vs prompting an LLM to "find this PDF."

FeatureNative AgentThis Skill
DOI resolution strategyAd-hoc web search✓ Deterministic 6-source chain
Unpaywall integration✓ Highest OA hit rate
arXiv / PMC / bioRxiv fallbackManual✓ Automatic
Batch download--batch dois.txt or --batch - (stdin)
Consistent filenamesauthor_year_title.pdf
Agent-native JSON output✓ Stable envelope + NDJSON progress
Machine-readable schemafetch.py schema
Idempotent retries--idempotency-key replays original envelope
Typed exit codes0/1/3/4 route failures deterministically
Dry-run preview--dry-run resolves without downloading
Host allowlist safety✓ Restricted to known OA domains
50 MB size cap✓ Prevents runaway downloads
PDF header validation✓ Rejects HTML landing pages
Sci-Hub last-resort fallbackNone✓ Mobile-UA, 1 req/s pacing, opt-out via PAPER_FETCH_NO_SCIHUB=1
DependenciesVaries✓ Python stdlib only
Works across all disciplinesVaries✓ Any field

Install

Pick your platform. Takes 10 seconds. No pip install required.

# Add the 365-skills marketplace, then install paper-fetch
> /plugin marketplace add Agents365-ai/365-skills
> /plugin install paper-fetch

# Optional: set Unpaywall contact email for highest hit rate
export UNPAYWALL_EMAIL=you@example.com
# Via ClawHub registry
clawhub install paper-fetch-pro-skill
# Any agent that supports the Agent Skills format
npx skills add Agents365-ai/365-skills -g
# Any agent that supports the Agent Skills format
npx skills add Agents365-ai/365-skills -g
# Any agent that supports the Agent Skills format
npx skills add Agents365-ai/365-skills -g
# Via SkillsMP CLI
skills install paper-fetch

Usage

Call directly from the command line, or just ask your agent in natural language.

# Single DOI (auto-detects TTY: JSON when piped, text in a terminal)
python skills/paper-fetch/scripts/fetch.py 10.1038/s41586-020-2649-2

# Force human-readable output
python skills/paper-fetch/scripts/fetch.py 10.1038/s41586-020-2649-2 --format text

# Dry-run preview — resolve without downloading
python skills/paper-fetch/scripts/fetch.py 10.1038/s41586-020-2649-2 --dry-run

# Batch mode — one DOI per line
python skills/paper-fetch/scripts/fetch.py --batch dois.txt --out ~/papers

# Pipe DOIs from another tool
echo 10.1038/s41586-021-03819-2 | python skills/paper-fetch/scripts/fetch.py --batch -

# Safely retriable batch (replay on retry, no network I/O)
python skills/paper-fetch/scripts/fetch.py --batch dois.txt --out ~/papers \
    --idempotency-key monday-review-batch

# Agent discovery — machine-readable CLI schema
python skills/paper-fetch/scripts/fetch.py schema --pretty

Or just ask your agent: "Download the AlphaFold2 paper PDF to my ~/papers folder."