6-Source Fallback Chain · OA-First, Sci-Hub Last-Resort

paper-fetch

DOI in, PDF out. A Claude Code skill that resolves paper PDFs through Unpaywall, Semantic Scholar, arXiv, PubMed Central, bioRxiv/medRxiv, and finally Sci-Hub mirrors as a last resort — across every discipline, with zero dependencies. Disable the Sci-Hub fallback with PAPER_FETCH_NO_SCIHUB=1.

npx skills add Agents365-ai/365-skills -g

GitHub SkillsMP ClawHub Discord

Why This Skill

A deterministic replacement for ad-hoc "can you find this PDF" requests.

🔗

6-Source Fallback Chain

Unpaywall → Semantic Scholar → arXiv → PubMed Central → bioRxiv/medRxiv → Sci-Hub mirrors. Stops at the first valid PDF, reports failure with metadata if none found.

🌐

All Disciplines

Not just life sciences or CS. Unpaywall + Semantic Scholar cover humanities, social sciences, chemistry, physics, economics — any Crossref DOI.

🛡️

OA First, Sci-Hub Last-Resort

Tries every legal open-access source before falling back to Sci-Hub. Mobile-UA fetches with 1 req/s pacing. Disable entirely with PAPER_FETCH_NO_SCIHUB=1; pin specific mirrors with PAPER_FETCH_SCIHUB_MIRRORS.

⚡

Zero Dependencies

Pure Python 3.8+ standard library. No pip install, no virtualenv, no Node.js — just clone and run.

📦

Batch Mode

Pass a file of DOIs with --batch dois.txt. Output is auto-named author_year_title.pdf for consistent library organization.

🤖

Agent-Native Output

Stable JSON envelope on stdout, NDJSON progress events on stderr, typed exit codes (0/1/3/4), machine-readable schema subcommand, ok: "partial" batches with next retry hints, --idempotency-key replay. Scored 28/28 on the agent-native CLI rubric.

How It Resolves a DOI

The skill tries each source in order and stops at the first one that returns a valid PDF.

Unpaywall

Queries api.unpaywall.org/v2/{doi} and reads best_oa_location.url_for_pdf. Covers every publisher with an OA copy in any institutional repository. Requires UNPAYWALL_EMAIL (optional — skipped if not set).

Semantic Scholar

Queries api.semanticscholar.org/graph/v1/paper/DOI:{doi} for the openAccessPdf field and externalIds (arXiv, PMC). Cross-disciplinary academic graph.

arXiv

If the paper has an arXiv ID, downloads from arxiv.org/pdf/{arxiv_id}.pdf. Covers physics, math, CS, stats, quantitative finance, economics, and EE.

PubMed Central OA

If the paper has a PMCID, downloads from ncbi.nlm.nih.gov/pmc/articles/{pmcid}/pdf/. Biomedical OA subset only.

bioRxiv / medRxiv

If the DOI prefix is 10.1101, queries api.biorxiv.org/details/{server}/{doi} for the latest-version PDF URL. Biology and medicine preprints.

Sci-Hub mirrors (last resort)

Iterates the configured mirror list (sci-hub.ru, .st, .se, …) only when every OA source missed. Mobile-UA fetches throttled to 1 req/s; short-circuits when the page reports the paper is not in the corpus. Disable with PAPER_FETCH_NO_SCIHUB=1; override mirrors with PAPER_FETCH_SCIHUB_MIRRORS.

Discipline Coverage

Works for every field, not just life sciences or CS. Coverage depends on OA availability, not subject area.

Source	Discipline Scope
Unpaywall	All disciplines — every Crossref DOI (humanities, social sciences, physics, chemistry, economics, …)
Semantic Scholar	All disciplines — cross-domain academic graph
arXiv	Physics, math, CS, statistics, quantitative finance, economics, EE
PubMed Central	Biomedical only
bioRxiv / medRxiv	Biology / medicine preprints only

In practice, Unpaywall + Semantic Scholar alone cover OA papers in chemistry, materials, economics, psychology, and humanities via institutional repositories, SSRN, RePEc, and publisher-hosted OA copies. arXiv/PMC/bioRxiv are additional fallbacks for their specific domains. If no legal OA copy exists, the skill reports failure honestly — it will never bypass paywalls regardless of discipline.

vs Native Agent

What you get with the skill vs prompting an LLM to "find this PDF."

Feature	Native Agent	This Skill
DOI resolution strategy	Ad-hoc web search	✓ Deterministic 6-source chain
Unpaywall integration	✗	✓ Highest OA hit rate
arXiv / PMC / bioRxiv fallback	Manual	✓ Automatic
Batch download	✗	✓ `--batch dois.txt` or `--batch -` (stdin)
Consistent filenames	✗	✓ `author_year_title.pdf`
Agent-native JSON output	✗	✓ Stable envelope + NDJSON progress
Machine-readable schema	✗	✓ `fetch.py schema`
Idempotent retries	✗	✓ `--idempotency-key` replays original envelope
Typed exit codes	✗	✓ `0`/`1`/`3`/`4` route failures deterministically
Dry-run preview	✗	✓ `--dry-run` resolves without downloading
Host allowlist safety	✗	✓ Restricted to known OA domains
50 MB size cap	✗	✓ Prevents runaway downloads
PDF header validation	✗	✓ Rejects HTML landing pages
Sci-Hub last-resort fallback	None	✓ Mobile-UA, 1 req/s pacing, opt-out via `PAPER_FETCH_NO_SCIHUB=1`
Dependencies	Varies	✓ Python stdlib only
Works across all disciplines	Varies	✓ Any field

Install

Pick your platform. Takes 10 seconds. No pip install required.

# Add the 365-skills marketplace, then install paper-fetch
> /plugin marketplace add Agents365-ai/365-skills
> /plugin install paper-fetch

# Optional: set Unpaywall contact email for highest hit rate
export UNPAYWALL_EMAIL=you@example.com

# Via ClawHub registry
clawhub install paper-fetch-pro-skill

# Any agent that supports the Agent Skills format
npx skills add Agents365-ai/365-skills -g

# Any agent that supports the Agent Skills format
npx skills add Agents365-ai/365-skills -g

# Any agent that supports the Agent Skills format
npx skills add Agents365-ai/365-skills -g

# Via SkillsMP CLI
skills install paper-fetch

Usage

Call directly from the command line, or just ask your agent in natural language.

# Single DOI (auto-detects TTY: JSON when piped, text in a terminal)
python skills/paper-fetch/scripts/fetch.py 10.1038/s41586-020-2649-2

# Force human-readable output
python skills/paper-fetch/scripts/fetch.py 10.1038/s41586-020-2649-2 --format text

# Dry-run preview — resolve without downloading
python skills/paper-fetch/scripts/fetch.py 10.1038/s41586-020-2649-2 --dry-run

# Batch mode — one DOI per line
python skills/paper-fetch/scripts/fetch.py --batch dois.txt --out ~/papers

# Pipe DOIs from another tool
echo 10.1038/s41586-021-03819-2 | python skills/paper-fetch/scripts/fetch.py --batch -

# Safely retriable batch (replay on retry, no network I/O)
python skills/paper-fetch/scripts/fetch.py --batch dois.txt --out ~/papers \
    --idempotency-key monday-review-batch

# Agent discovery — machine-readable CLI schema
python skills/paper-fetch/scripts/fetch.py schema --pretty

Or just ask your agent: "Download the AlphaFold2 paper PDF to my ~/papers folder."

Related Skills

Part of the Agents365-ai research-skill family — pick the right tool for the job.

🔍