v0.2 — Now with Auto-Update

scholar-deep-research

From a research question to a cited, structured report. Multi-source federation across OpenAlex, arXiv, Crossref, and PubMed — with transparent ranking, citation chasing, and a mandatory self-critique pass before the report ships.

git clone https://github.com/Agents365-ai/scholar-deep-research.git ~/.claude/skills/scholar-deep-research

GitHub ClawHub SkillsMP

Why This Skill

Script-driven, stateful, and resumable. Research is saved to research_state.json after every step, so crashes don't cost you the corpus.

🧭

7-Phase Workflow

Scope → Discover → Triage → Deep read → Chase → Synthesize → Self-critique → Report. Each phase has a completion gate checked against state before advancing.

🌐

4 Federated Sources

OpenAlex (240M+ works, primary), arXiv (preprints), Crossref (DOI metadata), PubMed (biomedical). Zero API keys required.

📏

Transparent Ranking

Published formula: α·rel + β·cite + γ·rec + δ·venue. Per-paper score components written to state so your report can cite its own methodology.

🛡️

Mandatory Self-Critique

Phase 6 runs a 14-point adversarial checklist: unanchored claims, venue/author bias, recency collapse, untested high-citation papers. Findings go in the report appendix.

💾

Resumable State

Every query, every paper, every decision lives in research_state.json. Sessions can pause, crash, and resume across days without losing the corpus.

🔁

Auto-Update

Phase 0 Step 0 fast-forwards the skill against its origin on every invocation. Users always run the latest version — without knowing they needed to.

The Pipeline

Every phase reads and writes research_state.json. Phase 6 can loop back to Phase 1 when self-critique finds coverage gaps; everything else is linear.

flowchart LR
    Q([Question]) --> P0[0 · Scope]
    P0 --> P1[1 · Discover]
    P1 --> P2[2 · Triage]
    P2 --> P3[3 · Deep read]
    P3 --> P4[4 · Chase]
    P4 --> P5[5 · Synthesize]
    P5 --> P6[6 · Self-critique]
    P6 -- blockers --> P1
    P6 --> P7[7 · Report]
    P7 --> OUT([Cited report + .bib])

    STATE[(research_state.json)]
    P0 & P1 & P2 & P3 & P4 & P5 & P6 & P7 <-.-> STATE

    classDef phase fill:#1c2a4a,stroke:#58a6ff,color:#e6edf3;
    classDef state fill:#1a3a1a,stroke:#3fb950,color:#e6edf3;
    class P0,P1,P2,P3,P4,P5,P6,P7 phase;
    class STATE state;

Discovery stops on a saturation signal, not a fixed round count: a round ends when new papers are <20% of hits and no new paper has >100 citations. Self-critique findings are written back to state and into the final report's appendix.

Agent-Native CLI

Every script emits exactly one JSON envelope on stdout and exits with a stable code. Built for LLM orchestrators, not just humans. Follows the 7 principles of agent-native CLI design.

Success

# python scripts/search_openalex.py \ # --query "transformer" --limit 5 --state s.json { "ok": true, "data": { "source": "openalex", "query": "transformer", "round": 1, "count": 5 } } # exit 0

Failure

# network blip on openalex.org { "ok": false, "error": { "code": "upstream_error", "message": "503 Service Unavailable", "retryable": true, "source": "openalex" } } # exit 2 (retryable)

Every script supports --schema pre-parse, so an agent can discover parameters without reading docs. Expensive calls (citation chasing) accept --idempotency-key so retries replay from cache instead of re-spending API budget.

vs Native Agent

What you get with the skill vs prompting an LLM directly.

Feature	Native Agent	This Skill
Multi-source search	One source at a time	✓ 4 sources federated
Saturation-based stop signal	One shot	✓ Per-source, min-rounds gate
Cross-source deduplication	✗	✓ DOI-first + title similarity
Transparent ranking formula	Opaque	✓ Formula + components in state
Forward / backward citation chase	✗	✓ OpenAlex graph, idempotent
Resumable state	Stateless per turn	✓ research_state.json
Report archetype selection	Generic outline	✓ 5 archetypes
Mandatory self-critique pass	✗	✓ 14-point checklist (Phase 6)
Citation anchors enforced	Claims float	✓ [^id] required for every claim
BibTeX / CSL-JSON / RIS export	✗	✓ Generated from state
Confirmation-bias backstop	✗	✓ Explicit critique search
Auto-update on invocation	N/A	✓ Phase 0 Step 0
MCP graceful degradation	Hard-wired to one tool	✓ Scripts work when MCP fails

5 Report Archetypes

The right structure for the right question — picked from user intent, not forced into a generic outline.

📖

Literature Review

What is known about a topic. Thematic sections, synthesis, gap analysis. The default when intent is "survey the field."

🔬

Systematic Review

Rigorous PRISMA-lite for a narrow question across many studies. Extraction table + pooled findings + explicit criteria.

🗺️

Scoping Review

"What has been studied, and how?" Breadth over depth. Coverage map, methods inventory, research-gap pivot.

⚖️

Comparative Analysis

A vs B head-to-head. Axes of comparison, per-axis verdict, evidence-backed recommendation.

📝

Grant Background

Narrative introduction for a proposal. Problem significance → what's known → what's missing → why our approach.

🎯

Intent-Driven

Phase 0 matches your question to the right archetype, or asks when it's ambiguous. The template shapes every downstream phase.

Install

Pick your platform. After cloning, pip install -r requirements.txt pulls in httpx and pypdf. No API keys required.

# Global install (available in all projects)
git clone https://github.com/Agents365-ai/scholar-deep-research.git \
  ~/.claude/skills/scholar-deep-research
cd ~/.claude/skills/scholar-deep-research
pip install -r requirements.txt

# Project-level install
git clone https://github.com/Agents365-ai/scholar-deep-research.git \
  .claude/skills/scholar-deep-research

# Global install
git clone https://github.com/Agents365-ai/scholar-deep-research.git \
  ~/.config/opencode/skills/scholar-deep-research
cd ~/.config/opencode/skills/scholar-deep-research
pip install -r requirements.txt

# Project-level install
git clone https://github.com/Agents365-ai/scholar-deep-research.git \
  .opencode/skills/scholar-deep-research

# Via ClawHub registry
clawhub install scholar-deep-research

# Manual install
git clone https://github.com/Agents365-ai/scholar-deep-research.git \
  ~/.openclaw/skills/scholar-deep-research

# Install under research category
git clone https://github.com/Agents365-ai/scholar-deep-research.git \
  ~/.hermes/skills/research/scholar-deep-research

# Or add to ~/.hermes/config.yaml:
# skills:
#   external_dirs:
#     - ~/myskills/scholar-deep-research

# User-level install
git clone https://github.com/Agents365-ai/scholar-deep-research.git \
  ~/.agents/skills/scholar-deep-research

# Project-level install
git clone https://github.com/Agents365-ai/scholar-deep-research.git \
  .agents/skills/scholar-deep-research

# Via SkillsMP CLI
skills install scholar-deep-research

Easiest install: just tell your agent — "Install https://github.com/Agents365-ai/scholar-deep-research for me and run pip install -r requirements.txt". Claude Code, OpenCode, OpenClaw, Hermes, Codex, and pi-mono will all do the right thing.