AutoSci

Read, think, experiment, write, evolve — the AI research agent with memory that compounds across every project.

⚠️ Status & Update

Thanks to everyone who's been trying AutoSci — the community response has been amazing! AutoSci evolved from our earlier OmegaWiki prototype into what we're building toward: a next-generation research agent that can handle the full scientific lifecycle. We're actively testing and iterating on new features, and more capabilities are on the way. Jump in, break things, and tell us what you think — your feedback and ideas are what's shaping where this goes next. 🙏

🌿 Which branch? main is the stable, lean version. The full system described in our paper — SciMem · SciFlow · SciDAG · SciEvolve — lives on the paper branch (frozen as tag arxiv-v1). Note that paper is a research snapshot, not a finished product: it's under active testing and iteration, and some capabilities described in the paper are still being implemented and refined.

📄 Paper

AutoSci: A Memory-Centric Agentic System for the Full Scientific Research Lifecycle

· 📄 Read on arXiv →

If you find AutoSci useful in your research, please cite our paper.

📌 Poster & Demo

_{AutoSci poster — click to view full size.}

_{▶ Watch the AutoSci demo on Bilibili}

🆕 What's New

🛠️ 2026-05-19 · Experiment Overhaul

A possible usage process：/ideate [research-direction-or-topic](You can use --skip-pilot to decide whether to conduct preliminary experiments) -> /exp-design <idea-slug>-> For each experimental block,recommended flow: /exp-run <slug> [--env local|remote] to deploy → /exp-status to monitor → /exp-run <slug> --collect to collect.->/exp-eval <experiment-slug>

✨ : New Skills /exp-pilot-run — Pilot experiment execution: write code, deploy, monitor, collect raw results. /exp-pilot-eval — Pilot result evaluation: read results, apply lenient verdict logic These two skills are built into Phase5 of /ideate 🛠️ : Modified Skills /ideate 5 structured generation paths (A-E) for both Claude and Review LLM. Phase restructuring: Filter & Validation merged into Phase 3, Write Wiki moved to Phase 4. Phase 5: Finish pilot design and workflow invocation Your ideas will follow a clearer path, and a more reasonable screening mechanism will be established through pilot experiments. /exp-design A brand-new experimental design process:method candidate generation + 5 experiment block types + iterative ablation loop /exp-run Add the code decision gate, code optimization and config check

🎨 2026-05-18 · /poster — drafted paper → print-ready conference poster

Run /poster after /paper-draft + /paper-compile to turn your finished draft into a self-contained 1400×900 HTML poster and a print-quality PNG. Figures, booktabs tables, and math macros are extracted automatically from your LaTeX source; Claude walks you through picking which figures land in which sections and customizing the header (venue, affiliation logo). Export to PDF from your browser's print dialog. Pipeline adapted from PaperX (arXiv:2602.03866).

🎯 2026-05-12 · /discover from a venue — "what should I read first from ICLR 2024?"

Run /discover --venue iclr --year 2024 (or any conference/year) and get a personalized shortlist of papers from that venue, ranked by relevance to what's already in your wiki. Instead of scrolling a 7000-paper proceedings, you see the dozen that actually matter for your research direction, each with a rationale tied to topics and methods you already track. No new API keys, no ingest side-effects on your wiki — just a ranked reading list. Supports NeurIPS, ICLR, ICML, and other venues covered by Paper Copilot.

📰 2026-05-09 · Daily arXiv — fresh-paper recommendations, on demand or scheduled

Run /daily-arxiv for a one-off pass, or /daily-arxiv setup to schedule the same pipeline in GitHub Actions. The skill builds an evidence packet from arXiv + Semantic Scholar + DeepXiv, lets the LLM rank candidates against your wiki interests, and delivers a digest by e-mail. Explicit --mode auto-ingest calls /ingest for high-confidence picks; inform mode just notifies.

🌐 2026-05-06 · Knowledge Graph Visualization — browser + Obsidian

Your research graph now has two ways to explore:

Web UI — run python3 tools/serve.py, open http://localhost:8765/#/graph. Click any node to highlight its neighborhood via BFS, filter by entity type or edge category, double-click to open the full page in the Reader.
Obsidian — run /visualize --obsidian to generate a color-coded graph config, or /visualize --canvas to produce a force-layout Canvas with labeled semantic edges.

Team

AutoSci is built by DAIR Lab at Peking University.

Weitong Qian _PKU _{Undergraduate · 2023}	Beicheng Xu _PKU _{Ph.D. · 2023}	Zhongao Xie _PKU _{Undergraduate · 2025}	Bowen Fan _PKU _{Undergraduate · 2024}
Guozheng Tang _PKU _{Undergraduate · 2024}	Xinzhe Wu _PKU _{Undergraduate · 2024}	Jiale Chen _PKU _{Undergraduate · 2024}	Mingtian Yang _PKU _{Undergraduate · 2024}
Chenyang Di _PKU _{Undergraduate · 2023}

_{...and more contributors who have shaped AutoSci along the way.}

What is AutoSci?

Scientific research has traditionally been human-intensive: researchers coordinate literature, ideas, experiments, manuscripts, and review responses across long project cycles. AutoSci is a memory-centric agentic system that automates the full research lifecycle — from paper ingestion to rebuttal — while maintaining structured persistent memory across projects and improving its own procedures over time.

🔬 Works Produced with AutoSci

The following papers were generated end-to-end using AutoSci — from literature ingestion and idea generation to experiment execution and manuscript writing.

Paper	Domain	PDF
Agent-driven iterative optimization of Triton GPU kernels	GPU kernel optimization	📄 PDF
PTM-aware degrader target nomination via calibrated ternary-complex scoring	Biomedical drug discovery	📄 PDF
Forced Honesty Dissociates Polite Speech from Motivated Cognition in LLM Attitude Ratings	LLMs as cognitive models	📄 PDF

Have you used AutoSci in your own research? We'd love to feature your work here — open a PR or drop us a message!

Quick Start

Prerequisites: Python 3.9+, Node.js 18+

# 1. Clone
git clone https://github.com/skyllwt/AutoSci.git
cd AutoSci

# 2. Install Claude Code
npm install -g @anthropic-ai/claude-code
claude login

# 3. One-click setup
chmod +x setup.sh && ./setup.sh        # Linux / macOS
# Windows (PowerShell):
#   powershell -ExecutionPolicy Bypass -File .\setup.ps1
# setup creates a .venv for AutoSci; /init will use it automatically

# 4. Put your own papers in raw/papers/ (.tex or .pdf)
#    Optional: intent notes in raw/notes/, saved pages in raw/web/

# 5. Build your research memory and start a project
claude
# Then type: /init [your-research-topic]

Manual setup (Linux / macOS)

python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env                 # Edit to add API keys
cp config/settings.local.json.example .claude/settings.local.json

Manual setup (Windows / PowerShell)

python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
Copy-Item .env.example .env          # Edit to add API keys
Copy-Item config\settings.local.json.example .claude\settings.local.json

Note: native Windows is supported for the local pipeline. Remote-GPU experiments via /exp-run --env remote rely on ssh/rsync/screen and are best run from WSL2 or Linux/macOS.

API Keys

Key	Required?	How to get	What it enables
`ANTHROPIC_API_KEY`	Yes (or use a third-party compatible API — see below)	`claude login` (automatic)	Powers all Claude Code skills
`CLAUDE_CODE_OAUTH_TOKEN`	Optional	`claude setup-token`	GitHub Actions Claude Code auth for Pro/Max users
`SEMANTIC_SCHOLAR_API_KEY`	Optional	semanticscholar.org/product/api (free)	Citation graph, paper search
`DEEPXIV_TOKEN`	Optional	`setup.sh` auto-registers	Semantic search, TLDR, trending
`LLM_API_KEY` + `LLM_BASE_URL` + `LLM_MODEL`	Optional	Any OpenAI-compatible API	Cross-model review; `/daily-arxiv` inform recommendations

Don't have an Anthropic API key? AutoSci runs on Claude Code, which supports any Anthropic-protocol-compatible provider — DeepSeek, Kimi, MiMo, GLM, and more. See the LLM API Configuration section below for setup snippets.

Cross-model review: AutoSci uses a second LLM as an independent reviewer for ideas, experiments, and paper drafts. Works with any OpenAI-compatible API — DeepSeek, OpenAI, Qwen, OpenRouter, SiliconFlow, etc. If not configured, skills still work in Claude-only mode.

LLM API Configuration / 大模型 API 配置

AutoSci runs on Claude Code, which speaks the Anthropic API protocol. You can use Claude directly, or route Claude Code to any third-party provider that exposes an Anthropic-compatible endpoint by overriding a few environment variables.

AutoSci 基于 Claude Code,Claude Code 使用 Anthropic API 协议通信。你既可以直接使用 Claude,也可以通过覆盖几个环境变量,把 Claude Code 指向任意支持 Anthropic 协议的第三方供应商。

Option A — Native Claude / 原生 Claude

claude login   # OAuth, no manual config / OAuth 登录,无需手动配置

Option B — Third-party Anthropic-compatible API / 第三方 Anthropic 兼容 API

Pick a provider below, paste the snippet into ~/.claude/settings.json (or the project's .claude/settings.json), and replace the <...> placeholder with your own API key. Model names and extra options follow each provider's official Claude Code docs.

从下方任选一个供应商,把对应配置粘贴到 ~/.claude/settings.json(或项目的 .claude/settings.json),并把 <...> 占位符替换为你自己的 API key。模型名与额外选项均来自各供应商官方 Claude Code 文档。

MiMo / DeepSeek / Kimi / GLM 配置示例

MiMo (小米)

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.xiaomimimo.com/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your-mimo-key>",
    "ANTHROPIC_MODEL": "mimo-v2.5",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "mimo-v2.5",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "mimo-v2.5-pro",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "mimo-v2.5"
  }
}

DeepSeek

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.deepseek.com/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your-deepseek-key>",
    "ANTHROPIC_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "deepseek-v4-flash",
    "CLAUDE_CODE_SUBAGENT_MODEL": "deepseek-v4-flash",
    "CLAUDE_CODE_EFFORT_LEVEL": "max"
  }
}

Kimi (Moonshot)

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.moonshot.ai/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your-moonshot-key>",
    "ANTHROPIC_MODEL": "kimi-k2.5",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "kimi-k2.5",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "kimi-k2.5",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "kimi-k2.5",
    "CLAUDE_CODE_SUBAGENT_MODEL": "kimi-k2.5",
    "ENABLE_TOOL_SEARCH": "false"
  }
}

GLM (Z.AI)

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.z.ai/api/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your-zai-key>",
    "API_TIMEOUT_MS": "3000000"
  }
}

Z.AI applies a default server-side model mapping, so no explicit ANTHROPIC_MODEL is needed. Z.AI 默认在服务端做模型映射,无需显式设置 ANTHROPIC_MODEL。

Skip the Claude Code onboarding / 跳过 Claude Code 初始引导: when using a third-party key, create or edit .claude.json (~/.claude.json on macOS/Linux) and add { "hasCompletedOnboarding": true }.

Skills

AutoSci ships with 30+ slash commands spanning the full research lifecycle.

View all skills

Phase 0: Setup

Command	What it does
`/setup`	Interactive API key configuration — checks `.env` state and walks through Semantic Scholar, DeepXiv, and Review LLM setup
`/reset`	Destructive cleanup — reset wiki state to a clean scaffold by scope (`wiki / raw / log / checkpoints / all`)

Phase 1: Knowledge Base

Command	What it does
`/prefill`	Seed `wiki/foundations/` with domain background so subsequent `/ingest` doesn't create duplicate concept pages for textbook material
`/init`	Bootstrap the wiki from your source files, with optional discovery, then ingest the final paper set in parallel
`/ingest`	Ingest a paper (local path or arXiv URL) — creates pages and builds all cross-references and graph edges
`/discover`	Build a ranked shortlist of candidate papers (anchor-driven, topic-driven, venue-filtered, or from wiki state) without ingesting
`/edit`	Add or remove raw sources, or update wiki content, per user request
`/ask`	Ask the wiki a question — retrieve and synthesize relevant pages, optionally crystallize the answer back into the wiki
`/check`	Scan the full wiki to detect health issues and produce a tiered fix-recommendation report

Phase 2: Ideation & Experiments

Command	What it does
`/daily-arxiv`	Run or schedule the daily arXiv recommendation feed; delivers a ranked digest by email with optional auto-ingest for high-confidence picks
`/ideate`	Multi-phase research idea generation: landscape scan → dual-model brainstorm → filter & validation → write to wiki → pilot
`/exp-pilot-run`	Pilot experiment execution — write code, deploy, monitor, collect raw results (called by `/ideate` Phase 5)
`/exp-pilot-eval`	Pilot result evaluation — read results, apply success criteria, update idea page (called by `/ideate` Phase 5)
`/novelty`	Multi-source novelty verification via WebSearch + Semantic Scholar + wiki + Review LLM; outputs novelty score and recommendations
`/review`	Cross-model review of any research artifact — outputs structured scores, wiki entity mapping, and improvement suggestions
`/exp-design`	Idea-driven experiment design with iterative ablation — method candidates → benchmark selection → sensitivity analysis → main experiment
`/exp-run`	Full experiment execution pipeline — prepare code → deploy → monitor → collect results
`/exp-status`	View the status of all running experiments; optionally auto-collect completed runs and advance the pipeline
`/exp-eval`	Experiment verdict gate — Review LLM independently judges results and auto-updates the linked idea's status and graph edges
`/refine`	Multi-round iterative improvement — repeatedly calls `/review`, parses feedback, applies fixes, and updates wiki until target score

Phase 3: Writing & Dissemination

Command	What it does
`/survey`	Generate a Related Work section from wiki knowledge — thematic grouping → narrative structure → LaTeX output
`/paper-plan`	Compile a paper outline from the idea graph — evidence map → narrative structure → section + figure + citation plan
`/paper-draft`	Draft a LaTeX paper from `PAPER_PLAN` — write each section from wiki sources, generate figures/tables, verify BibTeX
`/paper-compile`	LaTeX compile → PDF — latexmk compile + auto-fix + page count / anonymity / font checks + submission checklist
`/research`	End-to-end research orchestrator — idea discovery → experiment design → execution → verdict → paper writing with human gates
`/rebuttal`	Parse review comments → atomize concerns → map to wiki → stress-test with Review LLM → generate rebuttal
`/poster`	Generate an academic poster from a drafted paper — distill sections into a single-page HTML poster with figures

Utilities

Command	What it does
`/visualize`	Generate Obsidian graph configs and Canvas knowledge maps; the interactive web graph is served by `tools/serve.py`

Contributing

We welcome contributions and feedback — especially while we're in active iteration. See CONTRIBUTING.md.

Community / 交流群

Scan to join the AutoSci WeChat group / 扫码加入微信交流群

Citation

If you find AutoSci useful in your research, please cite our paper:

@misc{qian2026autosci,
      title={AutoSci: A Memory-Centric Agentic System for the Full Scientific Research Lifecycle}, 
      author={Weitong Qian and Beicheng Xu and Zhongao Xie and Bowen Fan and Guozheng Tang and Jiale Chen and Xinzhe Wu and Mingtian Yang and Chenyang Di and Jiajun Li and Lingching Tung and Peichao Lai and Yifei Xia and Ziyi Guo and Yanwei Xu and Yanzhao Qin and Shaoduo Gan and Xupeng Miao and Bin Cui},
      year={2026},
      eprint={2605.31468},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2605.31468}, 
}

Acknowledgments

Claude Code — the AI agent runtime that powers AutoSci
The /poster pipeline is adapted from PaperX

License

MIT — use it, fork it, build on it.

Star History

Built with Claude Code

If this project helps your research, give it a ⭐

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
.claude/skills		.claude/skills
.github		.github
.sleepcode/tests		.sleepcode/tests
app		app
assets		assets
config		config
docs		docs
i18n		i18n
mcp-servers/llm-review		mcp-servers/llm-review
raw		raw
runtime		runtime
templates/poster		templates/poster
tools		tools
wiki		wiki
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.mcp.json		.mcp.json
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.ps1		setup.ps1
setup.sh		setup.sh

Folders and files

Latest commit

History

Repository files navigation

AutoSci

⚠️ Status & Update

📄 Paper

AutoSci: A Memory-Centric Agentic System for the Full Scientific Research Lifecycle

📌 Poster & Demo

🆕 What's New

🛠️ 2026-05-19 · Experiment Overhaul

🎨 2026-05-18 · /poster — drafted paper → print-ready conference poster

🎯 2026-05-12 · /discover from a venue — "what should I read first from ICLR 2024?"

📰 2026-05-09 · Daily arXiv — fresh-paper recommendations, on demand or scheduled

🌐 2026-05-06 · Knowledge Graph Visualization — browser + Obsidian

Team

What is AutoSci?

🔬 Works Produced with AutoSci

Quick Start

API Keys

LLM API Configuration / 大模型 API 配置

Option A — Native Claude / 原生 Claude

Option B — Third-party Anthropic-compatible API / 第三方 Anthropic 兼容 API

MiMo (小米)

DeepSeek

Kimi (Moonshot)

GLM (Z.AI)

Skills

Phase 0: Setup

Phase 1: Knowledge Base

Phase 2: Ideation & Experiments

Phase 3: Writing & Dissemination

Utilities

Contributing

Community / 交流群

Citation

Acknowledgments

License

Star History

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages