Skip to content

skyllwt/AutoSci

AutoSci Logo

AutoSci

Read, think, experiment, write, evolve — the AI research agent with memory that compounds across every project.

License: MIT Python 3.9+ Claude Code arXiv Status


⚠️ Status & Update

Thanks to everyone who's been trying AutoSci — the community response has been amazing! AutoSci evolved from our earlier OmegaWiki prototype into what we're building toward: a next-generation research agent that can handle the full scientific lifecycle. We're actively testing and iterating on new features, and more capabilities are on the way. Jump in, break things, and tell us what you think — your feedback and ideas are what's shaping where this goes next. 🙏

🌿 Which branch? main is the stable, lean version. The full system described in our paper — SciMem · SciFlow · SciDAG · SciEvolve — lives on the paper branch (frozen as tag arxiv-v1). Note that paper is a research snapshot, not a finished product: it's under active testing and iteration, and some capabilities described in the paper are still being implemented and refined.


📄 Paper

arXiv  ·  📄 Read on arXiv →

If you find AutoSci useful in your research, please cite our paper.


📌 Poster & Demo

AutoSci conference poster
AutoSci poster — click to view full size.
▶ Watch AutoSci on Bilibili
▶ Watch the AutoSci demo on Bilibili

🆕 What's New

🛠️ 2026-05-19 · Experiment Overhaul

A possible usage process:/ideate [research-direction-or-topic](You can use --skip-pilot to decide whether to conduct preliminary experiments) -> /exp-design <idea-slug>-> For each experimental block,recommended flow: /exp-run <slug> [--env local|remote] to deploy → /exp-status to monitor → /exp-run <slug> --collect to collect.->/exp-eval <experiment-slug>

✨ : New Skills /exp-pilot-run — Pilot experiment execution: write code, deploy, monitor, collect raw results. /exp-pilot-eval — Pilot result evaluation: read results, apply lenient verdict logic These two skills are built into Phase5 of /ideate 🛠️ : Modified Skills /ideate 5 structured generation paths (A-E) for both Claude and Review LLM. Phase restructuring: Filter & Validation merged into Phase 3, Write Wiki moved to Phase 4. Phase 5: Finish pilot design and workflow invocation Your ideas will follow a clearer path, and a more reasonable screening mechanism will be established through pilot experiments. /exp-design A brand-new experimental design process:method candidate generation + 5 experiment block types + iterative ablation loop /exp-run Add the code decision gate, code optimization and config check

🎨 2026-05-18 · /poster — drafted paper → print-ready conference poster

Run /poster after /paper-draft + /paper-compile to turn your finished draft into a self-contained 1400×900 HTML poster and a print-quality PNG. Figures, booktabs tables, and math macros are extracted automatically from your LaTeX source; Claude walks you through picking which figures land in which sections and customizing the header (venue, affiliation logo). Export to PDF from your browser's print dialog. Pipeline adapted from PaperX (arXiv:2602.03866).

Example /poster output

🎯 2026-05-12 · /discover from a venue — "what should I read first from ICLR 2024?"

Run /discover --venue iclr --year 2024 (or any conference/year) and get a personalized shortlist of papers from that venue, ranked by relevance to what's already in your wiki. Instead of scrolling a 7000-paper proceedings, you see the dozen that actually matter for your research direction, each with a rationale tied to topics and methods you already track. No new API keys, no ingest side-effects on your wiki — just a ranked reading list. Supports NeurIPS, ICLR, ICML, and other venues covered by Paper Copilot.

📰 2026-05-09 · Daily arXiv — fresh-paper recommendations, on demand or scheduled

Run /daily-arxiv for a one-off pass, or /daily-arxiv setup to schedule the same pipeline in GitHub Actions. The skill builds an evidence packet from arXiv + Semantic Scholar + DeepXiv, lets the LLM rank candidates against your wiki interests, and delivers a digest by e-mail. Explicit --mode auto-ingest calls /ingest for high-confidence picks; inform mode just notifies.

🌐 2026-05-06 · Knowledge Graph Visualization — browser + Obsidian

Your research graph now has two ways to explore:

  • Web UI — run python3 tools/serve.py, open http://localhost:8765/#/graph. Click any node to highlight its neighborhood via BFS, filter by entity type or edge category, double-click to open the full page in the Reader.
  • Obsidian — run /visualize --obsidian to generate a color-coded graph config, or /visualize --canvas to produce a force-layout Canvas with labeled semantic edges.

Team

AutoSci is built by DAIR Lab at Peking University.

Weitong Qian

Weitong Qian
PKU
Undergraduate · 2023
Beicheng Xu

Beicheng Xu
PKU
Ph.D. · 2023
Zhongao Xie

Zhongao Xie
PKU
Undergraduate · 2025
Bowen Fan

Bowen Fan
PKU
Undergraduate · 2024
Guozheng Tang

Guozheng Tang
PKU
Undergraduate · 2024
Xinzhe Wu

Xinzhe Wu
PKU
Undergraduate · 2024
Jiale Chen

Jiale Chen
PKU
Undergraduate · 2024
Mingtian Yang

Mingtian Yang
PKU
Undergraduate · 2024
Chenyang Di

Chenyang Di
PKU
Undergraduate · 2023
...and more contributors who have shaped AutoSci along the way.

What is AutoSci?

Scientific research has traditionally been human-intensive: researchers coordinate literature, ideas, experiments, manuscripts, and review responses across long project cycles. AutoSci is a memory-centric agentic system that automates the full research lifecycle — from paper ingestion to rebuttal — while maintaining structured persistent memory across projects and improving its own procedures over time.

AutoSci system overview

🔬 Works Produced with AutoSci

The following papers were generated end-to-end using AutoSci — from literature ingestion and idea generation to experiment execution and manuscript writing.

Paper Domain PDF
Agent-driven iterative optimization of Triton GPU kernels GPU kernel optimization 📄 PDF
PTM-aware degrader target nomination via calibrated ternary-complex scoring Biomedical drug discovery 📄 PDF
Forced Honesty Dissociates Polite Speech from Motivated Cognition in LLM Attitude Ratings LLMs as cognitive models 📄 PDF

Have you used AutoSci in your own research? We'd love to feature your work here — open a PR or drop us a message!


Quick Start

Prerequisites: Python 3.9+, Node.js 18+

# 1. Clone
git clone https://github.com/skyllwt/AutoSci.git
cd AutoSci

# 2. Install Claude Code
npm install -g @anthropic-ai/claude-code
claude login

# 3. One-click setup
chmod +x setup.sh && ./setup.sh        # Linux / macOS
# Windows (PowerShell):
#   powershell -ExecutionPolicy Bypass -File .\setup.ps1
# setup creates a .venv for AutoSci; /init will use it automatically

# 4. Put your own papers in raw/papers/ (.tex or .pdf)
#    Optional: intent notes in raw/notes/, saved pages in raw/web/

# 5. Build your research memory and start a project
claude
# Then type: /init [your-research-topic]
Manual setup (Linux / macOS)
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env                 # Edit to add API keys
cp config/settings.local.json.example .claude/settings.local.json
Manual setup (Windows / PowerShell)
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
Copy-Item .env.example .env          # Edit to add API keys
Copy-Item config\settings.local.json.example .claude\settings.local.json

Note: native Windows is supported for the local pipeline. Remote-GPU experiments via /exp-run --env remote rely on ssh/rsync/screen and are best run from WSL2 or Linux/macOS.

API Keys

Key Required? How to get What it enables
ANTHROPIC_API_KEY Yes (or use a third-party compatible API — see below) claude login (automatic) Powers all Claude Code skills
CLAUDE_CODE_OAUTH_TOKEN Optional claude setup-token GitHub Actions Claude Code auth for Pro/Max users
SEMANTIC_SCHOLAR_API_KEY Optional semanticscholar.org/product/api (free) Citation graph, paper search
DEEPXIV_TOKEN Optional setup.sh auto-registers Semantic search, TLDR, trending
LLM_API_KEY + LLM_BASE_URL + LLM_MODEL Optional Any OpenAI-compatible API Cross-model review; /daily-arxiv inform recommendations

Don't have an Anthropic API key? AutoSci runs on Claude Code, which supports any Anthropic-protocol-compatible provider — DeepSeek, Kimi, MiMo, GLM, and more. See the LLM API Configuration section below for setup snippets.

Cross-model review: AutoSci uses a second LLM as an independent reviewer for ideas, experiments, and paper drafts. Works with any OpenAI-compatible API — DeepSeek, OpenAI, Qwen, OpenRouter, SiliconFlow, etc. If not configured, skills still work in Claude-only mode.


LLM API Configuration / 大模型 API 配置

AutoSci runs on Claude Code, which speaks the Anthropic API protocol. You can use Claude directly, or route Claude Code to any third-party provider that exposes an Anthropic-compatible endpoint by overriding a few environment variables.

AutoSci 基于 Claude Code,Claude Code 使用 Anthropic API 协议通信。你既可以直接使用 Claude,也可以通过覆盖几个环境变量,把 Claude Code 指向任意支持 Anthropic 协议的第三方供应商。

Option A — Native Claude / 原生 Claude

claude login   # OAuth, no manual config / OAuth 登录,无需手动配置

Option B — Third-party Anthropic-compatible API / 第三方 Anthropic 兼容 API

Pick a provider below, paste the snippet into ~/.claude/settings.json (or the project's .claude/settings.json), and replace the <...> placeholder with your own API key. Model names and extra options follow each provider's official Claude Code docs.

从下方任选一个供应商,把对应配置粘贴到 ~/.claude/settings.json(或项目的 .claude/settings.json),并把 <...> 占位符替换为你自己的 API key。模型名与额外选项均来自各供应商官方 Claude Code 文档。

MiMo / DeepSeek / Kimi / GLM 配置示例

MiMo (小米)

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.xiaomimimo.com/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your-mimo-key>",
    "ANTHROPIC_MODEL": "mimo-v2.5",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "mimo-v2.5",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "mimo-v2.5-pro",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "mimo-v2.5"
  }
}

DeepSeek

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.deepseek.com/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your-deepseek-key>",
    "ANTHROPIC_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "deepseek-v4-flash",
    "CLAUDE_CODE_SUBAGENT_MODEL": "deepseek-v4-flash",
    "CLAUDE_CODE_EFFORT_LEVEL": "max"
  }
}

Kimi (Moonshot)

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.moonshot.ai/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your-moonshot-key>",
    "ANTHROPIC_MODEL": "kimi-k2.5",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "kimi-k2.5",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "kimi-k2.5",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "kimi-k2.5",
    "CLAUDE_CODE_SUBAGENT_MODEL": "kimi-k2.5",
    "ENABLE_TOOL_SEARCH": "false"
  }
}

GLM (Z.AI)

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.z.ai/api/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your-zai-key>",
    "API_TIMEOUT_MS": "3000000"
  }
}

Z.AI applies a default server-side model mapping, so no explicit ANTHROPIC_MODEL is needed. Z.AI 默认在服务端做模型映射,无需显式设置 ANTHROPIC_MODEL

Skip the Claude Code onboarding / 跳过 Claude Code 初始引导: when using a third-party key, create or edit .claude.json (~/.claude.json on macOS/Linux) and add { "hasCompletedOnboarding": true }.


Skills

AutoSci ships with 30+ slash commands spanning the full research lifecycle.

View all skills

Phase 0: Setup

Command What it does
/setup Interactive API key configuration — checks .env state and walks through Semantic Scholar, DeepXiv, and Review LLM setup
/reset Destructive cleanup — reset wiki state to a clean scaffold by scope (wiki / raw / log / checkpoints / all)

Phase 1: Knowledge Base

Command What it does
/prefill Seed wiki/foundations/ with domain background so subsequent /ingest doesn't create duplicate concept pages for textbook material
/init Bootstrap the wiki from your source files, with optional discovery, then ingest the final paper set in parallel
/ingest Ingest a paper (local path or arXiv URL) — creates pages and builds all cross-references and graph edges
/discover Build a ranked shortlist of candidate papers (anchor-driven, topic-driven, venue-filtered, or from wiki state) without ingesting
/edit Add or remove raw sources, or update wiki content, per user request
/ask Ask the wiki a question — retrieve and synthesize relevant pages, optionally crystallize the answer back into the wiki
/check Scan the full wiki to detect health issues and produce a tiered fix-recommendation report

Phase 2: Ideation & Experiments

Command What it does
/daily-arxiv Run or schedule the daily arXiv recommendation feed; delivers a ranked digest by email with optional auto-ingest for high-confidence picks
/ideate Multi-phase research idea generation: landscape scan → dual-model brainstorm → filter & validation → write to wiki → pilot
/exp-pilot-run Pilot experiment execution — write code, deploy, monitor, collect raw results (called by /ideate Phase 5)
/exp-pilot-eval Pilot result evaluation — read results, apply success criteria, update idea page (called by /ideate Phase 5)
/novelty Multi-source novelty verification via WebSearch + Semantic Scholar + wiki + Review LLM; outputs novelty score and recommendations
/review Cross-model review of any research artifact — outputs structured scores, wiki entity mapping, and improvement suggestions
/exp-design Idea-driven experiment design with iterative ablation — method candidates → benchmark selection → sensitivity analysis → main experiment
/exp-run Full experiment execution pipeline — prepare code → deploy → monitor → collect results
/exp-status View the status of all running experiments; optionally auto-collect completed runs and advance the pipeline
/exp-eval Experiment verdict gate — Review LLM independently judges results and auto-updates the linked idea's status and graph edges
/refine Multi-round iterative improvement — repeatedly calls /review, parses feedback, applies fixes, and updates wiki until target score

Phase 3: Writing & Dissemination

Command What it does
/survey Generate a Related Work section from wiki knowledge — thematic grouping → narrative structure → LaTeX output
/paper-plan Compile a paper outline from the idea graph — evidence map → narrative structure → section + figure + citation plan
/paper-draft Draft a LaTeX paper from PAPER_PLAN — write each section from wiki sources, generate figures/tables, verify BibTeX
/paper-compile LaTeX compile → PDF — latexmk compile + auto-fix + page count / anonymity / font checks + submission checklist
/research End-to-end research orchestrator — idea discovery → experiment design → execution → verdict → paper writing with human gates
/rebuttal Parse review comments → atomize concerns → map to wiki → stress-test with Review LLM → generate rebuttal
/poster Generate an academic poster from a drafted paper — distill sections into a single-page HTML poster with figures

Utilities

Command What it does
/visualize Generate Obsidian graph configs and Canvas knowledge maps; the interactive web graph is served by tools/serve.py

Contributing

We welcome contributions and feedback — especially while we're in active iteration. See CONTRIBUTING.md.

Community / 交流群

WeChat Group QR Code

Scan to join the AutoSci WeChat group / 扫码加入微信交流群

Citation

If you find AutoSci useful in your research, please cite our paper:

@misc{qian2026autosci,
      title={AutoSci: A Memory-Centric Agentic System for the Full Scientific Research Lifecycle}, 
      author={Weitong Qian and Beicheng Xu and Zhongao Xie and Bowen Fan and Guozheng Tang and Jiale Chen and Xinzhe Wu and Mingtian Yang and Chenyang Di and Jiajun Li and Lingching Tung and Peichao Lai and Yifei Xia and Ziyi Guo and Yanwei Xu and Yanzhao Qin and Shaoduo Gan and Xupeng Miao and Bin Cui},
      year={2026},
      eprint={2605.31468},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2605.31468}, 
}

Acknowledgments

  • Claude Code — the AI agent runtime that powers AutoSci
  • The /poster pipeline is adapted from PaperX

License

MIT — use it, fork it, build on it.

Star History

Built with Claude Code

If this project helps your research, give it a ⭐

About

Karpathy's LLM-Wiki vision, fully realized — wiki-centric full-lifecycle AI research platform powered by Claude Code

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors