A didactic example of a proper deep-research pipeline built out of small, single-purpose Atomic Agents.
Unlike a typical "search-and-summarise" agent — generate one set of queries, fetch results, write an answer — this example iterates: it plans sub-topics, researches each one across multiple depth levels, reflects on whether each has enough coverage, and produces a report where every claim is tied to a registered source.
- Plan. A
PlannerAgentbreaks the question into 3–5 durable sub-topics, each seeded with a handful of queries. - Research (per sub-topic, up to N iterations):
- Search (SearXNG) and scrape the top new URLs.
ExtractorAgentpulls atomic, citable claims from each scraped page.ReflectorAgentdecides whether the sub-topic has enough material, or emits follow-up queries for the next iteration.
- Write.
WriterAgentdrafts a cited report from the accumulated state, then runs a second pass over its own draft to strip any sentence whose citation doesn't correspond to a real source.
Every agent has a single responsibility and reads / contributes to a shared ResearchState object. The loop itself lives in main.py as plain Python — no megagent, no hidden control flow.
-
Clone the main Atomic Agents repository:
git clone https://github.com/BrainBlend-AI/atomic-agents
-
Navigate to the Deep Research directory:
cd atomic-agents/atomic-examples/deep-research -
Install dependencies using uv:
uv sync
-
Set up environment variables: Create a
.envfile in thedeep-researchdirectory with:OPENAI_API_KEY=your_openai_api_key SEARXNG_BASE_URL=http://localhost:8080 SEARXNG_API_KEY=your_searxng_secret_key
-
Set up SearXNG:
- Install from the official repository.
- Default configuration expects SearXNG at
http://localhost:8080. - JSON output must be enabled in
settings.yml(look for theformats:key).
-
Run a research query:
uv run python -m deep_research "What is the current state of fusion energy research?"
- One-shot (
python -m deep_research "your question"): plan → research → write, prints a cited report and exits. - Chat (
python -m deep_research): same first turn, then a REPL where each follow-up is routed by aDeciderAgentto either another research pass + Q&A or straight Q&A against the existing state.
deep_research/
├── __main__.py # python -m deep_research entrypoint
├── main.py # Plain orchestrator: plan → research → write (+ chat loop)
├── config.py # Model + connectivity + research budgets
├── state.py # ResearchState dataclass — the one source of truth
├── context_providers.py # Renders state + current date into agent system prompts
├── agents/
│ ├── planner_agent.py # Question → sub-topics (with initial queries)
│ ├── extractor_agent.py # One scraped source → atomic claims
│ ├── reflector_agent.py # Sub-topic state → sufficient? + next queries
│ ├── writer_agent.py # Full state → cited report (draft + verify passes)
│ ├── decider_agent.py # Chat mode: research more, or answer from state?
│ └── qa_agent.py # Chat mode: cited answer from existing state
└── tools/
├── searxng_search.py
└── webpage_scraper.py
All limits live in ResearchBudget inside config.py. Tune to taste:
| Knob | Default | Meaning |
|---|---|---|
num_sub_topics |
4 | Plan width |
max_depth_per_sub_topic |
2 | Max iterations per sub-topic; reflector can stop earlier |
search_results_per_query |
5 | SearXNG page size |
scrape_top_n_per_iteration |
3 | New URLs scraped per iteration |
hard_call_cap |
80 | Global safety net on total agent calls |
Worst-case first turn with defaults: 1 plan + 4×2×(1 extract×3 sources + 1 reflect) = 33 agent calls + 2 writer passes ≈ 35 agent calls, 24 scrapes. Chat follow-ups add a decider call plus either Q&A or another research pass; the hard_call_cap of 80 leaves headroom.
MIT — see the LICENSE file.