Deep Evidence Researcher turns a single question into a structured investigation. You type a question; the agent searches arXiv and Google Scholar for relevant papers, opens the promising ones to read their actual content, and as it goes it saves every concrete claim it finds into a small SQLite notebook — recording not just the claim but its subject, the relation, the value, the verbatim quote, and the source URL. Because every note is structured the same way, the system can automatically group claims that talk about the same thing and flag where sources agree, disagree, or stand alone. When the agent has enough evidence, it writes a short synthesis citing the URLs, and then a separate verifier pass re-reads the notebook to produce a typed report — agreements, disagreements with each side's sources, and a confidence score. The whole loop runs through a multi-provider LLM gateway so you can pick which models (Gemini, Groq, etc.) are allowed to do the thinking, and a Streamlit UI streams every search, page fetch, and note in real time so you can see how the answer was built, not just what it is.
mcp_server.py— MCP tools:search_arxiv,search_google_scholar,fetch_page,notes_add,notes_list,notes_groupedresearcher.py— Native tool-use agent loop + verifier turn (typedVerifierReport)app.py— Streamlit UI with live trace, claim groups, and verdictllm_gatewayV2/— FastAPI gateway in front of 7 LLM providers (own venv) — authored by Rohan Shravan, vendored here with permission
Requirements: Python 3.14+, uv, API keys in .env (Gemini / Groq / OpenRouter / etc.)
-
Start the LLM gateway (separate shell):
cd llm_gatewayV2 && ./run.sh # serves :8100
-
CLI:
uv run python researcher.py "Do LLMs reason or pattern-match?" -
Streamlit UI:
uv run streamlit run app.py
Notes accumulate in notes.db alongside the server.
- Semantic Scholar integration — free API (no key), gives the citation graph (forward + backward refs). For a project called "Deep Evidence", walking citations is the whole game.
- PDF reading (
fetch_pdfviapypdf/pdfplumber) — half the arXiv hits are PDFs andtrafilaturasilently fails on them; the agent currently reads only the abstract. - Tavily / Brave web search — for non-academic questions (news, blogs, docs). Today the toolbelt is research-paper-shaped only.
- Two-pass research mode — pass 1 gathers (5–6 turns), pass 2 drills into contradictions surfaced by
notes_grouped. UI toggle. Today the agent stops at "good enough" without using its own contradiction signal. - Inline citation hover — each
(https://…)link in the synthesis reveals the supporting note's quote on hover. Cheap to build, big trust boost. - Source quality signal — flag peer-reviewed vs preprint vs blog in the notes table; weight synthesis toward stronger sources.
llm_gatewayV2/ was authored by Rohan Shravan and vendored into this repo with permission. All other code in this repository is original to AxiomLoop.
