codebase-index ships a stdio MCP server powered by the optional mcp extra:
pip install "codebase-index[mcp]"
codebase-index mcp --root /path/to/repoThe server speaks MCP over stdio through FastMCP. Build the index with
codebase-index index before connecting a client.
Current shipped interfaces:
codebase-index/cbxCLIcodebase-index mcp --root <repo>stdio MCP server- Claude Code skill generated by
codebase-index init --target claude - Codex
AGENTS.mdpackage generated by--target codex - OpenCode command/agent resources generated by
--target opencode - Optional hooks and
watchmode for freshness
Current non-goals:
- No HTTP/SSE transport.
- No streaming result events yet; use
limitandtoken_budget. - Client-specific config paths are templates until verified against each client version.
The MCP server exposes the same retrieval contract as the CLI.
| Tool | Purpose | CLI equivalent |
|---|---|---|
healthcheck |
Report server, package, config, index, freshness, and safety status | doctor, freshness checks |
search_code |
Hybrid code search returning ranked file:line ranges | search |
find_symbol |
Locate definitions by symbol name/kind | symbol |
find_refs |
Return callers/references for a symbol | refs |
impact_of |
Return affected files/symbols from graph expansion | impact |
explain_code |
Intent-aware retrieval packet for a natural-language question | explain |
architecture_overview |
Modules, god nodes, surprising connections, suggested questions | architecture |
path_between |
Shortest dependency/call path between two symbols or files | path |
describe_symbol |
Node card: definition, callers, callees, centrality, module | describe |
index_stats |
Return counts, language coverage, graph stats, freshness | stats |
Tool responses are JSON strings returned through MCP content blocks. Every payload — success or error — is wrapped in a stable envelope so clients can branch on the contract without sniffing the shape:
{
"schema_version": 1,
"tool": "search_code",
"index": {
"exists": true,
"stale": false,
"built_at": "2026-05-29T12:00:00Z",
"files_changed_since_build": 0
},
"results": [],
"recommended_reads": []
}schema_version(int) — the payload contract version. Bumped only on a breaking change (field removal or type change); additive fields keep the same version. The current version is 1.tool(string) — the emitting tool name (search_code,find_symbol,find_refs,impact_of,explain_code,architecture_overview,path_between,describe_symbol,index_stats,healthcheck).- The no-index / error path carries the same envelope plus an
"error"field.
Rules:
- Additive fields are allowed within a
schema_version. - Field removal or type changes bump
schema_version. - Tool descriptions should include examples and expected failure modes.
- Errors should fail closed: no partial unsafe result when config or index state is unsafe.
Every tool's enveloped output is locked by golden snapshots in
tests/golden/mcp_*.json (regenerate intentionally with
UPDATE_GOLDEN=1 pytest tests/test_mcp_golden.py), and the schema_version /
tool values are asserted explicitly so a golden can never silently freeze a
wrong contract version.
{
"mcpServers": {
"codebase-index": {
"command": "codebase-index",
"args": ["mcp", "--root", "/path/to/repo"]
}
}
}{
"mcpServers": {
"codebase-index": {
"command": "codebase-index",
"args": ["mcp", "--root", "${PWD}"]
}
}
}Use the client's MCP server configuration UI or JSON file and register:
{
"name": "codebase-index",
"command": "codebase-index",
"args": ["mcp", "--root", "/path/to/repo"]
}Client-specific config file paths and screenshots should be added only after they are verified against the current client versions.
Large codebase queries currently support:
limittoken_budget
Future protocol work should add one of:
cursorfor pagingtoken_budgetto cap output- progressive result events where supported by the MCP SDK
Agents should be able to stop after enough context rather than receiving a large static payload.
The MCP server reads the same repository data as the CLI, so it inherits the same trust boundaries:
- Respect
.gitignore,.codeindexignore,.claudeignore, and.cursorignore. - Exclude secret filenames and binary/generated/dependency files before parsing.
- Redact tokens, credentials, private keys, JWTs, and connection strings in snippets.
- Make external embeddings opt-in only, with explicit config and warnings.
- Never log source snippets or secrets by default.
- Bind only to stdio unless a future HTTP transport has explicit host/protocol restrictions.
- Done:
src/codebase_index/mcp/server.pythin adapter over retrieval/storage code. - Done:
codebase-index mcp --root <path>CLI entrypoint. - Done:
healthcheck,search_code,find_symbol,find_refs,impact_of,explain_code,architecture_overview,path_between,describe_symbol, andindex_statstools. - Done: focused tests for tool registration, missing-index behavior, config resolution, and run entrypoint.
- Done: explicit
schema_version+toolenvelope on every structured tool payload (including the error path), asserted bytests/test_mcp_server.pyandtests/test_mcp_golden.py. - Done: golden snapshots for every tool output (
tests/golden/mcp_*.json). - Done: unstructured-output registration (
structured_output=Falsewhere supported) so the server loads onmcp>=1.27+pydantic>=2.10, where auto-detecting a structured schema from the-> strreturn annotation otherwise raises at import time. - Follow-up: verified client-specific docs for Claude Desktop, Claude Code, Cursor, VS Code, Zed, and Windsurf.
- Follow-up: paging or progressive result support.