diff --git a/AGENTS.md b/AGENTS.md index ba6eb95..fc9f9b8 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,8 +1,9 @@ ## codebaseGraph workflow -- Treat the repo-local `.codebaseGraph` graph as the project operating source of truth. -- Use `/Users/rabii/Projects/Repositories/codebaseGraph/.venv/bin/codebase-graph graph-search --repo-root . --no-refresh --detail slim --context-limit 1 --json` before answering repo-structure questions or performing coding tasks. -- Use `/Users/rabii/Projects/Repositories/codebaseGraph/.venv/bin/codebase-graph graph-context --repo-root . --profile --no-refresh --detail slim --context-limit 2 --json` when relationships or nearby evidence matter; useful profiles include `definitions`, `dependencies`, `callgraph`, `docs`, `runtime`, and `change_impact`. +- Treat the repo-local `.codebaseGraph` graph as the project operating source of truth. It is prohibited to read the code source before you find the target files using the graph. +- AI agents must use block format for `graph-search` and `graph-context`; reserve `--json` for tests, APIs, or explicit structured-payload debugging. +- Use `/Users/rabii/Projects/Repositories/codebaseGraph/.venv/bin/codebase-graph graph-search --repo-root . --no-refresh --detail slim --context-limit 1 --format block` before answering repo-structure questions or performing coding tasks. +- Use `/Users/rabii/Projects/Repositories/codebaseGraph/.venv/bin/codebase-graph graph-context --repo-root . --profile --no-refresh --detail slim --context-limit 2 --format block` when relationships or nearby evidence matter; useful profiles include `definitions`, `dependencies`, `callgraph`, `docs`, `runtime`, and `change_impact`. - For architecture orientation, run `/Users/rabii/Projects/Repositories/codebaseGraph/.venv/bin/codebase-graph graph-architecture-queries`, then execute selected read-only statements with `/Users/rabii/Projects/Repositories/codebaseGraph/.venv/bin/codebase-graph graph-query "" --repo-root .`. - Use `/Users/rabii/Projects/Repositories/codebaseGraph/.venv/bin/codebase-graph graph-schema` or `/Users/rabii/Projects/Repositories/codebaseGraph/.venv/bin/codebase-graph graph-query-helpers` before writing raw graph queries, add `--pretty` for indented JSON when humans need to inspect output, and keep `graph-query` read-only. - Refresh the graph with `/Users/rabii/Projects/Repositories/codebaseGraph/.venv/bin/codebase-graph setup --repo-root . --mcp-client none` when files change materially. Setup config: `/Users/rabii/Projects/Repositories/codebaseGraph/.codebaseGraph/config.json`. diff --git a/README.md b/README.md index 1cf97c8..400d9dd 100644 --- a/README.md +++ b/README.md @@ -1,19 +1,17 @@ # codebaseGraph -`codebaseGraph` is a generic project/code knowledge graph engine for coding repositories. It materializes Python source, `AGENTS.md`, `CLAUDE.md`, Markdown, and MDX files into a LadyBugDB-backed graph, then exposes search, compact context, schema, query helpers, and a read-only MCP tool surface for coding agents. +`codebaseGraph` builds a repo-local knowledge graph for coding agents. It materializes Python source, `AGENTS.md`, +`CLAUDE.md`, Markdown, and MDX files into a LadyBugDB-backed graph, then exposes search, compact context, schema, query +helpers, and read-only MCP tools. -LadyBugDB is a required runtime dependency. A normal production install must include `real_ladybug`; setup fails before creating repository state if the runtime cannot open a graph database. +Requires Python 3.10+ and a package build that includes `real_ladybug`. -## Production install +## Quick start ```bash python -m pip install cbasegraph -``` - -From a repository root, run: - -```bash codebase-graph setup --repo-root . +codebase-graph graph-search SampleService --repo-root . --no-refresh --format block ``` Setup creates: @@ -25,138 +23,66 @@ Setup creates: _graph.ldb ``` -For a repository named `my-service`, the database path is exactly `.codebaseGraph/my-service_graph.ldb`. +For a repository named `my-service`, the database path is `.codebaseGraph/my-service_graph.ldb`. -The setup command also: +The setup command materializes the graph, writes or updates one marked codebaseGraph block in `AGENTS.md` or +`CLAUDE.md`, and installs a Codex MCP client entry unless skipped. -- Materializes the repository graph into the repo-local database. -- Writes or updates one marked codebaseGraph block in `AGENTS.md` or `CLAUDE.md`. -- Installs an MCP client entry named `codebase_graph`, unless skipped. - -Useful options: +Useful setup options: ```bash codebase-graph setup --repo-root /path/to/repo -codebase-graph setup --mcp-client codex codebase-graph setup --mcp-client claude -codebase-graph setup --mcp-client claude-project codebase-graph setup --mcp-client lmstudio -codebase-graph setup --mcp-client hermes -codebase-graph setup --mcp-client openclaw -codebase-graph setup --mcp-client generic -codebase-graph setup --mcp-config-path /tmp/client-config -codebase-graph setup --dry-run codebase-graph setup --skip-mcp-config codebase-graph setup --instructions-target claude +codebase-graph setup --dry-run --pretty ``` -`--dry-run` returns the raw server descriptor plus the exact client patch or payload without writing repository graph state, instruction files, or MCP client files. - -## MCP installation - -The user-facing installer is: +## MCP install ```bash -codebase-graph mcp install +codebase-graph mcp install --client codex ``` -By default this installs Codex with a repository-specific server name, for example `codebase_graph_my_service`. It builds the server descriptor from `.codebaseGraph/config.json`, uses the supported native client CLI when available, and falls back to the adapter file writer when the CLI is missing or fails. +Supported clients are `codex`, `claude`, `claude-project`, `lmstudio`, `hermes`, `openclaw`, and `generic`. + +Server naming: + +- `codebase-graph setup` installs the default MCP server as `codebase_graph`. +- Standalone `codebase-graph mcp install` defaults to `codebase_graph_`. +- Use `--name codebase_graph` to override the standalone installer name. -Useful installer options: +The installer builds the server descriptor from `.codebaseGraph/config.json`, uses a supported native client CLI when +available, and falls back to writing the client config file directly. Use `--dry-run --json` to inspect the emitted +command or config patch before writing, and `--verify` to run a stdio smoke test after installation. ```bash -codebase-graph mcp install --client codex codebase-graph mcp install --client claude --scope user codebase-graph mcp install --client claude-project -codebase-graph mcp install --client lmstudio -codebase-graph mcp install --client hermes -codebase-graph mcp install --client openclaw -codebase-graph mcp install --client generic codebase-graph mcp install --client all --dry-run --json -codebase-graph mcp install --name codebase_graph codebase-graph mcp install --config-path /path/to/.codebaseGraph/config.json codebase-graph mcp install --verify ``` -Native CLI installers are attempted first for Codex, Claude, Claude project scope, and OpenClaw: - -```bash -codex mcp add -- -claude mcp add --transport stdio --scope -- -openclaw mcp set '' -``` - -If native installation is unavailable, codebaseGraph writes the client config file directly. `setup --mcp-client ...` remains supported and delegates to the same installer behavior after materializing graph state and updating instructions. The default MCP server name is `codebase_graph`, which avoids mixed-case tool namespace issues in clients that normalize or validate MCP labels strictly. - -`--dry-run` reports the native command or emitted file patch without calling native CLIs or writing files. `--verify` runs a direct stdio MCP smoke test and, where available, asks the client CLI whether it can see the server. - ## MCP usage -Setup and install build one canonical server descriptor and serialize it into the selected client format. When run from a virtual environment, the command may be the absolute path to that environment's `codebase-graph` executable so the MCP client can launch it without relying on shell `PATH`. - -Codex uses `~/.codex/config.toml`: - -```toml -[mcp_servers.codebase_graph] -command = "codebase-graph" -args = ["mcp", "serve", "--config", ".codebaseGraph/config.json"] -startup_timeout_sec = 60 -``` - -Claude Desktop, Claude project config, LM Studio, and generic MCP JSON use an `mcpServers` shape: - -```json -{ - "mcpServers": { - "codebase_graph": { - "type": "stdio", - "command": "codebase-graph", - "args": ["mcp", "serve", "--config", ".codebaseGraph/config.json"] - } - } -} -``` - -OpenClaw uses JSON5-compatible JSON under `mcp.servers`, and Hermes emits YAML under `mcp_servers` in `~/.hermes/config.yaml`. LM Studio reads `~/.lmstudio/mcp.json` and requires enabling "Allow calling servers from mcp.json" in the app. Use `codebase-graph mcp install --dry-run --client --json` to inspect the exact emitted command or patch before installation. - -Client examples: - -```bash -codebase-graph mcp install --client codex -codebase-graph mcp install --client claude -codebase-graph mcp install --client claude-project -codebase-graph mcp install --client lmstudio -codebase-graph mcp install --client hermes -codebase-graph mcp install --client openclaw -codebase-graph mcp install --client generic --dry-run --json -``` - -The server can also be run directly: +Stdio is the default transport for local MCP clients: ```bash codebase-graph mcp serve --config .codebaseGraph/config.json codebase-graph-mcp --config .codebaseGraph/config.json ``` -Stdio is the default transport for local MCP clients. An optional local Streamable HTTP transport is available for clients that connect to an HTTP endpoint: +HTTP is available for local endpoint clients: ```bash codebase-graph mcp http --config .codebaseGraph/config.json --host 127.0.0.1 --port 8765 ``` -The HTTP transport rejects non-local bind hosts unless `--allow-remote` is passed. Keep it bound to `127.0.0.1` -for normal use. Remote binding requires a bearer token: - -```bash -CODEBASE_GRAPH_MCP_TOKEN="$(openssl rand -hex 32)" -codebase-graph mcp http --config .codebaseGraph/config.json --host 0.0.0.0 --allow-remote --auth-token-env CODEBASE_GRAPH_MCP_TOKEN -``` - -Clients must send `Authorization: Bearer `. The token gate does not add TLS, rate limiting, authorization scopes, or -a multi-user session model; put remote HTTP behind a trusted network boundary and TLS-terminating proxy. - -HTTP clients must start with JSON-RPC `initialize`, then send the returned `Mcp-Session-Id` response header on later -requests. Requests without a known session id are rejected before tool dispatch. +Keep HTTP bound to `127.0.0.1` for normal use. Remote binding requires `--allow-remote` and a bearer token, but does not +provide TLS, rate limiting, authorization scopes, or a multi-user security model. HTTP clients must initialize first and +send the returned `Mcp-Session-Id` header on later requests. Available MCP tools: @@ -168,83 +94,53 @@ Available MCP tools: - `graph_architecture_queries` - `graph_query` with write-like statements blocked -`graph_query` returns at most 1,000 rows per call and fetches only one extra row to determine whether the result was -truncated. Add a narrower `MATCH` pattern or a query-side `LIMIT` for broader graph exploration. - -For coding-task architecture orientation, call `graph_architecture_queries` first to fetch the grouped read-only Cypher catalog, then run selected statements with `graph_query`. - -## Operational diagnostics - -Runtime warning and error paths emit structured JSON events to stderr. Set `CODEBASE_GRAPH_LOG_LEVEL=INFO` to include -setup start/completion diagnostics; the default level is `WARNING`. +`graph_query` returns at most 1,000 rows per call. Add a narrower `MATCH` pattern or a query-side `LIMIT` for broader +graph exploration. -Examples of emitted events include: +## CLI workflow -- `setup.failed` -- `mcp.tool_error` -- `mcp.stdio_parse_error` -- `mcp.http_forbidden_origin` -- `materializer.lock_exists` -- `materializer.stale_lock_removed` - -## CLI graph workflow - -The CLI exposes the same graph workflow as the MCP tools, which is useful in clients that do not surface MCP tools directly: +The CLI mirrors the MCP tools for clients that do not surface MCP directly: ```bash codebase-graph graph-health --repo-root . -codebase-graph graph-search SampleService --repo-root . --no-refresh --detail slim --context-limit 1 --json -codebase-graph graph-context SampleService --repo-root . --profile definitions --no-refresh --detail slim --context-limit 2 --json -codebase-graph graph-schema -codebase-graph graph-query-helpers -codebase-graph graph-architecture-queries --group overview +codebase-graph graph-context SampleService --repo-root . --profile definitions --format block codebase-graph graph-query "MATCH (n) RETURN count(n) AS total_nodes LIMIT 1" --repo-root . ``` -CLI JSON output is minified by default to reduce tokens. Add `--pretty` to JSON-producing commands when you want indented output. Retrieval commands support `--detail standard|slim`; `standard` keeps the full payload, while `slim` drops score diagnostics and duplicate or empty summary fields. - -`graph-query` blocks write-like statements and should be used read-only. The older `search` and `context` commands remain available. Setup reports the explicit database and manifest paths to use with them when needed: +Use `--format block` for agent-facing output and `--json --pretty` for structured inspection. Retrieval commands also +support `--detail standard|slim`; `slim` drops score diagnostics and duplicate or empty summary fields. -```bash -codebase-graph search SampleService \ - --source-root . \ - --db .codebaseGraph/_graph.ldb \ - --manifest .codebaseGraph/manifest.json -``` +For coding-task architecture orientation, call `graph_architecture_queries` first, then run selected statements with +`graph_query`. -## Development install +## Development ```bash python -m pip install -e .[dev] -``` - -Run checks: - -```bash python -m pytest ruff check . ``` -## CI and releases - -GitHub Actions runs pytest across Linux, macOS, and Windows for Python 3.10 through 3.14, plus ruff, supply-chain, and package-build validation. Supply-chain checks include dependency consistency, vulnerability advisory scanning, Dependabot update coverage, immutable GitHub Action pins, and CycloneDX SBOM generation. Built wheels and source distributions are smoke-tested with `setup`, `graph-health`, `graph-search`, and a stdio MCP handshake before release. Releases are managed by release-please, use tag-derived package versions, create GitHub Releases with distribution assets and SBOMs, and publish to PyPI through Trusted Publishing. - -Run `python scripts/check_release_gate.py` for local release-gate checks. Use the `--production` confirmations documented in [docs/release.md](docs/release.md) before publishing. - -Conda distribution uses the conda-forge staged-recipes path rather than direct Anaconda.org uploads. See [docs/release.md](docs/release.md) for the release workflow and conda-forge submission checklist. +## Release and security -## Security +CI runs pytest across Linux, macOS, and Windows for Python 3.10 through 3.14, plus ruff, package-build checks, +supply-chain validation, and smoke tests. See [docs/release.md](docs/release.md) for the full release process and +conda-forge checklist. -Report suspected vulnerabilities privately. See [SECURITY.md](SECURITY.md) for supported versions, reporting expectations, and the local-first MCP security boundary. +Report suspected vulnerabilities privately. See [SECURITY.md](SECURITY.md) for supported versions, reporting +expectations, and the local-first MCP security boundary. ## Troubleshooting -- Missing LadyBugDB: install a package build that includes `real_ladybug`; setup will fail before creating `.codebaseGraph`. +- Missing LadyBugDB: install a package build that includes `real_ladybug`; setup fails before creating `.codebaseGraph` + if the runtime cannot open a graph database. - Stale graph: rerun `codebase-graph setup --repo-root .` after material source or documentation changes. -- Broken Codex config: rerun `codebase-graph mcp install --client codex --verify`, then check `codex mcp list`. -- Broken Claude config: rerun `codebase-graph mcp install --client claude --scope user --verify` or `codebase-graph mcp install --client claude-project --verify`. -- Broken LM Studio, Hermes, OpenClaw, or generic config: run `codebase-graph mcp install --client --dry-run --json` first, then inspect the emitted payload and target path. -- PATH or executable issues: run setup from the virtual environment that contains `codebase-graph`; the descriptor prefers that absolute executable path. -- Direct smoke test: run `codebase-graph mcp serve --config .codebaseGraph/config.json` and send MCP `initialize`, `tools/list`, and `tools/call` JSON-RPC messages over stdio. -- Unsupported files: binary, vendor, cache, virtualenv, build, dist, `.codebase_graph`, and `.codebaseGraph` paths are skipped. -- Lock/contention errors: stop other graph materialization or setup processes using the same `.codebaseGraph/_graph.ldb`. Stale locks with dead writer PIDs are removed automatically; if the error remains, inspect the `.ldb.lock` file before removing it manually. +- Broken client config: rerun `codebase-graph mcp install --client --verify`. +- PATH or executable issues: run setup from the virtual environment that contains `codebase-graph`; the descriptor + prefers that absolute executable path. +- Unsupported files: binary, vendor, cache, virtualenv, build, dist, `.codebase_graph`, and `.codebaseGraph` paths are + skipped. +- Lock errors: stop other graph materialization or setup processes using the same + `.codebaseGraph/_graph.ldb`. Stale locks with dead writer PIDs are removed automatically; if the error + remains, inspect the `.ldb.lock` file before removing it manually. +- Diagnostics: set `CODEBASE_GRAPH_LOG_LEVEL=INFO` to include setup start/completion events on stderr. diff --git a/src/codebase_graph/cli/__init__.py b/src/codebase_graph/cli/__init__.py index 2b8af75..b86f24b 100644 --- a/src/codebase_graph/cli/__init__.py +++ b/src/codebase_graph/cli/__init__.py @@ -8,126 +8,23 @@ from codebase_graph.db import create_ladybug_database from codebase_graph.ingest import GraphMaterializer +from codebase_graph.mcp.graph_commands import ( + add_compact_context_arguments, + add_json_output_arguments, + graph_command_names, + graph_command_spec, + graph_command_specs, +) from codebase_graph.mcp.runtime import runtime_config from codebase_graph.mcp.tools import handle_tool_call -from codebase_graph.ontology import CONTEXT_PROFILES, QUERY_HELPERS, schema_payload -from codebase_graph.reasoning import architecture_query_catalog -from codebase_graph.retrieval import SearchRequest, SearchService +from codebase_graph.retrieval import SearchRequest, SearchService, serialize_graph_block from codebase_graph.setup import SetupError, SetupOptions, run_setup from codebase_graph.setup.clients import supported_client_ids from codebase_graph.setup.installer import McpInstallOptions, install_mcp_clients, supported_install_client_ids def main(argv: Sequence[str] | None = None) -> int: - parser = argparse.ArgumentParser(prog="codebase-graph") - subparsers = parser.add_subparsers(dest="command", required=True) - - materialize_parser = subparsers.add_parser("materialize", help="Materialize the code graph") - materialize_parser.add_argument("--source-root", default=".", help="Repository or source root to scan") - materialize_parser.add_argument("--db", default=None, help="LadybugDB path; defaults under .codebaseGraph") - materialize_parser.add_argument("--manifest", default=None, help="Manifest path; defaults under .codebaseGraph") - materialize_parser.add_argument("--mode", choices=("full", "changed"), default="changed") - materialize_parser.add_argument("--no-fts", action="store_true", help="Skip FTS index creation") - _add_json_output_arguments(materialize_parser) - - search_parser = subparsers.add_parser("search", help="Search the code graph with compact context") - _add_search_arguments(search_parser) - - context_parser = subparsers.add_parser("context", help="Return compact context for a search query") - _add_search_arguments(context_parser) - - graph_health_parser = subparsers.add_parser("graph-health", help="Check configured graph paths") - _add_runtime_arguments(graph_health_parser) - _add_json_output_arguments(graph_health_parser) - - graph_search_parser = subparsers.add_parser("graph-search", help="Search the code graph with compact context") - graph_search_parser.add_argument("query", help="Search query") - _add_compact_context_arguments(graph_search_parser) - _add_runtime_arguments(graph_search_parser) - _add_graph_compatibility_arguments(graph_search_parser) - - graph_context_parser = subparsers.add_parser("graph-context", help="Return compact graph context") - graph_context_parser.add_argument("query", nargs="?", help="Search query") - graph_context_parser.add_argument("--node-id", default=None, help="Explicit graph node id") - graph_context_parser.add_argument("--node-type", default=None, help="Explicit graph node type") - _add_compact_context_arguments(graph_context_parser) - _add_runtime_arguments(graph_context_parser) - _add_graph_compatibility_arguments(graph_context_parser) - - graph_schema_parser = subparsers.add_parser("graph-schema", help="Return ontology schema, indexes, profiles, and helpers") - _add_json_output_arguments(graph_schema_parser) - graph_query_helpers_parser = subparsers.add_parser("graph-query-helpers", help="Return named read-only graph query helpers") - _add_json_output_arguments(graph_query_helpers_parser) - - graph_architecture_parser = subparsers.add_parser( - "graph-architecture-queries", - help="Return the architecture-discovery query catalog", - ) - graph_architecture_parser.add_argument("--group", default=None, help="Optional architecture query group") - _add_json_output_arguments(graph_architecture_parser) - - graph_query_parser = subparsers.add_parser("graph-query", help="Execute a restricted read-only graph query") - graph_query_parser.add_argument("statement", help="Read-only graph query statement") - graph_query_parser.add_argument("--parameters", default="{}", help="JSON object with query parameters") - graph_query_parser.add_argument("--limit", type=int, default=100, help="Maximum rows to return") - _add_runtime_arguments(graph_query_parser) - _add_json_output_arguments(graph_query_parser) - - setup_parser = subparsers.add_parser("setup", help="Bootstrap codebaseGraph state for a repository") - setup_parser.add_argument("--repo-root", default=".", help="Repository root to configure") - setup_parser.add_argument("--mcp-client", choices=supported_client_ids(), default="codex") - setup_parser.add_argument("--mcp-config-path", default=None, help="Override MCP client config path") - setup_parser.add_argument("--skip-mcp-config", action="store_true", help="Do not write MCP client config") - setup_parser.add_argument("--dry-run", action="store_true", help="Return the MCP config patch without writing it") - setup_parser.add_argument( - "--instructions-target", - choices=("auto", "agents", "claude", "skip"), - default="auto", - help="Instruction file to update", - ) - setup_parser.add_argument("--mode", choices=("full", "changed"), default="changed", help="Materialization mode") - setup_parser.add_argument("--json", action="store_true", help="Emit JSON output") - _add_json_output_arguments(setup_parser) - - mcp_parser = subparsers.add_parser("mcp", help="Run or inspect the MCP server") - mcp_subparsers = mcp_parser.add_subparsers(dest="mcp_command", required=True) - install_parser = mcp_subparsers.add_parser("install", help="Install the MCP server in a supported client") - install_parser.add_argument("--client", choices=supported_install_client_ids(include_all=True), default="codex") - install_parser.add_argument("--scope", choices=("local", "user", "project"), default="local") - install_parser.add_argument("--name", default=None, help="MCP server name; defaults to codebase_graph-") - install_parser.add_argument("--config-path", default=None, help="Path to .codebaseGraph/config.json") - install_parser.add_argument("--client-config-path", default=None, help="Override the target MCP client config path") - install_parser.add_argument("--repo-root", default=".", help="Repository root used to find .codebaseGraph/config.json") - install_parser.add_argument("--dry-run", action="store_true", help="Show the install action without writing or invoking CLIs") - install_parser.add_argument("--verify", action="store_true", help="Run direct MCP smoke checks after installation") - install_parser.add_argument("--json", action="store_true", help="Emit JSON output") - _add_json_output_arguments(install_parser) - - serve_parser = mcp_subparsers.add_parser("serve", help="Serve graph tools over MCP stdio") - serve_parser.add_argument("--repo-root", default=".", help="Repository root containing .codebaseGraph/config.json") - serve_parser.add_argument("--config", default=None, help="Path to .codebaseGraph/config.json") - serve_parser.add_argument("--db", default=None, help="Override LadyBugDB path") - serve_parser.add_argument("--manifest", default=None, help="Override manifest path") - http_parser = mcp_subparsers.add_parser("http", help="Serve graph tools over Streamable HTTP") - http_parser.add_argument("--repo-root", default=".", help="Repository root containing .codebaseGraph/config.json") - http_parser.add_argument("--config", default=None, help="Path to .codebaseGraph/config.json") - http_parser.add_argument("--db", default=None, help="Override LadyBugDB path") - http_parser.add_argument("--manifest", default=None, help="Override manifest path") - http_parser.add_argument("--host", default="127.0.0.1", help="HTTP bind host; default keeps the server local") - http_parser.add_argument("--port", type=int, default=8765, help="HTTP bind port") - http_parser.add_argument("--path", default="/mcp", help="MCP HTTP endpoint path") - http_parser.add_argument( - "--allow-remote", - action="store_true", - help="Allow binding MCP HTTP to a non-local host; requires an auth token", - ) - http_parser.add_argument( - "--auth-token", - default=None, - help="Bearer token required for HTTP requests; prefer --auth-token-env to avoid shell history exposure", - ) - http_parser.add_argument("--auth-token-env", default=None, help="Environment variable containing the HTTP bearer token") - + parser = _build_parser() args = parser.parse_args(argv) if args.command == "materialize": materializer = GraphMaterializer( @@ -140,79 +37,12 @@ def main(argv: Sequence[str] | None = None) -> int: result = materializer.materialize(mode=args.mode) finally: materializer.close() - _print_json(_result_payload(result), args) + _print_json(result.as_dict(), args) return 0 if args.command in {"search", "context"}: - request = SearchRequest( - query=args.query, - limit=args.limit, - profile=args.profile, - budget=args.budget, - max_depth=args.max_depth, - context_limit=args.context_limit, - detail=args.detail, - ) - try: - request.validate() - except ValueError as exc: - parser.error(str(exc)) - materializer = GraphMaterializer( - Path(args.source_root), - db_path=args.db, - manifest_path=args.manifest, - include_fts=True, - ) - if args.no_refresh: - with create_ladybug_database(materializer.db_path, include_fts=True, read_only=True) as store: - payload = SearchService(store).search(request) - else: - try: - materializer.materialize(mode="changed") - payload = SearchService(materializer.store).search(request) - finally: - materializer.close() - _print_json(payload.as_dict(detail=args.detail), args) - return 0 - if args.command == "graph-health": - return _print_tool_payload(parser, "graph_health", {}, args) - if args.command == "graph-search": - return _print_tool_payload(parser, "graph_search", _search_arguments_payload(args), args) - if args.command == "graph-context": - if not args.query and not (args.node_id and args.node_type): - parser.error("graph-context requires a query or both --node-id and --node-type") - if (args.node_id and not args.node_type) or (args.node_type and not args.node_id): - parser.error("graph-context explicit lookup requires both --node-id and --node-type") - payload = _search_arguments_payload(args) - if args.node_id and args.node_type: - payload["node_id"] = args.node_id - payload["node_type"] = args.node_type - return _print_tool_payload(parser, "graph_context", payload, args) - if args.command == "graph-schema": - _print_json(schema_payload(), args) - return 0 - if args.command == "graph-query-helpers": - _print_json({"query_helpers": [helper.as_dict() for helper in QUERY_HELPERS]}, args) - return 0 - if args.command == "graph-architecture-queries": - try: - payload = architecture_query_catalog(group=args.group) - except ValueError as exc: - parser.error(str(exc)) - _print_json(payload, args) - return 0 - if args.command == "graph-query": - try: - parameters = json.loads(args.parameters) - except json.JSONDecodeError as exc: - parser.error(f"graph-query --parameters must be a JSON object: {exc}") - if not isinstance(parameters, dict): - parser.error("graph-query --parameters must be a JSON object") - return _print_tool_payload( - parser, - "graph_query", - {"statement": args.statement, "parameters": parameters, "limit": args.limit}, - args, - ) + return _run_legacy_search_command(parser, args) + if args.command in graph_command_names(): + return _run_graph_command(parser, args) if args.command == "setup": try: result = run_setup( @@ -285,38 +115,95 @@ def main(argv: Sequence[str] | None = None) -> int: return 2 +def _build_parser() -> argparse.ArgumentParser: + parser = argparse.ArgumentParser(prog="codebase-graph") + subparsers = parser.add_subparsers(dest="command", required=True) + + materialize_parser = subparsers.add_parser("materialize", help="Materialize the code graph") + materialize_parser.add_argument("--source-root", default=".", help="Repository or source root to scan") + materialize_parser.add_argument("--db", default=None, help="LadybugDB path; defaults under .codebaseGraph") + materialize_parser.add_argument("--manifest", default=None, help="Manifest path; defaults under .codebaseGraph") + materialize_parser.add_argument("--mode", choices=("full", "changed"), default="changed") + materialize_parser.add_argument("--no-fts", action="store_true", help="Skip FTS index creation") + _add_json_output_arguments(materialize_parser) + + search_parser = subparsers.add_parser("search", help="Search the code graph with compact context") + _add_search_arguments(search_parser) + + context_parser = subparsers.add_parser("context", help="Return compact context for a search query") + _add_search_arguments(context_parser) + + for spec in graph_command_specs(): + graph_parser = subparsers.add_parser(spec.command_name, help=spec.help) + spec.add_arguments(graph_parser) + + setup_parser = subparsers.add_parser("setup", help="Bootstrap codebaseGraph state for a repository") + setup_parser.add_argument("--repo-root", default=".", help="Repository root to configure") + setup_parser.add_argument("--mcp-client", choices=supported_client_ids(), default="codex") + setup_parser.add_argument("--mcp-config-path", default=None, help="Override MCP client config path") + setup_parser.add_argument("--skip-mcp-config", action="store_true", help="Do not write MCP client config") + setup_parser.add_argument("--dry-run", action="store_true", help="Return the MCP config patch without writing it") + setup_parser.add_argument( + "--instructions-target", + choices=("auto", "agents", "claude", "skip"), + default="auto", + help="Instruction file to update", + ) + setup_parser.add_argument("--mode", choices=("full", "changed"), default="changed", help="Materialization mode") + setup_parser.add_argument("--json", action="store_true", help="Emit JSON output") + _add_json_output_arguments(setup_parser) + + mcp_parser = subparsers.add_parser("mcp", help="Run or inspect the MCP server") + mcp_subparsers = mcp_parser.add_subparsers(dest="mcp_command", required=True) + install_parser = mcp_subparsers.add_parser("install", help="Install the MCP server in a supported client") + install_parser.add_argument("--client", choices=supported_install_client_ids(include_all=True), default="codex") + install_parser.add_argument("--scope", choices=("local", "user", "project"), default="local") + install_parser.add_argument("--name", default=None, help="MCP server name; defaults to codebase_graph-") + install_parser.add_argument("--config-path", default=None, help="Path to .codebaseGraph/config.json") + install_parser.add_argument("--client-config-path", default=None, help="Override the target MCP client config path") + install_parser.add_argument("--repo-root", default=".", help="Repository root used to find .codebaseGraph/config.json") + install_parser.add_argument("--dry-run", action="store_true", help="Show the install action without writing or invoking CLIs") + install_parser.add_argument("--verify", action="store_true", help="Run direct MCP smoke checks after installation") + install_parser.add_argument("--json", action="store_true", help="Emit JSON output") + _add_json_output_arguments(install_parser) + + serve_parser = mcp_subparsers.add_parser("serve", help="Serve graph tools over MCP stdio") + serve_parser.add_argument("--repo-root", default=".", help="Repository root containing .codebaseGraph/config.json") + serve_parser.add_argument("--config", default=None, help="Path to .codebaseGraph/config.json") + serve_parser.add_argument("--db", default=None, help="Override LadyBugDB path") + serve_parser.add_argument("--manifest", default=None, help="Override manifest path") + http_parser = mcp_subparsers.add_parser("http", help="Serve graph tools over Streamable HTTP") + http_parser.add_argument("--repo-root", default=".", help="Repository root containing .codebaseGraph/config.json") + http_parser.add_argument("--config", default=None, help="Path to .codebaseGraph/config.json") + http_parser.add_argument("--db", default=None, help="Override LadyBugDB path") + http_parser.add_argument("--manifest", default=None, help="Override manifest path") + http_parser.add_argument("--host", default="127.0.0.1", help="HTTP bind host; default keeps the server local") + http_parser.add_argument("--port", type=int, default=8765, help="HTTP bind port") + http_parser.add_argument("--path", default="/mcp", help="MCP HTTP endpoint path") + http_parser.add_argument( + "--allow-remote", + action="store_true", + help="Allow binding MCP HTTP to a non-local host; requires an auth token", + ) + http_parser.add_argument( + "--auth-token", + default=None, + help="Bearer token required for HTTP requests; prefer --auth-token-env to avoid shell history exposure", + ) + http_parser.add_argument("--auth-token-env", default=None, help="Environment variable containing the HTTP bearer token") + return parser + + def _add_search_arguments(parser: argparse.ArgumentParser) -> None: parser.add_argument("query", help="Search query") parser.add_argument("--source-root", default=".", help="Repository or source root to search") parser.add_argument("--db", default=None, help="LadybugDB path; defaults under .codebaseGraph") parser.add_argument("--manifest", default=None, help="Manifest path; defaults under .codebaseGraph") - _add_compact_context_arguments(parser) + add_compact_context_arguments(parser) parser.add_argument("--no-refresh", action="store_true", help="Query the existing graph without changed materialization") parser.add_argument("--json", action="store_true", help="Emit compact JSON output") -def _add_compact_context_arguments(parser: argparse.ArgumentParser) -> None: - parser.add_argument("--limit", type=int, default=3, help="Maximum search hits to return") - parser.add_argument("--profile", choices=sorted(CONTEXT_PROFILES), default="brief", help="Context profile") - parser.add_argument("--budget", type=int, default=600, help="Approximate per-hit context character budget") - parser.add_argument("--max-depth", type=int, default=None, help="Override the context profile depth") - parser.add_argument("--context-limit", type=int, default=3, help="Maximum context items per search hit") - parser.add_argument("--detail", choices=("standard", "slim"), default="standard", help="Output detail level") - _add_json_output_arguments(parser) - - -def _add_runtime_arguments(parser: argparse.ArgumentParser) -> None: - parser.add_argument("--repo-root", default=".", help="Repository root containing .codebaseGraph/config.json") - parser.add_argument("--config", default=None, help="Path to .codebaseGraph/config.json") - parser.add_argument("--db", default=None, help="Override LadyBugDB path") - parser.add_argument("--manifest", default=None, help="Override manifest path") - - -def _add_graph_compatibility_arguments(parser: argparse.ArgumentParser) -> None: - parser.add_argument("--no-refresh", action="store_true", help="Accepted for search/context command parity") - parser.add_argument("--json", action="store_true", help="Accepted for search/context command parity") - - def _runtime(args: argparse.Namespace) -> object: return runtime_config( repo_root=args.repo_root, @@ -326,65 +213,80 @@ def _runtime(args: argparse.Namespace) -> object: ) -def _search_arguments_payload(args: argparse.Namespace) -> dict[str, object]: - payload: dict[str, object] = { - "limit": args.limit, - "profile": args.profile, - "budget": args.budget, - "context_limit": args.context_limit, - "detail": args.detail, - } - if args.query: - payload["query"] = args.query - if args.max_depth is not None: - payload["max_depth"] = args.max_depth - return payload - - -def _print_tool_payload( - parser: argparse.ArgumentParser, - tool_name: str, - arguments: dict[str, object], - args: argparse.Namespace, -) -> int: +def _run_legacy_search_command(parser: argparse.ArgumentParser, args: argparse.Namespace) -> int: + try: + request = _search_request_from_args(args) + except ValueError as exc: + parser.error(str(exc)) + materializer = GraphMaterializer( + Path(args.source_root), + db_path=args.db, + manifest_path=args.manifest, + include_fts=True, + ) + if args.no_refresh: + with create_ladybug_database(materializer.db_path, include_fts=True, read_only=True) as store: + payload = SearchService(store).search(request) + else: + try: + materializer.materialize(mode="changed") + payload = SearchService(materializer.store).search(request) + finally: + materializer.close() + _print_payload(payload.as_dict(detail=args.detail), args) + return 0 + + +def _search_request_from_args(args: argparse.Namespace) -> SearchRequest: + request = SearchRequest( + query=args.query, + limit=args.limit, + profile=args.profile, + budget=args.budget, + max_depth=args.max_depth, + context_limit=args.context_limit, + detail=args.detail, + ) + request.validate() + return request + + +def _run_graph_command(parser: argparse.ArgumentParser, args: argparse.Namespace) -> int: + spec = graph_command_spec(args.command) try: - payload = handle_tool_call(tool_name, arguments, runtime=_runtime(args)) + arguments = spec.payload_from_args(args) + runtime = _runtime(args) if spec.requires_runtime else None + payload = handle_tool_call(spec.tool_name, arguments, runtime=runtime) except (OSError, ValueError) as exc: parser.error(str(exc)) - _print_json(payload, args) + _print_payload(payload, args) return 0 def _add_json_output_arguments(parser: argparse.ArgumentParser) -> None: - parser.add_argument("--pretty", action="store_true", help="Emit indented JSON output") + add_json_output_arguments(parser) def _print_json(payload: object, args: argparse.Namespace) -> None: print(_json_dumps(payload, pretty=getattr(args, "pretty", False))) +def _print_payload(payload: dict[str, object], args: argparse.Namespace) -> None: + if getattr(args, "json", False): + _print_json(payload, args) + return + if getattr(args, "format", "json") == "block": + print(serialize_graph_block(payload), end="") + return + _print_json(payload, args) + + def _json_dumps(payload: object, *, pretty: bool) -> str: if pretty: return json.dumps(payload, indent=2, sort_keys=True) return json.dumps(payload, separators=(",", ":"), sort_keys=True) -def _result_payload(result: object) -> dict[str, object]: - return { - "mode": getattr(result, "mode"), - "scanned": getattr(result, "scanned"), - "rebuilt": getattr(result, "rebuilt"), - "skipped": getattr(result, "skipped"), - "deleted": getattr(result, "deleted"), - "diagnostics": list(getattr(result, "diagnostics")), - "manifest_path": getattr(result, "manifest_path"), - "rebuilt_paths": list(getattr(result, "rebuilt_paths")), - "skipped_paths": list(getattr(result, "skipped_paths")), - "deleted_paths": list(getattr(result, "deleted_paths")), - "graph_summary": dict(getattr(result, "graph_summary")), - } - - def _print_mcp_install_results(results: Sequence[object]) -> None: for result in results: action = getattr(result, "action") diff --git a/src/codebase_graph/ingest/materializer.py b/src/codebase_graph/ingest/materializer.py index ca3576c..96d7d36 100644 --- a/src/codebase_graph/ingest/materializer.py +++ b/src/codebase_graph/ingest/materializer.py @@ -206,6 +206,21 @@ class MaterializationResult: deleted_paths: tuple[str, ...] graph_summary: Mapping[str, Any] + def as_dict(self) -> dict[str, Any]: + return { + "mode": self.mode, + "scanned": self.scanned, + "rebuilt": self.rebuilt, + "skipped": self.skipped, + "deleted": self.deleted, + "diagnostics": list(self.diagnostics), + "manifest_path": self.manifest_path, + "rebuilt_paths": list(self.rebuilt_paths), + "skipped_paths": list(self.skipped_paths), + "deleted_paths": list(self.deleted_paths), + "graph_summary": dict(self.graph_summary), + } + class GraphMaterializer: def __init__( diff --git a/src/codebase_graph/mcp/graph_commands.py b/src/codebase_graph/mcp/graph_commands.py new file mode 100644 index 0000000..5af427b --- /dev/null +++ b/src/codebase_graph/mcp/graph_commands.py @@ -0,0 +1,281 @@ +from __future__ import annotations + +import argparse +import json +from collections.abc import Callable, Sequence +from dataclasses import dataclass +from typing import Any + +from codebase_graph.ontology import CONTEXT_PROFILES +from codebase_graph.retrieval import DETAIL_LEVELS + + +MAX_GRAPH_QUERY_LIMIT = 1000 + +PayloadBuilder = Callable[[argparse.Namespace], dict[str, Any]] +ArgumentAdder = Callable[[argparse.ArgumentParser], None] + + +@dataclass(frozen=True, slots=True) +class GraphCommandSpec: + command_name: str + tool_name: str + help: str + description: str + input_schema: dict[str, Any] + add_arguments: ArgumentAdder + payload_from_args: PayloadBuilder + requires_runtime: bool = True + + def tool_spec(self) -> dict[str, Any]: + return { + "name": self.tool_name, + "description": self.description, + "inputSchema": self.input_schema, + } + + +def graph_command_specs() -> tuple[GraphCommandSpec, ...]: + return GRAPH_COMMAND_SPECS + + +def graph_command_names() -> set[str]: + return {spec.command_name for spec in GRAPH_COMMAND_SPECS} + + +def graph_tool_specs() -> list[dict[str, Any]]: + return [spec.tool_spec() for spec in GRAPH_COMMAND_SPECS] + + +def graph_command_spec(command_name: str) -> GraphCommandSpec: + for spec in GRAPH_COMMAND_SPECS: + if spec.command_name == command_name: + return spec + raise KeyError(command_name) + + +def search_arguments_payload(args: argparse.Namespace) -> dict[str, Any]: + payload: dict[str, Any] = { + "limit": args.limit, + "profile": args.profile, + "budget": args.budget, + "context_limit": args.context_limit, + "detail": args.detail, + } + if getattr(args, "query", None): + payload["query"] = args.query + if args.max_depth is not None: + payload["max_depth"] = args.max_depth + return payload + + +def _empty_payload(args: argparse.Namespace) -> dict[str, Any]: + return {} + + +def _architecture_payload(args: argparse.Namespace) -> dict[str, Any]: + payload: dict[str, Any] = {} + if args.group: + payload["group"] = args.group + return payload + + +def _context_payload(args: argparse.Namespace) -> dict[str, Any]: + if not args.query and not (args.node_id and args.node_type): + raise ValueError("graph-context requires a query or both --node-id and --node-type") + if (args.node_id and not args.node_type) or (args.node_type and not args.node_id): + raise ValueError("graph-context explicit lookup requires both --node-id and --node-type") + payload = search_arguments_payload(args) + if args.node_id and args.node_type: + payload["node_id"] = args.node_id + payload["node_type"] = args.node_type + return payload + + +def _query_payload(args: argparse.Namespace) -> dict[str, Any]: + try: + parameters = json.loads(args.parameters) + except json.JSONDecodeError as exc: + raise ValueError(f"graph-query --parameters must be a JSON object: {exc}") from exc + if not isinstance(parameters, dict): + raise ValueError("graph-query --parameters must be a JSON object") + return {"statement": args.statement, "parameters": parameters, "limit": args.limit} + + +def add_json_output_arguments(parser: argparse.ArgumentParser) -> None: + parser.add_argument("--pretty", action="store_true", help="Emit indented JSON output") + + +def add_compact_context_arguments(parser: argparse.ArgumentParser) -> None: + parser.add_argument("--limit", type=int, default=3, help="Maximum search hits to return") + parser.add_argument("--profile", choices=sorted(CONTEXT_PROFILES), default="brief", help="Context profile") + parser.add_argument("--budget", type=int, default=600, help="Approximate per-hit context character budget") + parser.add_argument("--max-depth", type=int, default=None, help="Override the context profile depth") + parser.add_argument("--context-limit", type=int, default=3, help="Maximum context items per search hit") + parser.add_argument("--detail", choices=sorted(DETAIL_LEVELS), default="standard", help="Output detail level") + parser.add_argument("--format", choices=("json", "block"), default="json", help="Output format") + add_json_output_arguments(parser) + + +def add_runtime_arguments(parser: argparse.ArgumentParser) -> None: + parser.add_argument("--repo-root", default=".", help="Repository root containing .codebaseGraph/config.json") + parser.add_argument("--config", default=None, help="Path to .codebaseGraph/config.json") + parser.add_argument("--db", default=None, help="Override LadyBugDB path") + parser.add_argument("--manifest", default=None, help="Override manifest path") + + +def add_graph_compatibility_arguments(parser: argparse.ArgumentParser) -> None: + parser.add_argument("--no-refresh", action="store_true", help="Accepted for search/context command parity") + parser.add_argument("--json", action="store_true", help="Accepted for search/context command parity; same as --format json") + + +def _add_graph_health_arguments(parser: argparse.ArgumentParser) -> None: + add_runtime_arguments(parser) + add_json_output_arguments(parser) + + +def _add_graph_search_arguments(parser: argparse.ArgumentParser) -> None: + parser.add_argument("query", help="Search query") + add_compact_context_arguments(parser) + add_runtime_arguments(parser) + add_graph_compatibility_arguments(parser) + + +def _add_graph_context_arguments(parser: argparse.ArgumentParser) -> None: + parser.add_argument("query", nargs="?", help="Search query") + parser.add_argument("--node-id", default=None, help="Explicit graph node id") + parser.add_argument("--node-type", default=None, help="Explicit graph node type") + add_compact_context_arguments(parser) + add_runtime_arguments(parser) + add_graph_compatibility_arguments(parser) + + +def _add_graph_architecture_arguments(parser: argparse.ArgumentParser) -> None: + parser.add_argument("--group", default=None, help="Optional architecture query group") + add_json_output_arguments(parser) + + +def _add_graph_query_arguments(parser: argparse.ArgumentParser) -> None: + parser.add_argument("statement", help="Read-only graph query statement") + parser.add_argument("--parameters", default="{}", help="JSON object with query parameters") + parser.add_argument("--limit", type=int, default=100, help="Maximum rows to return") + add_runtime_arguments(parser) + add_json_output_arguments(parser) + + +def _object_schema( + properties: dict[str, Any] | None = None, + *, + required: Sequence[str] = (), +) -> dict[str, Any]: + schema: dict[str, Any] = { + "type": "object", + "properties": properties or {}, + "additionalProperties": False, + } + if required: + schema["required"] = list(required) + return schema + + +def _search_schema(*, required: Sequence[str]) -> dict[str, Any]: + return _object_schema( + { + "query": {"type": "string"}, + "limit": {"type": "integer", "minimum": 1}, + "profile": {"type": "string"}, + "budget": {"type": "integer", "minimum": 0}, + "max_depth": {"type": "integer", "minimum": 0}, + "context_limit": {"type": "integer", "minimum": 0}, + "detail": {"type": "string", "enum": sorted(DETAIL_LEVELS)}, + "output_format": {"type": "string", "enum": ["json", "block"]}, + "node_id": {"type": "string"}, + "node_type": {"type": "string"}, + }, + required=required, + ) + + +GRAPH_COMMAND_SPECS = ( + GraphCommandSpec( + command_name="graph-health", + tool_name="graph_health", + help="Check configured graph paths", + description="Check the configured codebaseGraph database path and manifest path.", + input_schema=_object_schema(), + add_arguments=_add_graph_health_arguments, + payload_from_args=_empty_payload, + ), + GraphCommandSpec( + command_name="graph-search", + tool_name="graph_search", + help="Search the code graph with compact context", + description="Search code, documentation, paths, and dependencies with compact graph context.", + input_schema=_search_schema(required=("query",)), + add_arguments=_add_graph_search_arguments, + payload_from_args=search_arguments_payload, + ), + GraphCommandSpec( + command_name="graph-context", + tool_name="graph_context", + help="Return compact graph context", + description="Return compact context for a search query or explicit node_id/node_type pair.", + input_schema=_search_schema(required=()), + add_arguments=_add_graph_context_arguments, + payload_from_args=_context_payload, + ), + GraphCommandSpec( + command_name="graph-schema", + tool_name="graph_schema", + help="Return ontology schema, indexes, profiles, and helpers", + description="Return ontology schema, search indexes, context profiles, and query helper metadata.", + input_schema=_object_schema(), + add_arguments=add_json_output_arguments, + payload_from_args=_empty_payload, + requires_runtime=False, + ), + GraphCommandSpec( + command_name="graph-query-helpers", + tool_name="graph_query_helpers", + help="Return named read-only graph query helpers", + description="Return named read-only query helpers for common graph exploration tasks.", + input_schema=_object_schema(), + add_arguments=add_json_output_arguments, + payload_from_args=_empty_payload, + requires_runtime=False, + ), + GraphCommandSpec( + command_name="graph-architecture-queries", + tool_name="graph_architecture_queries", + help="Return the architecture-discovery query catalog", + description="Return the grouped architecture-discovery Cypher catalog for coding-agent first-step orientation.", + input_schema=_object_schema( + { + "group": { + "type": "string", + "description": "Optional architecture query group to return.", + }, + } + ), + add_arguments=_add_graph_architecture_arguments, + payload_from_args=_architecture_payload, + requires_runtime=False, + ), + GraphCommandSpec( + command_name="graph-query", + tool_name="graph_query", + help="Execute a restricted read-only graph query", + description="Execute a restricted read-only graph query against the configured database.", + input_schema=_object_schema( + { + "statement": {"type": "string"}, + "parameters": {"type": "object"}, + "limit": {"type": "integer", "minimum": 1, "maximum": MAX_GRAPH_QUERY_LIMIT}, + }, + required=("statement",), + ), + add_arguments=_add_graph_query_arguments, + payload_from_args=_query_payload, + ), +) + diff --git a/src/codebase_graph/mcp/tools.py b/src/codebase_graph/mcp/tools.py index 796b533..0d98722 100644 --- a/src/codebase_graph/mcp/tools.py +++ b/src/codebase_graph/mcp/tools.py @@ -8,8 +8,9 @@ from codebase_graph.diagnostics import log_event from codebase_graph.ontology import QUERY_HELPERS, schema_payload from codebase_graph.reasoning import CompactContextBuilder, architecture_query_catalog -from codebase_graph.retrieval import DETAIL_LEVELS, SearchRequest, SearchService +from codebase_graph.retrieval import DETAIL_LEVELS, SearchRequest, SearchService, serialize_graph_block +from .graph_commands import MAX_GRAPH_QUERY_LIMIT, graph_tool_specs from .runtime import GraphRuntimeConfig, open_graph_store READ_ONLY_DENY_RE = re.compile( @@ -19,14 +20,13 @@ r")\b", re.IGNORECASE, ) -MAX_GRAPH_QUERY_LIMIT = 1000 class UnknownToolError(ValueError): pass -def handle_tool_call(name: str, arguments: dict[str, Any], *, runtime: GraphRuntimeConfig) -> dict[str, Any]: +def handle_tool_call(name: str, arguments: dict[str, Any], *, runtime: GraphRuntimeConfig | None) -> dict[str, Any]: if name == "graph_health": return _health(runtime) if name == "graph_schema": @@ -36,14 +36,14 @@ def handle_tool_call(name: str, arguments: dict[str, Any], *, runtime: GraphRunt if name == "graph_architecture_queries": return architecture_query_catalog(group=_optional_str(arguments.get("group"))) if name == "graph_search": - with open_graph_store(runtime) as store: + with open_graph_store(_require_runtime(runtime, name)) as store: request = _search_request(arguments) return SearchService(store).search(request).as_dict(detail=request.detail) if name == "graph_context": - with open_graph_store(runtime) as store: + with open_graph_store(_require_runtime(runtime, name)) as store: return _context_payload(store, arguments) if name == "graph_query": - with open_graph_store(runtime) as store: + with open_graph_store(_require_runtime(runtime, name)) as store: return _query_payload(store, arguments) raise UnknownToolError(f"Unknown codebaseGraph MCP tool: {name}") @@ -51,16 +51,25 @@ def handle_tool_call(name: str, arguments: dict[str, Any], *, runtime: GraphRunt def call_tool_result(name: str, arguments: dict[str, Any], *, runtime: GraphRuntimeConfig) -> dict[str, Any]: try: payload = handle_tool_call(name, arguments, runtime=runtime) + return tool_result(name, payload, arguments) except UnknownToolError: raise except Exception as exc: return tool_error_result(name, exc) - return tool_result(payload) -def tool_result(payload: dict[str, Any]) -> dict[str, Any]: +def _require_runtime(runtime: GraphRuntimeConfig | None, tool_name: str) -> GraphRuntimeConfig: + if runtime is None: + raise ValueError(f"{tool_name} requires a graph runtime") + return runtime + + +def tool_result(name: str, payload: dict[str, Any], arguments: dict[str, Any] | None = None) -> dict[str, Any]: + text = json.dumps(payload, separators=(",", ":"), sort_keys=True) + if name in {"graph_search", "graph_context"} and _output_format(arguments or {}) == "block": + text = serialize_graph_block(payload) return { - "content": [{"type": "text", "text": json.dumps(payload, separators=(",", ":"), sort_keys=True)}], + "content": [{"type": "text", "text": text}], "structuredContent": payload, "isError": False, } @@ -89,61 +98,7 @@ def tool_error_result(name: str, exc: Exception) -> dict[str, Any]: def tool_specs() -> list[dict[str, Any]]: - return [ - { - "name": "graph_health", - "description": "Check the configured codebaseGraph database path and manifest path.", - "inputSchema": {"type": "object", "properties": {}, "additionalProperties": False}, - }, - { - "name": "graph_search", - "description": "Search code, documentation, paths, and dependencies with compact graph context.", - "inputSchema": _search_schema(required=("query",)), - }, - { - "name": "graph_context", - "description": "Return compact context for a search query or explicit node_id/node_type pair.", - "inputSchema": _search_schema(required=()), - }, - { - "name": "graph_schema", - "description": "Return ontology schema, search indexes, context profiles, and query helper metadata.", - "inputSchema": {"type": "object", "properties": {}, "additionalProperties": False}, - }, - { - "name": "graph_query_helpers", - "description": "Return named read-only query helpers for common graph exploration tasks.", - "inputSchema": {"type": "object", "properties": {}, "additionalProperties": False}, - }, - { - "name": "graph_architecture_queries", - "description": "Return the grouped architecture-discovery Cypher catalog for coding-agent first-step orientation.", - "inputSchema": { - "type": "object", - "properties": { - "group": { - "type": "string", - "description": "Optional architecture query group to return.", - }, - }, - "additionalProperties": False, - }, - }, - { - "name": "graph_query", - "description": "Execute a restricted read-only graph query against the configured database.", - "inputSchema": { - "type": "object", - "properties": { - "statement": {"type": "string"}, - "parameters": {"type": "object"}, - "limit": {"type": "integer", "minimum": 1, "maximum": MAX_GRAPH_QUERY_LIMIT}, - }, - "required": ["statement"], - "additionalProperties": False, - }, - }, - ] + return graph_tool_specs() def _health(runtime: GraphRuntimeConfig) -> dict[str, Any]: @@ -275,25 +230,6 @@ def _json_safe(value: Any) -> Any: return str(value) -def _search_schema(*, required: tuple[str, ...]) -> dict[str, Any]: - return { - "type": "object", - "properties": { - "query": {"type": "string"}, - "limit": {"type": "integer", "minimum": 1}, - "profile": {"type": "string"}, - "budget": {"type": "integer", "minimum": 0}, - "max_depth": {"type": "integer", "minimum": 0}, - "context_limit": {"type": "integer", "minimum": 0}, - "detail": {"type": "string", "enum": sorted(DETAIL_LEVELS)}, - "node_id": {"type": "string"}, - "node_type": {"type": "string"}, - }, - "required": list(required), - "additionalProperties": False, - } - - def _optional_int(value: Any) -> int | None: if value is None or value == "": return None @@ -312,3 +248,10 @@ def _detail(arguments: dict[str, Any]) -> str: valid = ", ".join(sorted(DETAIL_LEVELS)) raise ValueError(f"Unknown detail level: {detail}. Valid levels: {valid}") return detail + + +def _output_format(arguments: dict[str, Any]) -> str: + output_format = str(arguments.get("output_format", "json")) + if output_format not in {"json", "block"}: + raise ValueError(f"Unknown output format: {output_format}. Valid formats: block, json") + return output_format diff --git a/src/codebase_graph/retrieval/__init__.py b/src/codebase_graph/retrieval/__init__.py index 09fe35e..761622c 100644 --- a/src/codebase_graph/retrieval/__init__.py +++ b/src/codebase_graph/retrieval/__init__.py @@ -5,6 +5,9 @@ intentional_summary_omissions, parse_search_block, serialize_agent_search_block, + serialize_context_block, + serialize_graph_block, + serialize_parseable_search_block, serialize_search_block, ) from .search import DETAIL_LEVELS, CompactContextPayload, SearchHit, SearchRequest, SearchService @@ -19,5 +22,8 @@ "intentional_summary_omissions", "parse_search_block", "serialize_agent_search_block", + "serialize_context_block", + "serialize_graph_block", + "serialize_parseable_search_block", "serialize_search_block", ] diff --git a/src/codebase_graph/retrieval/block_format.py b/src/codebase_graph/retrieval/block_format.py index d4c15d6..10c4020 100644 --- a/src/codebase_graph/retrieval/block_format.py +++ b/src/codebase_graph/retrieval/block_format.py @@ -11,8 +11,8 @@ ONTOLOGY_TERMS = {"Class", "Method", "Scope", "Contains", "outgoing", "path", "span", "id", "label", "rank_score"} -def serialize_search_block(payload: Mapping[str, Any]) -> str: - """Serialize graph-search JSON into a readable ontology-preserving block format.""" +def serialize_parseable_search_block(payload: Mapping[str, Any]) -> str: + """Serialize graph-search JSON into a parseable debug block format.""" lines = [ " | ".join( [ @@ -73,7 +73,7 @@ def serialize_search_block(payload: Mapping[str, Any]) -> str: def serialize_agent_search_block(payload: Mapping[str, Any]) -> str: - """Serialize graph-search JSON into a more aggressive display-only agent block.""" + """Serialize graph-search JSON into the compact runtime block format.""" lines = [f"q {_format_value(str(payload.get('query', '')))}"] current_path: str | None = None result_keys = {_record_key(result) for result in payload.get("results", [])} @@ -121,6 +121,49 @@ def serialize_agent_search_block(payload: Mapping[str, Any]) -> str: return "\n".join(lines) + "\n" +def serialize_context_block(payload: Mapping[str, Any]) -> str: + """Serialize an explicit graph-context payload into a readable block.""" + header = [ + f"context {payload.get('node_type', '')}", + f"id={_format_value(str(payload.get('node_id', '')))}", + f"profile={_format_value(str(payload.get('profile', '')))}", + ] + lines = [" ".join(header)] + current_path: str | None = None + for context in payload.get("context", []): + context_path = str(context.get("path", "")) + if context_path != current_path: + if len(lines) > 1: + lines.append("") + lines.append(f"file path {_format_value(context_path)}") + current_path = context_path + context_parts = [ + f" {context.get('direction', '')}", + str(context.get("relation", "")), + str(context.get("type", "")), + _format_value(str(context.get("label", ""))), + _format_span(_span(context.get("span", {}))), + ] + context_summary = _meaningful_summary(context) + if context_summary: + context_parts.append(f"summary={_format_value(context_summary)}") + lines.append(" ".join(context_parts)) + return "\n".join(lines) + "\n" + + +def serialize_graph_block(payload: Mapping[str, Any]) -> str: + if "results" in payload: + return serialize_agent_search_block(payload) + if "context" in payload and "node_id" in payload and "node_type" in payload: + return serialize_context_block(payload) + raise ValueError("Block format is only supported for graph-search and graph-context payloads") + + +def serialize_search_block(payload: Mapping[str, Any]) -> str: + """Backward-compatible alias for the parseable debug block format.""" + return serialize_parseable_search_block(payload) + + def canonicalize_search_payload(payload: Mapping[str, Any]) -> dict[str, Any]: records: list[dict[str, Any]] = [] for result in payload.get("results", []): @@ -316,6 +359,9 @@ def _record_key(record: Mapping[str, Any]) -> tuple[str, str, str, tuple[tuple[s "canonicalize_search_payload", "intentional_summary_omissions", "parse_search_block", + "serialize_context_block", "serialize_agent_search_block", + "serialize_graph_block", + "serialize_parseable_search_block", "serialize_search_block", ] diff --git a/src/codebase_graph/setup/installer.py b/src/codebase_graph/setup/installer.py index 2799b94..340ec10 100644 --- a/src/codebase_graph/setup/installer.py +++ b/src/codebase_graph/setup/installer.py @@ -7,7 +7,7 @@ import subprocess from dataclasses import dataclass from pathlib import Path -from typing import Any +from typing import Any, Callable from codebase_graph.mcp.protocol import LATEST_PROTOCOL_VERSION @@ -15,14 +15,9 @@ from .descriptor import McpServerDescriptor, build_server_descriptor from .state import MCP_SERVER_NAME, load_setup_config -INSTALL_CLIENTS = ("codex", "claude", "claude-project", "lmstudio", "hermes", "openclaw", "generic") SCOPES = ("local", "user", "project") -NATIVE_EXECUTABLES = { - "codex": "codex", - "claude": "claude", - "claude-project": "claude", - "openclaw": "openclaw", -} +NativeCommandBuilder = Callable[[McpServerDescriptor, str], list[str]] +VisibilityCommandBuilder = Callable[[], list[str]] @dataclass(frozen=True, slots=True) @@ -84,6 +79,94 @@ def as_dict(self) -> dict[str, Any]: return payload +@dataclass(frozen=True, slots=True) +class InstallClientStrategy: + client_id: str + adapter_id: str | None = None + project_adapter_id: str | None = None + forced_scope: str | None = None + native_executable: str | None = None + native_command_builder: NativeCommandBuilder | None = None + visibility_command_builder: VisibilityCommandBuilder | None = None + + def install_scope(self, scope: str) -> str: + return self.forced_scope or scope + + def adapter_client_id(self, scope: str) -> str: + if self.project_adapter_id is not None and self.install_scope(scope) == "project": + return self.project_adapter_id + return self.adapter_id or self.client_id + + def native_command(self, descriptor: McpServerDescriptor, *, scope: str) -> list[str] | None: + if self.native_command_builder is None: + return None + return self.native_command_builder(descriptor, self.install_scope(scope)) + + def visibility_command(self) -> list[str] | None: + if self.visibility_command_builder is None: + return None + return self.visibility_command_builder() + + +def _codex_native_command(descriptor: McpServerDescriptor, scope: str) -> list[str]: + return ["codex", "mcp", "add", descriptor.name, "--", descriptor.command, *descriptor.args] + + +def _claude_native_command(descriptor: McpServerDescriptor, scope: str) -> list[str]: + return [ + "claude", + "mcp", + "add", + "--transport", + "stdio", + "--scope", + scope, + descriptor.name, + "--", + descriptor.command, + *descriptor.args, + ] + + +def _openclaw_native_command(descriptor: McpServerDescriptor, scope: str) -> list[str]: + entry = descriptor.stdio_entry(include_type=True) + return ["openclaw", "mcp", "set", descriptor.name, json.dumps(entry, separators=(",", ":"), sort_keys=True)] + + +INSTALL_STRATEGIES: dict[str, InstallClientStrategy] = { + "codex": InstallClientStrategy( + client_id="codex", + native_executable="codex", + native_command_builder=_codex_native_command, + visibility_command_builder=lambda: ["codex", "mcp", "list"], + ), + "claude": InstallClientStrategy( + client_id="claude", + project_adapter_id="claude-project", + native_executable="claude", + native_command_builder=_claude_native_command, + visibility_command_builder=lambda: ["claude", "mcp", "list"], + ), + "claude-project": InstallClientStrategy( + client_id="claude-project", + forced_scope="project", + native_executable="claude", + native_command_builder=_claude_native_command, + visibility_command_builder=lambda: ["claude", "mcp", "list"], + ), + "lmstudio": InstallClientStrategy(client_id="lmstudio"), + "hermes": InstallClientStrategy(client_id="hermes"), + "openclaw": InstallClientStrategy( + client_id="openclaw", + native_executable="openclaw", + native_command_builder=_openclaw_native_command, + visibility_command_builder=lambda: ["openclaw", "mcp", "list"], + ), + "generic": InstallClientStrategy(client_id="generic"), +} +INSTALL_CLIENTS = tuple(INSTALL_STRATEGIES) + + def supported_install_client_ids(*, include_all: bool = False) -> tuple[str, ...]: values = [*INSTALL_CLIENTS] if include_all: @@ -104,6 +187,7 @@ def install_mcp_clients(options: McpInstallOptions) -> list[McpInstallResult]: def install_mcp_server(options: McpInstallOptions) -> McpInstallResult: _validate_options(options) + strategy = _client_strategy(options.client) descriptor = _build_descriptor(options) entry = descriptor.stdio_entry() if options.skip or options.client == "none": @@ -119,12 +203,13 @@ def install_mcp_server(options: McpInstallOptions) -> McpInstallResult: entry=entry, ) - native_command = _native_command(options.client, descriptor, scope=options.scope) + native_command = strategy.native_command(descriptor, scope=options.scope) use_native = ( options.prefer_native and options.client_config_path is None and native_command is not None - and shutil.which(_native_executable(options.client)) + and strategy.native_executable is not None + and shutil.which(strategy.native_executable) ) if options.dry_run: if use_native: @@ -156,14 +241,14 @@ def install_mcp_server(options: McpInstallOptions) -> McpInstallResult: descriptor, dry_run=False, native_command=native_command, - native_error=_missing_native_error(options.client) if native_command is not None else None, + native_error=_missing_native_error(strategy) if native_command is not None else None, ) def _install_with_failure_result(options: McpInstallOptions, client: str) -> McpInstallResult: client_options = McpInstallOptions( client=client, - scope=_scope_for_client(client, options.scope), + scope=_client_strategy(client).install_scope(options.scope), setup_config_path=options.setup_config_path, server_name=options.server_name, client_config_path=options.client_config_path, @@ -207,7 +292,7 @@ def _file_adapter_result( native_command: list[str] | None = None, native_error: str | None = None, ) -> McpInstallResult: - adapter = get_client_adapter(_adapter_client_id(options.client, options.scope)) + adapter = get_client_adapter(_client_strategy(options.client).adapter_client_id(options.scope)) path = ( Path(options.client_config_path).expanduser().resolve() if options.client_config_path is not None @@ -348,7 +433,7 @@ def _verify_stdio(descriptor: McpServerDescriptor, *, timeout: int) -> dict[str, def _verify_client_visibility(client: str, server_name: str, *, timeout: int) -> dict[str, Any]: - command = _visibility_command(client) + command = _client_strategy(client).visibility_command() if command is None: return {"ok": True, "skipped": True, "reason": f"{client} has no CLI visibility check"} executable = command[0] @@ -418,39 +503,6 @@ def _frame_json_rpc(method: str, params: dict[str, Any], *, request_id: int) -> return f"Content-Length: {len(body)}\r\n\r\n".encode("ascii") + body -def _native_command(client: str, descriptor: McpServerDescriptor, *, scope: str) -> list[str] | None: - if client == "codex": - return ["codex", "mcp", "add", descriptor.name, "--", descriptor.command, *descriptor.args] - if client in {"claude", "claude-project"}: - return [ - "claude", - "mcp", - "add", - "--transport", - "stdio", - "--scope", - _scope_for_client(client, scope), - descriptor.name, - "--", - descriptor.command, - *descriptor.args, - ] - if client == "openclaw": - entry = descriptor.stdio_entry(include_type=True) - return ["openclaw", "mcp", "set", descriptor.name, json.dumps(entry, separators=(",", ":"), sort_keys=True)] - return None - - -def _visibility_command(client: str) -> list[str] | None: - if client == "codex": - return ["codex", "mcp", "list"] - if client in {"claude", "claude-project"}: - return ["claude", "mcp", "list"] - if client == "openclaw": - return ["openclaw", "mcp", "list"] - return None - - def _build_descriptor(options: McpInstallOptions) -> McpServerDescriptor: config_path = Path(options.setup_config_path).expanduser().resolve() repo_root: Path | None = None @@ -476,27 +528,16 @@ def _validate_options(options: McpInstallOptions) -> None: raise ValueError(f"Unsupported MCP install scope: {options.scope}. Supported scopes: {', '.join(SCOPES)}") -def _native_executable(client: str) -> str: - return NATIVE_EXECUTABLES[client] - - -def _adapter_client_id(client: str, scope: str) -> str: - if client == "claude" and scope == "project": - return "claude-project" - return client - - -def _scope_for_client(client: str, scope: str) -> str: - if client == "claude-project": - return "project" - return scope +def _client_strategy(client: str) -> InstallClientStrategy: + if client == "none": + return InstallClientStrategy(client_id="none") + return INSTALL_STRATEGIES[client] -def _missing_native_error(client: str) -> str | None: - executable = NATIVE_EXECUTABLES.get(client) - if executable is None: +def _missing_native_error(strategy: InstallClientStrategy) -> str | None: + if strategy.native_executable is None: return None - return f"{executable} executable not found" + return f"{strategy.native_executable} executable not found" def _subprocess_error(completed: subprocess.CompletedProcess[str]) -> str: diff --git a/src/codebase_graph/setup/instructions.py b/src/codebase_graph/setup/instructions.py index 183ad33..98664c7 100644 --- a/src/codebase_graph/setup/instructions.py +++ b/src/codebase_graph/setup/instructions.py @@ -76,9 +76,10 @@ def _instruction_block(*, server_name: str, config_path: Path, setup_command: st return ( f"{START_MARKER}\n" "## codebaseGraph workflow\n" - "- Treat the repo-local `.codebaseGraph` graph as the project operating source of truth.\n" - f"- Use `{setup_command} graph-search --repo-root . --no-refresh --detail slim --context-limit 1 --json` before answering repo-structure questions or performing coding tasks.\n" - f"- Use `{setup_command} graph-context --repo-root . --profile --no-refresh --detail slim --context-limit 2 --json` when relationships or nearby evidence matter; useful profiles include `definitions`, `dependencies`, `callgraph`, `docs`, `runtime`, and `change_impact`.\n" + "- Treat the repo-local `.codebaseGraph` graph as the project operating source of truth. It is prohibited to read the code source before you find the target files using the graph.\n" + "- AI agents must use block format for `graph-search` and `graph-context`; reserve `--json` for tests, APIs, or explicit structured-payload debugging.\n" + f"- Use `{setup_command} graph-search --repo-root . --no-refresh --detail slim --context-limit 1 --format block` before answering repo-structure questions or performing coding tasks.\n" + f"- Use `{setup_command} graph-context --repo-root . --profile --no-refresh --detail slim --context-limit 2 --format block` when relationships or nearby evidence matter; useful profiles include `definitions`, `dependencies`, `callgraph`, `docs`, `runtime`, and `change_impact`.\n" f"- For architecture orientation, run `{setup_command} graph-architecture-queries`, then execute selected read-only statements with `{setup_command} graph-query \"\" --repo-root .`.\n" f"- Use `{setup_command} graph-schema` or `{setup_command} graph-query-helpers` before writing raw graph queries, add `--pretty` for indented JSON when humans need to inspect output, and keep `graph-query` read-only.\n" f"- Refresh the graph with `{setup_command} setup --repo-root . --mcp-client none` when files change materially. Setup config: `{config_path.as_posix()}`.\n" diff --git a/src/codebase_graph/setup/orchestrator.py b/src/codebase_graph/setup/orchestrator.py index 59bf6a5..d7c667f 100644 --- a/src/codebase_graph/setup/orchestrator.py +++ b/src/codebase_graph/setup/orchestrator.py @@ -151,19 +151,10 @@ def run_setup(options: SetupOptions) -> SetupResult: def _materialization_payload(result: Any) -> dict[str, Any]: - return { - "mode": getattr(result, "mode"), - "scanned": getattr(result, "scanned"), - "rebuilt": getattr(result, "rebuilt"), - "skipped": getattr(result, "skipped"), - "deleted": getattr(result, "deleted"), - "diagnostics": list(getattr(result, "diagnostics")), - "manifest_path": getattr(result, "manifest_path"), - "rebuilt_paths": list(getattr(result, "rebuilt_paths")), - "skipped_paths": list(getattr(result, "skipped_paths")), - "deleted_paths": list(getattr(result, "deleted_paths")), - "graph_summary": dict(getattr(result, "graph_summary")), - } + as_dict = getattr(result, "as_dict", None) + if callable(as_dict): + return as_dict() + raise TypeError(f"Unsupported materialization result: {type(result).__name__}") def _dry_run_materialization(paths: SetupPaths) -> Any: @@ -202,6 +193,21 @@ class _DryRunMaterialization: deleted_paths: tuple[str, ...] = () graph_summary: dict[str, Any] = field(default_factory=dict) + def as_dict(self) -> dict[str, Any]: + return { + "mode": self.mode, + "scanned": self.scanned, + "rebuilt": self.rebuilt, + "skipped": self.skipped, + "deleted": self.deleted, + "diagnostics": list(self.diagnostics), + "manifest_path": self.manifest_path, + "rebuilt_paths": list(self.rebuilt_paths), + "skipped_paths": list(self.skipped_paths), + "deleted_paths": list(self.deleted_paths), + "graph_summary": dict(self.graph_summary), + } + def _config_would_change(path: Path, payload: dict[str, Any]) -> bool: if not path.exists(): diff --git a/tests/test_graph_output_block_format.py b/tests/test_graph_output_block_format.py index 77ac6fd..6a8f38a 100644 --- a/tests/test_graph_output_block_format.py +++ b/tests/test_graph_output_block_format.py @@ -11,6 +11,8 @@ canonicalize_search_payload, parse_search_block, serialize_agent_search_block, + serialize_context_block, + serialize_parseable_search_block, serialize_search_block, ) @@ -32,9 +34,10 @@ def test_token_counting_uses_encoded_text_length() -> None: def test_raw_vs_block_comparison_preserves_search_service_fixture() -> None: payload = json.loads(FIXTURE_PATH.read_text(encoding="utf-8")) - block = serialize_search_block(payload) + block = serialize_parseable_search_block(payload) assert parse_search_block(block) == canonicalize_search_payload(payload) + assert serialize_search_block(payload) == block def test_l_same_is_only_emitted_for_matching_context_spans() -> None: @@ -122,6 +125,30 @@ def test_agent_block_reduces_display_only_boilerplate() -> None: ) in block +def test_context_block_serializes_explicit_node_context() -> None: + block = serialize_context_block( + { + "node_id": "Class:943d6556d328f1c7ca67", + "node_type": "Class", + "profile": "definitions", + "context": [ + { + "direction": "outgoing", + "relation": "Contains", + "type": "Method", + "label": "search", + "path": "src/codebase_graph/retrieval/search.py", + "span": {"line_start": 123, "line_end": 149}, + } + ], + } + ) + + assert block.startswith("context Class id=Class:943d6556d328f1c7ca67 profile=definitions") + assert "file path src/codebase_graph/retrieval/search.py" in block + assert "outgoing Contains Method search L123-L149" in block + + def _load_benchmark_script() -> Any: spec = importlib.util.spec_from_file_location("compare_graph_output_tokens", SCRIPT_PATH) assert spec is not None diff --git a/tests/test_mcp_installer.py b/tests/test_mcp_installer.py index b238607..7809422 100644 --- a/tests/test_mcp_installer.py +++ b/tests/test_mcp_installer.py @@ -10,6 +10,8 @@ from codebase_graph.setup.clients import get_client_adapter from codebase_graph.setup.descriptor import build_server_descriptor from codebase_graph.setup.installer import ( + INSTALL_CLIENTS, + INSTALL_STRATEGIES, McpInstallOptions, default_server_name, install_mcp_clients, @@ -22,6 +24,18 @@ def test_default_server_name_is_namespace_safe() -> None: assert default_server_name("My Service") == "codebase_graph_my_service" +def test_install_strategy_registry_covers_advertised_clients() -> None: + assert set(INSTALL_CLIENTS) == set(INSTALL_STRATEGIES) + for client, strategy in INSTALL_STRATEGIES.items(): + assert strategy.adapter_client_id("local") + if strategy.native_command_builder is not None: + assert strategy.native_executable + if client == "claude": + assert strategy.adapter_client_id("project") == "claude-project" + if client == "claude-project": + assert strategy.install_scope("local") == "project" + + def test_codex_native_command_generation_uses_repo_server_name( tmp_path: Path, monkeypatch: pytest.MonkeyPatch, diff --git a/tests/test_mcp_portability.py b/tests/test_mcp_portability.py index ec44dcc..f13eecd 100644 --- a/tests/test_mcp_portability.py +++ b/tests/test_mcp_portability.py @@ -186,6 +186,12 @@ def test_stdio_mcp_wire_initialize_list_call_and_tool_error(tmp_path: Path) -> N "tools/call", {"name": "graph_search", "arguments": {"query": "SampleService", "limit": 2}}, ) + block_search = _rpc( + proc.stdin, + proc.stdout, + "tools/call", + {"name": "graph_search", "arguments": {"query": "SampleService", "limit": 2, "output_format": "block"}}, + ) failure = _rpc( proc.stdin, proc.stdout, @@ -201,9 +207,13 @@ def test_stdio_mcp_wire_initialize_list_call_and_tool_error(tmp_path: Path) -> N graph_search_tool = next(tool for tool in listed["result"]["tools"] if tool["name"] == "graph_search") assert "context_limit" in graph_search_tool["inputSchema"]["properties"] assert graph_search_tool["inputSchema"]["properties"]["detail"]["enum"] == ["slim", "standard"] + assert graph_search_tool["inputSchema"]["properties"]["output_format"]["enum"] == ["json", "block"] assert health["result"]["structuredContent"]["ok"] is True assert search["result"]["structuredContent"]["results"] assert "\n " not in search["result"]["content"][0]["text"] + assert block_search["result"]["structuredContent"] == search["result"]["structuredContent"] + assert block_search["result"]["content"][0]["text"].startswith("q SampleService\n") + assert "id=Class:" in block_search["result"]["content"][0]["text"] assert "error" not in failure assert failure["result"]["isError"] is True assert failure["result"]["structuredContent"]["error"]["type"] == "ValueError" diff --git a/tests/test_search.py b/tests/test_search.py index efa5787..1e76066 100644 --- a/tests/test_search.py +++ b/tests/test_search.py @@ -7,11 +7,12 @@ import pytest -from codebase_graph.cli import main as cli_main +from codebase_graph.cli import _build_parser, main as cli_main from codebase_graph.db import GraphNeighbor, SearchIndexRow from codebase_graph.ingest import GraphMaterializer +from codebase_graph.mcp.graph_commands import graph_command_spec, graph_tool_specs from codebase_graph.mcp.runtime import GraphRuntimeConfig -from codebase_graph.mcp.tools import MAX_GRAPH_QUERY_LIMIT, _query_payload, handle_tool_call +from codebase_graph.mcp.tools import MAX_GRAPH_QUERY_LIMIT, _query_payload, handle_tool_call, tool_specs from codebase_graph.reasoning import CompactContextBuilder, ContextNode from codebase_graph.retrieval.search import CompactContextPayload, SearchHit, SearchRequest, SearchService @@ -508,6 +509,31 @@ def test_cli_graph_commands_match_mcp_tool_payloads(tmp_path: Path, capsys: pyte assert "score" not in search_payload["results"][0] assert len(search_payload["results"][0].get("context", [])) <= 1 + assert cli_main([ + "graph-search", + "SampleService", + "--repo-root", + source_root.as_posix(), + "--db", + db_path.as_posix(), + "--manifest", + manifest_path.as_posix(), + "--limit", + "2", + "--context-limit", + "1", + "--detail", + "slim", + "--no-refresh", + "--format", + "block", + ]) == 0 + block_output = capsys.readouterr().out + assert block_output.startswith("q SampleService\n") + assert "file path sample_project/service.py" in block_output + assert "id=Class:" in block_output + assert not block_output.lstrip().startswith("{") + hit = next(item for item in search_payload["results"] if item["label"] == "SampleService") context_args = { "node_id": hit["id"], @@ -539,6 +565,31 @@ def test_cli_graph_commands_match_mcp_tool_payloads(tmp_path: Path, capsys: pyte ]) == 0 assert json.loads(capsys.readouterr().out) == handle_tool_call("graph_context", context_args, runtime=runtime) + assert cli_main([ + "graph-context", + "--node-id", + hit["id"], + "--node-type", + hit["type"], + "--repo-root", + source_root.as_posix(), + "--db", + db_path.as_posix(), + "--manifest", + manifest_path.as_posix(), + "--profile", + "definitions", + "--limit", + "1", + "--detail", + "slim", + "--format", + "block", + ]) == 0 + context_block = capsys.readouterr().out + assert context_block.startswith(f"context {hit['type']} id={hit['id']} profile=definitions\n") + assert "file path " in context_block + statement = "MATCH (n) RETURN count(n) AS total_nodes LIMIT 1" query_args = {"statement": statement, "parameters": {}, "limit": 5} assert cli_main([ @@ -556,6 +607,80 @@ def test_cli_graph_commands_match_mcp_tool_payloads(tmp_path: Path, capsys: pyte assert json.loads(capsys.readouterr().out) == handle_tool_call("graph_query", query_args, runtime=runtime) +def test_graph_command_specs_drive_mcp_tool_specs() -> None: + assert tool_specs() == graph_tool_specs() + + +def test_graph_command_specs_build_cli_payloads() -> None: + parser = _build_parser() + cases = [ + ( + [ + "graph-search", + "SampleService", + "--limit", + "2", + "--context-limit", + "1", + "--detail", + "slim", + ], + "graph_search", + { + "query": "SampleService", + "limit": 2, + "profile": "brief", + "budget": 600, + "context_limit": 1, + "detail": "slim", + }, + ), + ( + [ + "graph-context", + "--node-id", + "Class:1", + "--node-type", + "Class", + "--profile", + "definitions", + "--limit", + "1", + "--detail", + "slim", + ], + "graph_context", + { + "node_id": "Class:1", + "node_type": "Class", + "limit": 1, + "profile": "definitions", + "budget": 600, + "context_limit": 3, + "detail": "slim", + }, + ), + ( + [ + "graph-query", + "MATCH (n) RETURN n", + "--parameters", + '{"limit": 1}', + "--limit", + "5", + ], + "graph_query", + {"statement": "MATCH (n) RETURN n", "parameters": {"limit": 1}, "limit": 5}, + ), + ] + for argv, tool_name, expected_payload in cases: + args = parser.parse_args(argv) + spec = graph_command_spec(args.command) + + assert spec.tool_name == tool_name + assert spec.payload_from_args(args) == expected_payload + + def test_cli_graph_metadata_commands_do_not_open_graph_db(capsys: pytest.CaptureFixture[str]) -> None: assert cli_main(["graph-schema"]) == 0 schema_output = capsys.readouterr().out diff --git a/tests/test_setup_workflow.py b/tests/test_setup_workflow.py index 3fcda65..5ec3739 100644 --- a/tests/test_setup_workflow.py +++ b/tests/test_setup_workflow.py @@ -1,6 +1,7 @@ from __future__ import annotations import json +import re import sys from pathlib import Path @@ -16,7 +17,7 @@ from codebase_graph.mcp.runtime import runtime_config from codebase_graph.mcp.server import McpGraphServer, handle_tool_call from codebase_graph.setup import SetupError, SetupOptions, run_setup -from codebase_graph.setup.instructions import END_MARKER, START_MARKER +from codebase_graph.setup.instructions import END_MARKER, START_MARKER, upsert_instruction_block from codebase_graph.setup.mcp_config import configure_mcp_client, server_entry from codebase_graph.setup.state import build_setup_config, derive_setup_paths, load_setup_config, write_setup_config @@ -58,10 +59,18 @@ def test_setup_cli_creates_state_db_mcp_config_instructions_and_searchable_docs( assert agents_text.count(END_MARKER) == 1 assert "graph-search" in agents_text assert "graph-context" in agents_text + assert "--format block" in agents_text + assert re.search(r"graph-search .*--json", agents_text) is None + assert re.search(r"graph-context .*--json", agents_text) is None + assert "AI agents must use block format" in agents_text assert "graph-architecture-queries" in agents_text assert "MCP server" not in agents_text assert "graph_architecture_queries" not in agents_text assert "graph_query" not in agents_text + assert ( + "It is prohibited to read the code source before you find the target files using the graph." + in agents_text + ) mcp_payload = tomllib.loads(mcp_config_path.read_text(encoding="utf-8")) assert "otherServer" not in mcp_payload.get("mcp_servers", {}) assert mcp_payload["mcp_servers"]["codebase_graph"]["args"] == [ @@ -108,6 +117,26 @@ def test_setup_cli_creates_state_db_mcp_config_instructions_and_searchable_docs( assert any(hit["label"] == "SampleService" for hit in symbol_payload["results"]) +def test_claude_instruction_target_uses_block_format(tmp_path: Path) -> None: + repo_root = tmp_path / "fresh_repo" + repo_root.mkdir() + + result = upsert_instruction_block( + repo_root, + target="claude", + server_name="codebase_graph", + config_path=repo_root / ".codebaseGraph" / "config.json", + ) + claude_text = (repo_root / "CLAUDE.md").read_text(encoding="utf-8") + + assert result.action == "created" + assert result.path == (repo_root / "CLAUDE.md").as_posix() + assert not (repo_root / "AGENTS.md").exists() + assert "--format block" in claude_text + assert re.search(r"graph-search .*--json", claude_text) is None + assert re.search(r"graph-context .*--json", claude_text) is None + + def test_mcp_config_dry_run_preserves_existing_json_servers(tmp_path: Path) -> None: config_path = tmp_path / "mcp.json" config_path.write_text(