diff --git a/.agents/skills/publish-pip/SKILL.md b/.agents/skills/publish-pip/SKILL.md index c4bbc563..b88fdc9f 100644 --- a/.agents/skills/publish-pip/SKILL.md +++ b/.agents/skills/publish-pip/SKILL.md @@ -47,28 +47,34 @@ adding a runtime dependency to `pyproject.toml`. ```bash rm -rf dist build *.egg-info ``` -4. **Build** sdist + wheel: +4. **Sync agent artifacts** — ensure install_data copies match dev source: + ```bash + .venv/bin/python scripts/sync_agent_artifacts.py --check + ``` + If this fails, run `.venv/bin/python scripts/sync_agent_artifacts.py` to sync, + then commit the changes before publishing. +5. **Build** sdist + wheel: ```bash .venv/bin/python -m build ``` Expect `dist/java_codebase_rag--py3-none-any.whl` and `.tar.gz`. -5. **Verify the built version** before upload (catches a forgotten bump): +6. **Verify the built version** before upload (catches a forgotten bump): ```bash .venv/bin/python -c "import zipfile,glob; w=glob.glob('dist/*.whl')[0]; z=zipfile.ZipFile(w); m=[n for n in z.namelist() if n.endswith('METADATA')][0]; print([l for l in z.read(m).decode().splitlines() if l.startswith('Version')][0])" ``` -6. **Upload** (permanent — confirm the version is right first): +7. **Upload** (permanent — confirm the version is right first): ```bash .venv/bin/twine upload dist/* ``` twine prints the live URL on success: `https://pypi.org/project/java-codebase-rag//`. -7. **Verify on PyPI** via the JSON API. ⚠️ Python's `urllib`/`requests` SSL +8. **Verify on PyPI** via the JSON API. ⚠️ Python's `urllib`/`requests` SSL verification fails locally (missing CA bundle) — set `SSL_CERT_FILE`: ```bash CERT=$(.venv/bin/python -c "import certifi; print(certifi.where())") SSL_CERT_FILE="$CERT" .venv/bin/python -c "import urllib.request,json; d=json.load(urllib.request.urlopen('https://pypi.org/pypi/java-codebase-rag/json')); print('latest:', d['info']['version'])" ``` -8. **Commit + push the version bump** so the repo matches what was published +9. **Commit + push the version bump** so the repo matches what was published (commit convention: `bump version to X.Y.Z`). `dist/`, `build/`, and `*.egg-info` are gitignored — do not commit them. @@ -79,6 +85,7 @@ adding a runtime dependency to `pyproject.toml`. | Bump | edit `pyproject.toml` `version` | | Tooling | `.venv/bin/pip install build twine` | | Clean | `rm -rf dist build *.egg-info` | +| Sync | `.venv/bin/python scripts/sync_agent_artifacts.py --check` | | Build | `.venv/bin/python -m build` | | Verify wheel | read `Version:` from `dist/*.whl` METADATA | | Upload | `.venv/bin/twine upload dist/*` | diff --git a/AGENTS.md b/AGENTS.md index 7a358384..a94d015d 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,332 +1,37 @@ -# AGENTS.md - -Canonical agent instructions for Cursor, Claude Code, and other -agentic tools working on this repo. Cursor reads this file at the project -root (and nested `AGENTS.md` in subdirectories when working there). - -Project skills and tooling live under **`.agents/`** (tracked in git). -Create local symlinks if your editor expects the legacy paths: -`ln -s .agents .cursor` and `ln -s .agents .claude` (both are -gitignored). - -### Two audiences, two skill trees - -| Directory | Audience | Purpose | -|-----------|----------|---------| -| **`.agents/skills/`** (`.claude/skills/`, `.cursor/skills/`) | Agents **developing** this repo | propose, plan-prompts, pr-open, pr-review | -| **`skills/explore-codebase/`** (project root) | Agents **using** this tool on their own codebase | /explore-codebase — complete MCP operating manual | -| **`agents/explorer-rag-enhanced.md`** (project root) | Agents **using** this tool on their own codebase | Claude Code subagent combining RAG graph navigation with file-system search | - -`.agents/` skills are loaded by the agent working *on* java-codebase-rag source -code. `skills/` and `agents/` are shipped to consumers — they instruct an agent -to call the MCP tools (`search`, `find`, `describe`, `neighbors`, `resolve`) -against an indexed Java codebase. Do not mix the two: never import consumer -skills/agents into `.agents/skills/` or vice versa. - -This repo is a **self-contained stdio MCP server** that serves semantic -+ structural search over a Java codebase. It is a Python project (the -indexer and server). It is **not** a Java project — the -`tests/bank-chat-system/` tree is fixture data, not code to modify. - -Treat README and the markdown docs as the source of truth for -behaviour, schemas, env vars, ranking, edges, tool defaults, and -ontology. **Do not copy that content here** — read those files directly -when needed. - -## Where to look - -- `README.md` — pip-first landing page: install, 5-minute walkthrough on the - bank-chat fixture, MCP host wiring (Claude Code / Claude Desktop), the - five-tool cheat sheet (`search` / `find` / `describe` / `neighbors` / `resolve`), - and the CLI cheat sheet. Pointers out to other docs for depth. -- [`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) — environment - variables, full `.java-codebase-rag.yml` reference, **graph layer** - (node kinds, edges, capabilities, ranking, "Re-index required" - callouts), brownfield overrides, ignore patterns. The current - `ontology_version` is **17** (`EDGE_SCHEMA` in `java_ontology.py`; - material `OVERRIDES` Symbol→Symbol edges: subtype instance method → - supertype declaration with matching `signature`, one - `IMPLEMENTS`/`EXTENDS` hop; valid `neighbors` `EdgeType`). - Earlier ontology bumps are described inline in that doc's §3 - callouts list. -- [`docs/JAVA-CODEBASE-RAG-CLI.md`](./docs/JAVA-CODEBASE-RAG-CLI.md) — - operator guide for the `java-codebase-rag` CLI (`init` / `increment` / - `reprocess` / `erase`, `meta`, `tables`, `diagnose-ignore`, - `analyze-pr`; hidden `refresh` alias → `reprocess` — see that doc). -- `docs/CODEBASE_REQUIREMENTS.md` — Java-repo assumptions and per-file map of - what to edit when a target tree doesn't match defaults. -- `tests/README.md` — testing philosophy. -- **`skills/explore-codebase/`** — user-facing skill shipped to java-codebase-rag consumers. Single self-contained operating manual for the 5-tool MCP. Developer workflow skills live in **`.agents/skills/`**, not here. -- **`agents/explorer-rag-enhanced.md`** — user-facing Claude Code subagent shipped to consumers. Combines RAG graph navigation with file-system search for universal codebase exploration. -- **`propose/`** — design proposes. **In-flight** proposes live in - **`propose/active/`**. **`propose/completed/`** — landed work and rationale. - **List or search this tree** for current filenames; do not rely on enumerated - copies here. -- **`plans/`** — longer-form multi-PR plans (`PLAN-*.md`) and - **`AGENT-PROMPTS-*.md`** for per-PR agent handoffs. Active plans live in - **`plans/active/`**. **`plans/completed/`** — finished plans and completed - prompt sets (templates). **Open the directory**; don't cache a mental file - list from here. - -## File map (top of repo) - -| File | Role | -|------|------| -| `server.py` | MCP stdio server. Every `@mcp.tool` lives here. | -| `search_lancedb.py` | Vector / hybrid / graph-expanded search; ranking. | -| `build_ast_graph.py` | Tree-sitter → LadybugDB graph builder (full rebuild). Owns `pass1`–`pass6` (`pass5` emits `HTTP_CALLS` / `ASYNC_CALLS` caller edges; `pass6_match_edges` resolves cross-service / intra-service / ambiguous / phantom / unresolved match outcomes — ontology 7). | -| `ladybug_queries.py` | Read-only Cypher helpers used by the server. Includes `meta()` decoder for the LadybugDB MAP-as-STRING JSON-blob columns. | -| `ast_java.py` | Tree-sitter Java parsing, role/capability inference, `_string_value_atoms` helper (shared by route/client/producer extractors), `_collect_outgoing_calls` for caller-side detection. | -| `graph_enrich.py` | `module` / `microservice` resolution, `BrownfieldOverrides` (route + role + capability + http client + async producer), meta-annotation walk, `resolve_routes_for_method` / `resolve_http_client_for_method` / `resolve_async_producer_for_method`. | -| `java_ontology.py` | Source of truth for `VALID_ROLES`, `VALID_CAPABILITIES`, `VALID_CLIENT_KINDS`, `VALID_HTTP_CALL_STRATEGIES`, `VALID_ASYNC_CALL_STRATEGIES`, `VALID_HTTP_CALL_MATCHES`. | -| `chunk_heuristics.py` | Query-time chunk hints (no AST / no re-index). | -| `mcp_hints.py` | MCP v2 road-sign `hints` catalog (`generate_hints`; locked v1 templates in `propose/completed/HINTS-ROAD-SIGNS-PROPOSE.md`). | -| `index_common.py` | Embedding config (no CocoIndex dep). | -| `java_index_flow_lancedb.py` | CocoIndex flow (used by `java-codebase-rag init` / `increment` / `reprocess` / `erase`). | -| `java_index_v1_common.py` | Shared file walker / exclude patterns. | -| `path_filtering.py` | Layered ignore patterns (`.gitignore`-style; PR-C / B5). Reused by indexer + graph build. | -| `pr_analysis.py` | `java-codebase-rag analyze-pr` helpers (PR-B / B4) — diff parsing, hunk-to-symbol mapping. | -| `mcp.json.example` | Template for `.mcp.json`. | - -## Test layout - -- `tests/conftest.py` — session-scoped LadybugDB graph fixture. -- `tests/bank-chat-system/` — deterministic Java corpus (fixture, not production model). -- `tests/fixtures/call_graph_smoke/` — mini Maven tree calibrated against the call-graph resolver. -- `tests/fixtures/brownfield_route_stubs/` — `@CodebaseRoute` / `@CodebaseRoutes` source stubs (PR-A3). -- `tests/fixtures/brownfield_client_stubs/` — `@CodebaseHttpClient` / `@CodebaseHttpClients` / `@CodebaseProducer` / `@CodebaseProducers` source stubs (PR-D2). -- `tests/fixtures/http_caller_smoke/` — Feign + RestTemplate + KafkaTemplate + WebClient + StreamBridge fixture for caller-side detection (PR-D1). -- Heavy e2e tests gated behind `JAVA_CODEBASE_RAG_RUN_HEAVY=1`. - -## Breaking changes and compatibility - -- **Breaking changes are always allowed.** Do not keep compatibility with - prior versions, external consumers, or hypothetical “users” of this - repo unless the current task explicitly asks for a compatibility layer. -- Prefer straightforward removals and schema or API updates over - deprecation periods, dual code paths, shims, or version branching unless - there is a clear, stated need in the task at hand. +# java-codebase-rag ## Python environment -- Use only the repository `.venv/bin/python` for Python commands (repo root). -- Use only `.venv/bin/pip` for package install and dependency commands. -- Do not use system `python`, `python3`, `pip`, or `pip3` for this repo - unless you have explicitly activated `.venv` and that is what those - resolve to. -- When running tests, linters, or scripts, invoke the `.venv/bin` - executables directly. -- Examples: - - `.venv/bin/python -m pytest tests -q` - - `.venv/bin/ruff check .` - - `.venv/bin/pip install -r requirements.txt` - -## Investigate before editing - -For any non-trivial change, read the relevant doc first instead of -inferring from code: - -- Behaviour / public surface → `README.md`. -- Brownfield assumptions, role/capability tuning → `docs/CODEBASE_REQUIREMENTS.md`. -- In-flight design proposes → **`propose/active/`**. - **List or search** for current names. -- Why current design exists → `propose/completed/` and `plans/completed/`. -- Testing philosophy → `tests/README.md`. -- In-flight multi-PR scope → **`plans/active/`**. - **List or search** for active `PLAN-*.md` / `AGENT-PROMPTS-*.md`. - Finished plans and prompt templates → `plans/completed/`. - -## Propose-then-implement culture - -The repo has a strong "propose then implement" culture (`propose/`, -`plans/`). For non-trivial features: - -1. Drop a short markdown propose under `propose/active/` describing scope, - schema impact, reindex requirement, and tests touched. -2. For multi-PR efforts, add a matching `plans/active/PLAN-.md` with - per-PR sections, then `plans/active/AGENT-PROMPTS-.md` with the - per-PR agent task prompts. -3. Reference the propose / plan from the PR description. -4. Move propose into `propose/completed/` (or plan into - `plans/completed/`) once the *whole* effort is landed — not after - each PR. - -Skip this for clearly-bounded fixes (one-file bugs, doc edits, test -loosening). Use judgement. - -## Per-PR agent task contract - -When you're given a per-PR task prompt from `plans/AGENT-PROMPTS-*.md` -(or a completed prompt file in `plans/completed/` as a structural -template): - -- **Scope is binding.** The "Out of scope (do NOT touch)" list is a - hard constraint, not a guideline. Sentinel grep patterns the prompt - lists must return zero on `git diff master..HEAD`. -- **Implement in the listed order.** Do not reshape the PR or roll - multiple PRs together. -- **Match named tests verbatim.** When the plan lists - `test__`, that is the exact name to use. If you - add, drop, or rename tests, update the plan/prompt text in the same - change so reviewers are not chasing a stale list. -- **No drive-by lint fixes.** Removing an unused `import` in a file - the PR doesn't otherwise touch is still a scope leak. If a file - isn't in the deliverables list, don't touch it. -- **PR description must include**: scope statement, manual evidence - (with the exact command from the prompt), and any intentional design - divergences from sibling PRs called out explicitly so the reviewer - doesn't flag them as bugs. - -## Editing rules - -- No compatibility shims or deprecation cycles (see **Breaking changes** - above). -- One source of truth for ontology values lives in `java_ontology.py`. - Don't sprinkle role / capability / client-kind / strategy / match - string literals across other modules. Current valid sets: `VALID_ROLES`, - `VALID_CAPABILITIES`, `VALID_CLIENT_KINDS`, `VALID_HTTP_CALL_STRATEGIES`, - `VALID_ASYNC_CALL_STRATEGIES`, `VALID_HTTP_CALL_MATCHES`, - `VALID_ROUTE_FRAMEWORKS`, `VALID_ROUTE_KINDS`, `VALID_PRODUCER_KINDS`, - `VALID_RESOLVE_REASONS`, `VALID_UNRESOLVED_CALL_REASONS`. -- Schema changes that affect the Lance index or LadybugDB graph need a - matching update to the README "Re-index required" callout. Bump - `ontology_version` when enrichment semantics change (currently **17**). -- Brownfield is a first-class surface: any new auto-detection (route, - role, capability, http client, async producer) must compose with the - matching `BrownfieldOverrides` layer. Last writer wins (outermost layer - overrides earlier ones), with one explicit exception: caller-side - `HTTP_CALLS` / `ASYNC_CALLS` use option-(b) *replacement* rather than - union when any brownfield layer fires on a method (single network packet - → single edge). See `plans/completed/PLAN-TIER1B-COMPLETION.md` § - "Caller-side composition divergence". -- LadybugDB's Python binder rejects `dict` for `MAP` columns. Store all - map-shaped graph_meta data (`routes_by_framework`, `routes_by_layer`, - `http_calls_by_strategy`, `async_calls_by_strategy`, etc.) as `STRING` - JSON blobs and decode in `ladybug_queries.meta()`. -- `server.py` is a stdio MCP server: anything reachable from a tool - handler must not write to **stdout** (that's the JSON-RPC transport). - Diagnostics go to stderr. -- Tool `description=` strings and `_INSTRUCTIONS` in `server.py` are - read by LLM clients to choose tools — treat them as part of the - contract, not freeform docs. -- Don't overfit to the `tests/bank-chat-system/` fixture. It is a - deterministic corpus, not a model of production. Assert on invariants, - not exact counts. Don't special-case the fixture in production code. -- Don't introduce a parallel `*Overrides` class when extending brownfield - support. `BrownfieldOverrides` already holds route, role, capability, - http client, and async producer dicts — extend it in place. - -## LadybugDB Cypher pitfalls - -When adding or editing Cypher run against LadybugDB (for example in -`ladybug_queries.py`, `mcp_v2.py`, or any `LadybugGraph._rows` caller): - -- **Do not filter relationship types with** `label(e) IN $list` **or** - `label(e) IN ["A","B"]` **in** `WHERE`. On supported versions this can - be ignored or wrong; prefer **OR of scalar equalities** - (`label(e) = $p OR label(e) = $q …`) with bound parameters, after - validating labels against an allowlist (see `neighbors_v2` in - `mcp_v2.py`). -- **Typed union patterns** like `-[e:CALLS|HTTP_CALLS]->` are only safe if - every column you `RETURN` from `e` exists on **all** of those - relationship types in the graph schema. Otherwise prefer untyped `[e]` - plus explicit label filtering, or split queries. - -## Validate - -- `.venv/bin/ruff check .` — fix or justify warnings. -- `.venv/bin/python -m pytest tests -v` — must pass without - `JAVA_CODEBASE_RAG_RUN_HEAVY`. Expect skips only where tests document - env gating (see `tests/README.md`). Each plan may add tests; match the - active plan if it cites a count. -- Exception for isolated automation workflow changes: if edits are limited to - `automation/cursor_propose_only/**` (plus optional docs references to that - workflow), targeted validation is enough: - - `.venv/bin/ruff check .` - - `.venv/bin/python -m pytest automation/cursor_propose_only/tests -q` -- For schema or ranking work, also run with - `JAVA_CODEBASE_RAG_RUN_HEAVY=1` locally (slow; downloads models). -- For graph builder changes, also rebuild a fixture and inspect - `java-codebase-rag meta` (or `GraphMetaOutput` from the same helper) to - confirm new counters wire up: - ```bash - rm -rf /tmp/check && .venv/bin/python build_ast_graph.py \ - --source-root tests/bank-chat-system \ - --ladybug-path /tmp/check/code_graph.lbug --verbose - ``` - -## Commit and PR - -- Branch from `master`. Branch names: - - `cursor/` — cursor-agent work - - `feat/` — landed-feature work (e.g. `feat/b2b-http-async-edges`) - - `plan/` — in-progress plan / propose drafts - - `chore/` — repo hygiene (docs, tooling, deps) -- Commit messages: present tense, imperative, lowercase first word, - matching existing style (e.g. `fixed call graph review D6`, - `applied fixes for call graph layer`). -- One logical change per commit when feasible. -- Always open a PR; never push directly to `master`. -- PR body should reference any propose / plan it implements, list - user-visible behaviour changes, and call out reindex / env-var / - ontology bumps explicitly. - -## Don't - -- Don't run `gh auth status` or otherwise inspect credentials. -- Don't widen the public surface "just in case" — every new tool, - env var, or schema column adds a re-index burden on users. -- Don't special-case the `tests/bank-chat-system/` fixture in - production code. If a test needs it, the test is wrong (see - `tests/README.md`). -- Don't tighten loose test assertions (`>= 1`, `len(...) >= N`, - `key in result`) into exact counts to chase a number — they are - intentionally loose. -- Don't add a hard dependency on `cocoindex` outside - `java_index_flow_lancedb.py` / the `java-codebase-rag` lifecycle (`init` / - `increment` / `reprocess` / `erase`) path. - -## Cursor Cloud specific instructions - -This is a self-contained Python project — no external services -(no Postgres, Kafka, Docker) are needed. All storage (LadybugDB, LanceDB, -CocoIndex state) is embedded/file-based. - -### Environment +Use `.venv/bin/python` and `.venv/bin/pip` (repo root) for all Python commands. +Invoke the `.venv/bin` executables directly — never system `python`/`pip`. -- Python 3.11+ with `.venv` at repo root. The update script creates - the venv and installs deps if missing. -- `.venv/bin` must be on `PATH` for CLI tests - (`test_java_codebase_rag_cli.py` uses - `shutil.which("java-codebase-rag")`). The update script handles - this via `~/.bashrc`. -- The package must be installed in **editable mode** - (`pip install -e .`) so the `java-codebase-rag` CLI entry point - is registered. The update script handles this. +## Tests -### Running checks +- Erase stale manual indexes first — they hijack project-root discovery: + `rm -rf tests/*/.java-codebase-rag tests/*/.java-codebase-rag.{yml,hosts}` +- Tests build their own fresh index in a temp dir; never commit one under + `tests/` (`.gitignore` un-ignores it there). +- The full suite is slow. Run only the subset relevant to your change during + development; run the full suite once, at the end of the task. -```bash -.venv/bin/ruff check . -.venv/bin/python -m pytest tests -v -``` +## Docs -Heavy (CocoIndex + LanceDB e2e) tests are gated behind -`JAVA_CODEBASE_RAG_RUN_HEAVY=1` and download the embedding model on -first run. They are not required for normal development. +All files in `docs/` are **operator-facing**. No internal docs yet. -### Hello-world verification +**Operator docs** +- `docs/CONFIGURATION.md` — env vars, project YAML, ontology, brownfield overrides, ignore patterns. +- `docs/JAVA-CODEBASE-RAG-CLI.md` — operator CLI playbook (workflows, exit codes, env alignment). +- `docs/AGENT-GUIDE.md` — agent-facing MCP operating manual (copy-paste into `AGENTS.md`/`CLAUDE.md`). +- `docs/EDGE-NAVIGATION.md` — MCP-traversable edges, directions, dot-key composition. +- `docs/MANUAL-VERIFICATION-CHECKLIST.md` — 7-phase post-index verification. +- `docs/CODEBASE_REQUIREMENTS.md` — assumptions about the target Java repo. +- `docs/PRODUCT-VISION.md` — long-term product direction. +- `docs/paper/paper.pdf` — architecture report (rationale, GPS metaphor, ontology). -Build the LadybugDB graph from the test fixture and inspect it: +**Internal docs** — none yet. -```bash -rm -rf /tmp/check && .venv/bin/python build_ast_graph.py \ - --source-root tests/bank-chat-system \ - --ladybug-path /tmp/check/code_graph.lbug --verbose -.venv/bin/java-codebase-rag meta \ - --source-root tests/bank-chat-system --index-dir /tmp/check -``` +## Shipped artifacts -The MCP server (`server.py`) is stdio-based and is not started as a -long-running dev server — it is invoked by MCP hosts (Claude Desktop, -Claude Code) directly. +`skills/` and `agents/` are shipped consumer artifacts — deployed verbatim by +`install`/`update` to the user's agent host. This repo is the source of truth; +never hand-patch deployed copies. diff --git a/README.md b/README.md index 216d0db2..a2d80c84 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,8 @@ # java-codebase-rag -A graph-native code intelligence layer for Java microservice estates, exposed to LLM agents via the **Model Context Protocol (MCP)**. +A graph-native code intelligence layer for Java microservice estates — usable as an **MCP server** or a **CLI** (`jrag`), two surfaces over the same graph. -The system extracts a deterministic property graph from Java source (tree-sitter), stores it in **LadybugDB** (graph) alongside a **LanceDB** vector index (chunks), and exposes a deliberately small MCP surface — **five tools**: `search`, `find`, `describe`, `neighbors`, `resolve` — that collapse onto three primitive agent operations: **locate**, **inspect**, **walk**. +The system extracts a deterministic property graph from Java source (tree-sitter), stores it in **LadybugDB** (graph) alongside a **LanceDB** vector index (chunks), and exposes two agent surfaces, picked at install time (`java-codebase-rag install --surface mcp|cli`): the **MCP** surface ships five tools — `search`, `find`, `describe`, `neighbors`, `resolve` — over stdio; the **CLI** surface ships `jrag`, one command per engineering intent. Both collapse onto three primitive operations: **locate**, **inspect**, **walk**. > **What this MCP is:** a **GPS for code navigation**, not a reasoning engine. > Agents use a simple loop: @@ -31,7 +31,7 @@ Generic code-search tools (grep, ctags, vector-only RAG) hit a ceiling on real J - **Brownfield annotations as a first-class override.** Real Java estates have hand-rolled HTTP clients, dynamic topic names, reflection-heavy routing. `@CodebaseHttpRoute`, `@CodebaseAsyncRoute`, `@CodebaseHttpClient`, and `@CodebaseProducer` let you pin the truth in source. They have **exclusive priority** — when a symbol is annotated, framework-convention inference is skipped entirely. You get a correct graph on legacy code without rewriting it. -The rest of this README is the install, walkthrough, and tool cheat sheet for putting that to work. +The rest of this README is the install, the tool/command orientation, and the reference for putting that to work. --- @@ -75,91 +75,11 @@ If you prefer manual configuration, see [`docs/JAVA-CODEBASE-RAG-CLI.md`](./docs --- -## 5-minute walkthrough — index this repo's bank-chat fixture +## Tools & commands at a glance -This repo ships a small multi-module Spring fixture under [`tests/bank-chat-system/`](./tests/bank-chat-system/) (`chat-core` + `chat-assign`) that the test suite uses for calibration. You can index it and confirm the install works end-to-end in under five minutes — no agent host required. +Pick a surface once at install time — `java-codebase-rag install --surface mcp|cli` (default `mcp`). Both surfaces walk the same LanceDB vectors + LadybugDB graph. -```bash -# 1. Clone the repo to get the fixture (the published package doesn't include tests/) -git clone https://github.com/HumanBean17/java-codebase-rag -cd java-codebase-rag - -# 2. Build the index (Lance vectors + LadybugDB graph). First run downloads the -# embedding model (~90 MB) and takes ~30-60s on the fixture. -java-codebase-rag init --source-root tests/bank-chat-system --index-dir tmp/bank-chat-index - -# 3. Inspect what landed (resolved config, edge counts, ontology version) -java-codebase-rag meta --source-root tests/bank-chat-system --index-dir tmp/bank-chat-index -``` - -> **Windows users:** these smoke-test snippets use POSIX shell syntax (`VAR=value` prefix, `\` line continuations). Run them under **Git Bash** or **WSL**, or skip straight to `java-codebase-rag install`, which wires up MCP registration and configuration without a shell. - -Smoke-test the index with two checks (`search_lancedb` ships with the package): - -```bash -# Vector search — proves the LanceDB side works -JAVA_CODEBASE_RAG_INDEX_DIR=tmp/bank-chat-index \ - python -m search_lancedb "chat ingress controller" --table java --limit 3 - -# Vector + graph expansion — proves LadybugDB is wired in -JAVA_CODEBASE_RAG_INDEX_DIR=tmp/bank-chat-index \ - python -m search_lancedb "chat ingress controller" --table java --limit 3 \ - --graph-expand --expand-depth 2 -``` - -If vector hits come back and graph expansion adds neighbor symbols, the install works end-to-end. Wire it into your agent next — the five MCP tools (`search`, `find`, `describe`, `neighbors`, `resolve`) are reachable over stdio. - ---- - -## Wire into an MCP host - -> **Quick setup:** Run `java-codebase-rag install` from your Java project root. The interactive wizard handles MCP registration, skill deployment, and configuration for Claude Code, Qwen Code, and GigaCode in one step. - -### Claude Code (manual) - -With the package installed, the console script `java-codebase-rag-mcp` is on your `PATH`. Register it project-scoped: - -```bash -claude mcp add --transport stdio java-codebase-rag -- java-codebase-rag-mcp -``` - -**Zero-env-var configuration:** The tool automatically walks up the directory tree to find `.java-codebase-rag.yml`, so you don't need to set `JAVA_CODEBASE_RAG_SOURCE_ROOT` when working from within a project. Just place the config file at your project root and the tool will find it. See [`mcp.json.example`](./mcp.json.example) for the minimal configuration. - -If you need to override defaults, you can set env vars (`JAVA_CODEBASE_RAG_INDEX_DIR`, `JAVA_CODEBASE_RAG_SOURCE_ROOT`, `SBERT_MODEL`, …) in `.mcp.json` or your shell profile. For a full configuration template, see [`mcp.json.example`](./mcp.json.example). Official docs: [Claude Code settings](https://docs.anthropic.com/en/docs/claude-code/settings). - -### Claude Desktop - -Edit `claude_desktop_config.json` (macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`) and add under `mcpServers`: - -```json -{ - "mcpServers": { - "java-codebase-rag": { - "command": "java-codebase-rag-mcp", - "env": { - "JAVA_CODEBASE_RAG_INDEX_DIR": "/ABSOLUTE/PATH/TO/.java-codebase-rag", - "JAVA_CODEBASE_RAG_SOURCE_ROOT": "/ABSOLUTE/PATH/TO/your-java-project" - } - } - } -} -``` - -See [`mcp.json.example`](./mcp.json.example) for the same shape in `.mcp.json` (Claude Code project-scoped) form. - -### Driving the MCP from an agent - -Pick **one** of two options (not both — they cover the same navigation intents): - -1. **[`docs/AGENT-GUIDE.md`](./docs/AGENT-GUIDE.md)** (recommended for most) — standalone MCP operating manual. Copy-paste the `BEGIN`/`END` block into your project's `QWEN.md`, `CLAUDE.md`, or `AGENTS.md`. Contains: five-tool reference, `NodeFilter` / edge taxonomy, ontology glossary, recovery playbook, and navigation patterns. Self-contained — no external file dependencies. - -2. **[`/explore-codebase`](./skills/explore-codebase/SKILL.md)** (for hosts with skill discovery) — single self-contained skill with the complete operating manual. If your MCP host supports skill discovery (Claude Code, Qwen Code, Cursor), load `/explore-codebase` to get the full tool reference, edge taxonomy, decision tree, and recovery playbook in one shot. - -Also: **[`docs/MANUAL-VERIFICATION-CHECKLIST.md`](./docs/MANUAL-VERIFICATION-CHECKLIST.md)** — 7-phase agent-driven verification you run after indexing your real project. - ---- - -## The five tools, at a glance +**MCP surface — five tools over stdio** | Tool | Purpose | Required args | |---|---|---| @@ -171,9 +91,63 @@ Also: **[`docs/MANUAL-VERIFICATION-CHECKLIST.md`](./docs/MANUAL-VERIFICATION-CHE Full schemas, `NodeFilter` / `EdgeFilter` semantics, and the hints contract live in [`docs/AGENT-GUIDE.md`](./docs/AGENT-GUIDE.md). Edge types and traversal directions are listed in [`docs/EDGE-NAVIGATION.md`](./docs/EDGE-NAVIGATION.md). +**CLI surface — `jrag`, one command per engineering intent** + +```bash +# Orientation +jrag status # index health (ontology version, freshness, counts) +jrag microservices # microservices with resolved type counts +jrag map # counts per kind per service/module +jrag map --module # group by module instead +jrag conventions # dominant roles + framework tallies +jrag overview chat-core # bundle for a microservice +jrag overview /chat/assign # route flow (inbound callers + outbound CALLS) +jrag overview banking.chat # topic producers + consumers +jrag overview chat-core --as microservice # override auto-detection + +# Locate +jrag find ChatService # exact name/FQN lookup (symbols) +jrag find --role CONTROLLER # filter mode (NodeFilter flags) +jrag inspect ChatService # full node details + edge_summary +jrag outline src/main/.../Foo.java # all symbols declared in a file +jrag imports src/main/.../Foo.java # imports resolved to graph nodes + +# Listings +jrag routes # HTTP routes +jrag clients # HTTP clients (Feign / RestTemplate / WebClient) +jrag producers # async message producers (Kafka / StreamBridge) +jrag topics # message topics grouped by producer +jrag jobs # scheduled tasks (@Scheduled) +jrag listeners # message listeners (@KafkaListener etc.) +jrag entities # JPA entities + +# Traversals (all resolve-first) +jrag callers ChatService#assign(Request) # who calls me? +jrag callees ChatService#assign(Request) # what do I call? +jrag hierarchy AbstractBase # type tree (parents + children) +jrag implementations PaymentProcessor # classes implementing an interface +jrag subclasses AbstractRepository # classes extending a type +jrag overrides Impl#run() # methods this overrides (dispatch UP) +jrag overridden-by Iface#run() # methods overriding this (dispatch DOWN) +jrag dependents PaymentGateway # who injects this type? +jrag dependencies ChatService # types this injects +jrag impact PaymentGateway # fleet-wide blast radius +jrag decompose ChatIngressController#assign # role-waterfall flow +jrag flow /chat/assign # request flow through a route +jrag connection chat-core # cross-service connections + +# Semantic search +jrag search "assign a chat agent" # semantic over Lance (java table) +jrag search "kafka" --table all # java + sql + yaml tables +jrag search "audit" --hybrid # vector + keyword hybrid +jrag search "audit" --offset 5 # paginated +``` + +Every `` command takes human-readable identifiers (FQN / simple name / route path / topic) — never raw node IDs. Output contract, flags, and the resolve-first rule are in [`jrag` — agent CLI](#jrag--agent-cli) below. + ### Three-layer architecture -Layer 1 (storage) → Layer 2 (5 MCP tools) → Layer 3 (skill). The [`/explore-codebase`](./skills/explore-codebase/SKILL.md) skill provides the full operating manual for Layer 2. See the [architecture diagram in `skills/README.md`](./skills/README.md#three-layer-architecture). +Layer 1 (storage) → Layer 2 (5 MCP tools **or** the `jrag` CLI) → Layer 3 (skill). The MCP-surface skill **[`/explore-codebase`](./skills/explore-codebase/SKILL.md)** documents the 5-tool MCP; the CLI-surface skill **[`/explore-codebase-cli`](./skills/explore-codebase-cli/SKILL.md)** documents the `jrag` CLI (PR-JRAG-5). See the [architecture diagram in `skills/README.md`](./skills/README.md#three-layer-architecture). --- @@ -209,6 +183,71 @@ Run `java-codebase-rag --help` to list grouped subcommands. Operator playbook wi --- +## jrag — agent CLI + +`jrag` is a separate console script (alongside `java-codebase-rag`) built for AI +coding agents. It gives the agent **one command per engineering intent** and +takes human-readable identifiers (FQN / simple name / route path / topic) — +never raw node IDs. Every `` command resolves the identifier via +`resolve_v2` as the first step; on `many` it returns candidates and stops, on +`none` it returns `not_found`. Auto-pick is forbidden. + +The default output is compact text (a deliberate divergence from the operator +CLI's TTY heuristic — `jrag` is agent-facing/non-TTY). `--format json` emits the +shared envelope verbatim. Every command emits the same envelope shape: + +```json +{ + "status": "ok", + "nodes": {"com.example.Foo": {"kind": "symbol", "fqn": "com.example.Foo"}}, + "edges": [{"edge_type": "CALLS", "confidence": 0.9, "target": "com.example.Bar#baz()"}], + "root": "com.example.Foo", + "agent_next_actions": ["jrag callees com.example.Foo#bar()"], + "truncated": false +} +``` + +No raw graph node id ever appears on either surface: `nodes` is keyed by each +node's natural identifier (FQN for symbols, `METHOD path` for routes, +`member_fqn->target` for clients, `topic:` for topics), `root` is the +root's natural identifier, and each edge carries `target` (the referenced node's +identifier) instead of a graph id. The agent reuses these identifiers directly +as the next command's `` — there is nothing else to pass. + +`agent_next_actions` carries up to 5 contextual next-step hints (e.g. after +`inspect`, the agent sees `jrag callers `, `jrag callees `, etc. for +the edges the root actually has). Omitted from JSON when empty. + +The full command catalog lives in [Tools & commands at a glance](#tools--commands-at-a-glance). + +### Flags + +| Flag | Scope | Effect | +|------|-------|--------| +| `--format text\|json` | all | output format (default: text) | +| `--service ` | listings/traversals | filter by microservice | +| `--module ` | listings/traversals | filter by module | +| `--limit ` | listings/traversals | cap results (default 20; `limit+1` fetch detects truncation) | +| `--offset ` | `find`, `search` only | paginate (other commands reject it) | +| `--kind symbol\|route\|client\|producer` | `` commands | resolve hint | +| `--java-kind`, `--role`, `--fqn-prefix` | `` commands | client-side post-filters | +| `--index-dir ` | all | override index directory | + +`--offset` is intentionally NOT a global flag: only `find` and `search` route +through backends that accept it. Every other command rejects it. + +A missing or stale index produces an actionable `status: error` envelope (exit +2) rather than a traceback: + +``` +error: No index at /path/to/code_graph.lbug. Run: java-codebase-rag init --source-root +``` + +See [`plans/active/PLAN-JRAG-CLI.md`](./plans/active/PLAN-JRAG-CLI.md) for the +full design and per-PR breakdown. + +--- + ## Further reading | Document | What's in it | @@ -218,10 +257,9 @@ Run `java-codebase-rag --help` to list grouped subcommands. Operator playbook wi | [`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) | Environment variables, project YAML, graph ontology, brownfield overrides, ignore patterns. | | [`docs/JAVA-CODEBASE-RAG-CLI.md`](./docs/JAVA-CODEBASE-RAG-CLI.md) | CLI operator playbook: workflows, exit codes, env alignment. | | [`docs/EDGE-NAVIGATION.md`](./docs/EDGE-NAVIGATION.md) | MCP-traversable edges, directions, dot-key composition. | -| [`skills/`](./skills/) | Single `/explore-codebase` skill — complete MCP operating manual for hosts with skill discovery (alternative to copy-pasting AGENT-GUIDE). See [`skills/README.md`](./skills/README.md). | +| [`skills/`](./skills/) | `/explore-codebase` (MCP surface) + `/explore-codebase-cli` (CLI surface) skills — operating manuals for hosts with skill discovery (alternative to copy-pasting AGENT-GUIDE). See [`skills/README.md`](./skills/README.md). | | [`docs/MANUAL-VERIFICATION-CHECKLIST.md`](./docs/MANUAL-VERIFICATION-CHECKLIST.md) | 7-phase agent-driven verification after indexing your project. | | [`docs/CODEBASE_REQUIREMENTS.md`](./docs/CODEBASE_REQUIREMENTS.md) | Assumptions about your Java repo + per-file edit map for non-conforming codebases. | -| [`automation/cursor_propose_only/README.md`](./automation/cursor_propose_only/README.md) | Optional proposal orchestration workflow (single-command autopilot, planning bundles, automated execution/review loops). | | [`docs/PRODUCT-VISION.md`](./docs/PRODUCT-VISION.md) | Long-term product direction. | --- diff --git a/agents/explorer-rag-cli.md b/agents/explorer-rag-cli.md new file mode 100644 index 00000000..62b9083b --- /dev/null +++ b/agents/explorer-rag-cli.md @@ -0,0 +1,291 @@ +--- +name: explorer-rag-cli +description: "MUST BE USED PROACTIVELY. Universal read-only explorer agent that drives the `jrag` CLI for graph-native codebase navigation (callers, callees, routes, clients, producers, impact, search, inspect, flow, overview) and falls back to file-system search (grep, glob, file reading). Use for any exploration task: locating code, tracing dependencies, finding patterns, answering 'where is X' or 'who calls Y'. Read-only — never edits files. This is the CLI-surface counterpart to explorer-rag-enhanced (which uses the MCP tools)." +--- + +You are a universal codebase explorer — a read-only search and navigation specialist that drives the **`jrag` CLI** (the agent-facing shell surface of java-codebase-rag) and falls back to **broad file-system search** (grep, glob, file reading) when the index is missing or stale. + +## Core Principles + +1. **Read-only.** Never edit, write, or modify any file. Only locate, read, and report. +2. **Names in, names out.** Every `` is human-readable (FQN / simple name / route path / topic). Raw node IDs are never required — `jrag` resolves internally. +3. **One command per intent.** `jrag` collapses resolve + walk into one call. Pick the command that matches the intent; do not chain resolve→inspect→traverse manually. +4. **Smallest sufficient tool.** Pick the lightest tool that answers the question. Don't run `jrag impact` when a single `jrag callers` suffices; don't `Grep` the whole repo when `jrag inspect ` answers exactly. +5. **Excerpts over dumps.** When searching broadly, read excerpts and relevant sections rather than entire files. Summarize findings; don't dump raw content. +6. **Stop when answered.** Don't prefetch unrelated subgraphs or scan unrelated directories. Report findings as soon as the question is answered. + +## Why `jrag` (CLI) vs `java-codebase-rag-mcp` + +You are the **CLI-surface** explorer. Use `jrag` shell commands (`jrag callers`, `jrag inspect`, `jrag search`, …), NOT the MCP tools (`search`/`find`/`describe`/`neighbors`/`resolve`). One surface per project — running both strands the agent in two vocabularies. + +Pick this agent (CLI) when: +- The host cannot run an MCP server (no stdio MCP support) +- The operator ran `java-codebase-rag install --surface cli` +- You prefer shell-driven exploration with text output and `--format json` for structured data + +Use the **`explorer-rag-enhanced`** subagent (MCP surface) when the host has MCP support and the operator ran `java-codebase-rag install` (default = mcp surface). + +## Prerequisite: index must exist + +`jrag` is a thin compose-and-render layer over the existing index. If the project has not been indexed, every command exits 2 with an actionable envelope. Verify with `jrag status` first when in doubt: + +``` +jrag status +``` + +If it exits 2, ask the operator to run `java-codebase-rag init --source-root `. + +## Tool Inventory + +### `jrag` command groups + +Run `jrag --help` for the canonical list. Groups: + +| Group | Commands | +| --- | --- | +| **Orientation** | `status`, `microservices`, `map`, `conventions`, `overview` | +| **Locate** | `find`, `search` | +| **Listings** | `routes`, `clients`, `producers`, `topics`, `jobs`, `listeners`, `entities` | +| **Traversal** | `callers`, `callees`, `hierarchy`, `implementations`, `subclasses`, `overrides`, `overridden-by`, `dependents`, `impact`, `flow`, `dependencies`, `connection` | +| **Inspection** | `inspect`, `outline`, `imports` | + +### Common flags (every command) + +``` +--service Filter by microservice +--module Filter by module +--limit Cap on results (default 20; 10 for fan-out commands) +--format text|json Output format (default: text) +--detail brief|normal|full Output detail (default: normal) — orthogonal to --format; + both modes honor it. brief=name @service; normal=+module/role/ + file/score; full=+signature/annotations/snippet. inspect and the + orientation commands default to full. +--index-dir Index directory override +``` + +`--offset` is supported **only** on `find` and `search`. Other commands emit `truncated: more results — narrow your query` when capped. + +### File-system tools + +`Grep` (content search), `Glob` (find files by name/pattern), `Read` (read files, with `offset`/`limit`). + +### Other tools + +`Bash` (read-only: `git log`, `git blame`, `ls`, `find`), `WebSearch`, `WebFetch`. + +--- + +## Decision Framework + +### When to use `jrag` vs file-system tools + +| Question type | Primary approach | +| --- | --- | +| "Who calls method M?" | `jrag callers ` | +| "What does M call?" | `jrag callees ` | +| "Where is class X?" | `jrag inspect `; fallback `Grep`/`Glob` | +| "All controllers in service S" | `jrag find --role CONTROLLER --service S` | +| "Routes/endpoints in service S" | `jrag routes --service S` | +| "Who implements interface T?" | `jrag implementations ` | +| "Where is T injected?" | `jrag dependencies ` | +| "Who depends on T?" | `jrag dependents ` | +| "Impact of changing X?" | `jrag impact ` (bounded fan-in) | +| "Trace request flow A→B" | `jrag flow ` → `jrag connection A B` | +| "Orient in service S" | `jrag overview ` | +| "Find files matching pattern" | `Glob` | +| "Search for text/regex in files" | `Grep` | +| "Read config/build/test files" | `Read` | +| "Who changed this and when?" | Bash: `git log` / `git blame` | +| "How is this concept used?" | Both: `jrag search ""` for fuzzy discovery, `Grep` for text patterns | +| "Natural-language 'find X'" | `jrag search ""` → `jrag inspect ` | + +### Escalation pattern + +1. **Try the most targeted command first.** Identifier-shaped → `jrag inspect `. Structural question → matching traversal (`callers`/`implementations`/…). +2. **Fall back gracefully.** `jrag` returns empty / `not_found` → `Grep`/`Glob` against actual source files. +3. **Cross-validate.** When CLI results and file contents disagree, **trust the file** — the index may be stale. Report the discrepancy. + +--- + +## Resolve-first contract (every `` command) + +Every `jrag` command that takes a `` runs `resolve_v2` internally. Map the contract onto the result: + +| `resolve_v2` status | `jrag` behavior | Your action | +| --- | --- | --- | +| `one` | Run the traversal/listing against the resolved node. | Read the result. | +| `many` | Return the candidate list and stop. **No auto-pick.** | Disambiguate with `--kind`/`--role`/`--fqn-prefix`/`--service`; re-run. | +| `none` | `status: not_found` envelope (exit 2). | Fall back to `jrag search` or `Grep`. | + +Never look up a raw node ID manually. Pass an FQN, simple name, prior `sym:`/`route:`/`client:`/`producer:` id, route path, or topic. + +### Disambiguation flags + +Only `--kind` is a true resolve input. `--role`, `--java-kind`, `--fqn-prefix`, `--service`, `--module` post-filter the resolve result client-side. + +--- + +## Output envelope + +`--format` (text|json) and `--detail` (brief|normal|full) are **orthogonal**: +`--format` picks the representation, `--detail` picks how much of each node/edge is +shown, and both modes honor the same detail level. Default is `text` + `normal` +(name @service + module/role/file/score); `inspect` and orientation commands default +to `full`. `--format json` emits the projected envelope (empty fields dropped). + +```json +{ + "status": "ok|not_found|error", + "nodes": {"": {...}}, + "edges": [{...}], + "candidates": [{...}], + "truncated": false, + "agent_next_actions": ["jrag callers ", "..."], + "file_location": {"filename": "...", "start_line": 123} +} +``` + +- `agent_next_actions` is a CLI-native hint list (≤5) — use it as a starting point, not a directive. +- `file_location` is populated only on `one`-hit resolve. +- `truncated` is computed via +1-fetch on `find`/`search`; other commands emit `truncated: more results — narrow your query` when capped. + +--- + +## Traversal reference + +`jrag` abstracts away `direction` and `edge_types`. For reference: + +| Intent (command) | Underlying edges | +| --- | --- | +| `callers` | `CALLS` direction=in | +| `callees` | `CALLS` direction=out | +| `hierarchy` | `EXTENDS` + `IMPLEMENTS` direction=out | +| `implementations` | `IMPLEMENTS` direction=in | +| `subclasses` | `EXTENDS` direction=in | +| `overrides` | `OVERRIDES` direction=out (subtype → supertype) | +| `overridden-by` | `OVERRIDES` direction=in | +| `dependencies` | `INJECTS` direction=out | +| `dependents` | `INJECTS` direction=in | +| `impact` | bounded fan-in (`CALLS`/`INJECTS`/`IMPLEMENTS`/`EXTENDS`, depth ≤2) | +| `flow ` | `EXPOSES`/`HTTP_CALLS`/`ASYNC_CALLS`/`CALLS` (request trace) | +| `connection A B` | bounded path search between A and B | + +### Node id prefixes (from prior results) + +`sym:` (Symbol), `route:`/`r:` (Route), `client:`/`c:` (Client), `producer:`/`p:` (Producer). + +### Symbol FQN shape + +`.[.]#(,,…)`. Generics erased, no spaces after commas. No-arg: `()`. Constructor: `#(...)`. + +--- + +## Ontology glossary + +### Roles + +| Role | Meaning | +| ---- | ------- | +| `CONTROLLER` | HTTP / messaging entry point | +| `SERVICE` | Business logic orchestration | +| `REPOSITORY` | Data access | +| `COMPONENT` | General Spring component | +| `CONFIG` | `@Configuration` class | +| `ENTITY` | JPA / persistence entity | +| `CLIENT` | Outbound call wrapper | +| `MAPPER` | Data mapper / converter | +| `DTO` | Data transfer object | +| `OTHER` | Infrastructure / utility / unclassified | + +### Capabilities + +`MESSAGE_LISTENER`, `MESSAGE_PRODUCER`, `HTTP_CLIENT`, `SCHEDULED_TASK`, `EXCEPTION_HANDLER`. + +### Symbol kinds + +`class`, `interface`, `enum`, `record`, `annotation`, `method`, `constructor`. + +### Route / client / producer kinds + +Route frameworks: `spring_mvc`, `webflux`. Route kinds: `http_endpoint`, `http_consumer`, `kafka_topic`, `rabbit_queue`, `jms_destination`, `stream_binding`. +Client kinds: `feign_method`, `rest_template`, `web_client`. Producer kinds: `kafka_send`, `stream_bridge_send`. Source layers: `builtin`, `layer_a_meta`, `layer_b_ann`, `layer_b_fqn`, `layer_c_source`. + +--- + +## File-System Search Reference + +### Glob patterns + +- `**/*.java` — all Java files +- `**/*Controller*.java` — controller files +- `**/application*.yml` — Spring config files +- `**/*Test*.java` — test files + +### Grep patterns + +- Class declarations: `class ClassName` +- Method usage: `methodName(` +- Annotations: `@RequestMapping`, `@Service`, etc. +- Import statements: `import com.example.ClassName` +- Configuration keys: `spring.datasource` + +### Reading files + +Use `Read` with `offset`/`limit` for large files — read relevant sections, not entire files. + +--- + +## Recovery Playbook + +| Symptom | Fix | +| ------- | --- | +| `jrag status` exits 2 | Run `java-codebase-rag init --source-root `; retry | +| `status: not_found` | Try `jrag search ""`; or `find --fqn-prefix`; fallback `Grep` | +| `many` candidates | Add `--kind`/`--role`/`--fqn-prefix`/`--service`; re-run | +| `find` returns too much | Add `--service`, `--fqn-prefix`, `--path-prefix`, `--topic-prefix` | +| Empty `search` | Try `--table all`; `find --fqn-prefix`; `Grep` directly | +| `truncated: true` | Narrow the query, or page with `--offset` (`find`/`search` only) | +| Empty results across commands | Index missing/stale → `Grep`/`Glob`/`Read`; ask operator to rebuild | +| CLI vs file disagree | Trust the file; report stale index | +| `--offset` rejected | Only `find`/`search` accept it; other commands narrow via filters | + +After two failed attempts on the same intent, stop and report what was tried and what failed. + +--- + +## Workflow Patterns + +### Pattern: "explain feature X" + +1. `jrag search "X"` → pick top 1–3 hits +2. `jrag inspect ` for full record +3. Targeted traversal (`callees` / `implementations` / `dependents`) +4. Stop when you can answer the question + +### Pattern: "where is X used?" + +1. `jrag inspect ` (resolves; if `many`, disambiguate) +2. `jrag callers ` and `jrag dependents ` +3. If CLI misses: `Grep` for the symbol name +4. Report all usage sites with file:line + +### Pattern: "find all Y in the codebase" + +1. Structural: `jrag find --role [--service ]` +2. Textual: `Grep` for the pattern +3. Broad: `Glob` for files + `Grep` for content +4. Summarize findings; don't dump raw lists + +### Pattern: "trace the flow from A to B" + +1. `jrag flow ` to trace the request +2. `jrag connection A B` to confirm a path exists +3. Use `Grep` to fill gaps where the graph index is incomplete +4. Report the trace with file:line references + +### Pattern: "orient in service S" + +1. `jrag overview ` (bundle of routes/clients/producers) +2. `jrag conventions --service ` (dominant roles + framework tallies) +3. `jrag map --service ` (type counts) +4. `jrag routes --service ` (entry points) diff --git a/automation/__init__.py b/automation/__init__.py deleted file mode 100644 index 8b137891..00000000 --- a/automation/__init__.py +++ /dev/null @@ -1 +0,0 @@ - diff --git a/automation/cursor_propose_only/README.md b/automation/cursor_propose_only/README.md deleted file mode 100644 index 86df43f0..00000000 --- a/automation/cursor_propose_only/README.md +++ /dev/null @@ -1,146 +0,0 @@ -# Cursor propose-only automation - -This workflow is intentionally isolated under `automation/cursor_propose_only/` -so orchestration sources are not mixed with production runtime code, main docs, -or the primary test suite. - -## Commands - -From repository root: - -- `.venv/bin/python automation/cursor_propose_only/cli.py prepare ...` -- `.venv/bin/python automation/cursor_propose_only/cli.py evaluate ...` -- `.venv/bin/python automation/cursor_propose_only/execute.py ...` -- `.venv/bin/python automation/cursor_propose_only/autopilot.py ...` - -## Select specific proposals - -If you only want a subset, pass `--proposal` multiple times: - -```bash -.venv/bin/python automation/cursor_propose_only/cli.py prepare \ - --repo-root . \ - --proposal-dir propose \ - --output-dir .agents/reports/propose_automation_selected \ - --proposal HTTP-ROUTE-METHOD-ENUM-PROPOSE.md \ - --proposal ENHANCED-ROLE-RECOGNITION-PROPOSE.md \ - --rounds 3 \ - --min-severity medium -``` - -Notes: - -- `--proposal` paths may be absolute or relative to `--proposal-dir` -- when `--proposal` is provided, `--glob` is ignored -- add `--include-completed` if selected files can come from `propose/completed/` or `propose/stale/` - -## Generate a propose-only workflow bundle - -```bash -.venv/bin/python automation/cursor_propose_only/cli.py prepare \ - --repo-root . \ - --proposal-dir propose \ - --output-dir .agents/reports/propose_automation \ - --rounds 3 \ - --min-severity medium -``` - -Generated artifacts: - -- `.agents/reports/propose_automation/workflow.json` -- `.agents/reports/propose_automation/jobs//planner_prompt.md` -- `.agents/reports/propose_automation/jobs//reviewer_prompt_round1.md` -- `.agents/reports/propose_automation/jobs//reviewer_prompt_round2.md` -- `.agents/reports/propose_automation/jobs//reviewer_prompt_round3.md` - -## Evaluate each reviewer response - -Use one fresh reviewer session per round. Save each reviewer response to a file, -then evaluate it with severity gating. - -```bash -.venv/bin/python automation/cursor_propose_only/cli.py evaluate \ - --workflow .agents/reports/propose_automation/workflow.json \ - --job-id \ - --round 1 \ - --review-file /path/to/review_round1.md \ - --min-severity medium \ - --write -``` - -Status transitions: - -- actionable issue found -> `needs_fixes` -- approved with no actionable issues -> `ready_to_merge` -- final round still failing -> `blocked_after_reviews` - -## Automate implementation after plans are ready - -When `plans/AGENT-PROMPTS-.md` exists, run `execute.py` to iterate PR -sections in order, run implementation command(s), run review loops, and mark -tasks as `ready_to_merge` / `merged`. - -```bash -.venv/bin/python automation/cursor_propose_only/execute.py \ - --repo-root . \ - --workflow .agents/reports/propose_automation/workflow.json \ - --rounds 3 \ - --min-severity medium \ - --implementation-command 'cursor-agent run --model auto --prompt-file {task_prompt_file}' \ - --review-command 'cursor-agent run --model auto --prompt-file {review_prompt_file}' \ - --merge-command 'gh pr merge {pr_url} --squash --delete-branch' \ - --run -``` - -Notes: - -- without `--run`, `execute.py` performs a dry-run and only stages prompts/state -- command templates support placeholders such as `{task_prompt_file}`, - `{review_prompt_file}`, `{pr_url}`, `{branch}`, `{base}`, `{round}` -- workflow state is persisted in `.agents/reports/propose_automation/workflow.json` - -## Fully automated (single command) - -If you want: "I provide propose(s), workflow runs, I come back to ready PRs", -use `autopilot.py`. - -```bash -.venv/bin/python automation/cursor_propose_only/autopilot.py \ - --repo-root . \ - --proposal-dir propose \ - --output-dir .agents/reports/propose_automation_selected \ - --proposal ENHANCED-ROLE-RECOGNITION-PROPOSE.md \ - --planning-rounds 2 \ - --planning-min-severity medium \ - --implementation-rounds 2 \ - --implementation-min-severity medium \ - --planner-command 'cursor-agent run --model auto --prompt-file {planner_prompt_file}' \ - --planning-review-command 'cursor-agent run --model auto --prompt-file {review_prompt_file}' \ - --implementation-command 'cursor-agent run --model auto --prompt-file {task_prompt_file}' \ - --implementation-review-command 'cursor-agent run --model auto --prompt-file {review_prompt_file}' \ - --merge-command 'gh pr merge {pr_url} --squash --delete-branch' \ - --run -``` - -Behavior: - -1. Generates workflow bundle (`prepare` equivalent) -2. Runs planner command for each selected proposal -3. Runs planning review rounds with severity gating and planner-fix loops -4. Parses generated `plans/AGENT-PROMPTS-*.md` -5. Runs implementation and implementation-review loops per PR section -6. Marks tasks `ready_to_merge` or `merged` (if merge command is provided) - -## Reviewer format convention - -For consistent parsing, reviewer findings should follow: - -- `[CRITICAL] ...` -- `[HIGH] ...` -- `[MEDIUM] ...` -- `[LOW] ...` -- `[TRIVIAL] ...` - -When no actionable issues remain, reviewer should return: - -- `APPROVED` diff --git a/automation/cursor_propose_only/__init__.py b/automation/cursor_propose_only/__init__.py deleted file mode 100644 index 8b137891..00000000 --- a/automation/cursor_propose_only/__init__.py +++ /dev/null @@ -1 +0,0 @@ - diff --git a/automation/cursor_propose_only/autopilot.py b/automation/cursor_propose_only/autopilot.py deleted file mode 100644 index c5ef6bfe..00000000 --- a/automation/cursor_propose_only/autopilot.py +++ /dev/null @@ -1,386 +0,0 @@ -#!/usr/bin/env python3 - -from __future__ import annotations - -import argparse -import json -import subprocess -import sys -from dataclasses import dataclass -from datetime import UTC, datetime -from pathlib import Path -from typing import Any, Callable - -ROOT = Path(__file__).resolve().parents[2] -if str(ROOT) not in sys.path: - sys.path.insert(0, str(ROOT)) - - -@dataclass(frozen=True, slots=True) -class CommandResult: - success: bool - command: str - returncode: int - stdout: str - stderr: str - - -def _now_utc() -> str: - return datetime.now(UTC).isoformat() - - -def _run_shell_command(command_template: str, variables: dict[str, str], cwd: Path) -> CommandResult: - command = command_template.format(**variables) - completed = subprocess.run( - ["/bin/bash", "-lc", command], - cwd=str(cwd), - text=True, - capture_output=True, - check=False, - ) - return CommandResult( - success=completed.returncode == 0, - command=command, - returncode=completed.returncode, - stdout=completed.stdout, - stderr=completed.stderr, - ) - - -def _write_text(path: Path, content: str) -> Path: - path.parent.mkdir(parents=True, exist_ok=True) - path.write_text(content, encoding="utf-8") - return path - - -def _write_command_log(path: Path, result: CommandResult) -> Path: - return _write_text( - path, - f"$ {result.command}\n\n[stdout]\n{result.stdout}\n\n[stderr]\n{result.stderr}\n", - ) - - -def _render_planner_fix_prompt(base_prompt: str, actionable_issues: list[dict[str, str]]) -> str: - bullets = "\n".join(f"- [{issue['severity'].upper()}] {issue['summary']}" for issue in actionable_issues) - return ( - f"{base_prompt}\n\n" - "## Reviewer findings to fix\n" - f"{bullets}\n\n" - "Update plan and cursor prompts to resolve all actionable findings." - ) - - -def _load_workflow(path: Path) -> dict[str, Any]: - return json.loads(path.read_text(encoding="utf-8")) - - -def _save_workflow(path: Path, workflow: dict[str, Any]) -> None: - path.write_text(json.dumps(workflow, indent=2, sort_keys=True), encoding="utf-8") - - -def _resolve_command(cli_value: str, file_value: str, *, name: str) -> str: - if file_value: - return Path(file_value).read_text(encoding="utf-8").strip() - if cli_value: - return cli_value - raise ValueError(f"Missing required {name}: pass --{name.replace('_', '-')} or --{name.replace('_', '-')}-file") - - -def run_autopilot( - *, - repo_root: Path, - proposal_dir: Path, - output_dir: Path, - selected_proposals: list[str], - proposal_glob: str, - include_completed: bool, - planning_rounds: int, - planning_min_severity: str, - implementation_rounds: int, - implementation_min_severity: str, - planner_command: str, - planning_review_command: str, - implementation_command: str, - implementation_review_command: str, - merge_command: str | None, - run: bool, - command_runner: Callable[[str, dict[str, str], Path], CommandResult] | None = None, - pr_url_regex: str | None = None, -) -> dict[str, Any]: - from automation.cursor_propose_only.execute import execute_workflow - from automation.cursor_propose_only.workflow import evaluate_review, prepare_bundle - - runner = command_runner or _run_shell_command - workflow = prepare_bundle( - repo_root=repo_root, - proposal_dir=proposal_dir, - output_dir=output_dir, - rounds=planning_rounds, - min_severity=planning_min_severity, - pattern=proposal_glob, - include_completed=include_completed, - selected_proposals=selected_proposals, - ) - workflow_path = output_dir / "workflow.json" - - for job in workflow.get("jobs", []): - job_id = str(job.get("job_id", "job")) - job_dir = output_dir / "jobs" / job_id - planner_prompt_file = repo_root / str(job.get("planner_prompt_path", "")) - planner_prompt_text = planner_prompt_file.read_text(encoding="utf-8") - - if not run: - job["planning_status"] = "ready_for_planner" - job.setdefault("planning_history", []).append( - { - "phase": "planner", - "dry_run": True, - "command_template": planner_command, - "at": _now_utc(), - } - ) - job["execution_status"] = "skipped_planning_pending" - continue - - variables = { - "job_id": job_id, - "round": "0", - "propose_path": str(job.get("propose_path", "")), - "plan_path": str(job.get("plan_path", "")), - "agent_prompts_path": str(job.get("agent_prompts_path", "")), - "planner_prompt_file": str(planner_prompt_file), - "review_prompt_file": "", - "review_output_file": "", - "issues_file": "", - } - planner_result = runner(planner_command, variables, repo_root) - planner_log = _write_command_log(job_dir / "planning_command_round0.log", planner_result) - job.setdefault("planning_history", []).append( - { - "phase": "planner", - "returncode": planner_result.returncode, - "success": planner_result.success, - "log_path": str(planner_log), - "at": _now_utc(), - } - ) - if not planner_result.success: - job["planning_status"] = "blocked_planner_failed" - continue - - planning_approved = False - for round_number in range(1, planning_rounds + 1): - review_prompt_file = repo_root / str(job["reviewer_prompt_paths"][round_number - 1]) - review_output_file = job_dir / f"planning_review_output_round{round_number}.md" - variables.update( - { - "round": str(round_number), - "review_prompt_file": str(review_prompt_file), - "review_output_file": str(review_output_file), - } - ) - review_result = runner(planning_review_command, variables, repo_root) - _write_text( - review_output_file, review_result.stdout or review_result.stderr or "No planning review output captured.\n" - ) - job.setdefault("planning_history", []).append( - { - "phase": "planning_review", - "round_number": round_number, - "returncode": review_result.returncode, - "success": review_result.success, - "review_output_file": str(review_output_file), - "at": _now_utc(), - } - ) - if not review_result.success: - job["planning_status"] = "blocked_planning_review_failed" - break - - review_eval = evaluate_review(review_output_file.read_text(encoding="utf-8"), min_severity=planning_min_severity) - job.setdefault("reviews", []).append( - { - "round_number": round_number, - "review_file": str(review_output_file), - **review_eval, - } - ) - if review_eval["approved"]: - planning_approved = True - break - - if round_number < planning_rounds: - issues_file = _write_text( - job_dir / f"planning_review_issues_round{round_number}.json", - json.dumps(review_eval.get("actionable_issues", []), indent=2, sort_keys=True), - ) - fix_prompt = _render_planner_fix_prompt(planner_prompt_text, review_eval.get("actionable_issues", [])) - fix_prompt_file = _write_text(job_dir / f"planner_fix_prompt_round{round_number}.md", fix_prompt) - variables.update( - { - "planner_prompt_file": str(fix_prompt_file), - "issues_file": str(issues_file), - "round": str(round_number), - } - ) - fix_result = runner(planner_command, variables, repo_root) - fix_log = _write_command_log(job_dir / f"planning_fix_command_round{round_number}.log", fix_result) - job.setdefault("planning_history", []).append( - { - "phase": "planner_fix", - "round_number": round_number, - "returncode": fix_result.returncode, - "success": fix_result.success, - "log_path": str(fix_log), - "at": _now_utc(), - } - ) - if not fix_result.success: - job["planning_status"] = "blocked_planner_fix_failed" - break - else: - job["planning_status"] = "blocked_after_planning_reviews" - - if planning_approved: - job["planning_status"] = "ready_to_execute" - elif "planning_status" not in job: - job["planning_status"] = "blocked_after_planning_reviews" - - _save_workflow(workflow_path, workflow) - - if run: - workflow = execute_workflow( - workflow_path=workflow_path, - repo_root=repo_root, - rounds=implementation_rounds, - min_severity=implementation_min_severity, - implementation_command=implementation_command, - review_command=implementation_review_command, - merge_command=merge_command, - dry_run=False, - command_runner=runner, - pr_url_regex=pr_url_regex, - ) - else: - workflow = execute_workflow( - workflow_path=workflow_path, - repo_root=repo_root, - rounds=implementation_rounds, - min_severity=implementation_min_severity, - implementation_command=implementation_command, - review_command=implementation_review_command, - merge_command=merge_command, - dry_run=True, - command_runner=runner, - pr_url_regex=pr_url_regex, - ) - workflow["autopilot_updated_at_utc"] = _now_utc() - _save_workflow(workflow_path, workflow) - return workflow - - -def build_parser() -> argparse.ArgumentParser: - parser = argparse.ArgumentParser( - description="Single-command proposal -> planning -> implementation -> review automation." - ) - parser.add_argument("--repo-root", default=".", help="Repository root.") - parser.add_argument("--proposal-dir", default="propose", help="Proposal directory.") - parser.add_argument("--output-dir", default="reports/propose_automation", help="Workflow output directory.") - parser.add_argument("--glob", default="*-PROPOSE.md", help="Proposal glob pattern.") - parser.add_argument("--proposal", action="append", default=[], help="Specific proposal file (repeatable).") - parser.add_argument("--include-completed", action="store_true", help="Allow selecting from propose/completed.") - parser.add_argument("--planning-rounds", type=int, default=3, help="Planning review rounds.") - parser.add_argument( - "--planning-min-severity", - choices=("trivial", "low", "medium", "high", "critical"), - default="medium", - help="Planning actionable severity threshold.", - ) - parser.add_argument("--implementation-rounds", type=int, default=3, help="Implementation review rounds.") - parser.add_argument( - "--implementation-min-severity", - choices=("trivial", "low", "medium", "high", "critical"), - default="medium", - help="Implementation actionable severity threshold.", - ) - parser.add_argument("--planner-command", default="", help="Planner command template.") - parser.add_argument("--planner-command-file", default="", help="File containing planner command template.") - parser.add_argument("--planning-review-command", default="", help="Planning reviewer command template.") - parser.add_argument( - "--planning-review-command-file", - default="", - help="File containing planning reviewer command template.", - ) - parser.add_argument("--implementation-command", default="", help="Implementation command template.") - parser.add_argument( - "--implementation-command-file", - default="", - help="File containing implementation command template.", - ) - parser.add_argument("--implementation-review-command", default="", help="Implementation reviewer command template.") - parser.add_argument( - "--implementation-review-command-file", - default="", - help="File containing implementation reviewer command template.", - ) - parser.add_argument("--merge-command", default="", help="Optional merge command template.") - parser.add_argument("--merge-command-file", default="", help="Optional file containing merge command template.") - parser.add_argument("--pr-url-regex", default="", help="Optional custom PR URL extraction regex.") - parser.add_argument( - "--run", - action="store_true", - help="Run full automation. Without this flag, runner performs dry-run state staging.", - ) - return parser - - -def main(argv: list[str] | None = None) -> int: - parser = build_parser() - args = parser.parse_args(argv) - try: - planner_command = _resolve_command(args.planner_command, args.planner_command_file, name="planner_command") - planning_review_command = _resolve_command( - args.planning_review_command, args.planning_review_command_file, name="planning_review_command" - ) - implementation_command = _resolve_command( - args.implementation_command, args.implementation_command_file, name="implementation_command" - ) - implementation_review_command = _resolve_command( - args.implementation_review_command, - args.implementation_review_command_file, - name="implementation_review_command", - ) - merge_command = ( - _resolve_command(args.merge_command, args.merge_command_file, name="merge_command") - if (args.merge_command or args.merge_command_file) - else None - ) - except ValueError as exc: - parser.error(str(exc)) - return 2 - - workflow = run_autopilot( - repo_root=Path(args.repo_root).resolve(), - proposal_dir=Path(args.proposal_dir).resolve(), - output_dir=Path(args.output_dir).resolve(), - selected_proposals=list(args.proposal or []), - proposal_glob=args.glob, - include_completed=bool(args.include_completed), - planning_rounds=int(args.planning_rounds), - planning_min_severity=args.planning_min_severity, - implementation_rounds=int(args.implementation_rounds), - implementation_min_severity=args.implementation_min_severity, - planner_command=planner_command, - planning_review_command=planning_review_command, - implementation_command=implementation_command, - implementation_review_command=implementation_review_command, - merge_command=merge_command, - run=bool(args.run), - pr_url_regex=(args.pr_url_regex or None), - ) - print(json.dumps(workflow, indent=2, sort_keys=True)) - return 0 - - -if __name__ == "__main__": - raise SystemExit(main()) diff --git a/automation/cursor_propose_only/cli.py b/automation/cursor_propose_only/cli.py deleted file mode 100644 index 7680bc22..00000000 --- a/automation/cursor_propose_only/cli.py +++ /dev/null @@ -1,20 +0,0 @@ -#!/usr/bin/env python3 - -from __future__ import annotations - -import sys -from pathlib import Path - -ROOT = Path(__file__).resolve().parents[2] -if str(ROOT) not in sys.path: - sys.path.insert(0, str(ROOT)) - - -def _main() -> int: - from automation.cursor_propose_only.workflow import main - - return main() - - -if __name__ == "__main__": - raise SystemExit(_main()) diff --git a/automation/cursor_propose_only/execute.py b/automation/cursor_propose_only/execute.py deleted file mode 100644 index 5ad259a5..00000000 --- a/automation/cursor_propose_only/execute.py +++ /dev/null @@ -1,517 +0,0 @@ -from __future__ import annotations - -import argparse -import json -import re -import subprocess -import sys -from dataclasses import dataclass -from datetime import UTC, datetime -from pathlib import Path -from typing import Any, Callable - -ROOT = Path(__file__).resolve().parents[2] -if str(ROOT) not in sys.path: - sys.path.insert(0, str(ROOT)) - -_PR_SECTION_RE = re.compile(r"^##\s+(PR-[^\n]+)$", re.MULTILINE) -_PROMPT_BLOCK_RE = re.compile(r"\*\*Prompt:\*\*\s*\n\s*(`{3,})\n(.*?)\n\1", re.DOTALL) -_FIELD_RE = re.compile(r"^\*\*(Branch|Base|Plan section):\*\*\s*(.+?)\s*$", re.MULTILINE) -_BACKTICK_VALUE_RE = re.compile(r"`([^`]+)`") -_DEFAULT_PR_URL_RE = re.compile(r"https://github\.com/[^/\s]+/[^/\s]+/pull/\d+") - - -@dataclass(frozen=True, slots=True) -class PromptTask: - task_id: str - heading: str - branch: str - base: str - plan_section: str - prompt_body: str - - -@dataclass(frozen=True, slots=True) -class CommandResult: - success: bool - command: str - returncode: int - stdout: str - stderr: str - - -def _now_utc() -> str: - return datetime.now(UTC).isoformat() - - -def _evaluate_review(review_text: str, *, min_severity: str) -> dict[str, Any]: - from automation.cursor_propose_only.workflow import evaluate_review - - return evaluate_review(review_text, min_severity=min_severity) - - -def _json_dump(path: Path, payload: dict[str, Any]) -> None: - path.parent.mkdir(parents=True, exist_ok=True) - path.write_text(json.dumps(payload, indent=2, sort_keys=True), encoding="utf-8") - - -def _json_load(path: Path) -> dict[str, Any]: - return json.loads(path.read_text(encoding="utf-8")) - - -def _safe_name(value: str) -> str: - return re.sub(r"[^a-zA-Z0-9._-]+", "-", value).strip("-") or "task" - - -def parse_agent_prompts(markdown_text: str) -> list[PromptTask]: - matches = list(_PR_SECTION_RE.finditer(markdown_text)) - if not matches: - return [] - tasks: list[PromptTask] = [] - for index, match in enumerate(matches): - start = match.start() - end = matches[index + 1].start() if index + 1 < len(matches) else len(markdown_text) - section_text = markdown_text[start:end] - heading = match.group(1).strip() - task_id = heading.split("—", 1)[0].strip() - fields = {m.group(1).strip().lower(): m.group(2).strip() for m in _FIELD_RE.finditer(section_text)} - branch = _extract_backtick_or_raw(fields.get("branch", "")) - base = _extract_backtick_or_raw(fields.get("base", "master")) or "master" - plan_section = fields.get("plan section", "") - prompt_match = _PROMPT_BLOCK_RE.search(section_text) - if prompt_match is None: - raise ValueError(f"Missing Prompt block for section {heading!r}") - prompt_body = prompt_match.group(2).strip() - tasks.append( - PromptTask( - task_id=task_id, - heading=heading, - branch=branch, - base=base, - plan_section=plan_section.strip(), - prompt_body=prompt_body, - ) - ) - return tasks - - -def _extract_backtick_or_raw(value: str) -> str: - if not value: - return "" - backtick_match = _BACKTICK_VALUE_RE.search(value) - if backtick_match: - return backtick_match.group(1).strip() - return value.strip(" .") - - -def run_shell_command( - command_template: str, - variables: dict[str, str], - *, - working_directory: Path, -) -> CommandResult: - command = command_template.format(**variables) - completed = subprocess.run( - ["/bin/bash", "-lc", command], - cwd=str(working_directory), - text=True, - capture_output=True, - check=False, - ) - return CommandResult( - success=completed.returncode == 0, - command=command, - returncode=completed.returncode, - stdout=completed.stdout, - stderr=completed.stderr, - ) - - -def _find_pr_url(text: str, *, regex: str | None = None) -> str: - pattern = re.compile(regex) if regex else _DEFAULT_PR_URL_RE - match = pattern.search(text) - return match.group(0) if match else "" - - -def _render_impl_fix_prompt(base_prompt: str, actionable_issues: list[dict[str, str]]) -> str: - bullets = "\n".join(f"- [{it['severity'].upper()}] {it['summary']}" for it in actionable_issues) - return ( - f"{base_prompt}\n\n" - "## Reviewer findings to fix before continuing\n" - f"{bullets}\n\n" - "Address only these actionable findings while staying inside the original scope contract." - ) - - -def _ensure_job_execution_state(job: dict[str, Any], tasks: list[PromptTask]) -> None: - existing = {item.get("task_id"): item for item in job.get("execution_tasks", [])} - merged: list[dict[str, Any]] = [] - for task in tasks: - if task.task_id in existing: - merged.append(existing[task.task_id]) - continue - merged.append( - { - "task_id": task.task_id, - "heading": task.heading, - "branch": task.branch, - "base": task.base, - "status": "pending_implementation", - "pr_url": "", - "history": [], - "review_rounds": [], - } - ) - job["execution_tasks"] = merged - if "execution_status" not in job: - job["execution_status"] = "pending" - - -def _write_task_prompt(path: Path, content: str) -> Path: - path.parent.mkdir(parents=True, exist_ok=True) - path.write_text(content, encoding="utf-8") - return path - - -def execute_workflow( - *, - workflow_path: Path, - repo_root: Path, - rounds: int, - min_severity: str, - implementation_command: str, - review_command: str, - merge_command: str | None, - dry_run: bool, - command_runner: Callable[[str, dict[str, str], Path], CommandResult] | None = None, - pr_url_regex: str | None = None, -) -> dict[str, Any]: - workflow = _json_load(workflow_path) - runner = command_runner or (lambda tpl, vars_, cwd: run_shell_command(tpl, vars_, working_directory=cwd)) - output_root = workflow_path.parent / "execution" - output_root.mkdir(parents=True, exist_ok=True) - - for job in workflow.get("jobs", []): - job_id = str(job.get("job_id", "job")) - job_root = output_root / _safe_name(job_id) - planning_status = str(job.get("planning_status", "")) - if planning_status and planning_status != "ready_to_execute": - if planning_status.startswith("blocked_"): - job["execution_status"] = "skipped_planning_blocked" - else: - job["execution_status"] = "skipped_planning_pending" - continue - agent_prompts_path = repo_root / str(job.get("agent_prompts_path", "")) - if not agent_prompts_path.is_file(): - job["execution_status"] = "blocked_missing_agent_prompts" - continue - - tasks = parse_agent_prompts(agent_prompts_path.read_text(encoding="utf-8")) - _ensure_job_execution_state(job, tasks) - task_map = {task.task_id: task for task in tasks} - - for task_state in job.get("execution_tasks", []): - task_id = str(task_state.get("task_id")) - task = task_map.get(task_id) - if task is None: - task_state["status"] = "skipped_missing_task_definition" - continue - if task_state.get("status") in {"merged", "ready_to_merge"}: - continue - - task_root = job_root / _safe_name(task_id) - impl_prompt_path = _write_task_prompt(task_root / "implementation_prompt.md", task.prompt_body) - variables = { - "job_id": job_id, - "task_id": task.task_id, - "branch": task.branch, - "base": task.base, - "plan_section": task.plan_section, - "agent_prompts_path": str(agent_prompts_path), - "propose_path": str(job.get("propose_path", "")), - "plan_path": str(job.get("plan_path", "")), - "task_prompt_file": str(impl_prompt_path), - "round": "0", - "pr_url": str(task_state.get("pr_url", "")), - "review_prompt_file": "", - "review_output_file": "", - "issues_file": "", - } - - if dry_run: - task_state["status"] = "ready_for_implementation" - task_state.setdefault("history", []).append( - { - "phase": "implementation", - "dry_run": True, - "command_template": implementation_command, - "at": _now_utc(), - } - ) - continue - - impl_result = runner(implementation_command, variables, repo_root) - impl_log_path = task_root / "implementation_round0.log" - _write_task_prompt( - impl_log_path, - f"$ {impl_result.command}\n\n[stdout]\n{impl_result.stdout}\n\n[stderr]\n{impl_result.stderr}\n", - ) - task_state.setdefault("history", []).append( - { - "phase": "implementation", - "returncode": impl_result.returncode, - "success": impl_result.success, - "log_path": str(impl_log_path), - "at": _now_utc(), - } - ) - if not impl_result.success: - task_state["status"] = "blocked_implementation_failed" - break - - detected_pr = _find_pr_url(f"{impl_result.stdout}\n{impl_result.stderr}", regex=pr_url_regex) - if detected_pr: - task_state["pr_url"] = detected_pr - - review_passed = False - for round_number in range(1, rounds + 1): - review_prompt = ( - "You are reviewing a PR implementation result.\n\n" - f"Task: {task.heading}\n" - f"Branch: {task.branch}\n" - f"PR URL: {task_state.get('pr_url', '')}\n\n" - f"Report only `{min_severity}` or higher issues.\n" - "Output findings as `[SEVERITY] summary`.\n" - "If no actionable issues remain, return exactly `APPROVED`.\n" - ) - review_prompt_path = _write_task_prompt( - task_root / f"review_prompt_round{round_number}.md", review_prompt - ) - review_output_file = task_root / f"review_output_round{round_number}.md" - variables.update( - { - "round": str(round_number), - "pr_url": str(task_state.get("pr_url", "")), - "review_prompt_file": str(review_prompt_path), - "review_output_file": str(review_output_file), - } - ) - review_result = runner(review_command, variables, repo_root) - _write_task_prompt( - review_output_file, - review_result.stdout or review_result.stderr or "No review output captured.\n", - ) - if not review_result.success: - task_state["status"] = "blocked_review_failed" - task_state.setdefault("review_rounds", []).append( - { - "round_number": round_number, - "status": "review_command_failed", - "returncode": review_result.returncode, - "at": _now_utc(), - } - ) - break - - review_eval = _evaluate_review( - review_output_file.read_text(encoding="utf-8"), min_severity=min_severity - ) - task_state.setdefault("review_rounds", []).append( - { - "round_number": round_number, - "review_file": str(review_output_file), - **review_eval, - "at": _now_utc(), - } - ) - if review_eval["approved"]: - review_passed = True - break - - if round_number < rounds: - issues_file = _write_task_prompt( - task_root / f"review_issues_round{round_number}.json", - json.dumps(review_eval.get("actionable_issues", []), indent=2, sort_keys=True), - ) - fix_prompt = _render_impl_fix_prompt( - task.prompt_body, review_eval.get("actionable_issues", []) - ) - fix_prompt_file = _write_task_prompt( - task_root / f"implementation_fix_prompt_round{round_number}.md", fix_prompt - ) - variables.update( - { - "task_prompt_file": str(fix_prompt_file), - "issues_file": str(issues_file), - "round": str(round_number), - } - ) - fix_result = runner(implementation_command, variables, repo_root) - fix_log_path = task_root / f"implementation_fix_round{round_number}.log" - _write_task_prompt( - fix_log_path, - f"$ {fix_result.command}\n\n[stdout]\n{fix_result.stdout}\n\n[stderr]\n{fix_result.stderr}\n", - ) - task_state.setdefault("history", []).append( - { - "phase": "implementation_fix", - "round_number": round_number, - "returncode": fix_result.returncode, - "success": fix_result.success, - "log_path": str(fix_log_path), - "at": _now_utc(), - } - ) - if not fix_result.success: - task_state["status"] = "blocked_fix_failed" - break - detected_pr = _find_pr_url(f"{fix_result.stdout}\n{fix_result.stderr}", regex=pr_url_regex) - if detected_pr: - task_state["pr_url"] = detected_pr - else: - task_state["status"] = "blocked_after_reviews" - - if task_state.get("status", "").startswith("blocked_"): - break - - if review_passed: - if merge_command: - merge_vars = dict(variables) - merge_vars["round"] = str(rounds) - merge_result = runner(merge_command, merge_vars, repo_root) - merge_log_path = task_root / "merge.log" - _write_task_prompt( - merge_log_path, - f"$ {merge_result.command}\n\n[stdout]\n{merge_result.stdout}\n\n[stderr]\n{merge_result.stderr}\n", - ) - task_state.setdefault("history", []).append( - { - "phase": "merge", - "returncode": merge_result.returncode, - "success": merge_result.success, - "log_path": str(merge_log_path), - "at": _now_utc(), - } - ) - task_state["status"] = "merged" if merge_result.success else "ready_to_merge" - else: - task_state["status"] = "ready_to_merge" - else: - task_state["status"] = "blocked_after_reviews" - - statuses = {task.get("status") for task in job.get("execution_tasks", [])} - if statuses and statuses <= {"merged"}: - job["execution_status"] = "all_merged" - elif "blocked_after_reviews" in statuses or any(s.startswith("blocked_") for s in statuses if s): - job["execution_status"] = "blocked" - elif "ready_to_merge" in statuses: - job["execution_status"] = "ready_to_merge" - elif "ready_for_implementation" in statuses: - job["execution_status"] = "dry_run_ready" - else: - job["execution_status"] = "in_progress" - - workflow["execution_updated_at_utc"] = _now_utc() - _json_dump(workflow_path, workflow) - return workflow - - -def _load_command_file(path: Path) -> str: - return path.read_text(encoding="utf-8").strip() - - -def build_parser() -> argparse.ArgumentParser: - parser = argparse.ArgumentParser( - description="Execute per-PR implementation + review loops from AGENT-PROMPTS plans." - ) - parser.add_argument("--repo-root", default=".", help="Repository root.") - parser.add_argument( - "--workflow", - default="reports/propose_automation/workflow.json", - help="Path to workflow.json from prepare step.", - ) - parser.add_argument("--rounds", type=int, default=3, help="Review rounds per PR task.") - parser.add_argument( - "--min-severity", - choices=("trivial", "low", "medium", "high", "critical"), - default="medium", - help="Actionable severity threshold for review gating.", - ) - parser.add_argument( - "--implementation-command", - default="", - help=( - "Shell command template used to run implementation agent. Supports placeholders: " - "{task_prompt_file} {branch} {base} {task_id} {job_id} {round} {issues_file} {pr_url}." - ), - ) - parser.add_argument( - "--review-command", - default="", - help=( - "Shell command template used to run reviewer agent. Supports placeholders: " - "{review_prompt_file} {review_output_file} {pr_url} {task_id} {job_id} {round}." - ), - ) - parser.add_argument( - "--merge-command", - default="", - help=( - "Optional shell command template used after approval. Supports placeholders: " - "{pr_url} {branch} {task_id} {job_id}." - ), - ) - parser.add_argument("--implementation-command-file", default="", help="Read implementation command template from file.") - parser.add_argument("--review-command-file", default="", help="Read review command template from file.") - parser.add_argument("--merge-command-file", default="", help="Read merge command template from file.") - parser.add_argument("--pr-url-regex", default="", help="Custom regex for extracting PR URL from command output.") - parser.add_argument( - "--run", - action="store_true", - help="Execute commands. Without this flag, runner performs a dry-run and only stages task prompts.", - ) - return parser - - -def _resolve_command(cli_value: str, file_value: str, *, name: str) -> str: - if file_value: - return _load_command_file(Path(file_value)) - if cli_value: - return cli_value - raise ValueError(f"Missing required {name}: pass --{name.replace('_', '-')} or --{name.replace('_', '-')}-file") - - -def main(argv: list[str] | None = None) -> int: - parser = build_parser() - args = parser.parse_args(argv) - try: - implementation_command = _resolve_command( - args.implementation_command, args.implementation_command_file, name="implementation_command" - ) - review_command = _resolve_command(args.review_command, args.review_command_file, name="review_command") - merge_command = ( - _resolve_command(args.merge_command, args.merge_command_file, name="merge_command") - if (args.merge_command or args.merge_command_file) - else None - ) - except ValueError as exc: - parser.error(str(exc)) - return 2 - - workflow = execute_workflow( - workflow_path=Path(args.workflow).resolve(), - repo_root=Path(args.repo_root).resolve(), - rounds=int(args.rounds), - min_severity=args.min_severity, - implementation_command=implementation_command, - review_command=review_command, - merge_command=merge_command, - dry_run=not bool(args.run), - pr_url_regex=(args.pr_url_regex or None), - ) - print(json.dumps(workflow, indent=2, sort_keys=True)) - return 0 - - -if __name__ == "__main__": - raise SystemExit(main()) diff --git a/automation/cursor_propose_only/tests/test_autopilot.py b/automation/cursor_propose_only/tests/test_autopilot.py deleted file mode 100644 index f26d1e04..00000000 --- a/automation/cursor_propose_only/tests/test_autopilot.py +++ /dev/null @@ -1,106 +0,0 @@ -from __future__ import annotations - -import json -from pathlib import Path - -from automation.cursor_propose_only.autopilot import CommandResult, run_autopilot - - -def test_run_autopilot_dry_run_stages_planning_and_execution(tmp_path: Path) -> None: - repo_root = tmp_path / "repo" - propose_dir = repo_root / "propose" - propose_dir.mkdir(parents=True) - (propose_dir / "DEMO-PROPOSE.md").write_text("# demo\n", encoding="utf-8") - - workflow = run_autopilot( - repo_root=repo_root, - proposal_dir=propose_dir, - output_dir=repo_root / "reports" / "auto", - selected_proposals=["DEMO-PROPOSE.md"], - proposal_glob="*-PROPOSE.md", - include_completed=False, - planning_rounds=2, - planning_min_severity="medium", - implementation_rounds=2, - implementation_min_severity="medium", - planner_command="echo planner {planner_prompt_file}", - planning_review_command="echo APPROVED", - implementation_command="echo impl {task_prompt_file}", - implementation_review_command="echo APPROVED", - merge_command=None, - run=False, - ) - job = workflow["jobs"][0] - assert job["planning_status"] == "ready_for_planner" - assert job["execution_status"] == "skipped_planning_pending" - - -def test_run_autopilot_run_mode_completes_single_task(tmp_path: Path) -> None: - repo_root = tmp_path / "repo" - propose_dir = repo_root / "propose" - plans_dir = repo_root / "plans" - propose_dir.mkdir(parents=True) - plans_dir.mkdir(parents=True) - (propose_dir / "DEMO-PROPOSE.md").write_text("# demo\n", encoding="utf-8") - - def fake_runner(template: str, variables: dict[str, str], _cwd: Path) -> CommandResult: - command = template.format(**variables) - if template == "planner": - agent_prompt_path = repo_root / variables["agent_prompts_path"] - agent_prompt_path.parent.mkdir(parents=True, exist_ok=True) - agent_prompt_path.write_text( - """# Demo - -## PR-A1 — demo -**Branch:** `cursor/demo-a1` off `master`. -**Base:** `master`. -**Plan section:** `plans/PLAN-DEMO.md` § PR-A1. - -**Prompt:** - -```` -Implement demo. -```` -""", - encoding="utf-8", - ) - (repo_root / variables["plan_path"]).write_text("# plan\n", encoding="utf-8") - return CommandResult(True, command, 0, "planned", "") - if template == "planning-review": - return CommandResult(True, command, 0, "APPROVED", "") - if template == "impl": - return CommandResult(True, command, 0, "impl ok https://github.com/acme/repo/pull/88", "") - if template == "impl-review": - return CommandResult(True, command, 0, "APPROVED", "") - if template == "merge": - return CommandResult(True, command, 0, "merged", "") - raise AssertionError(f"Unexpected template {template}") - - workflow = run_autopilot( - repo_root=repo_root, - proposal_dir=propose_dir, - output_dir=repo_root / "reports" / "auto", - selected_proposals=["DEMO-PROPOSE.md"], - proposal_glob="*-PROPOSE.md", - include_completed=False, - planning_rounds=2, - planning_min_severity="medium", - implementation_rounds=2, - implementation_min_severity="medium", - planner_command="planner", - planning_review_command="planning-review", - implementation_command="impl", - implementation_review_command="impl-review", - merge_command="merge", - run=True, - command_runner=fake_runner, - ) - - job = workflow["jobs"][0] - assert job["planning_status"] == "ready_to_execute" - assert job["execution_status"] == "all_merged" - task = job["execution_tasks"][0] - assert task["status"] == "merged" - assert task["pr_url"] == "https://github.com/acme/repo/pull/88" - persisted = json.loads((repo_root / "reports" / "auto" / "workflow.json").read_text(encoding="utf-8")) - assert persisted["jobs"][0]["execution_status"] == "all_merged" diff --git a/automation/cursor_propose_only/tests/test_execute.py b/automation/cursor_propose_only/tests/test_execute.py deleted file mode 100644 index a7fbe979..00000000 --- a/automation/cursor_propose_only/tests/test_execute.py +++ /dev/null @@ -1,166 +0,0 @@ -from __future__ import annotations - -import json -from pathlib import Path - -import pytest - -from automation.cursor_propose_only.execute import CommandResult, execute_workflow, parse_agent_prompts - - -_PROMPTS_FIXTURE = """# Cursor task prompts — Demo - -## PR-A1 — Add thing -**Branch:** `cursor/a1` off `master`. -**Base:** `master`. -**Plan section:** `plans/PLAN-DEMO.md` § PR-A1. - -**Prompt:** - -```` -Implement A1. -```` - -## PR-A2 — Add second thing -**Branch:** `cursor/a2` off `master`. -**Base:** `master`. -**Plan section:** `plans/PLAN-DEMO.md` § PR-A2. - -**Prompt:** - -```` -Implement A2. -```` -""" - - -def test_parse_agent_prompts_extracts_ordered_tasks() -> None: - tasks = parse_agent_prompts(_PROMPTS_FIXTURE) - assert [t.task_id for t in tasks] == ["PR-A1", "PR-A2"] - assert tasks[0].branch == "cursor/a1" - assert tasks[1].base == "master" - assert tasks[0].prompt_body == "Implement A1." - - -def test_parse_agent_prompts_requires_prompt_block() -> None: - with pytest.raises(ValueError): - parse_agent_prompts("## PR-A1 — Missing prompt\n**Branch:** `x`\n") - - -def test_execute_workflow_dry_run_stages_tasks(tmp_path: Path) -> None: - repo_root = tmp_path / "repo" - repo_root.mkdir() - prompts_path = repo_root / "plans" / "AGENT-PROMPTS-DEMO.md" - prompts_path.parent.mkdir(parents=True) - prompts_path.write_text(_PROMPTS_FIXTURE, encoding="utf-8") - - workflow_path = tmp_path / "workflow.json" - workflow_path.write_text( - json.dumps( - { - "jobs": [ - { - "job_id": "demo", - "agent_prompts_path": "plans/AGENT-PROMPTS-DEMO.md", - "plan_path": "plans/PLAN-DEMO.md", - "propose_path": "propose/DEMO-PROPOSE.md", - } - ] - } - ), - encoding="utf-8", - ) - - workflow = execute_workflow( - workflow_path=workflow_path, - repo_root=repo_root, - rounds=3, - min_severity="medium", - implementation_command="impl {task_prompt_file}", - review_command="review {review_prompt_file}", - merge_command=None, - dry_run=True, - ) - job = workflow["jobs"][0] - assert job["execution_status"] == "dry_run_ready" - assert len(job["execution_tasks"]) == 2 - assert job["execution_tasks"][0]["status"] == "ready_for_implementation" - - -def test_execute_workflow_run_mode_reviews_and_merges(tmp_path: Path) -> None: - repo_root = tmp_path / "repo" - repo_root.mkdir() - prompts_path = repo_root / "plans" / "AGENT-PROMPTS-DEMO.md" - prompts_path.parent.mkdir(parents=True) - prompts_path.write_text(_PROMPTS_FIXTURE, encoding="utf-8") - - workflow_path = tmp_path / "workflow.json" - workflow_path.write_text( - json.dumps( - { - "jobs": [ - { - "job_id": "demo", - "agent_prompts_path": "plans/AGENT-PROMPTS-DEMO.md", - "plan_path": "plans/PLAN-DEMO.md", - "propose_path": "propose/DEMO-PROPOSE.md", - } - ] - } - ), - encoding="utf-8", - ) - - def fake_runner(template: str, variables: dict[str, str], _cwd: Path) -> CommandResult: - if template.startswith("impl"): - return CommandResult( - success=True, - command=template.format(**variables), - returncode=0, - stdout="impl ok https://github.com/acme/repo/pull/42", - stderr="", - ) - if template.startswith("review"): - if variables["round"] == "1": - return CommandResult( - success=True, - command=template.format(**variables), - returncode=0, - stdout="[HIGH] tighten acceptance criteria", - stderr="", - ) - return CommandResult( - success=True, - command=template.format(**variables), - returncode=0, - stdout="APPROVED", - stderr="", - ) - if template.startswith("merge"): - return CommandResult( - success=True, - command=template.format(**variables), - returncode=0, - stdout="merged", - stderr="", - ) - raise AssertionError(f"Unexpected template: {template}") - - workflow = execute_workflow( - workflow_path=workflow_path, - repo_root=repo_root, - rounds=3, - min_severity="medium", - implementation_command="impl {round}", - review_command="review {round}", - merge_command="merge {pr_url}", - dry_run=False, - command_runner=fake_runner, - ) - job = workflow["jobs"][0] - task = job["execution_tasks"][0] - assert task["status"] == "merged" - assert task["pr_url"] == "https://github.com/acme/repo/pull/42" - assert len(task["review_rounds"]) == 2 - assert task["review_rounds"][0]["approved"] is False - assert task["review_rounds"][1]["approved"] is True diff --git a/automation/cursor_propose_only/tests/test_workflow.py b/automation/cursor_propose_only/tests/test_workflow.py deleted file mode 100644 index f4bf49ea..00000000 --- a/automation/cursor_propose_only/tests/test_workflow.py +++ /dev/null @@ -1,139 +0,0 @@ -from __future__ import annotations - -import json -from pathlib import Path - -from automation.cursor_propose_only.workflow import ( - apply_review_result, - evaluate_review, - prepare_bundle, - resolve_selected_proposals, -) - - -def test_prepare_bundle_writes_manifest_and_prompts(tmp_path: Path) -> None: - repo_root = tmp_path / "repo" - proposal_dir = repo_root / "propose" - output_dir = repo_root / "reports" / "propose_automation" - proposal_dir.mkdir(parents=True) - (proposal_dir / "EXAMPLE-PROPOSE.md").write_text("# Example\n", encoding="utf-8") - - workflow = prepare_bundle( - repo_root=repo_root, - proposal_dir=proposal_dir, - output_dir=output_dir, - rounds=3, - min_severity="medium", - pattern="*-PROPOSE.md", - include_completed=False, - ) - - assert workflow["review_rounds"] == 3 - assert workflow["min_severity"] == "medium" - assert len(workflow["jobs"]) == 1 - job = workflow["jobs"][0] - assert job["status"] == "pending_planner" - assert job["plan_path"] == "plans/PLAN-EXAMPLE.md" - assert len(job["reviewer_prompt_paths"]) == 3 - - workflow_path = output_dir / "workflow.json" - assert workflow_path.exists() - persisted = json.loads(workflow_path.read_text(encoding="utf-8")) - assert persisted["jobs"][0]["job_id"] == "example" - - planner_prompt = repo_root / job["planner_prompt_path"] - assert planner_prompt.exists() - assert "Do not implement production code." in planner_prompt.read_text(encoding="utf-8") - - -def test_evaluate_review_threshold_filtering() -> None: - review_text = """ - [LOW] Minor wording issue. - [HIGH] Missing out-of-scope guardrail. - APPROVED - """ - result = evaluate_review(review_text, min_severity="medium") - assert result["approved"] is False - assert result["issue_count"] == 2 - assert result["actionable_issue_count"] == 1 - assert result["actionable_issues"][0]["severity"] == "high" - - -def test_apply_review_result_updates_job_status(tmp_path: Path) -> None: - repo_root = tmp_path / "repo" - proposal_dir = repo_root / "propose" - output_dir = repo_root / "reports" / "propose_automation" - proposal_dir.mkdir(parents=True) - (proposal_dir / "EXAMPLE-PROPOSE.md").write_text("# Example\n", encoding="utf-8") - prepare_bundle( - repo_root=repo_root, - proposal_dir=proposal_dir, - output_dir=output_dir, - rounds=3, - min_severity="medium", - pattern="*-PROPOSE.md", - include_completed=False, - ) - workflow_path = output_dir / "workflow.json" - - failing_review = output_dir / "review-round-1.md" - failing_review.write_text("[MEDIUM] Missing risk section.\n", encoding="utf-8") - apply_review_result( - workflow_path=workflow_path, - review_file=failing_review, - job_id="example", - round_number=1, - min_severity="medium", - ) - after_fail = json.loads(workflow_path.read_text(encoding="utf-8")) - assert after_fail["jobs"][0]["status"] == "needs_fixes" - - approved_review = output_dir / "review-round-2.md" - approved_review.write_text("APPROVED\n", encoding="utf-8") - apply_review_result( - workflow_path=workflow_path, - review_file=approved_review, - job_id="example", - round_number=2, - min_severity="medium", - ) - after_approve = json.loads(workflow_path.read_text(encoding="utf-8")) - assert after_approve["jobs"][0]["status"] == "ready_to_merge" - - -def test_prepare_bundle_explicit_selection_filters_proposals(tmp_path: Path) -> None: - repo_root = tmp_path / "repo" - proposal_dir = repo_root / "propose" - output_dir = repo_root / "reports" / "propose_automation" - proposal_dir.mkdir(parents=True) - (proposal_dir / "ALPHA-PROPOSE.md").write_text("# Alpha\n", encoding="utf-8") - (proposal_dir / "BETA-PROPOSE.md").write_text("# Beta\n", encoding="utf-8") - - workflow = prepare_bundle( - repo_root=repo_root, - proposal_dir=proposal_dir, - output_dir=output_dir, - rounds=3, - min_severity="medium", - pattern="*-PROPOSE.md", - include_completed=False, - selected_proposals=["BETA-PROPOSE.md"], - ) - - assert len(workflow["jobs"]) == 1 - assert workflow["jobs"][0]["job_id"] == "beta" - - -def test_resolve_selected_proposals_supports_completed_when_enabled(tmp_path: Path) -> None: - proposal_dir = tmp_path / "propose" - completed_dir = proposal_dir / "completed" - completed_dir.mkdir(parents=True) - proposal_path = completed_dir / "DONE-PROPOSE.md" - proposal_path.write_text("# Done\n", encoding="utf-8") - - selected = resolve_selected_proposals( - proposal_dir, - selected_proposals=["DONE-PROPOSE.md"], - include_completed=True, - ) - assert selected == [proposal_path.resolve()] diff --git a/automation/cursor_propose_only/workflow.py b/automation/cursor_propose_only/workflow.py deleted file mode 100644 index ea889016..00000000 --- a/automation/cursor_propose_only/workflow.py +++ /dev/null @@ -1,424 +0,0 @@ -from __future__ import annotations - -import argparse -import json -import re -from dataclasses import asdict, dataclass -from datetime import UTC, datetime -from pathlib import Path -from typing import Any - -SEVERITY_ORDER = { - "trivial": 0, - "low": 1, - "medium": 2, - "high": 3, - "critical": 4, -} - -SEVERITY_CHOICES = tuple(SEVERITY_ORDER.keys()) - -_ISSUE_LINE_RE = re.compile( - r"^\s*(?:[-*]\s*)?(?:\[(critical|high|medium|low|trivial)\]|" - r"(critical|high|medium|low|trivial)\s*[:\-])\s*(.+?)\s*$", - re.IGNORECASE, -) -_APPROVED_RE = re.compile(r"\bAPPROVED\b", re.IGNORECASE) - - -@dataclass(frozen=True, slots=True) -class ReviewIssue: - severity: str - summary: str - - -def _validate_severity(severity: str) -> str: - normalized = severity.strip().lower() - if normalized not in SEVERITY_ORDER: - allowed = ", ".join(SEVERITY_CHOICES) - raise ValueError(f"Invalid severity {severity!r}; expected one of: {allowed}") - return normalized - - -def _slugify(value: str) -> str: - slug = re.sub(r"[^a-z0-9]+", "-", value.lower()).strip("-") - return slug or "proposal" - - -def _topic_from_propose_name(name: str) -> str: - if name.upper().endswith("-PROPOSE"): - base = name[: -len("-PROPOSE")] - else: - base = name - token = re.sub(r"[^A-Za-z0-9]+", "-", base).strip("-") - return token.upper() or "TOPIC" - - -def _relpath(path: Path, repo_root: Path) -> str: - try: - return str(path.resolve().relative_to(repo_root.resolve())) - except ValueError: - return str(path.resolve()) - - -def discover_proposals( - proposal_dir: Path, *, pattern: str = "*-PROPOSE.md", include_completed: bool = False -) -> list[Path]: - if not proposal_dir.exists(): - raise FileNotFoundError(f"Proposal directory does not exist: {proposal_dir}") - paths = sorted(p for p in proposal_dir.glob(pattern) if p.is_file()) - if include_completed: - completed = proposal_dir / "completed" - if completed.exists(): - paths.extend(sorted(p for p in completed.glob(pattern) if p.is_file())) - return sorted(set(paths)) - - -def resolve_selected_proposals( - proposal_dir: Path, selected_proposals: list[str], *, include_completed: bool = False -) -> list[Path]: - if not proposal_dir.exists(): - raise FileNotFoundError(f"Proposal directory does not exist: {proposal_dir}") - if not selected_proposals: - return [] - - search_roots = [proposal_dir.resolve()] - completed = proposal_dir / "completed" - if include_completed and completed.exists(): - search_roots.append(completed.resolve()) - - resolved: list[Path] = [] - for entry in selected_proposals: - candidate = Path(entry).expanduser() - candidates: list[Path] = [] - if candidate.is_absolute(): - candidates.append(candidate) - else: - candidates.append(proposal_dir / candidate) - if include_completed: - candidates.append(completed / candidate) - - found = next((path for path in candidates if path.is_file()), None) - if found is None: - raise FileNotFoundError( - f"Selected proposal not found: {entry!r} (checked under {proposal_dir})" - ) - - found_resolved = found.resolve() - if not any( - ( - found_resolved == root - or ( - found_resolved.is_relative_to(root) - if hasattr(found_resolved, "is_relative_to") - else str(found_resolved).startswith(str(root) + "/") - ) - ) - for root in search_roots - ): - raise ValueError( - f"Selected proposal must be under {proposal_dir} (or completed/ when enabled): {entry!r}" - ) - resolved.append(found_resolved) - - return sorted(set(resolved)) - - -def render_planner_prompt(propose_path: str, plan_path: str, agent_prompt_path: str) -> str: - return ( - f"# Planner prompt — {Path(propose_path).name}\n\n" - "You are a planning agent for this repository.\n\n" - "## Task\n" - "Use the proposal below to generate/update the implementation plan and per-PR agent prompts.\n\n" - f"- Proposal input: `{propose_path}`\n" - f"- Plan output: `{plan_path}`\n" - f"- Prompt output: `{agent_prompt_path}`\n\n" - "## Requirements\n" - "1. Produce plan + prompt files only.\n" - "2. Do not implement production code.\n" - "3. Keep out-of-scope guardrails explicit in each PR prompt.\n" - "4. Keep tests/checks explicit per PR prompt.\n" - ) - - -def render_reviewer_prompt( - propose_path: str, - plan_path: str, - agent_prompt_path: str, - *, - round_number: int, - min_severity: str, -) -> str: - return ( - f"# Reviewer prompt — round {round_number}\n\n" - "You are a fresh reviewer session for propose/planning artifacts.\n\n" - "## Review scope\n" - f"- `{propose_path}`\n" - f"- `{plan_path}`\n" - f"- `{agent_prompt_path}`\n\n" - "## Rules\n" - f"1. Report only issues with severity `{min_severity}` or higher.\n" - "2. Ignore low/trivial style nits.\n" - "3. If no actionable issues remain, return exactly `APPROVED`.\n" - "4. Format findings as `[SEVERITY] summary` (e.g., `[HIGH] ...`).\n" - ) - - -def parse_review_issues(review_text: str) -> list[ReviewIssue]: - issues: list[ReviewIssue] = [] - for line in review_text.splitlines(): - match = _ISSUE_LINE_RE.match(line) - if not match: - continue - severity = (match.group(1) or match.group(2) or "").lower() - summary = match.group(3).strip() - issues.append(ReviewIssue(severity=severity, summary=summary)) - return issues - - -def evaluate_review(review_text: str, *, min_severity: str) -> dict[str, Any]: - threshold = _validate_severity(min_severity) - issues = parse_review_issues(review_text) - actionable = [ - issue - for issue in issues - if SEVERITY_ORDER[issue.severity] >= SEVERITY_ORDER[threshold] - ] - approved_token = bool(_APPROVED_RE.search(review_text)) - return { - "approved": approved_token and not actionable, - "approved_token_present": approved_token, - "min_severity": threshold, - "issue_count": len(issues), - "actionable_issue_count": len(actionable), - "issues": [asdict(issue) for issue in issues], - "actionable_issues": [asdict(issue) for issue in actionable], - } - - -def _readme_text(rounds: int, min_severity: str, workflow_path: str) -> str: - return ( - "# Propose-only automation bundle\n\n" - "This directory is generated by `automation/cursor_propose_only/cli.py`.\n\n" - "## Workflow\n" - "1. Run planner prompt for each job to produce plan artifacts.\n" - f"2. Run {rounds} reviewer rounds in fresh sessions.\n" - f"3. Reviewer threshold is `{min_severity}` and higher.\n" - "4. Save each reviewer response to a file.\n" - "5. Evaluate each response with:\n\n" - f" `.venv/bin/python automation/cursor_propose_only/cli.py evaluate --workflow {workflow_path}" - " --job-id --round --review-file --write`\n" - ) - - -def prepare_bundle( - *, - repo_root: Path, - proposal_dir: Path, - output_dir: Path, - rounds: int, - min_severity: str, - pattern: str, - include_completed: bool, - selected_proposals: list[str] | None = None, -) -> dict[str, Any]: - if rounds < 1: - raise ValueError("rounds must be >= 1") - threshold = _validate_severity(min_severity) - if selected_proposals: - proposals = resolve_selected_proposals( - proposal_dir, selected_proposals, include_completed=include_completed - ) - else: - proposals = discover_proposals(proposal_dir, pattern=pattern, include_completed=include_completed) - - jobs_dir = output_dir / "jobs" - jobs_dir.mkdir(parents=True, exist_ok=True) - - jobs: list[dict[str, Any]] = [] - for proposal in proposals: - topic = _topic_from_propose_name(proposal.stem) - job_id = _slugify(topic) - job_dir = jobs_dir / job_id - job_dir.mkdir(parents=True, exist_ok=True) - - plan_path = repo_root / "plans" / f"PLAN-{topic}.md" - agent_prompt_path = repo_root / "plans" / f"AGENT-PROMPTS-{topic}.md" - - planner_prompt_path = job_dir / "planner_prompt.md" - planner_prompt_path.write_text( - render_planner_prompt( - _relpath(proposal, repo_root), - _relpath(plan_path, repo_root), - _relpath(agent_prompt_path, repo_root), - ), - encoding="utf-8", - ) - - reviewer_prompts: list[str] = [] - for round_index in range(1, rounds + 1): - reviewer_path = job_dir / f"reviewer_prompt_round{round_index}.md" - reviewer_path.write_text( - render_reviewer_prompt( - _relpath(proposal, repo_root), - _relpath(plan_path, repo_root), - _relpath(agent_prompt_path, repo_root), - round_number=round_index, - min_severity=threshold, - ), - encoding="utf-8", - ) - reviewer_prompts.append(_relpath(reviewer_path, repo_root)) - - jobs.append( - { - "job_id": job_id, - "status": "pending_planner", - "propose_path": _relpath(proposal, repo_root), - "plan_path": _relpath(plan_path, repo_root), - "agent_prompts_path": _relpath(agent_prompt_path, repo_root), - "planner_prompt_path": _relpath(planner_prompt_path, repo_root), - "reviewer_prompt_paths": reviewer_prompts, - "reviews": [], - } - ) - - workflow = { - "generated_at_utc": datetime.now(UTC).isoformat(), - "proposal_dir": _relpath(proposal_dir, repo_root), - "review_rounds": rounds, - "min_severity": threshold, - "jobs": jobs, - } - output_dir.mkdir(parents=True, exist_ok=True) - workflow_path = output_dir / "workflow.json" - workflow_path.write_text(json.dumps(workflow, indent=2, sort_keys=True), encoding="utf-8") - (output_dir / "README.md").write_text( - _readme_text(rounds=rounds, min_severity=threshold, workflow_path=_relpath(workflow_path, repo_root)), - encoding="utf-8", - ) - return workflow - - -def apply_review_result( - *, - workflow_path: Path, - review_file: Path, - job_id: str, - round_number: int, - min_severity: str, -) -> dict[str, Any]: - workflow = json.loads(workflow_path.read_text(encoding="utf-8")) - if round_number < 1: - raise ValueError("round_number must be >= 1") - if round_number > int(workflow.get("review_rounds", 0)): - raise ValueError( - f"round_number {round_number} exceeds configured rounds ({workflow.get('review_rounds')})" - ) - _validate_severity(min_severity) - review_text = review_file.read_text(encoding="utf-8") - result = evaluate_review(review_text, min_severity=min_severity) - - jobs = workflow.get("jobs", []) - for job in jobs: - if job.get("job_id") != job_id: - continue - job.setdefault("reviews", []) - job["reviews"].append( - { - "round_number": round_number, - "review_file": str(review_file), - **result, - } - ) - if result["approved"]: - job["status"] = "ready_to_merge" - elif round_number >= int(workflow.get("review_rounds", 0)): - job["status"] = "blocked_after_reviews" - else: - job["status"] = "needs_fixes" - break - else: - raise ValueError(f"Unknown job_id: {job_id}") - - workflow_path.write_text(json.dumps(workflow, indent=2, sort_keys=True), encoding="utf-8") - return result - - -def build_parser() -> argparse.ArgumentParser: - parser = argparse.ArgumentParser(description="Generate and gate propose-only orchestration bundles.") - subparsers = parser.add_subparsers(dest="subcommand", required=True) - - prepare = subparsers.add_parser("prepare", help="Generate planner/reviewer prompt bundles.") - prepare.add_argument("--repo-root", default=".", help="Repository root used for relative paths.") - prepare.add_argument("--proposal-dir", default="propose", help="Directory containing proposal markdown files.") - prepare.add_argument("--output-dir", default="reports/propose_automation", help="Output directory.") - prepare.add_argument("--rounds", type=int, default=3, help="Number of fresh reviewer rounds.") - prepare.add_argument( - "--min-severity", choices=SEVERITY_CHOICES, default="medium", help="Actionable review threshold." - ) - prepare.add_argument("--glob", default="*-PROPOSE.md", help="Glob pattern for proposal selection.") - prepare.add_argument( - "--proposal", - action="append", - default=[], - help=( - "Specific proposal file to include (repeatable). Paths can be absolute " - "or relative to --proposal-dir. When set, --glob is ignored." - ), - ) - prepare.add_argument( - "--include-completed", - action="store_true", - help="Also include proposals under proposal-dir/completed.", - ) - - evaluate = subparsers.add_parser("evaluate", help="Evaluate a reviewer response and optionally persist it.") - evaluate.add_argument("--review-file", required=True, help="Path to markdown/text reviewer output.") - evaluate.add_argument( - "--min-severity", choices=SEVERITY_CHOICES, default="medium", help="Actionable review threshold." - ) - evaluate.add_argument("--workflow", help="workflow.json path; required with --write.") - evaluate.add_argument("--job-id", help="Job identifier in workflow.json; required with --write.") - evaluate.add_argument("--round", dest="round_number", type=int, default=1, help="1-based review round.") - evaluate.add_argument("--write", action="store_true", help="Write result back into workflow.json.") - return parser - - -def main(argv: list[str] | None = None) -> int: - parser = build_parser() - args = parser.parse_args(argv) - - if args.subcommand == "prepare": - workflow = prepare_bundle( - repo_root=Path(args.repo_root).resolve(), - proposal_dir=Path(args.proposal_dir).resolve(), - output_dir=Path(args.output_dir).resolve(), - rounds=int(args.rounds), - min_severity=args.min_severity, - pattern=args.glob, - include_completed=bool(args.include_completed), - selected_proposals=list(args.proposal or []), - ) - print(json.dumps(workflow, indent=2, sort_keys=True)) - return 0 - - review_file = Path(args.review_file).resolve() - if args.write: - if not args.workflow or not args.job_id: - parser.error("--workflow and --job-id are required with --write") - result = apply_review_result( - workflow_path=Path(args.workflow).resolve(), - review_file=review_file, - job_id=args.job_id, - round_number=int(args.round_number), - min_severity=args.min_severity, - ) - else: - result = evaluate_review(review_file.read_text(encoding="utf-8"), min_severity=args.min_severity) - print(json.dumps(result, indent=2, sort_keys=True)) - return 0 - - -if __name__ == "__main__": - raise SystemExit(main()) diff --git a/docs/reports/call-graph-review.md b/docs/reports/call-graph-review.md deleted file mode 100644 index e6ed718e..00000000 --- a/docs/reports/call-graph-review.md +++ /dev/null @@ -1,364 +0,0 @@ -# Call Graph Layer — Code Review - -**Repository:** [HumanBean17/java-codebase-rag](https://github.com/HumanBean17/java-codebase-rag) -**Commits reviewed:** -- `b3a15d8` — *call graph layer propose* -- `fb5473f` — *call graph layer implementation* - -**Reference docs:** -- [`propose/completed/CALL-GRAPH-PROPOSE.md`](https://github.com/HumanBean17/java-codebase-rag/blob/master/propose/completed/CALL-GRAPH-PROPOSE.md) -- [`plans/completed/PLAN-CALL-GRAPH.md`](https://github.com/HumanBean17/java-codebase-rag/blob/master/plans/completed/PLAN-CALL-GRAPH.md) - -**Test status:** all 24 new call-graph tests pass locally -(`tests/test_ast_java_calls.py`, `tests/test_call_graph_smoke_roundtrip.py`, -`tests/test_call_graph_receiver_resolution.py`). - ---- - -## Overall verdict - -**Strong, faithfully-scoped implementation.** The proposal is realised as -written, the receiver-type resolver is well-structured, the schema and edge -metadata match the design (confidence + strategy + source), and the test -coverage targets concrete proposal section numbers. Scope discipline is -visible — no creep into HTTP / async / AOP / traces. - -There are **three correctness bugs** that should land as a quick follow-up -before Phase 3 is closed, plus a handful of design issues worth pushing back -on. All three bugs share one root cause: **resolution strategy and -confidence are silently downgraded at edge-emit time when the receiver was -already resolved successfully.** - ---- - -## What's done well - -- **Confidence + strategy tagging is faithful to the design.** Every edge - carries (`confidence`, `strategy`, `source='static'`) — clean migration - path for trace ingestion later. -- **Multigraph dedup at write time** (`(src_id, dst_id, arg_count, line)`) - is correctly shaped: prevents accidental duplication while preserving - overload-ambiguous fan-out at distinct call sites. -- **Receiver-type resolver** is clear and matches the proposal: scope table - built once per method, supertype-bounded lookup, explicit - `chained_receiver` phantom path, deterministic phantom IDs. -- **Receiver-disambiguation discipline.** `_unique_type_simple_resolve` - deliberately uses the *type* registry (not a per-method simple-name - index). The dedicated test - `test_receiver_disambiguation_uses_type_index_not_method_unique` is - exactly the right kind of negative test — this is the precise trap - CMM-style cascades fall into and the implementation avoids it. -- **`_method_ids_for_call_graph_needle`** elegantly accepts type FQN, - method FQN, or simple method name; fan-out through `DECLARES` from a - type needle is the right move and matches §6.1. -- **`exclude_external` is filter-on-result, not filter-on-store.** Phantoms - stay in the graph (so impact analysis can see JDK-adjacent signals), but - query consumers get clean lists by default. Matches risk #2 mitigation - in the proposal. -- **Tests target proposal section numbers.** 24 tests, all passing, - including a Kuzu round-trip on a real fixture project. The shadowing - test (`test_local_shadows_field_same_name_resolves_receiver`) is the - kind of edge case that bites in real codebases. -- **Diagnostics are baked in** — `pass3_calls` prints the chained-phantom - percentage as the proposal mandates. - ---- - -## Bugs (must fix) - -### B1. Constructor calls always become phantoms when the class has no explicit constructor - -**Severity: high — most common Java call site is broken.** - -`new Svc()` in `ScopeReceivers.byLocal()` resolves the receiver type to -`smoke.Svc` correctly. But `Svc` has no explicit constructor in source, so -`_parse_method` is never invoked for an ``, and no constructor -`MemberEntry` is created. `_lookup_method_candidates(type='smoke.Svc', -callee='', argc=0)` finds nothing → fallthrough to phantom at -`confidence=0.0`. - -Confirmed empirically against the smoke fixture: - -``` -['smoke.ScopeReceivers#byLocal()', 'smoke.Svc#(0)', 'phantom', False, 0.0] -['smoke.ScopeReceivers#shadowLocalOverField()', 'smoke.Svc#(0)', 'phantom', False, 0.0] -``` - -In a real Spring codebase, **every** `new MyDto()`, `new HashMap<>()`, -`new ArrayList<>()` on a project type without a hand-written constructor -lands as a phantom. - -**Fix.** When parsing a `TypeDecl` and discovering no constructor -declaration, synthesize a default -`MethodDecl(name="", signature="()", is_constructor=True, ...)` -with `start_line` / `start_byte` from the type declaration and -`parameters=[]`. Make sure it gets a `MemberEntry`. - -Two corollary checks: - -- `_emit_call_edge` for `new Svc()` should then resolve to the synthesized - member with `strategy='constructor'` (not `phantom`), `confidence` - inherited from the receiver-resolution tier. -- Confirm existing `INJECTS` / `DECLARES` accounting doesn't double-count - the synthesized node. - -**Suggested test** — add to `tests/test_call_graph_smoke_roundtrip.py` -(`test_implicit_default_ctor_is_resolved`): - -```java -public class HasNoCtor {} -public class Caller { void m() { new HasNoCtor(); } } -``` - -Assert: `(Caller#m)-[CALLS {strategy:'constructor', resolved:true}]->(HasNoCtor#())`. - ---- - -### B2. Implicit `super()` for a class that doesn't extend anything is mis-tagged as `phantom` - -**Severity: medium — diagnostic regression, not a wrong answer.** - -`WildUtils` has an explicit `private WildUtils() {}` constructor with no -`super(...)` body, so the AST extractor synthesizes the implicit-super -call site. `_first_supertype_fqn` returns `None` (no `EXTENDS` row → -there is no `Object` node in the index), so `_resolve_receiver_type` -returns `(None, "phantom", 0.0)`. Result: - -``` -['smoke.WildUtils#WildUtils()', '?super#(0)', 'phantom', False, 0.0] -``` - -The proposal §4.2 promises strategy `implicit_super (0.90)` for this case. -Right now the agent cannot distinguish "implicit super to `Object`" from -"I have no idea what this call resolved to" — real signal loss. - -**Fix.** In `_resolve_receiver_type`, when `expr == 'super'` and -`_first_supertype_fqn(...) is None`, return -`("java.lang.Object", "implicit_super", 0.90)`. In `_emit_call_edge`, -allow phantom callee (no member resolved on `Object`) but **preserve -`strategy='implicit_super'` and `confidence=0.90`** instead of overriding -to `phantom` / `0.0`. This is the same fix-shape as B3 below. - ---- - -### B3. Resolution strategy and confidence are silently overridden to `phantom` / `0.0` when the callee can't be located on a resolved external receiver - -**Severity: high — collapses static-import precision when callees are JDK / Spring.** - -In `_resolve_and_emit_call`: - -```python -if not candidates: - pid = _phantom_method_id(...) - _emit_call_edge(..., confidence=0.0, strategy="phantom", resolved=False) - return -``` - -This branch fires whenever the receiver type *did* resolve (e.g. -`java.util.Objects` via `static_import`, confidence 0.95) but the callee -method isn't on a type we indexed. The static-import smoke test confirms it: - -``` -requireNonNull edges: 1 - phantom 0.0 False java.util.Objects#requireNonNull(1) -``` - -The README and the MCP instructions both tell agents to use -`min_confidence=0.9` to filter noise. Under that filter, **every JDK -static-import call disappears from the graph**, even though the resolver -*knew* the call's target type with 0.95 confidence. - -**Fix.** Decouple the *receiver-resolution strategy/confidence* from the -*callee-found* boolean. When `candidates` is empty: - -- Keep the phantom callee (creating it on the resolved receiver type — - already done). -- Keep `resolved=False` on the edge (the *callee node* is a phantom). -- **Preserve the receiver-resolution `strat` and `conf`** unless they're - `'chained_receiver'`. Specifically: `strategy` stays `'static_import'` / - `'static_import_wildcard'` / `'import_map'` / `'same_module'` etc.; - `confidence` stays the receiver-tier value. - -The only case where `confidence=0.0, strategy='phantom'` is honest is when -the receiver itself was unresolvable. Distinguishing those two failure -modes is the whole point of the cascade. - -Optional: add a small property `callee_found BOOLEAN` on the edge so a -query like *"high-confidence edges with phantom callees"* (= calls into -well-known external libraries) becomes one Cypher predicate. - -**Suggested tests:** - -- `test_static_import_to_jdk_keeps_high_confidence` — `requireNonNull` - edge has `confidence>=0.95` and `strategy='static_import'`, with - `resolved=False` on the edge. -- `test_min_confidence_filter_keeps_high_confidence_static_import_callers` - — `find_callers('java.util.Objects#requireNonNull(1)', min_confidence=0.9)` - returns the in-repo caller. - ---- - -## Design issues (push back on the proposal here, not just the implementation) - -### D1. Phantom-ID `arg_count` semantics are inconsistent across method-references and regular calls - -`_phantom_method_id` builds the FQN as `{receiver}#{callee}({arg_count})`. -For method references the `arg_count` is `-1`. So the same external method -can exist as both `Foo#bar(2)` and `Foo#bar(-1)` phantom nodes — distinct -nodes for the same logical target. The dedup key -`(src_id, dst_id, arg_count, line)` then keeps both edges, doubling the -graph for code that mixes calls and method references on the same target. - -**Recommendation.** Either normalize phantom IDs without `arg_count` for -method references (`?{recv}#{callee}(?)`) or drop `arg_count` from the -dedup key and use `(src_id, dst_id, line, byte)` (line+byte already pin a -unique call site). - ---- - -### D2. Method-reference precision is leaving free wins on the table - -Method references that *are* unambiguous on name (single method, no -overloads) currently still emit with `arg_count=-1`. Cheap precision win, -no extra resolver complexity: when the receiver type is known and exactly -one method with `name == callee_simple` exists on the receiver type, pick -that single-arity match and emit a fully-resolved edge with the receiver's -real arity instead of `-1`. - ---- - -### D3. Anonymous-inner-class call attribution does the proposal-correct thing, but the design is questionable - -Right now `pingFromAnon()` (called from inside -`new Runnable() { run() { pingFromAnon(); } }`) is attributed to -**`NestedCalls#m()`**, the enclosing named method, with -`strategy='this_super'`. That matches §4.1's wording. - -But: the anonymous `Runnable` *does* get parsed as a nested type in -`_parse_type` (kind `class`). It produces a `MemberEntry` for its -`run()` method. So the graph has two contradictory facts: the call edge -goes from `NestedCalls#m`, and the structural fact "there exists a -`run()` method here" lives on a separate, disconnected anonymous type -node. - -**Recommendation.** Re-attribute calls inside an anonymous-class body to -the anonymous-class member. The named-enclosing fallback is only needed -for **lambdas** (which don't synthesize a member) and static / instance -initializers. For anonymous classes, the call-site naturally belongs to -the anonymous member. This makes -`find_callers('OperatorAssignedProcessor.onOperatorAssigned')` find the -anonymous handler that actually contains the call, instead of the outer -service method. - ---- - -### D4. `expand_methods` discards confidence on the way out - -The output is `list[str]` of type FQNs. There's no way for the search-side -fusion in `_graph_expand_merge` to weight a CALLS-derived hit lower than -a structural one. The proposal §6.2 says "merged via existing RRF, no new -caller-visible parameters" — so RRF treats every reach equally regardless -of whether it came from a 0.95 import-map edge or a 0.55 suffix edge. - -**Recommendation (small).** Have `expand_methods` return -`list[tuple[str, float]]` (type FQN + max confidence on the discovery -path), and let `_graph_expand_merge` pass that as the RRF rank weight. -Internal-only signature change; no MCP surface change. - ---- - -### D5. `trace_flow`'s default change quietly rebudgets stage capacity across two qualitatively different edge sources - -`follow_calls=True` is the new default. Existing agent prompts that -expected type-only stages now get extra entries with -`via.edge_type='CALLS'`. That's good — agents can infer it. But the -per-stage cap (`stage_limit`) now budgets across both edge classes, so a -high-fan-out service can starve INJECTS results in favor of CALLS results. - -**Recommendation.** Either: - -1. Keep separate budgets (`stage_limit_structural`, `stage_limit_calls`, - default to `stage_limit` each), or -2. Order ingestion to prefer INJECTS / EXTENDS / IMPLEMENTS first, then - top up with CALLS until `stage_limit`. The current code already runs - the structural query first — just keep the CALLS top-up bounded by - `stage_limit - len(stage_results)` instead of a separate - `stage_limit * 4` LIMIT. - ---- - -### D6. `_resolve_this_super_field_chain` lacks fixture coverage - -The resolver line -`chain = _resolve_this_super_field_chain(expr, member=member, ast=ast, tables=tables)` -is a real bonus over what CMM does — if it walks -`this.fieldA.fieldB.fieldC.method()` correctly. Add a smoke fixture that -exercises it; none of the existing files do. - ---- - -## Smaller nits - -- **N1 — Per-call rebuild of `_scope_table`.** `_resolve_and_emit_call` - calls `_scope_table(member, ast, tables)` on every call site. - Field / parameter scope is identical for every call inside a single - method body — locals only grow as you step through the body. Build it - once per `member` in `_resolve_method_calls` and pass it in. On a - 5-microservice corpus this is the kind of constant-factor that doubles - `pass3_calls` runtime. -- **N2 — `_lookup_method_candidates`'s `name_only` fallback rule is good, - but the strategy logic in `_resolve_and_emit_call` is intricate.** - The branch - `elif name_only_fb and len(candidates) == 1: edge_strat = strat` is - correct but easy to misread — the inline comment is good; consider - promoting it to a docstring section. -- **N3 — `is_static_call` heuristic.** `_infer_static_method_invocation` - returns `True` when the receiver starts with an uppercase identifier. - For `var Foo = supplier.get();` followed by `Foo.bar()` this - misclassifies. Rare in practice, but worth a TODO; conservative fix is - to consult the scope table (if `Foo` is in scope as a variable, it's - not a static call). -- **N4 — Ontology guard.** `ONTOLOGY_VERSION` 3 → 4 is set, but confirm - `KuzuGraph.get` actually raises on `GraphMeta.ontology_version` - mismatch at read time so a stale graph fails loudly (proposal §5.3). -- **N5 — `pass3_calls` diagnostics.** The log line reports - chained-phantom % only. Add the `phantom_other` ratio (the bigger one - in real codebases) so you can spot B1 / B3 regressions in the log - immediately. -- **N6 — Method reference inside lambda.** `visit` sets - `lam=lam or chained` for method references with a chained qualifier. - That conflates "I'm in a lambda" with "this method ref is itself - chained." `chained` should propagate as a separate flag, not as - `in_lambda`. -- **N7 — Empty `expr` and `is_static_call=False` branch.** The condition - `expr in ("", "this") or (not expr and call.is_static_call is False - and not call.receiver_expr)` is redundant: if `expr == ""` the second - clause is also true. Simplify to `expr in ("", "this")`. - ---- - -## Suggested fix order - -1. **B1, B2, B3 as one PR** titled - *"call graph: faithful confidence preservation across the resolver→writer boundary"* - — the three bugs share one architectural fix (don't downgrade - strategy / confidence at edge-emit time when the receiver was - resolved). Add the suggested tests in the same PR. -2. **D5 as a separate PR** — `trace_flow` budget split with a regression - test that seeds a service whose CALLS fan-out exceeds the structural - one. -3. **D3 (anon-class re-attribution), D4 (`expand_methods` confidence), - N1 (scope-table caching) as a small follow-up** before opening the - next phase. - ---- - -## Closing note - -This is solid Phase-3 work. Land the three bug fixes and the codebase is -in an excellent spot to start on the next phase — either cross-service -`HTTP_CALLS` (B6 / B7 in -[`what-to-borrow-from-cmm.md`](https://github.com/HumanBean17/java-codebase-rag/blob/master/tmp/what-to-borrow-from-cmm.md)) -or runtime-trace ingestion (B3 from the same doc). Both will lean on the -resolver and confidence machinery just built; the bug fixes above make -that lean trustworthy. diff --git a/docs/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md b/docs/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md deleted file mode 100644 index 83a99cae..00000000 --- a/docs/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md +++ /dev/null @@ -1,62 +0,0 @@ -# Design issues: PLAN-BROWNFIELD-ROLE-OVERRIDES (plan / specification) - -**Plan file:** `plans/todo/PLAN-BROWNFIELD-ROLE-OVERRIDES.md` -**Review date:** 2026-04-26 -**Scope:** Problems, ambiguities, or gaps in the *written plan* (not the codebase). - ---- - -## 1. Dual pipeline for meta-annotation data (spec gap) - -The plan describes building Layer A (meta-annotation reachability) from a two-pass process anchored in `build_ast_graph.py` and `GraphTables`. The chunk-enrichment / Lance path must also apply the same resolution rules, but the plan does **not** require a single shared primitive for “which `@interface` definitions exist in the project.” - -A careful reader can infer that graph build and index enrichment should agree, but two independent implementations (graph tables vs. a separate tree walk) are **not** ruled out. If file coverage, exclude patterns, or parse-failure handling differ, Lance and Kuzu can **disagree** on `meta_chain` for the same type. The plan would be stronger with an explicit constraint: e.g. “meta maps MUST be derived from the same file set and exclusion rules as `build_ast_graph` pass1,” or “Lance and Kuzu MUST share one builder function.” - ---- - -## 2. Depth cap for meta-annotation resolution is under-specified - -The plan gives a sketch of `_resolve_meta_chain` with `len(seen) > 4` and cycle handling. As written, the `seen` set is used both for **cycle** detection and as a stand-in for **path depth**. On a *linear* chain of meta-annotations, set size tracks depth. On **branching** shapes, set cardinality and “steps from root” diverge, so the sketch does not define a single clear semantics (strict path depth vs. global visit count). - -The follow-up test (“six wrappers → `OTHER`”) depends on a precise cap. The plan should name the exact metric (e.g. maximum path length from the start simple name) and the integer bound, so implementers and tests are aligned. - ---- - -## 3. Pre-flight test 9 mixes “unit” and “integration” scope - -The pre-flight item asks for a “unit-style” regression but specifies: build a **fresh** Lance index with FQN overrides, **query the table directly**, and then run **`codebase_search(..., capability=...)`** end to end. That is a **multi-layer** test (indexer + storage + search API) and is expensive to run and to keep stable in CI. - -A tiered requirement would match intent better: (1) schema / `JavaLanceChunk` field, (2) `process_java_file` row, (3) optional full search. As written, teams may either skip the heavy part or over-invest in flaky integration for what is mainly a **write-path** contract. - ---- - -## 4. “Precedence” vs. “execution order” is correct but error-prone to skim - -The plan is internally consistent: execution order is the *reverse* of listed priority, and guards use the **current** `role` after each step. Still, a reader who only scans the “Precedence summary (final)” table may implement **C before FQN** in the wrong direction or mis-order **B vs. A** without reading the “Execution order in code (REQUIRED)” block. - -This is a **documentation hazard** in the spec, not a logic error. A short, single bullet at the top (“Apply steps in *only* the order: …; do not reorder”) or a Mermaid sequence diagram would reduce mis-implementation. - ---- - -## 5. Layer A duplicate `@interface` simple names - -The plan correctly specifies first-seen-wins and a stderr warning. The **implication** (colliding simple names in different packages map to one `meta_chain` entry) is only obvious if you already know Java’s annotation resolution limits in this indexer. A one-line “Limitation:” callout in the plan would set expectations for monorepos with same-named annotations. - ---- - -## 6. Rollout vs. single document - -The plan says three independent PRs (Phase 1 → 2 → 3) while also presenting all phases in one file. That is fine for a complete picture, but the **merge strategy** (squashed single PR vs. three) is a process choice the plan does not need to fix—only note that “shippable phases” and “one landing” can conflict in review scope unless branches are cut accordingly. - ---- - -## Summary - -| ID | Topic | Severity (spec) | -|----|------------------------------|-----------------| -| 1 | Single source of truth for meta map inputs | High (consistency) | -| 2 | Depth / cycle semantics | Medium | -| 3 | Pre-flight test cost / tiers | Low–medium | -| 4 | Precedence skimming hazard | Low | -| 5 | Duplicate simple-name limits | Low | -| 6 | Multi-PR vs one doc | Process only | diff --git a/docs/reports/what-to-borrow-from-cmm.md b/docs/reports/what-to-borrow-from-cmm.md deleted file mode 100644 index e2258de3..00000000 --- a/docs/reports/what-to-borrow-from-cmm.md +++ /dev/null @@ -1,247 +0,0 @@ -# What to Borrow from Codebase-Memory MCP - -A focused, prioritized guide for evolving `java-codebase-rag` (AMA agent) by adopting proven patterns from [DeusData/codebase-memory-mcp](https://github.com/DeusData/codebase-memory-mcp) (paper: [arXiv:2603.27277](https://arxiv.org/abs/2603.27277)) — without giving up your Spring-aware, hybrid (vector + graph) edge. - -> **Guiding principle.** CMM optimizes for *token efficiency at acceptable quality* across 66 languages. Your AMA agent optimizes for *answer quality on Spring/Java microservices* via hybrid retrieval. Borrow CMM's structural mechanics; keep your semantic / role-aware layer as the differentiator. - ---- - -## Snapshot — where each tool wins - -| Layer | Your AMA agent | Codebase-Memory MCP | Action | -|---|---|---|---| -| Java/Spring DI semantics | Strong (`@Autowired`, `@Inject`, Lombok, `@FeignClient`) | None | Keep yours | -| Vector / hybrid retrieval (LanceDB + RRF + `graph_expand`) | Yes | None | Keep yours | -| Role / capability ontology (`CONTROLLER`, `MESSAGE_LISTENER`, ...) | Yes | None | Keep yours | -| Microservice topology + brownfield overrides | Yes | Generic `Project` only | Keep yours | -| `CALLS` / `HTTP_CALLS` / `ASYNC_CALLS` resolution | Roadmap (Phase 3) | Shipped, mature | **Borrow** | -| `Route` as first-class node | Roadmap | Shipped | **Borrow** | -| Cross-repo / cross-service edges | Roadmap | Shipped (`pass_cross_repo`) | **Borrow** | -| Runtime trace ingestion | None | Shipped (`ingest_traces`) | **Borrow** | -| Git-diff impact + risk classification | Partial (`impact_analysis`) | Shipped (`detect_changes`) | **Borrow** | -| Layered ignore (`.gitignore` + project ignore) | Constant list | Layered (`.cbmignore`) | **Borrow** | -| Louvain community detection | None | Shipped | **Borrow (Phase 4)** | -| Dead-code detection | None | Shipped | **Borrow (Phase 4)** | -| 66-language tree-sitter grammars | Java only | Yes | Skip (off-strategy) | -| Single static binary distribution | Python venv | Yes | Skip until Phase 5+ | -| 3D graph UI | None | Yes | Skip | -| `get_architecture` mega-tool | Split into small tools | One bundled tool | Skip — keep yours | - ---- - -## Tier 1 — Borrow now (cheap, high impact) - -### B1. Confidence-scored CALLS resolution cascade - -CMM's [`pass_calls.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_calls.c) and [`extract_calls.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/internal/cbm/extract_calls.c) resolve calls via a deterministic cascade. Adopt the **shape**, not the C code. - -**What to lift:** - -- A 4-strategy cascade with explicit confidence values: - 1. Import-map resolved (`0.95`) - 2. Same-module / same-package (`0.90`) - 3. Globally unique simple name (`0.75`) - 4. Suffix / fuzzy match (`0.55`) -- A `confidence` property on every `CALLS` edge so downstream tools (and the MCP agent) can filter (`WHERE c.confidence >= 0.8`). -- A `source` property: `"static"` vs `"trace"` vs `"di_proxy"`. - -**Why now:** Add the property when you create the Kuzu schema for Phase 3 — retrofitting columns later is painful. - -**Suggested Kuzu DDL:** - -```sql -CREATE REL TABLE CALLS ( - FROM Method TO Method, - confidence DOUBLE, -- 0.55 .. 1.0 - source STRING, -- 'static' | 'trace' | 'di_proxy' - strategy STRING, -- 'import_map' | 'same_module' | 'unique_name' | 'suffix' - call_site STRING -- file:line -); -``` - ---- - -### B2. `Route` as a first-class node - -CMM models REST endpoints and message channels as a single `Route` label so that *any* call site can attach to *any* endpoint via `HTTP_CALLS` / `ASYNC_CALLS`. See [`pass_route_nodes.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_route_nodes.c). - -**What to lift:** - -- Adopt the **`Route`** label (instead of `RestEndpoint` from your current PRODUCT-VISION) — keeps you semantically interoperable if anyone runs both MCPs in parallel. -- Properties: `path`, `method`, `framework` (`spring_mvc`, `webflux`, `feign`, `kafka`, `rabbitmq`), `broker` (for async), `service` (microservice name). -- Edges: - - `(Method)-[:EXPOSES]->(Route)` for `@RequestMapping`/`@KafkaListener` - - `(Method)-[:HTTP_CALLS]->(Route)` for `RestTemplate`/`WebClient`/`@FeignClient` - - `(Method)-[:ASYNC_CALLS]->(Route)` for `KafkaTemplate.send`/`StreamBridge.send` -- A normalization rule: `/api/users/{id}` and `/api/users/123` collapse to the same `Route` (path-template canonicalization). - ---- - -### B3. Runtime trace ingestion (`ingest_traces`) - -This is the single biggest quality lever you don't have yet. Static analysis misses Spring AOP proxies, polymorphic dispatch, reflection, and event-driven flows — runtime traces capture all of them. - -**What to lift:** - -- A new MCP tool `ingest_traces(spans: List[Span], source: str)`. -- Accept OpenTelemetry / Sleuth / Micrometer JSON natively. -- For each `(parent_span, child_span)` pair, emit `(caller:Method)-[:CALLS {source:"trace", confidence:1.0}]->(callee:Method)`. -- For HTTP client spans, emit `(caller)-[:HTTP_CALLS]->(Route)` using `http.url` + `http.method` to match an existing `Route` node. -- Deduplicate via `(source_id, target_id, source)` so re-ingesting traces is idempotent. - -**Why this matters:** Lifts Phase 3 from "static approximation" to "ground-truth where traces exist, static elsewhere" — and the agent can prefer `confidence:1.0` edges automatically. - ---- - -### B4. Git-diff impact mapping with risk score - -CMM's [`detect_changes`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_gitdiff.c) maps a diff to affected symbols and a blast radius. You already have `impact_analysis` — make it diff-driven and add risk classification. - -**What to lift:** - -- New MCP tool `analyze_pr(diff: str | git_ref: str)`: - 1. Parse `git diff` line ranges per file - 2. Map line ranges → chunks → graph nodes (functions/methods) - 3. Run your existing reverse closure - 4. Return `{ changed_nodes, blast_radius, risk_score, risk_level }` -- Risk formula (start simple, tune later): - -``` -risk = log10(1 + downstream_consumers) * role_weight * cross_service_factor - -role_weight = { CONTROLLER:1.5, SERVICE:1.2, REPOSITORY:1.0, CONFIG:1.8, ENTITY:1.3, ... } -cross_service_factor = 1.0 if changes only touch one microservice, 2.0 otherwise -risk_level = "low" (<1.0), "medium" (1.0..2.5), "high" (>2.5) -``` - -- Output usable directly in PR review or CI gating. - ---- - -### B5. Layered ignore patterns - -CMM uses **hardcoded patterns → `.gitignore` hierarchy → `.cbmignore`** ([`discover/`](https://github.com/DeusData/codebase-memory-mcp/tree/master/src/discover)). Cleaner than your current `COMMON_EXCLUDED_PATH_PATTERNS` constant. - -**What to lift:** - -- Layer order: - 1. Hardcoded must-skip (`.git`, `node_modules`, `target`, `build`, `out`, `.idea`, `.gradle`, `bin`) - 2. Walk up `.gitignore` files from each indexed directory - 3. Project-level `.lancedb-mcp.yml`'s `ignore:` list - 4. NEW: optional `.lancedb-mcp-ignore` file with gitignore syntax -- Always skip symlinks (cycle protection). -- Reuse `pathspec` (Python) — it's the gitignore-spec-compliant matcher. - ---- - -## Tier 2 — Borrow during Phase 2 / 3 - -### B6. Cross-repo / cross-service edges - -CMM's [`pass_cross_repo.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_cross_repo.c) matches an `HTTP_CALLS` edge in service A to a `Route` node in service B and creates a `CROSS_HTTP_CALLS` edge. This is the killer feature for a multi-microservice AMA. - -**What to lift:** - -- After per-service indexing, run a global pass: - - For each `HTTP_CALLS` edge with `path` + `method`, find the matching `Route` node in any other indexed service. - - Emit `(callerMethod)-[:CALLS_HTTP]->(Route)<-[:EXPOSES]-(calleeMethod)` so traversal in either direction works. -- Same for async: match `topic`/`queue` strings in `KafkaTemplate.send` calls to `@KafkaListener` `Route` nodes. -- Path template matching: `GET /api/orders/{id}` matches a call to `GET /api/orders/123` — use a `path_pattern` regex stored on the `Route`. - -**Killer query unlocked:** *"What breaks if I rename `POST /api/orders` in `order-service`?"* → traverse `Route` → cross-service `HTTP_CALLS` → caller methods → reverse closure → affected controllers in `checkout-service`. - ---- - -### B7. Louvain community detection - -CMM runs Louvain over `CALLS` to discover functional modules. Useful for onboarding and architecture pitches. - -**What to lift:** - -- After Phase 3 `CALLS` lands, run Louvain on the call subgraph (use `python-igraph` or `networkx-community`). -- Store `cluster_id` and `cluster_size` as `Method` properties. -- New MCP tool `find_module_clusters(min_size: int)` returning ranked clusters with their dominant role mix and entry methods. -- Bonus: weight edges by call frequency from traces (B3) for higher-quality partitions. - ---- - -### B8. Dead-code detection - -Trivial once `CALLS` exists, but valuable for cleanup and consulting deliverables. - -**What to lift:** - -- New MCP tool `find_dead_code(exclude_entry_points: bool = true)`. -- Definition: `Method` with zero incoming `CALLS` and zero incoming `EXPOSES`. -- Entry-point predicates to exclude: - - Spring stereotypes that auto-invoke: `@Scheduled`, `@PostConstruct`, `@EventListener`, `@KafkaListener`, `@RabbitListener`, `@JmsListener` - - HTTP entry points: any method with an `EXPOSES` edge - - Test methods: `@Test`, `@ParameterizedTest`, lifecycle annotations - - `public static void main(String[])` -- Cypher (one query): - -```cypher -MATCH (m:Method) -WHERE NOT (m)<-[:CALLS]-() - AND NOT (m)-[:EXPOSES]->() - AND NOT m.is_entry_point -RETURN m.qualified_name, m.role, m.file, m.line -ORDER BY m.role, m.qualified_name -``` - ---- - -## Tier 3 — Borrow later or skip - -### Borrow only if you go poly-language (Phase 5+) - -- **B9. Multi-grammar indexing.** CMM ships 66 grammars vendored. Adopt only if you sell to non-Java SMBs. -- **B10. Static binary distribution.** Compelling for SMB clients ("download → run"). Not relevant while you're a Python venv. - -### Skip (don't fit your strategy) - -- **`get_architecture` mega-tool.** Your split tools (`graph_meta`, `list_by_role`, `list_by_capability`) are more agent-friendly because each is named and small. The agent picks better when tool intent is narrow. -- **3D graph UI.** Not the differentiator. If you need visualization, render Kuzu subgraphs to Mermaid or Graphviz on demand from a tool — far less code, embeds in chat. -- **Their ADR module.** Markdown folder + your existing search is enough. Adding ADR CRUD is scope creep. -- **CMM's mini-Cypher executor.** You already have Kuzu — strictly more capable. - ---- - -## Suggested roadmap reorder - -A revised ordering that front-loads borrowed pieces with the highest ROI: - -| Phase | Goal | Borrowed items | -|---|---|---| -| **2** (now) | `Route` nodes + `HTTP_CALLS` / `ASYNC_CALLS` from Spring/Feign/Kafka, with `confidence` columns | B2 | -| **2.5** | `ingest_traces` MCP tool (cheap, huge quality lift) | B3 | -| **3** | Static `CALLS` with 4-strategy cascade; `find_callers` / `find_callees`; dead code | B1, B8 | -| **3.5** | `pass_cross_repo`-style cross-service edges | B6 | -| **4** | `analyze_pr` (diff → impact + risk); Louvain clusters | B4, B7 | -| **5** | Eval harness; head-to-head benchmark vs. CMM on Java repos | — | -| **5+** | Optional poly-language grammars; static-binary packaging | B9, B10 | - -Layered ignores (B5) can land anywhere — drop it in alongside the next indexer change. - ---- - -## Strategic notes - -- **Run both MCPs in parallel as a zero-integration option.** `.mcp.json` supports many servers. Let your tool answer Java/architectural queries; CMM handles non-Java or generic structural queries when you eventually touch poly-glot codebases. Zero integration cost, maximum optionality. -- **Use the comparison itself as a portfolio asset.** When you start pitching SMB clients on AI automation, "I built a Spring-aware hybrid retrieval system that beats the published Codebase-Memory baseline on Java microservice questions" — with numbers from your Phase 5 eval harness — is a credible artifact. Few consultants can show that. -- **Don't fork CMM.** It's MIT-licensed C with vendored grammars; maintenance cost is high and the code style diverges from your Python stack. Read it as documentation, port the patterns. - ---- - -## References - -- Codebase-Memory MCP source — [github.com/DeusData/codebase-memory-mcp](https://github.com/DeusData/codebase-memory-mcp) -- Paper — [Codebase-Memory: Tree-Sitter-Based Knowledge Graphs for LLM Code Exploration via MCP (arXiv:2603.27277)](https://arxiv.org/abs/2603.27277) -- Your repo — [HumanBean17/java-codebase-rag](https://github.com/HumanBean17/java-codebase-rag) -- Key CMM files referenced above: - - [`pass_calls.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_calls.c) — call resolution - - [`pass_route_nodes.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_route_nodes.c) — route nodes - - [`pass_cross_repo.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_cross_repo.c) — cross-service edges - - [`pass_gitdiff.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_gitdiff.c) — git diff impact - - [`extract_channels.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/internal/cbm/extract_channels.c) — async patterns - - [`service_patterns.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/internal/cbm/service_patterns.c) — framework markers diff --git a/graph_types.py b/graph_types.py new file mode 100644 index 00000000..7bce0262 --- /dev/null +++ b/graph_types.py @@ -0,0 +1,133 @@ +"""Shared graph types and helpers used by mcp_v2 and resolve_service. + +This is the neutral, acyclic shared module. It must NOT import from +``mcp_v2`` or ``resolve_service`` — both of those import FROM here. +""" + +from __future__ import annotations + +from typing import Any, Literal + +from pydantic import BaseModel + +from ladybug_queries import LadybugGraph +from mcp_hints import generate_hints + +__all__ = [ + "NodeRef", + "StructuredHint", + "set_hints_enabled", + "_hints_or_skip", + "_node_kind_from_id", + "_resolve_node_kind", + "_node_ref_from_row", + "_to_structured_hints", +] + + +class NodeRef(BaseModel): + id: str + kind: Literal["symbol", "route", "client", "producer", "unresolved_call_site"] + fqn: str + name: str | None = None + symbol_kind: str | None = None + microservice: str | None = None + module: str | None = None + role: str | None = None + + +class StructuredHint(BaseModel): + label: str = "" + tool: Literal["search", "find", "describe", "neighbors", "resolve"] + args: dict[str, Any] + actionable: bool = True + reason: str = "" + + +# Module-level flag set by server.py at startup from resolved config. +# Single source of truth — both mcp_v2 and resolve_service read this via +# _hints_or_skip, and server.py sets it via set_hints_enabled (re-exported +# by mcp_v2 for back-comat). +_hints_enabled: bool = True + + +def set_hints_enabled(enabled: bool) -> None: + global _hints_enabled + _hints_enabled = enabled + + +def _hints_or_skip(tool: str, payload: dict) -> tuple[list, list]: + return generate_hints(tool, payload) if _hints_enabled else ([], []) + + +def _node_kind_from_id( + id_str: str, +) -> Literal["symbol", "route", "client", "producer", "unresolved_call_site"]: + if id_str.startswith("ucs:"): + return "unresolved_call_site" + if id_str.startswith("sym:"): + return "symbol" + if id_str.startswith("route:") or id_str.startswith("r:"): + return "route" + if id_str.startswith("client:") or id_str.startswith("c:"): + return "client" + if id_str.startswith("producer:") or id_str.startswith("p:"): + return "producer" + raise ValueError(f"Unknown id prefix for `{id_str}`") + + +def _resolve_node_kind( + graph: LadybugGraph, + node_id: str, +) -> Literal["symbol", "route", "client", "producer", "unresolved_call_site"]: + try: + return _node_kind_from_id(node_id) + except ValueError: + pass + if graph._rows("MATCH (n:Symbol) WHERE n.id = $id RETURN n.id AS id LIMIT 1", {"id": node_id}): # noqa: SLF001 + return "symbol" + if graph._rows("MATCH (n:Route) WHERE n.id = $id RETURN n.id AS id LIMIT 1", {"id": node_id}): # noqa: SLF001 + return "route" + if graph._rows("MATCH (n:Client) WHERE n.id = $id RETURN n.id AS id LIMIT 1", {"id": node_id}): # noqa: SLF001 + return "client" + if graph._rows("MATCH (n:Producer) WHERE n.id = $id RETURN n.id AS id LIMIT 1", {"id": node_id}): # noqa: SLF001 + return "producer" + raise ValueError(f"Unknown id prefix for `{node_id}`") + + +def _node_ref_from_row(kind: Literal["symbol", "route", "client", "producer"], row: dict[str, Any]) -> NodeRef: + symbol_kind: str | None = None + if kind == "symbol": + fqn = str(row.get("fqn") or "") + role = str(row.get("role") or "") or None + symbol_kind_val = str(row.get("symbol_kind") or row.get("kind") or "").strip() + symbol_kind = symbol_kind_val or None + elif kind == "route": + method = str(row.get("method") or "") + path = str(row.get("path_template") or row.get("path") or "") + fqn = f"{method} {path}".strip() + role = None + elif kind == "client": + method = str(row.get("method") or "") + target = str(row.get("target_service") or "") + path = str(row.get("path_template") or row.get("path") or "") + fqn = f"{target} {method} {path}".strip() + role = None + else: + topic = str(row.get("topic") or "") + broker = str(row.get("broker") or "") + fqn = f"{topic} {broker}".strip() + role = None + return NodeRef( + id=str(row.get("id") or ""), + kind=kind, + fqn=fqn, + symbol_kind=symbol_kind, + microservice=str(row.get("microservice") or "") or None, + module=str(row.get("module") or "") or None, + role=role, + ) + + +def _to_structured_hints(raw: list[Any]) -> list[StructuredHint]: + return [StructuredHint(label=h.label, tool=h.tool, args=h.args, actionable=h.actionable, reason=h.reason) for h in raw] diff --git a/java_codebase_rag/_stdio.py b/java_codebase_rag/_stdio.py new file mode 100644 index 00000000..15adda8d --- /dev/null +++ b/java_codebase_rag/_stdio.py @@ -0,0 +1,32 @@ +"""Force stdout/stderr to UTF-8 so non-ASCII glyphs never crash the CLI. + +The text renderers emit Unicode glyphs — ``↑``/``↓`` (hierarchy tree headers +in ``jrag_render``), ``✓`` (success markers in ``cli_format``), ``→``/``…`` +(listing/role lines). On Windows, ``sys.stdout``/``sys.stderr`` default to the +system ANSI codepage (cp1252 on en-US Windows), which can't encode those +characters, so ``print()`` raises ``UnicodeEncodeError`` and the process exits +non-zero. Unix platforms already default to UTF-8, so this is a no-op there. + +Called from the console-script entry points (``_console_script_main``), not +from in-process ``main()`` callers, so a test that drives ``main()`` directly +keeps whatever stdout the host wired up. +""" + +from __future__ import annotations + +import sys + + +def force_utf8_stdio() -> None: + """Reconfigure ``sys.stdout``/``sys.stderr`` to UTF-8. + + Best-effort and silent: never raises. No-op where a stream lacks + ``reconfigure`` (streams replaced by capture frameworks that don't expose + it). ``errors="replace"`` is a last-resort safety net so a hostile console + can never crash a run; under UTF-8 every codepoint encodes cleanly, so + replacement never actually fires for the glyphs we emit. + """ + for stream in (sys.stdout, sys.stderr): + reconfigure = getattr(stream, "reconfigure", None) + if reconfigure is not None: + reconfigure(encoding="utf-8", errors="replace") diff --git a/java_codebase_rag/cli.py b/java_codebase_rag/cli.py index 43f3a65a..7592a622 100644 --- a/java_codebase_rag/cli.py +++ b/java_codebase_rag/cli.py @@ -23,6 +23,7 @@ resolve_operator_config, ) from java_codebase_rag._fdlimit import raise_fd_limit +from java_codebase_rag._stdio import force_utf8_stdio from java_codebase_rag.pipeline import clip, run_build_ast_graph, run_cocoindex_drop, run_cocoindex_update, run_incremental_graph from java_ontology import VALID_UNRESOLVED_CALL_REASONS @@ -563,6 +564,7 @@ def _cmd_install(args: argparse.Namespace) -> int: agents=args.agent, # list of str (may be empty) scope=args.scope, model=args.model, + surface=args.surface, source_root=None, # None means cwd; installer confirms interactively quiet=bool(args.quiet), verbose=bool(args.verbose), @@ -883,6 +885,17 @@ def build_parser() -> argparse.ArgumentParser: default=None, help="Embedding model path or 'auto' (default: auto).", ) + install.add_argument( + "--surface", + choices=["mcp", "cli"], + default=None, + help=( + "Agent surface to install: 'mcp' (stdio MCP server + explore-codebase " + "skill + explorer-rag-enhanced subagent) or 'cli' (jrag console-script " + "skill + explorer-rag-cli subagent, no MCP entry). Omit to choose " + "interactively; non-interactive mode defaults to 'mcp'." + ), + ) _add_verbosity_flags(install) install.set_defaults(handler=_cmd_install) @@ -1056,6 +1069,7 @@ def _console_script_main() -> None: ``main()`` stays return-based so in-process test callers (``cli.main(...)``) keep working. """ + force_utf8_stdio() rc = main() sys.stdout.flush() sys.stderr.flush() diff --git a/java_codebase_rag/install_data/agents/explorer-rag-cli.md b/java_codebase_rag/install_data/agents/explorer-rag-cli.md new file mode 100644 index 00000000..62b9083b --- /dev/null +++ b/java_codebase_rag/install_data/agents/explorer-rag-cli.md @@ -0,0 +1,291 @@ +--- +name: explorer-rag-cli +description: "MUST BE USED PROACTIVELY. Universal read-only explorer agent that drives the `jrag` CLI for graph-native codebase navigation (callers, callees, routes, clients, producers, impact, search, inspect, flow, overview) and falls back to file-system search (grep, glob, file reading). Use for any exploration task: locating code, tracing dependencies, finding patterns, answering 'where is X' or 'who calls Y'. Read-only — never edits files. This is the CLI-surface counterpart to explorer-rag-enhanced (which uses the MCP tools)." +--- + +You are a universal codebase explorer — a read-only search and navigation specialist that drives the **`jrag` CLI** (the agent-facing shell surface of java-codebase-rag) and falls back to **broad file-system search** (grep, glob, file reading) when the index is missing or stale. + +## Core Principles + +1. **Read-only.** Never edit, write, or modify any file. Only locate, read, and report. +2. **Names in, names out.** Every `` is human-readable (FQN / simple name / route path / topic). Raw node IDs are never required — `jrag` resolves internally. +3. **One command per intent.** `jrag` collapses resolve + walk into one call. Pick the command that matches the intent; do not chain resolve→inspect→traverse manually. +4. **Smallest sufficient tool.** Pick the lightest tool that answers the question. Don't run `jrag impact` when a single `jrag callers` suffices; don't `Grep` the whole repo when `jrag inspect ` answers exactly. +5. **Excerpts over dumps.** When searching broadly, read excerpts and relevant sections rather than entire files. Summarize findings; don't dump raw content. +6. **Stop when answered.** Don't prefetch unrelated subgraphs or scan unrelated directories. Report findings as soon as the question is answered. + +## Why `jrag` (CLI) vs `java-codebase-rag-mcp` + +You are the **CLI-surface** explorer. Use `jrag` shell commands (`jrag callers`, `jrag inspect`, `jrag search`, …), NOT the MCP tools (`search`/`find`/`describe`/`neighbors`/`resolve`). One surface per project — running both strands the agent in two vocabularies. + +Pick this agent (CLI) when: +- The host cannot run an MCP server (no stdio MCP support) +- The operator ran `java-codebase-rag install --surface cli` +- You prefer shell-driven exploration with text output and `--format json` for structured data + +Use the **`explorer-rag-enhanced`** subagent (MCP surface) when the host has MCP support and the operator ran `java-codebase-rag install` (default = mcp surface). + +## Prerequisite: index must exist + +`jrag` is a thin compose-and-render layer over the existing index. If the project has not been indexed, every command exits 2 with an actionable envelope. Verify with `jrag status` first when in doubt: + +``` +jrag status +``` + +If it exits 2, ask the operator to run `java-codebase-rag init --source-root `. + +## Tool Inventory + +### `jrag` command groups + +Run `jrag --help` for the canonical list. Groups: + +| Group | Commands | +| --- | --- | +| **Orientation** | `status`, `microservices`, `map`, `conventions`, `overview` | +| **Locate** | `find`, `search` | +| **Listings** | `routes`, `clients`, `producers`, `topics`, `jobs`, `listeners`, `entities` | +| **Traversal** | `callers`, `callees`, `hierarchy`, `implementations`, `subclasses`, `overrides`, `overridden-by`, `dependents`, `impact`, `flow`, `dependencies`, `connection` | +| **Inspection** | `inspect`, `outline`, `imports` | + +### Common flags (every command) + +``` +--service Filter by microservice +--module Filter by module +--limit Cap on results (default 20; 10 for fan-out commands) +--format text|json Output format (default: text) +--detail brief|normal|full Output detail (default: normal) — orthogonal to --format; + both modes honor it. brief=name @service; normal=+module/role/ + file/score; full=+signature/annotations/snippet. inspect and the + orientation commands default to full. +--index-dir Index directory override +``` + +`--offset` is supported **only** on `find` and `search`. Other commands emit `truncated: more results — narrow your query` when capped. + +### File-system tools + +`Grep` (content search), `Glob` (find files by name/pattern), `Read` (read files, with `offset`/`limit`). + +### Other tools + +`Bash` (read-only: `git log`, `git blame`, `ls`, `find`), `WebSearch`, `WebFetch`. + +--- + +## Decision Framework + +### When to use `jrag` vs file-system tools + +| Question type | Primary approach | +| --- | --- | +| "Who calls method M?" | `jrag callers ` | +| "What does M call?" | `jrag callees ` | +| "Where is class X?" | `jrag inspect `; fallback `Grep`/`Glob` | +| "All controllers in service S" | `jrag find --role CONTROLLER --service S` | +| "Routes/endpoints in service S" | `jrag routes --service S` | +| "Who implements interface T?" | `jrag implementations ` | +| "Where is T injected?" | `jrag dependencies ` | +| "Who depends on T?" | `jrag dependents ` | +| "Impact of changing X?" | `jrag impact ` (bounded fan-in) | +| "Trace request flow A→B" | `jrag flow ` → `jrag connection A B` | +| "Orient in service S" | `jrag overview ` | +| "Find files matching pattern" | `Glob` | +| "Search for text/regex in files" | `Grep` | +| "Read config/build/test files" | `Read` | +| "Who changed this and when?" | Bash: `git log` / `git blame` | +| "How is this concept used?" | Both: `jrag search ""` for fuzzy discovery, `Grep` for text patterns | +| "Natural-language 'find X'" | `jrag search ""` → `jrag inspect ` | + +### Escalation pattern + +1. **Try the most targeted command first.** Identifier-shaped → `jrag inspect `. Structural question → matching traversal (`callers`/`implementations`/…). +2. **Fall back gracefully.** `jrag` returns empty / `not_found` → `Grep`/`Glob` against actual source files. +3. **Cross-validate.** When CLI results and file contents disagree, **trust the file** — the index may be stale. Report the discrepancy. + +--- + +## Resolve-first contract (every `` command) + +Every `jrag` command that takes a `` runs `resolve_v2` internally. Map the contract onto the result: + +| `resolve_v2` status | `jrag` behavior | Your action | +| --- | --- | --- | +| `one` | Run the traversal/listing against the resolved node. | Read the result. | +| `many` | Return the candidate list and stop. **No auto-pick.** | Disambiguate with `--kind`/`--role`/`--fqn-prefix`/`--service`; re-run. | +| `none` | `status: not_found` envelope (exit 2). | Fall back to `jrag search` or `Grep`. | + +Never look up a raw node ID manually. Pass an FQN, simple name, prior `sym:`/`route:`/`client:`/`producer:` id, route path, or topic. + +### Disambiguation flags + +Only `--kind` is a true resolve input. `--role`, `--java-kind`, `--fqn-prefix`, `--service`, `--module` post-filter the resolve result client-side. + +--- + +## Output envelope + +`--format` (text|json) and `--detail` (brief|normal|full) are **orthogonal**: +`--format` picks the representation, `--detail` picks how much of each node/edge is +shown, and both modes honor the same detail level. Default is `text` + `normal` +(name @service + module/role/file/score); `inspect` and orientation commands default +to `full`. `--format json` emits the projected envelope (empty fields dropped). + +```json +{ + "status": "ok|not_found|error", + "nodes": {"": {...}}, + "edges": [{...}], + "candidates": [{...}], + "truncated": false, + "agent_next_actions": ["jrag callers ", "..."], + "file_location": {"filename": "...", "start_line": 123} +} +``` + +- `agent_next_actions` is a CLI-native hint list (≤5) — use it as a starting point, not a directive. +- `file_location` is populated only on `one`-hit resolve. +- `truncated` is computed via +1-fetch on `find`/`search`; other commands emit `truncated: more results — narrow your query` when capped. + +--- + +## Traversal reference + +`jrag` abstracts away `direction` and `edge_types`. For reference: + +| Intent (command) | Underlying edges | +| --- | --- | +| `callers` | `CALLS` direction=in | +| `callees` | `CALLS` direction=out | +| `hierarchy` | `EXTENDS` + `IMPLEMENTS` direction=out | +| `implementations` | `IMPLEMENTS` direction=in | +| `subclasses` | `EXTENDS` direction=in | +| `overrides` | `OVERRIDES` direction=out (subtype → supertype) | +| `overridden-by` | `OVERRIDES` direction=in | +| `dependencies` | `INJECTS` direction=out | +| `dependents` | `INJECTS` direction=in | +| `impact` | bounded fan-in (`CALLS`/`INJECTS`/`IMPLEMENTS`/`EXTENDS`, depth ≤2) | +| `flow ` | `EXPOSES`/`HTTP_CALLS`/`ASYNC_CALLS`/`CALLS` (request trace) | +| `connection A B` | bounded path search between A and B | + +### Node id prefixes (from prior results) + +`sym:` (Symbol), `route:`/`r:` (Route), `client:`/`c:` (Client), `producer:`/`p:` (Producer). + +### Symbol FQN shape + +`.[.]#(,,…)`. Generics erased, no spaces after commas. No-arg: `()`. Constructor: `#(...)`. + +--- + +## Ontology glossary + +### Roles + +| Role | Meaning | +| ---- | ------- | +| `CONTROLLER` | HTTP / messaging entry point | +| `SERVICE` | Business logic orchestration | +| `REPOSITORY` | Data access | +| `COMPONENT` | General Spring component | +| `CONFIG` | `@Configuration` class | +| `ENTITY` | JPA / persistence entity | +| `CLIENT` | Outbound call wrapper | +| `MAPPER` | Data mapper / converter | +| `DTO` | Data transfer object | +| `OTHER` | Infrastructure / utility / unclassified | + +### Capabilities + +`MESSAGE_LISTENER`, `MESSAGE_PRODUCER`, `HTTP_CLIENT`, `SCHEDULED_TASK`, `EXCEPTION_HANDLER`. + +### Symbol kinds + +`class`, `interface`, `enum`, `record`, `annotation`, `method`, `constructor`. + +### Route / client / producer kinds + +Route frameworks: `spring_mvc`, `webflux`. Route kinds: `http_endpoint`, `http_consumer`, `kafka_topic`, `rabbit_queue`, `jms_destination`, `stream_binding`. +Client kinds: `feign_method`, `rest_template`, `web_client`. Producer kinds: `kafka_send`, `stream_bridge_send`. Source layers: `builtin`, `layer_a_meta`, `layer_b_ann`, `layer_b_fqn`, `layer_c_source`. + +--- + +## File-System Search Reference + +### Glob patterns + +- `**/*.java` — all Java files +- `**/*Controller*.java` — controller files +- `**/application*.yml` — Spring config files +- `**/*Test*.java` — test files + +### Grep patterns + +- Class declarations: `class ClassName` +- Method usage: `methodName(` +- Annotations: `@RequestMapping`, `@Service`, etc. +- Import statements: `import com.example.ClassName` +- Configuration keys: `spring.datasource` + +### Reading files + +Use `Read` with `offset`/`limit` for large files — read relevant sections, not entire files. + +--- + +## Recovery Playbook + +| Symptom | Fix | +| ------- | --- | +| `jrag status` exits 2 | Run `java-codebase-rag init --source-root `; retry | +| `status: not_found` | Try `jrag search ""`; or `find --fqn-prefix`; fallback `Grep` | +| `many` candidates | Add `--kind`/`--role`/`--fqn-prefix`/`--service`; re-run | +| `find` returns too much | Add `--service`, `--fqn-prefix`, `--path-prefix`, `--topic-prefix` | +| Empty `search` | Try `--table all`; `find --fqn-prefix`; `Grep` directly | +| `truncated: true` | Narrow the query, or page with `--offset` (`find`/`search` only) | +| Empty results across commands | Index missing/stale → `Grep`/`Glob`/`Read`; ask operator to rebuild | +| CLI vs file disagree | Trust the file; report stale index | +| `--offset` rejected | Only `find`/`search` accept it; other commands narrow via filters | + +After two failed attempts on the same intent, stop and report what was tried and what failed. + +--- + +## Workflow Patterns + +### Pattern: "explain feature X" + +1. `jrag search "X"` → pick top 1–3 hits +2. `jrag inspect ` for full record +3. Targeted traversal (`callees` / `implementations` / `dependents`) +4. Stop when you can answer the question + +### Pattern: "where is X used?" + +1. `jrag inspect ` (resolves; if `many`, disambiguate) +2. `jrag callers ` and `jrag dependents ` +3. If CLI misses: `Grep` for the symbol name +4. Report all usage sites with file:line + +### Pattern: "find all Y in the codebase" + +1. Structural: `jrag find --role [--service ]` +2. Textual: `Grep` for the pattern +3. Broad: `Glob` for files + `Grep` for content +4. Summarize findings; don't dump raw lists + +### Pattern: "trace the flow from A to B" + +1. `jrag flow ` to trace the request +2. `jrag connection A B` to confirm a path exists +3. Use `Grep` to fill gaps where the graph index is incomplete +4. Report the trace with file:line references + +### Pattern: "orient in service S" + +1. `jrag overview ` (bundle of routes/clients/producers) +2. `jrag conventions --service ` (dominant roles + framework tallies) +3. `jrag map --service ` (type counts) +4. `jrag routes --service ` (entry points) diff --git a/java_codebase_rag/install_data/skills/explore-codebase-cli/SKILL.md b/java_codebase_rag/install_data/skills/explore-codebase-cli/SKILL.md new file mode 100644 index 00000000..97c2b6f3 --- /dev/null +++ b/java_codebase_rag/install_data/skills/explore-codebase-cli/SKILL.md @@ -0,0 +1,251 @@ +--- +name: explore-codebase-cli +description: "MUST BE USED PROACTIVELY. Universal read-only codebase exploration via the `jrag` CLI — one command per engineering intent (callers, callees, routes, clients, producers, impact, search, inspect, flow, overview). Use for any exploration: locating code, tracing dependencies, finding patterns, 'where is X', 'who calls Y', 'find all controllers', 'trace the flow from A to B'. Combines graph navigation with file-system search (grep, glob, file reading). Do NOT use when the answer is already in open context or for a single known file — read that file directly." +--- + +# /explore-codebase-cli — Universal codebase exploration via `jrag` + +Read-only exploration combining **graph navigation through the `jrag` CLI** with **broad file-system search**. This is the CLI surface of java-codebase-rag; it loads the same index used by the MCP server but exposes one shell command per engineering intent instead of five MCP tools. + +## When to use + +Any time you need to search, locate, navigate, or explore the codebase. **Do NOT use when** the answer is already in open context or for a single known file — read that file directly. + +## Core Principles + +1. **Read-only.** Never edit, write, or modify any file. +2. **Names in, names out.** Every `` is human-readable (FQN / simple name / route path / topic). Raw node IDs are never required. +3. **One command per intent.** `jrag` collapses resolve + walk into one call. Pick the command that matches the intent; do not chain resolve→describe→neighbors manually. +4. **Stop when answered.** Don't prefetch unrelated subgraphs or directories. + +## Why `jrag` (CLI) vs `java-codebase-rag` (MCP) + +| Aspect | `jrag` CLI | MCP server (`java-codebase-rag-mcp`) | +| --- | --- | --- | +| Surface | Shell — one command per intent | 5 stdio MCP tools (`search` / `find` / `describe` / `neighbors` / `resolve`) | +| Resolve | **Internalized** — every `` command runs `resolve_v2` first | Explicit — agent calls `resolve` then `describe` / `neighbors` | +| Output | Compact text by default; `--format json` for the envelope; `--detail brief\|normal\|full` (orthogonal to format) | JSON-RPC envelope | +| Host fit | Any agent that can run shell commands | MCP-aware hosts (Claude Code, Claude Desktop, Qwen Code, GigaCode) | +| Index | Reuses the operator's `~/.java-codebase-rag` / `.java-codebase-rag/` index | Same | + +Pick **one** surface per project — running both strands the agent in two vocabularies. This skill is for the CLI surface. + +## Prerequisite: index must exist + +`jrag` is a thin compose-and-render layer over the existing index. If the project has not been indexed, every command exits 2 with an actionable envelope: + +``` +status: error +message: No index at . Run: java-codebase-rag init --source-root +``` + +Verify with `jrag status` first when in doubt. + +## Tool Inventory + +### `jrag` command groups + +Run `jrag --help` for the canonical list. Groups (PR-JRAG-1a..4): + +| Group | Commands | +| --- | --- | +| **Orientation** | `status`, `microservices`, `map`, `conventions`, `overview` | +| **Locate** | `find`, `search` | +| **Listings** | `routes`, `clients`, `producers`, `topics`, `jobs`, `listeners`, `entities` | +| **Traversal** | `callers`, `callees`, `hierarchy`, `implementations`, `subclasses`, `overrides`, `overridden-by`, `dependents`, `impact`, `flow`, `dependencies`, `connection` | +| **Inspection** | `inspect`, `outline`, `imports` | + +### Common flags (every command) + +``` +--service Filter by microservice +--module Filter by module +--limit Cap on results (default 20; 10 for fan-out commands) +--format text|json Output format (default: text) +--detail brief|normal|full Output detail (default: normal) — orthogonal to --format; + both modes honor it. brief=name @service; normal=+module/role/ + file/score; full=+signature/annotations/snippet. inspect and the + orientation commands (status/microservices/map/conventions/overview) + default to full. +--index-dir Index directory override (default: discovered from cwd) +``` + +`--offset` is supported **only** on `find` and `search` (they route through `find_v2` / `search_v2` which accept it). Other commands emit `truncated: more results — narrow your query` when capped. + +### File-system tools + +- **Grep** — content search by pattern/regex +- **Glob** — find files by name/path pattern (`**/*.java`, `**/*Controller*.java`, `**/application*.yml`) +- **Read** — read files (`offset`/`limit` for large files) + +### Other: **Bash** (read-only: `git log`, `git blame`, `ls`, `find`), **WebSearch**/**WebFetch** (external lookups) + +--- + +## Decision Framework + +| User asks… | First `jrag` command | Follow-up | +| ---------- | -------------------- | --------- | +| "Is the index fresh?" | `jrag status` | — | +| Identifier-shaped string (FQN / simple name) | `jrag inspect ` | `callers` / `callees` | +| Fuzzy / NL "where is X" | `jrag search ""` | `inspect ` | +| All controllers in service S | `jrag find --role CONTROLLER --service S` | `callees` | +| Interfaces in service S | `jrag find --java-kind interface --service S` | `implementations` | +| HTTP / messaging entry points | `jrag routes [--framework …] [--method …]` | `inspect ` | +| Outbound HTTP clients | `jrag clients [--calls-service …]` | `callees ` | +| Outbound async producers | `jrag producers [--topic-prefix …]` | `callees ` | +| Topics + consumers/producers | `jrag topics [--topic-prefix …]` | — | +| Who calls method M? | `jrag callers ` | `inspect ` | +| What does M call? | `jrag callees ` | `inspect ` | +| Who hits this route? | `jrag callers ` | — | +| Who implements interface T? | `jrag implementations ` | — | +| Subtypes of class C? | `jrag subclasses ` | — | +| Overriding methods? | `jrag overrides ` (dispatch UP) | — | +| Methods that override me? | `jrag overridden-by ` | — | +| Who injects T? | `jrag dependencies ` | — | +| Who depends on T? | `jrag dependents ` | — | +| Blast-radius of changing X? | `jrag impact ` (bounded fan-in) | `Grep` fallback | +| Trace request flow A→B | `jrag flow ` | `connection ` | +| File outline | `jrag outline ` | `inspect ` | +| File imports | `jrag imports ` | — | +| "Explain service S" | `jrag overview ` | `routes` / `clients` / `producers` | +| "Explain route /topic" | `jrag overview ` | `flow` | +| Find files matching pattern | `Glob` | `Read` | +| Search for text in files | `Grep` | `Read` | +| Who changed X and when? | Bash: `git log`/`git blame` | — | +| "How is this configured?" | `Glob` + `Grep` for config keys; `jrag search "" --table yaml` | `Read` sections | + +**Escalation:** ① Most targeted command first → ② Fall back gracefully (`callers` empty → `Grep`) → ③ Cross-validate (CLI vs file disagree → **trust the file** — index may be stale). + +**Rules of thumb:** Structure beats vector for exact questions (`find` / `inspect` + traversal); vector beats structure for fuzzy discovery (`search`); file-system beats stale index. + +--- + +## Resolve-first contract (every `` command) + +Every `jrag` command that takes a `` runs `resolve_v2` internally and maps the contract onto the envelope: + +| `resolve_v2` status | `jrag` behavior | +| --- | --- | +| `one` | Run the traversal/listing against the resolved node. | +| `many` | Return the candidate list and stop. **No auto-pick.** Disambiguate with `--kind`, `--role`, `--fqn-prefix`, etc. | +| `none` | Emit `status: not_found` envelope (exit 2). Fall back to `search` or `Grep`. | + +You never need to look up a raw node ID. Pass an FQN, simple name, `sym:`/`route:`/`client:`/`producer:` id (from a prior call), route path, topic, etc. + +### Disambiguation flags + +Only `--kind` is a true resolve input (`hint_kind`). The other narrowing flags (`--role`, `--java-kind`, `--fqn-prefix`, `--service`, `--module`) post-filter the resolve result client-side. If a post-filter collapses `many` → `one`, the command proceeds; if it still leaves `many`, the narrowed candidates are returned. + +--- + +## Output envelope + +`--format` (text|json) and `--detail` (brief|normal|full) are **orthogonal**: +`--format` picks the representation, `--detail` picks how much of each node/edge is +shown, and **both modes honor the same detail level** through one projection seam. + +- Default is `text` + `normal`: a one-line-per-row listing that includes + `name @service module=… role=… file=… score=…` (the cheap, high-value fields). + `inspect` and the orientation commands default to `full` (their purpose is detail). +- `--detail brief` reproduces the ultra-terse `name @service` line (escape hatch). +- `--detail full` adds an indented block per row (`signature`, `annotations`, + `snippet` for search, `data`/`edge_summary` for inspect). +- `--format json` emits the **projected** envelope (same field set as the text at + that detail level). Empty fields are dropped at every level (no `null` noise). + +`--format json` envelope shape (fields omitted when empty): + +```json +{ + "status": "ok|not_found|error", + "nodes": {"": {...}}, + "edges": [{...}], + "candidates": [{...}], + "truncated": false, + "agent_next_actions": ["jrag callers ", "..."], + "file_location": {"filename": "...", "start_line": 123} +} +``` + +- `truncated` is computed via +1-fetch on `find`/`search` (pass `--limit`, observe `truncated`, narrow or page with `--offset`); other commands emit `truncated: more results — narrow your query` when capped (no `--offset`). +- `agent_next_actions` is a CLI-native hint list (≤5) mapping the current result's edge labels to the next `jrag` command — use it as a starting point, not a directive. +- `file_location` is populated only on `one`-hit resolve (carries the resolved node's `filename` + `start_line`). + +--- + +## Traversal direction reference + +`jrag` abstracts away `direction` and `edge_types` — you name the intent, it picks the edges. For reference, the mapping is: + +| Intent (command) | Underlying edges | +| --- | --- | +| `callers` | `CALLS` direction=in | +| `callees` | `CALLS` direction=out | +| `hierarchy` | `EXTENDS` + `IMPLEMENTS` direction=out | +| `implementations` | `IMPLEMENTS` direction=in | +| `subclasses` | `EXTENDS` direction=in | +| `overrides` | `OVERRIDES` direction=out (subtype → supertype) | +| `overridden-by` | `OVERRIDES` direction=in (virtual `OVERRIDDEN_BY` out) | +| `dependencies` | `INJECTS` direction=out | +| `dependents` | `INJECTS` direction=in | +| `impact` | bounded fan-in: `CALLS`/`INJECTS`/`IMPLEMENTS`/`EXTENDS` direction=in (depth ≤2) | +| `flow ` | `trace_request_flow`: `EXPOSES`/`HTTP_CALLS`/`ASYNC_CALLS`/`CALLS` | +| `connection A B` | bounded search over the same edge set between A and B | + +### Node id prefixes (from prior results) + +`sym:` (Symbol), `route:`/`r:` (Route), `client:`/`c:` (Client), `producer:`/`p:` (Producer). Pass these verbatim if you have them; otherwise use the human-readable name. + +### Symbol FQN shape + +`.[.]#(,,…)`. Generics erased, no spaces after commas. No-arg: `()`. Constructor: `#(...)`. + +--- + +## Ontology glossary + +**Roles:** `CONTROLLER` | `SERVICE` | `REPOSITORY` | `COMPONENT` | `CONFIG` | `ENTITY` | `CLIENT` | `MAPPER` | `DTO` | `OTHER`. + +**Capabilities:** `MESSAGE_LISTENER`, `MESSAGE_PRODUCER`, `HTTP_CLIENT`, `SCHEDULED_TASK`, `EXCEPTION_HANDLER`. + +**Symbol kinds:** `class`, `interface`, `enum`, `record`, `annotation`, `method`, `constructor`. + +**Route frameworks:** `spring_mvc`, `webflux`. Route *kinds*: `http_endpoint`, `http_consumer`, `kafka_topic`, `rabbit_queue`, `jms_destination`, `stream_binding`. + +**Client kinds:** `feign_method`, `rest_template`, `web_client`. **Producer kinds:** `kafka_send`, `stream_bridge_send`. **Source layers (client/producer):** `builtin`, `layer_a_meta`, `layer_b_ann`, `layer_b_fqn`, `layer_c_source`. + +--- + +## Recovery Playbook + +**After two failed attempts on the same intent, stop and report command, args, and result snippet.** + +| Symptom | Fix | +| ------- | --- | +| `status: error` "No index at …" | Run `java-codebase-rag init --source-root ` then retry | +| `status: not_found` | Try `jrag search ""`; or `find --fqn-prefix …`; fallback `Grep` | +| `many` candidates returned | Add `--kind`/`--role`/`--fqn-prefix`/`--service`; re-run | +| `find` returns too much | Add `--service`, `--fqn-prefix`, `--path-prefix`, `--topic-prefix` | +| Empty `search` | Try `--table all`; `find --fqn-prefix`; `Grep` directly | +| `truncated: true` | Narrow the query, or page with `--offset` (`find`/`search` only) | +| Empty results across commands | Index missing/stale → `Grep`/`Glob`/`Read`; ask operator to rebuild (`java-codebase-rag reprocess`) | +| CLI vs file disagree | **Trust the file**; report stale index | +| `--offset` rejected | Only `find`/`search` accept it; other commands narrow via filters | +| Wrong node picked | Resolve must be ambiguous — pass `--kind` to narrow | + +--- + +## Workflow Patterns + +**"Explain feature X":** `jrag search "X"` → pick 1–3 hits → `jrag inspect ` → targeted traversal (`callees`/`implementations`) → stop when answered. + +**"Where is X used?":** `jrag inspect ` (resolves) → `jrag callers ` and `jrag dependents ` → `Grep` fallback → report all sites with file:line. + +**"Find all Y":** Structural → `jrag find --role [--service ]`. Textual → `Grep`. Broad → `Glob` + `Grep`. Summarize, don't dump. + +**"Trace flow from A to B":** `jrag flow ` to trace the request → `jrag connection A B` to confirm a path → `Grep` gaps → report with file:line. + +**"How is this configured?":** `Glob` for `**/application*.yml` → `Grep` for the key → `Read` sections → `jrag search "" --table yaml` supplement. + +**"Orient in a new service":** `jrag overview ` (bundle) → `jrag conventions --service ` (dominant roles) → `jrag map --service ` (counts) → `jrag routes --service ` (entry points). diff --git a/java_codebase_rag/installer.py b/java_codebase_rag/installer.py index be682d9b..26fbebab 100644 --- a/java_codebase_rag/installer.py +++ b/java_codebase_rag/installer.py @@ -22,10 +22,19 @@ import yaml Scope = Literal["project", "user"] +Surface = Literal["mcp", "cli"] # MCP server name constant _MCP_SERVER_NAME = "java-codebase-rag" +# Marker file written at install time so a CLI-only install (no MCP entry) is +# still visible to ``update``. Lives at the project/source root alongside +# ``.java-codebase-rag.yml``. JSON shape: +# {"version": 1, "hosts": [{"host": "claude-code", "scope": "project", +# "surface": "mcp"|"cli"}, ...]} +_MARKER_FILE_NAME = ".java-codebase-rag.hosts" +_MARKER_FILE_VERSION = 1 + # Exit code constants EXIT_SUCCESS = 0 EXIT_PARTIAL = 1 @@ -40,6 +49,20 @@ class ArtifactResult(NamedTuple): error: str | None +class ConfiguredHost(NamedTuple): + """A host installed on this machine: which host, which scope, which surface. + + Replaces the prior 2-tuple ``(HostConfig, scope)`` returned by + ``detect_configured_hosts`` so ``update`` can route the refresh through the + correct ``Surface`` (an MCP-surface install refreshes MCP+skill+agent; a + CLI-surface install refreshes the CLI skill+agent only). + """ + + host: "HostConfig" + scope: Scope + surface: Surface + + @dataclass(frozen=True) class HostConfig: """Configuration for an agent host.""" @@ -94,6 +117,37 @@ def agents_dir(self, scope: Scope, cwd: Path) -> Path: } +# --------------------------------------------------------------------------- +# ArtifactManifest — single source of truth for which artifacts each surface +# ships. Iterated by both ``deploy_artifacts`` and ``refresh_artifacts`` so +# adding/removing an artifact is one edit, not two. +# +# Each entry is a 3-tuple ``(kind, package_path, dest_relative)``: +# - ``kind``: "mcp" dispatches to ``_deploy_mcp_config`` / ``_refresh_mcp_config`` +# (the MCP config path is host/scope-resolved inside those helpers — +# ``package_path`` and ``dest_relative`` are unused for this kind). +# - ``kind``: "skill" | "agent" dispatches to ``_deploy_file`` / ``_refresh_file``. +# - ``package_path``: relative path under ``install_data/``. +# - ``dest_relative``: relative path under ``host.scope_path(scope, cwd)``. +# +# The ``mcp`` surface carries the MCP config entry; the ``cli`` surface does +# NOT (a CLI install never registers an MCP server). +# --------------------------------------------------------------------------- +ArtifactManifestEntry = tuple[str, str, str] + +ARTIFACT_MANIFEST: dict[Surface, list[ArtifactManifestEntry]] = { + "mcp": [ + ("mcp", "", ""), + ("skill", "skills/explore-codebase/SKILL.md", "skills/explore-codebase/SKILL.md"), + ("agent", "agents/explorer-rag-enhanced.md", "agents/explorer-rag-enhanced.md"), + ], + "cli": [ + ("skill", "skills/explore-codebase-cli/SKILL.md", "skills/explore-codebase-cli/SKILL.md"), + ("agent", "agents/explorer-rag-cli.md", "agents/explorer-rag-cli.md"), + ], +} + + def prompt( prompt_type: str, message: str, @@ -421,36 +475,117 @@ def select_scope(*, non_interactive: bool, cli_scope: str | None) -> Scope: return selected # type: ignore -def resolve_mcp_command(*, non_interactive: bool) -> str: - """Resolve the absolute path to java-codebase-rag-mcp. +def select_surface( + *, + non_interactive: bool, + cli_surface: str | None, + prefill: Surface | None = None, +) -> Surface: + """Select 'mcp' or 'cli' surface (PR-JRAG-5). + + The MCP surface registers the stdio MCP server (today's behavior). The CLI + surface ships the ``jrag`` console-script skill+subagent instead — no MCP + entry is registered. + + Args: + non_interactive: If True, honor ``cli_surface`` (default ``"mcp"``). + cli_surface: Surface from the ``--surface`` CLI flag. + prefill: On re-run, the surface recorded in the existing marker file. + When set and the user does not pick otherwise, this is preserved. + + Returns: + Selected surface (``"mcp"`` or ``"cli"``). + + Raises: + SystemExit(2): if ``cli_surface`` is invalid. + """ + if cli_surface: + if cli_surface not in ("mcp", "cli"): + print(f"Error: Invalid surface '{cli_surface}'. Must be 'mcp' or 'cli'.") + raise SystemExit(2) + return cli_surface # type: ignore + + if non_interactive: + # Default to MCP for back-comat when no flag is passed. + return "mcp" + + print( + "Note: 'mcp' surface registers the java-codebase-rag MCP server (5 tools: " + "search/find/describe/neighbors/resolve)." + ) + print( + " 'cli' surface deploys the `jrag` console-script skill+subagent " + "(one command per intent, no MCP server)." + ) + + choices = ["mcp", "cli"] + if prefill is not None: + # Surface the prior choice first so the user can keep it with Enter. + choices = [prefill] + [c for c in ("mcp", "cli") if c != prefill] + default = prefill + else: + default = "mcp" + + selected = prompt( + "select", + "Select agent surface:", + choices=choices, + default=default, + ) + + if not selected: + return default + return selected # type: ignore + + +def resolve_mcp_command(*, non_interactive: bool, surface: Surface = "mcp") -> str: + """Resolve the absolute path to the runtime binary for the chosen surface. - Returns the path string for use as MCP 'command' value. + - ``surface="mcp"`` (today's behavior): resolve ``java-codebase-rag-mcp``; + on missing + non-interactive, exit with code 2. + - ``surface="cli"``: resolve the ``jrag`` console script instead. The CLI + surface registers no MCP server, so the MCP binary is irrelevant — + never raise ``SystemExit(2)`` for a missing MCP binary on this surface. + If ``jrag`` is missing, fall through to the interactive prompt (or + non-interactive exit) parameterized for ``jrag``. Args: - non_interactive: If True, exit with code 2 when not found + non_interactive: If True, exit with code 2 when the target binary + is not found. + surface: Which surface's binary to resolve. Returns: - Absolute path to java-codebase-rag-mcp executable + Absolute path to the resolved executable. Raises: - SystemExit(2): If not found and non-interactive, or user aborts + SystemExit(2): If not found and non-interactive, or user aborts. """ - mcp_path = shutil.which("java-codebase-rag-mcp") + binary_name, display_name = _surface_binary(surface) + resolved = shutil.which(binary_name) - if mcp_path: - return mcp_path + if resolved: + return resolved # Not found on PATH if non_interactive: - print("Error: `java-codebase-rag-mcp` not found on PATH.") - print("Ensure `java-codebase-rag` is installed, then re-run with `--non-interactive --agent `.") + print(f"Error: `{display_name}` not found on PATH.") + if surface == "mcp": + print( + "Ensure `java-codebase-rag` is installed, then re-run with " + "`--non-interactive --agent `." + ) + else: + print( + "Ensure `java-codebase-rag` is installed (provides the `jrag` " + "console script), then re-run with `--non-interactive --agent `." + ) raise SystemExit(2) # Interactive: prompt user for path - print("Warning: `java-codebase-rag-mcp` not found on PATH.") + print(f"Warning: `{display_name}` not found on PATH.") user_path = prompt( "text", - "Enter the full path to java-codebase-rag-mcp (or 'abort'):", + f"Enter the full path to {display_name} (or 'abort'):", default="abort", ) @@ -466,7 +601,7 @@ def resolve_mcp_command(*, non_interactive: bool) -> str: print(f"Error: Path {path_obj} does not exist or is not a file.") user_path = prompt( "text", - "Enter the full path to java-codebase-rag-mcp (or 'abort'):", + f"Enter the full path to {display_name} (or 'abort'):", default="abort", ) if user_path == "abort" or not user_path: @@ -482,6 +617,18 @@ def resolve_mcp_command(*, non_interactive: bool) -> str: return str(path_obj.resolve()) +def _surface_binary(surface: Surface) -> tuple[str, str]: + """Return ``(shutil_which_target, user_display_name)`` for a surface. + + The CLI surface resolves the ``jrag`` console script (no MCP server is + registered, so the MCP binary is irrelevant). The MCP surface keeps + today's behavior. + """ + if surface == "cli": + return ("jrag", "jrag") + return ("java-codebase-rag-mcp", "java-codebase-rag-mcp") + + def merge_mcp_config(config_path: Path, host: HostConfig, *, mcp_command: str) -> bool: """Read, merge, write MCP config. Returns True if entry was added/updated. @@ -562,53 +709,52 @@ def deploy_artifacts( *, non_interactive: bool, mcp_command: str, + surface: Surface = "mcp", ) -> list[ArtifactResult]: """Deploy artifacts (MCP config, skill, agent) to selected hosts. + Iterates ``ARTIFACT_MANIFEST[surface]`` so both surfaces share one source + of truth. The keyword-only ``surface`` defaults to ``"mcp"`` so existing + direct-call sites in tests keep working unchanged. + Args: hosts: List of HostConfig objects to deploy to scope: Installation scope ("project" or "user") cwd: Current working directory non_interactive: If True, skip overwrite prompts - mcp_command: Resolved absolute path to java-codebase-rag-mcp + mcp_command: Resolved absolute path to the runtime binary + (``java-codebase-rag-mcp`` for ``mcp`` surface; ``jrag`` for + ``cli`` surface — unused for the latter since CLI ships no MCP + config). + surface: Which artifact set to deploy (default ``"mcp"`` for back-comat). Returns: List of ArtifactResult objects for each deployment """ results = [] + manifest = ARTIFACT_MANIFEST[surface] for host in hosts: - # Deploy MCP config - mcp_config_path = host.mcp_config_path(scope, cwd) - mcp_result = _deploy_mcp_config( - mcp_config_path, - host, - non_interactive=non_interactive, - mcp_command=mcp_command, - ) - results.append(mcp_result) - - # Deploy skill - skills_dir = host.skills_dir(scope, cwd) - skill_dest = skills_dir / "explore-codebase" / "SKILL.md" - skill_result = _deploy_file( - skill_dest, - "skills/explore-codebase/SKILL.md", - artifact_type="skill", - non_interactive=non_interactive, - ) - results.append(skill_result) - - # Deploy agent - agents_dir = host.agents_dir(scope, cwd) - agent_dest = agents_dir / "explorer-rag-enhanced.md" - agent_result = _deploy_file( - agent_dest, - "agents/explorer-rag-enhanced.md", - artifact_type="agent", - non_interactive=non_interactive, - ) - results.append(agent_result) + for kind, package_path, dest_relative in manifest: + if kind == "mcp": + # Only the MCP surface carries this entry; the CLI manifest + # has no "mcp" row by construction. + mcp_config_path = host.mcp_config_path(scope, cwd) + result = _deploy_mcp_config( + mcp_config_path, + host, + non_interactive=non_interactive, + mcp_command=mcp_command, + ) + else: + dest_path = host.scope_path(scope, cwd) / dest_relative + result = _deploy_file( + dest_path, + package_path, + artifact_type=kind, + non_interactive=non_interactive, + ) + results.append(result) return results @@ -1001,32 +1147,137 @@ def handle_rerun(cwd: Path, *, non_interactive: bool) -> dict | None: return existing_config -def detect_configured_hosts(cwd: Path) -> list[tuple[HostConfig, str]]: - """Scan project + user config files for java-codebase-rag MCP entries. +def detect_configured_hosts(cwd: Path) -> list[ConfiguredHost]: + """Detect hosts installed under ``cwd`` (project) and ``$HOME`` (user). + + Reads the marker file (``.java-codebase-rag.hosts``) written at install + time. Falls back to the legacy MCP-entry scan with ``surface="mcp"`` when + the marker is absent (pre-marker installs from earlier versions). + + The marker is the single source of truth for CLI-surface installs (which + register no MCP entry); without it, a CLI-only install would be invisible + to ``update`` (the legacy scan only finds MCP entries). Args: - cwd: Current working directory (for project-scope configs) + cwd: Current working directory (project root for project-scope configs) Returns: - List of (host_config, scope) tuples where scope is "project" or "user" + List of ``ConfiguredHost(host, scope, surface)`` tuples in marker order + (or MCP-scan order in the legacy fallback path). """ - detected = [] - - # Check all hosts in both project and user scopes + marker_hosts = _read_hosts_marker(cwd) + if marker_hosts is not None: + return marker_hosts + + # Legacy fallback: scan MCP entries + assume ``mcp`` surface. Pre-marker + # installs only ever shipped the MCP surface, so this back-comat mapping + # is exact. + detected: list[ConfiguredHost] = [] for host_name, host_config in HOSTS.items(): # Check project scope project_mcp_path = host_config.mcp_config_path("project", cwd) if _has_java_codebase_rag_entry(project_mcp_path): - detected.append((host_config, "project")) + detected.append(ConfiguredHost(host_config, "project", "mcp")) # Check user scope user_mcp_path = host_config.mcp_config_path("user", cwd) if _has_java_codebase_rag_entry(user_mcp_path): - detected.append((host_config, "user")) + detected.append(ConfiguredHost(host_config, "user", "mcp")) return detected +def _marker_path(cwd: Path) -> Path: + """Return the marker file path for a project root.""" + return cwd / _MARKER_FILE_NAME + + +def _write_hosts_marker( + project_root: Path, configured: list[ConfiguredHost] +) -> None: + """Write the marker file recording the installed host/scope/surface set. + + Round-trips with ``_read_hosts_marker``. Silently overwrites an existing + marker so re-runs (install over an existing install) reflect the latest + wizard answers. + """ + payload = { + "version": _MARKER_FILE_VERSION, + "hosts": [ + {"host": ch.host.name, "scope": ch.scope, "surface": ch.surface} + for ch in configured + ], + } + tmp_name = None + try: + with tempfile.NamedTemporaryFile( + mode="w", + dir=project_root, + prefix=f".{_MARKER_FILE_NAME}.", + delete=False, + ) as tmp: + json.dump(payload, tmp, indent=2) + tmp.flush() + os.fsync(tmp.fileno()) + tmp_name = tmp.name + # os.replace (not os.rename): on Windows, os.rename raises when the + # destination exists — the documented re-run path overwrites the prior + # marker. os.replace atomically overwrites cross-platform (PR #371 + # fixed this same pattern elsewhere). + os.replace(tmp_name, _marker_path(project_root)) + except (IOError, OSError) as e: + if tmp_name: + try: + os.unlink(tmp_name) + except OSError: + pass + # Non-fatal: ``update`` will fall back to the MCP-entry scan. Surface + # a warning so the operator notices, but do not abort the install. + print(f"Warning: failed to write {_marker_path(project_root)}: {e}") + + +def _read_hosts_marker(cwd: Path) -> list[ConfiguredHost] | None: + """Read the marker file. Return ``None`` if missing or unparseable. + + On parse/version errors, returns ``None`` so the caller falls back to the + MCP-entry scan rather than crashing mid-update. + """ + marker = _marker_path(cwd) + if not marker.is_file(): + return None + try: + with open(marker, "r") as f: + payload = json.load(f) + except (json.JSONDecodeError, IOError, OSError): + return None + + if not isinstance(payload, dict): + return None + + raw_hosts = payload.get("hosts", []) + if not isinstance(raw_hosts, list): + return None + + configured: list[ConfiguredHost] = [] + for entry in raw_hosts: + if not isinstance(entry, dict): + return None + host_name = entry.get("host") + scope = entry.get("scope") + surface = entry.get("surface", "mcp") + if host_name not in HOSTS: + return None + if scope not in ("project", "user"): + return None + if surface not in ("mcp", "cli"): + return None + configured.append( + ConfiguredHost(HOSTS[host_name], scope, surface) # type: ignore[arg-type] + ) + + return configured + + def _has_java_codebase_rag_entry(config_path: Path) -> bool: """Check if MCP config file has a java-codebase-rag entry. @@ -1056,49 +1307,47 @@ def refresh_artifacts( *, force: bool, dry_run: bool, + surface: Surface = "mcp", ) -> list[ArtifactResult]: """Overwrite skill and agent files from package data. Skip MCP if entry is correct. + Iterates ``ARTIFACT_MANIFEST[surface]`` so both surfaces share one source + of truth (PR-JRAG-5). The keyword-only ``surface`` defaults to ``"mcp"`` + so existing direct-call sites in tests keep working unchanged. + Args: host: HostConfig for the agent host scope: Installation scope ("project" or "user") cwd: Current working directory force: If True, overwrite all files even if matching dry_run: If True, print changes without writing + surface: Which artifact set to refresh (default ``"mcp"`` for back-comat). Returns: List of ArtifactResult objects for each artifact """ results = [] - - # Refresh skill file - skills_dir = host.skills_dir(scope, cwd) - skill_dest = skills_dir / "explore-codebase" / "SKILL.md" - skill_result = _refresh_file( - skill_dest, - "skills/explore-codebase/SKILL.md", - artifact_type="skill", - force=force, - dry_run=dry_run, - ) - results.append(skill_result) - - # Refresh agent file - agents_dir = host.agents_dir(scope, cwd) - agent_dest = agents_dir / "explorer-rag-enhanced.md" - agent_result = _refresh_file( - agent_dest, - "agents/explorer-rag-enhanced.md", - artifact_type="agent", - force=force, - dry_run=dry_run, - ) - results.append(agent_result) - - # Refresh MCP config (update command path if needed) - mcp_config_path = host.mcp_config_path(scope, cwd) - mcp_result = _refresh_mcp_config(mcp_config_path, host, force=force, dry_run=dry_run) - results.append(mcp_result) + manifest = ARTIFACT_MANIFEST[surface] + + for kind, package_path, dest_relative in manifest: + if kind == "mcp": + # Refresh MCP config (update command path if needed). + # NOTE: only the MCP surface has a "mcp" row in its manifest — + # ``_refresh_mcp_config`` (and therefore ``resolve_mcp_command``) + # is NEVER reached on the CLI surface by construction. The CLI + # surface ships no MCP entry, so there is nothing to refresh. + mcp_config_path = host.mcp_config_path(scope, cwd) + result = _refresh_mcp_config(mcp_config_path, host, force=force, dry_run=dry_run) + else: + dest_path = host.scope_path(scope, cwd) / dest_relative + result = _refresh_file( + dest_path, + package_path, + artifact_type=kind, + force=force, + dry_run=dry_run, + ) + results.append(result) return results @@ -1321,9 +1570,16 @@ def run_update( # Refresh artifacts for each host all_results = [] - for host_config, scope in configured_hosts: - print(f"\nRefreshing {host_config.name} ({scope} scope)...") - results = refresh_artifacts(host_config, scope, cwd, force=force, dry_run=dry_run) + for host_config, scope, surface in configured_hosts: + print(f"\nRefreshing {host_config.name} ({scope} scope, surface={surface})...") + results = refresh_artifacts( + host_config, + scope, + cwd, + force=force, + dry_run=dry_run, + surface=surface, + ) all_results.extend(results) # Check for partial failures @@ -1460,6 +1716,7 @@ def run_install( agents: list[str] | None, scope: str | None, model: str | None, + surface: str | None = None, source_root: Path | None = None, quiet: bool = False, verbose: bool = False, @@ -1471,6 +1728,7 @@ def run_install( agents: List of agent names from CLI flags scope: Scope from CLI flag model: Model from CLI flag + surface: Surface from CLI flag (``"mcp"`` or ``"cli"``; default ``"mcp"``) source_root: Source root path (defaults to cwd if None) quiet: If True, suppress output verbose: If True, raw-relay subprocess indexing output (no Live region) @@ -1511,21 +1769,30 @@ def run_install( # Stage 2: Embedding model resolved_model = resolve_model(model, non_interactive=non_interactive) - # Stage 3-4: Agent host + scope selection + # Stage 3-4: Agent host + scope + surface selection + prior_surface = _prior_surface_from_marker(cwd) try: hosts = select_hosts(non_interactive=non_interactive, cli_agents=agents) selected_scope = select_scope(non_interactive=non_interactive, cli_scope=scope) + selected_surface = select_surface( + non_interactive=non_interactive, + cli_surface=surface, + prefill=prior_surface, + ) except SystemExit as e: return e.code - # Stage 5: Artifact deployment - mcp_command = resolve_mcp_command(non_interactive=non_interactive) + # Stage 5: Artifact deployment (manifest iterates the chosen surface) + mcp_command = resolve_mcp_command( + non_interactive=non_interactive, surface=selected_surface + ) results = deploy_artifacts( hosts, selected_scope, source_root, non_interactive=non_interactive, mcp_command=mcp_command, + surface=selected_surface, ) # Check for partial failures @@ -1552,6 +1819,14 @@ def run_install( # Critical failures return 1 + # Record the host/scope/surface set so a later ``update`` can route the + # refresh through the right surface — critical for CLI-only installs (no + # MCP entry to scan). + configured = [ + ConfiguredHost(h, selected_scope, selected_surface) for h in hosts + ] + _write_hosts_marker(source_root, configured) + # Stage 6: Index + finish # Generate YAML config yaml_content = generate_yaml_config( @@ -1587,3 +1862,17 @@ def run_install( if init_outcome is False: return 1 return 0 + + +def _prior_surface_from_marker(cwd: Path) -> Surface | None: + """Return the (single) surface recorded in the existing marker, if any. + + On multi-surface installs (rare but possible across hosts), returns the + first recorded surface — the wizard prefill is a UX nicety, not a contract. + Returns ``None`` when no marker exists (fresh install) or the marker is + unparseable. + """ + configured = _read_hosts_marker(cwd) + if not configured: + return None + return configured[0].surface diff --git a/java_codebase_rag/jrag.py b/java_codebase_rag/jrag.py new file mode 100644 index 00000000..fd2b0cf1 --- /dev/null +++ b/java_codebase_rag/jrag.py @@ -0,0 +1,3526 @@ +"""jrag - agent-facing CLI (PR-JRAG-1a foundation). + +Compose-and-render layer over the existing backend (``resolve_v2``, +``LadybugGraph``, ``mcp_v2`` handlers, ``run_search``). v1 loads the index +in-process per call (no daemon); reuses the operator's index directory and +config resolver (``resolve_operator_config`` + ``apply_to_os_environ``). + +PR-JRAG-1a ships only the foundation: ``build_parser`` (with ``--offset`` +intentionally NOT global - registered only on find/search in PR-1b/PR-4), +``_resolve_cfg`` (operator config reuse), ``_load_graph`` (actionable error +envelopes), ``main`` (``raise_fd_limit`` first; stdout envelope + stderr +traceback on error), and the ``status`` command. Later PRs add subcommands and +fill the ``agent_next_actions`` hook. + +Lazy-import invariant: ``build_parser()`` imports NO backend modules - so +``jrag --help`` stays fast and free of torch/sentence_transformers/mcp_v2. +Backend imports (``resolve_service``, ``ladybug_queries``, +``resolve_operator_config``, ``jrag_envelope`` helpers) live inside command +handlers. Sentinel: + python -c "import java_codebase_rag.jrag as j; j.build_parser()" +loads no torch / sentence_transformers / mcp_v2. +""" +from __future__ import annotations + +import argparse +import os +import sys +import traceback +from pathlib import Path + +from java_codebase_rag._fdlimit import raise_fd_limit +from java_codebase_rag._stdio import force_utf8_stdio + +__all__ = ["build_parser", "main", "_console_script_main"] + + +class _IndexNotFound(RuntimeError): + """Raised when no LadybugDB graph exists at the resolved path.""" + + +class _IndexStale(RuntimeError): + """Raised when the on-disk graph's ontology is older than required.""" + + +# Generous limit for the topics --consumer-in / listeners --topic-prefix +# compose fetches (these resolve cross-topic edges and should not silently +# truncate the listener/consumer set under typical fixture sizes). +_CONSUMER_FETCH_LIMIT = 200 + + +def _load_graph_or_error(args: argparse.Namespace): + """Resolve config + load graph; on missing/stale index, print an error + envelope and return ``(cfg, graph_or_None, rc)``. + + Shared by every listing command so the cfg/load/error frame is not + hand-copied. ``rc`` is 2 on error (envelope already printed), 0 on success. + """ + from java_codebase_rag.jrag_envelope import Envelope + from java_codebase_rag.jrag_render import render + + cfg = _resolve_cfg(args) + try: + graph = _load_graph(cfg) + except (_IndexNotFound, _IndexStale) as exc: + env = Envelope(status="error", message=str(exc)) + print(render(env, fmt=args.format, detail=args.detail)) + return cfg, None, 2 + return cfg, graph, 0 + + +def _clamped_limit(args: argparse.Namespace) -> int: + """Return the limit clamped so ``limit+1 <= 500`` (backend clamp).""" + raw_limit = args.limit if args.limit is not None else 20 + return min(raw_limit, 499) + + +def _render_listing(rows, *, limit: int, args: argparse.Namespace, noun: str) -> int: + """Apply +1-fetch truncation, build the envelope, render as a listing. + + Shared by the listing commands whose backend returns a flat row list + (routes / clients / producers). ``rows`` must already be the limit+1 + fetch. Renders as the default shape (no ``shape=``). + """ + from java_codebase_rag.jrag_envelope import Envelope, mark_truncated, next_actions_hook, to_envelope_rows + from java_codebase_rag.jrag_render import render + + node_list = to_envelope_rows(rows) if rows and not isinstance(rows[0], dict) else list(rows) + display_nodes_list, truncated = mark_truncated(node_list, limit) + display_nodes = {node["id"]: node for node in display_nodes_list} + + env = Envelope(status="ok", nodes=display_nodes, truncated=truncated) + next_actions_hook(env) + print(render(env, fmt=args.format, detail=args.detail, noun=noun)) + return 0 + + +def _symbol_hit_to_dict(hit) -> dict: + """Convert a ``SymbolHit`` (dataclass) to the envelope node dict shape. + + Carries the FULL ``SymbolHit``: ``filename`` / ``start_line`` so the + projector can compose the ``file`` field at ``--detail normal``, and + ``signature`` / ``annotations`` / ``capabilities`` / ``modifiers`` / + ``package`` / ``parent_id`` / ``resolved`` so ``--detail full`` is genuinely + rich. The projector (:func:`jrag_envelope.project_node`) trims per detail + level at render time — callers build rich and let the seam trim, inverting + the old "trim at construction" that coupled detail to format. Empty values + are dropped by the projector, so carrying them here is harmless. Byte + offsets (``start_byte`` / ``end_byte``) are intentionally dropped — pure + noise, never a display field. + """ + return { + "id": hit.id, + "kind": "symbol", + "fqn": hit.fqn, + "name": hit.name, + "symbol_kind": hit.kind, + "microservice": hit.microservice, + "module": hit.module, + "role": hit.role, + "filename": hit.filename, + "start_line": hit.start_line, + "end_line": hit.end_line, + "signature": hit.signature, + "annotations": list(hit.annotations or []), + "capabilities": list(hit.capabilities or []), + "modifiers": list(hit.modifiers or []), + "package": hit.package, + "parent_id": hit.parent_id, + "resolved": hit.resolved, + } + + +def build_parser() -> argparse.ArgumentParser: + """Argparse builder. Imports no backend modules. + + ``--offset`` is intentionally NOT a global flag (PR-JRAG-1a contract): it + is added only to ``find`` / ``search`` subparsers in PR-JRAG-1b / PR-JRAG-4 + (those commands route through ``find_v2`` / ``search_v2`` which take an + ``offset``). In 1a, no subparser has ``--offset``. + """ + description = ( + "jrag - agent-facing CLI for graph-native code intelligence.\n\n" + "Every command resolves the identifier (FQN / simple name /\n" + "route path / topic) as the first step and maps one/many/none onto a\n" + "single envelope. Default output is compact text; `--format json` emits\n" + "the envelope verbatim.\n\n" + "Commands by group:\n" + " health: status\n" + " locate: find, inspect\n" + " listings: routes, clients, producers, topics, jobs, listeners,\n" + " entities\n" + " traversal: callers, callees, hierarchy, implementations, subclasses,\n" + " overrides, overridden-by, dependents, impact, decompose,\n" + " flow, dependencies, connection, outline, imports\n" + " orientation: microservices, map, conventions, overview\n" + " search: search\n\n" + "Run `jrag --help` for command-specific options." + ) + parser = argparse.ArgumentParser( + prog="jrag", + description=description, + formatter_class=argparse.RawDescriptionHelpFormatter, + exit_on_error=False, + ) + subparsers = parser.add_subparsers(dest="command") + + # Common flags applied per command via parents=[_common_parser()]. NOT + # global so commands can override defaults (e.g. fan-out commands use + # limit=10). The helper builds a FRESH parser each call so every subparser + # owns its own --detail Action object — argparse `parents` shares Action + # objects by reference, and `set_defaults(detail=...)` mutates the shared + # action's default (CPython walks `self._actions`), so a single shared + # `common` made `status.set_defaults(detail="full")` poison every other + # subparser into defaulting to "full". A fresh parser per subparser isolates + # the override to the command that asked for it. + def _common_parser() -> argparse.ArgumentParser: + common = argparse.ArgumentParser(add_help=False) + common.add_argument("--service", type=str, default=None, help="Filter by microservice.") + common.add_argument("--module", type=str, default=None, help="Filter by module.") + common.add_argument( + "--limit", type=int, default=20, help="Cap on results (default 20; 10 for fan-out)." + ) + common.add_argument( + "--index-dir", + type=str, + default=None, + dest="index_dir", + help="Index directory override (default: discovered from cwd).", + ) + common.add_argument( + "--format", + choices=("text", "json"), + default="text", + help="Output format (default: text).", + ) + common.add_argument( + "--detail", + choices=("brief", "normal", "full"), + default="normal", + help=( + "Output detail level (default normal) — ORTHOGONAL to --format: both " + "text and json honor it. brief = identity only (name @service); " + "normal = +module/role/file/score; full = +signature/annotations/snippet." + ), + ) + return common + + status = subparsers.add_parser( + "status", + help="Print index freshness, ontology version, and counts.", + parents=[_common_parser()], + description=( + "Index health and freshness. Reports ontology version, source root, " + "built_at, parse_errors, edge counts, and the counts dictionary from " + "GraphMeta. Exits 2 with an actionable envelope if the index is " + "missing or stale." + ), + ) + status.set_defaults(handler=_cmd_status, detail="full") + + # find subparser (PR-JRAG-1b) + find = subparsers.add_parser( + "find", + help="Find nodes by query or filter.", + parents=[_common_parser()], + description=( + "Find nodes by query or filter. Two modes:\n" + " Query mode (positional ): search by exact name/FQN (symbols only).\n" + " Filter mode (no positional): apply structured filters (NodeFilter flags).\n" + "Kind inference: domain flags (--http-method, --client-kind, --producer-kind) imply\n" + "route/client/producer when --kind is omitted. Contradiction emits an error envelope.\n" + "Query mode + non-symbol kind (explicit or inferred) errors: name/FQN lookup only\n" + "searches symbols; drop the positional and use filter mode for routes/clients/producers." + ), + ) + find.add_argument("query", nargs="?", default=None, help="Search query (name/FQN). Omit for filter mode.") + find.add_argument( + "--kind", + choices=("symbol", "route", "client", "producer"), + default=None, + help="Node kind (omit for auto-inference from domain flags).", + ) + find.add_argument("--role", type=str, default=None, help="Filter by role.") + find.add_argument("--exclude-role", type=str, default=None, help="Exclude by role.") + find.add_argument("--java-kind", type=str, default=None, help="Filter by Java symbol kind.") + find.add_argument("--annotation", type=str, default=None, help="Filter by annotation.") + find.add_argument("--capability", type=str, default=None, help="Filter by capability.") + find.add_argument("--framework", type=str, default=None, help="Filter by framework.") + find.add_argument("--source-layer", type=str, default=None, help="Filter by source layer.") + find.add_argument("--fqn-prefix", type=str, default=None, help="Filter by FQN prefix.") + find.add_argument("--http-method", type=str, default=None, help="Filter by HTTP method (route).") + find.add_argument("--path-prefix", type=str, default=None, help="Filter by path prefix (route).") + find.add_argument("--client-kind", type=str, default=None, help="Filter by client kind (client).") + find.add_argument("--calls-service", type=str, default=None, help="Filter by target service (client).") + find.add_argument("--calls-path-prefix", type=str, default=None, help="Filter by target path prefix (client).") + find.add_argument("--producer-kind", type=str, default=None, help="Filter by producer kind (producer).") + find.add_argument("--topic-prefix", type=str, default=None, help="Filter by topic prefix (producer).") + find.add_argument( + "--offset", + type=int, + default=0, + help="Page offset (filter mode only; ignored in query mode).", + ) + find.set_defaults(handler=_cmd_find) + + # inspect subparser (PR-JRAG-1b) + inspect = subparsers.add_parser( + "inspect", + help="Inspect a node by query.", + parents=[_common_parser()], + description=( + "Inspect a node by resolving a query (name/FQN) and returning its full details\n" + "including edge_summary. Uses resolve_v2 internally; on ambiguous candidates,\n" + "returns them (no auto-pick). On not_found, returns an error envelope." + ), + ) + inspect.add_argument("query", help="Search query (name/FQN).") + inspect.add_argument( + "--kind", + choices=("symbol", "route", "client", "producer"), + default=None, + help="Hint for resolve (omitted for broad search).", + ) + inspect.add_argument("--java-kind", type=str, default=None, help="Post-filter by Java symbol kind.") + inspect.add_argument("--role", type=str, default=None, help="Post-filter by role.") + inspect.add_argument("--fqn-prefix", type=str, default=None, help="Post-filter by FQN prefix.") + inspect.set_defaults(handler=_cmd_inspect, detail="full") + + # routes subparser (PR-JRAG-2) + routes = subparsers.add_parser( + "routes", + help="List HTTP routes.", + parents=[_common_parser()], + description=( + "List HTTP routes by microservice, framework, path prefix, or method. " + "Returns route nodes (no resolve step)." + ), + ) + routes.add_argument("--framework", type=str, default=None, help="Filter by framework.") + routes.add_argument("--path-prefix", type=str, default=None, help="Filter by path prefix.") + routes.add_argument("--method", type=str, default=None, help="Filter by HTTP method.") + routes.set_defaults(handler=_cmd_routes, detail="full") + + # clients subparser (PR-JRAG-2) + clients = subparsers.add_parser( + "clients", + help="List HTTP clients.", + parents=[_common_parser()], + description=( + "List HTTP clients by microservice, client kind, target service, or path prefix. " + "Returns client nodes (no resolve step)." + ), + ) + clients.add_argument("--client-kind", type=str, default=None, help="Filter by client kind.") + clients.add_argument("--calls-service", type=str, default=None, help="Filter by target service.") + clients.add_argument("--path-prefix", type=str, default=None, help="Filter by path prefix.") + clients.set_defaults(handler=_cmd_clients, detail="full") + + # producers subparser (PR-JRAG-2) + producers = subparsers.add_parser( + "producers", + help="List async message producers.", + parents=[_common_parser()], + description=( + "List async message producers by microservice, producer kind, or topic prefix. " + "Returns producer nodes (no resolve step)." + ), + ) + producers.add_argument("--producer-kind", type=str, default=None, help="Filter by producer kind.") + producers.add_argument("--topic-prefix", type=str, default=None, help="Filter by topic prefix.") + producers.set_defaults(handler=_cmd_producers, detail="full") + + # topics subparser (PR-JRAG-2) + topics = subparsers.add_parser( + "topics", + help="List message topics (producer-grouped).", + parents=[_common_parser()], + description=( + "List message topics grouped by producer. " + "No :Topic node exists; this command groups producers by topic name. " + "--consumer-in resolves consumers (listener methods) via EXPOSES edges to Route(topic)." + ), + ) + topics.add_argument("--topic-prefix", type=str, default=None, help="Filter by topic prefix.") + topics.add_argument("--producer-in", type=str, default=None, help="Scope producers to this microservice.") + topics.add_argument("--consumer-in", type=str, default=None, help="Show consumers from this microservice.") + topics.set_defaults(handler=_cmd_topics, detail="full") + + # jobs subparser (PR-JRAG-2) + jobs = subparsers.add_parser( + "jobs", + help="List scheduled tasks.", + parents=[_common_parser()], + description=( + "List scheduled task symbols (capability=SCHEDULED_TASK). " + "Returns Symbol nodes with the SCHEDULED_TASK capability." + ), + ) + jobs.set_defaults(handler=_cmd_jobs, detail="full") + + # listeners subparser (PR-JRAG-2) + listeners = subparsers.add_parser( + "listeners", + help="List message listeners.", + parents=[_common_parser()], + description=( + "List message listener symbols (capability=MESSAGE_LISTENER). " + "Returns Symbol nodes with the MESSAGE_LISTENER capability." + ), + ) + listeners.add_argument("--topic-prefix", type=str, default=None, help="Filter by topic prefix (on producer member).") + listeners.set_defaults(handler=_cmd_listeners, detail="full") + + # entities subparser (PR-JRAG-2) + entities = subparsers.add_parser( + "entities", + help="List JPA entities.", + parents=[_common_parser()], + description=( + "List JPA entity symbols (role=ENTITY). " + "Returns Symbol nodes with the ENTITY role." + ), + ) + entities.set_defaults(handler=_cmd_entities, detail="full") + + # ---- Traversal commands (PR-JRAG-3a) ---- + # Shared resolve-disambiguation flags (PR-JRAG-1a contract: only --kind is a + # true resolve input; the rest are client-side post-filters on resolve's + # candidate set). Traversals are resolve-first; --offset is NOT registered + # on any traversal subparser (none of the backends take offset). + resolve_parent = argparse.ArgumentParser(add_help=False) + resolve_parent.add_argument( + "--kind", + choices=("symbol", "route", "client", "producer"), + default=None, + help="Hint for resolve (omit for broad search).", + ) + resolve_parent.add_argument("--java-kind", type=str, default=None, help="Post-filter by Java symbol kind.") + resolve_parent.add_argument("--role", type=str, default=None, help="Post-filter by role.") + resolve_parent.add_argument("--fqn-prefix", type=str, default=None, help="Post-filter by FQN prefix.") + + callers = subparsers.add_parser( + "callers", + help="Who calls this symbol or route?", + parents=[_common_parser(), resolve_parent], + description=( + "Resolve then traverse the call graph inbound (who calls me?). " + "Symbol -> g.find_callers (CALLS edges, --service/--module pushed down). " + "Route -> g.find_route_callers; --service is a CLIENT-SIDE post-filter on " + "caller_microservice (the backend kwarg is ignored once route_id is set), " + "surfaced as a warnings[] entry. --include-external controls whether " + "external (JDK/Spring/Lombok) callers are excluded (default: excluded)." + ), + ) + callers.add_argument("query", help="Symbol FQN/name (e.g. 'pkg.Svc#method(Arg)') or route path.") + callers.add_argument("--depth", type=int, default=1, help="Call-graph depth (default 1).") + callers.add_argument( + "--min-confidence", + type=float, + default=0.0, + dest="min_confidence", + help="Minimum CALLS edge confidence in [0.0, 1.0].", + ) + callers.add_argument( + "--include-external", + action="store_true", + help="Include external (JDK/Spring/Lombok) callers/callees (default excluded).", + ) + callers.set_defaults(handler=_cmd_callers) + + callees = subparsers.add_parser( + "callees", + help="What does this symbol call?", + parents=[_common_parser(), resolve_parent], + description=( + "Resolve (Symbol) then traverse the call graph outbound (what do I " + "call?). Calls g.find_callees; --include-external is symmetric with callers." + ), + ) + callees.add_argument("query", help="Symbol FQN/name (e.g. 'pkg.Svc#method(Arg)').") + callees.add_argument("--depth", type=int, default=1, help="Call-graph depth (default 1).") + callees.add_argument( + "--min-confidence", + type=float, + default=0.0, + dest="min_confidence", + help="Minimum CALLS edge confidence in [0.0, 1.0].", + ) + callees.add_argument( + "--include-external", + action="store_true", + help="Include external (JDK/Spring/Lombok) callees (default excluded).", + ) + callees.set_defaults(handler=_cmd_callees) + + hierarchy = subparsers.add_parser( + "hierarchy", + help="Type hierarchy (parents and children).", + parents=[_common_parser(), resolve_parent], + description=( + "Resolve (type Symbol) then walk EXTENDS/IMPLEMENTS both directions: " + "out = supertypes (parents), in = subtypes (children). No --service/--module " + "push-down (structural edges)." + ), + ) + hierarchy.add_argument("query", help="Class/interface FQN or name.") + hierarchy.set_defaults(handler=_cmd_hierarchy) + + implementations = subparsers.add_parser( + "implementations", + help="Classes implementing an interface.", + parents=[_common_parser(), resolve_parent], + description=( + "Resolve (interface Symbol) then call g.find_implementors. " + "--service/--module pushed down; --capability pushed down to the backend " + "(find_implementors accepts a capability filter)." + ), + ) + implementations.add_argument("query", help="Interface FQN or name.") + implementations.add_argument("--capability", type=str, default=None, help="Filter implementors by capability.") + implementations.set_defaults(handler=_cmd_implementations) + + subclasses = subparsers.add_parser( + "subclasses", + help="Classes extending a type.", + parents=[_common_parser(), resolve_parent], + description=( + "Resolve (class Symbol) then call g.find_subclasses (EXTENDS inbound). " + "--service/--module pushed down." + ), + ) + subclasses.add_argument("query", help="Class FQN or name.") + subclasses.set_defaults(handler=_cmd_subclasses) + + overrides = subparsers.add_parser( + "overrides", + help="Methods this method overrides (dispatch UP to declaration).", + parents=[_common_parser(), resolve_parent], + description=( + "Resolve (method Symbol) then neighbors_v2([id], 'out', ['OVERRIDES']). " + "The stored OVERRIDES edge runs overrider -> declaration (subtype method -> " + "supertype declared method), so 'out' dispatches UP the hierarchy." + ), + ) + overrides.add_argument("query", help="Method FQN or name (e.g. 'pkg.Impl#method(Arg)').") + overrides.set_defaults(handler=_cmd_overrides) + + overridden_by = subparsers.add_parser( + "overridden-by", + help="Methods overriding this one (dispatch DOWN to overriders).", + parents=[_common_parser(), resolve_parent], + description=( + "Resolve (method Symbol) then neighbors_v2([id], 'in', ['OVERRIDES']) " + "(= virtual OVERRIDDEN_BY out). 'in' traverses the stored OVERRIDES edge " + "backward, dispatching DOWN from declaration to overriders." + ), + ) + overridden_by.add_argument("query", help="Method FQN or name (e.g. 'pkg.Iface#method(Arg)').") + overridden_by.set_defaults(handler=_cmd_overridden_by) + + dependents = subparsers.add_parser( + "dependents", + help="Who injects this type?", + parents=[_common_parser(), resolve_parent], + description=( + "Resolve (type Symbol) then call g.find_injectors (INJECTS inbound: " + "classes that inject this type). --service/--module pushed down." + ), + ) + dependents.add_argument("query", help="Type FQN or name.") + dependents.set_defaults(handler=_cmd_dependents) + + impact = subparsers.add_parser( + "impact", + help="Fleet-wide blast radius (INJECTS/IMPLEMENTS/EXTENDS reverse closure).", + parents=[_common_parser(), resolve_parent], + description=( + "Resolve then call g.impact_analysis (reverse closure over " + "INJECTS+IMPLEMENTS+EXTENDS: who breaks if this changes). --service is a " + "CLIENT-SIDE post-filter (impact_analysis has no microservice param); " + "surfaced as a warnings[] entry." + ), + ) + impact.add_argument("query", help="Symbol FQN or name.") + impact.add_argument("--depth", type=int, default=2, help="Closure depth (default 2).") + impact.set_defaults(handler=_cmd_impact) + + decompose = subparsers.add_parser( + "decompose", + help="Role-waterfall flow from an entrypoint.", + parents=[_common_parser(), resolve_parent], + description=( + "Resolve (entrypoint Symbol) then call g.trace_flow. Walks " + "CONTROLLER -> SERVICE/COMPONENT -> CLIENT/REPOSITORY/MAPPER stages via " + "INJECTS+EXTENDS+IMPLEMENTS (optionally + CALLS hops). --service/--module " + "pushed down; --depth clamped to 1..3." + ), + ) + decompose.add_argument("query", help="Entrypoint symbol FQN or name.") + decompose.add_argument("--depth", type=int, default=2, help="Neighbour hop count per stage (clamped 1..3, default 2).") + decompose.add_argument( + "--follow-calls", + action="store_true", + dest="follow_calls", + help="Follow DECLARES+CALLS type-to-type hops to top up each stage.", + ) + decompose.add_argument( + "--max-stage", + type=int, + default=20, + dest="max_stage", + help="Cap on symbols per stage (stage_limit, default 20).", + ) + decompose.add_argument( + "--min-confidence", + type=float, + default=0.0, + dest="min_confidence", + help="Min CALLS confidence when --follow-calls is on.", + ) + decompose.add_argument( + "--include-external", + action="store_true", + help="Include external types reached via the CALLS hop (default excluded).", + ) + decompose.set_defaults(handler=_cmd_decompose) + + flow = subparsers.add_parser( + "flow", + help="Request flow through a route (inbound callers + outbound CALLS hops).", + parents=[_common_parser()], + description=( + "Resolve to a Route then call g.trace_request_flow. Inbound = " + "cross-service HTTP/async callers (Client/Producer two-hop); outbound = " + "CALLS hops from the route handler. Intra-service is an INDEX-TIME data " + "property: CALLS edges are intra-codebase by construction, and the query " + "carries no microservice predicate, so the result reflects whatever the " + "fixture indexed (no query-time constraint). --max-hops clamped to 1..8." + ), + ) + flow.add_argument("query", help="Route path (e.g. '/chat/assign'). Resolved with hint_kind=route.") + flow.add_argument("--max-hops", type=int, default=5, dest="max_hops", help="Max CALLS hops (clamped 1..8, default 5).") + flow.set_defaults(handler=_cmd_flow) + + # ---- Compose traversals + file inspection (PR-JRAG-3b) ---- + # callees (Client/Producer variant) re-uses the existing _cmd_callees + # handler from PR-JRAG-3a; the help text below updates to advertise the + # Client/Producer dispatch (Symbol path is unchanged). --kind picks the + # resolve hint; the handler dispatches on the resolved node's kind. + # + # (The callees subparser was registered above with the Symbol-only help + # text; we patch its description here to advertise the new variant without + # duplicating the parser construction.) + callees.epilog = ( + "Symbol root lists the methods this code calls (CALLS out). Client and\n" + "Producer roots follow their call edge to the Route they target:\n" + " Client root -> the :Route it requests (HTTP_CALLS out)\n" + " Producer root -> the :Route (kafka_topic) it publishes to (ASYNC_CALLS out)\n" + "--include-external applies to the Symbol path; Client/Producer edges are\n" + "structural (Client/Producer -> :Route) and have no external-exclusion analog." + ) + + dependencies = subparsers.add_parser( + "dependencies", + help="Types this Symbol injects (INJECTS out).", + parents=[_common_parser(), resolve_parent], + description=( + "Resolve (type Symbol) then neighbors_v2([id], 'out', ['INJECTS']) " + "= the types this class injects (its direct dependencies). INJECTS is " + "Symbol -> Symbol (declaring type -> injected type), so 'out' traverses " + "from the injector to its dependencies. --service/--module are NOT " + "applied (INJECTS is a structural edge with no microservice predicate); " + "they surface as warnings[]. --include-external is accepted for surface " + "symmetry with callers/callees but is a warned no-op here (INJECTS has " + "no external-exclusion analog at the neighbors_v2 layer)." + ), + ) + dependencies.add_argument("query", help="Symbol FQN or name (e.g. 'pkg.Svc').") + dependencies.add_argument( + "--include-external", + action="store_true", + help="Accepted for symmetry; warned no-op on dependencies (INJECTS is structural).", + ) + dependencies.set_defaults(handler=_cmd_dependencies) + + connection = subparsers.add_parser( + "connection", + help="Cross-service connections for a microservice (inbound/outbound).", + parents=[_common_parser()], + description=( + "RESOLVE-FIRST EXCEPTION: the first positional is a microservice NAME " + "(e.g. 'chat-core'), NOT a query — it is passed literally to list_clients/" + "list_producers/find_route_callers; resolve_v2 is NEVER run on it.\n\n" + "Direction (default --both): clients/producers in OTHER services " + "targeting this service. HTTP via list_clients(target_service=) + " + "async via find_route_callers on this service's topic Routes.\n" + "--outbound: clients/producers IN this service. HTTP via " + "list_clients(microservice=) + producers via " + "list_producers(microservice=).\n" + "--both: render both inbound and outbound sections.\n\n" + "--http-method and --calls-service filter HTTP callers only (clients " + "have a target_service; producers do not). Producers are KEPT under " + "--calls-service so the async channel stays visible; a warnings[] entry " + "is emitted when --calls-service bypasses producers." + ), + ) + connection.add_argument( + "microservice", + help="Microservice NAME (literal — NOT resolved as a query).", + ) + connection.add_argument( + "--inbound", + dest="direction", + action="store_const", + const="inbound", + default=None, + help="Show only inbound connections (default is --both).", + ) + connection.add_argument( + "--outbound", + dest="direction", + action="store_const", + const="outbound", + help="Show only outbound connections (default is --both).", + ) + connection.add_argument( + "--both", + dest="direction", + action="store_const", + const="both", + help="Show both inbound and outbound sections (this is the default).", + ) + connection.add_argument( + "--http-method", + type=str, + default=None, + help="Filter HTTP callers by method (e.g. POST). Applies to clients only.", + ) + connection.add_argument( + "--calls-service", + type=str, + default=None, + help=( + "Narrow to edges involving this other service. Outbound: clients with " + "target_service == (producers kept with a warning — no service " + "target on ASYNC channels). Inbound: callers from microservice == ." + ), + ) + connection.set_defaults(handler=_cmd_connection) + + outline = subparsers.add_parser( + "outline", + help="List symbols declared in a file.", + parents=[_common_parser()], + description=( + "List all Symbol nodes whose declared location is in . Calls " + "find_symbols_in_file_range(graph, filename=, start_line=1, " + "end_line=2**31-1) — the start_line=1 is required (the backend returns " + "[] for start_line<1). UNBOUNDED: there is no --limit cap (the entire " + "file's symbol table is returned); --limit is accepted (common flag) " + "but does not truncate. --offset is rejected (the backend takes no offset)." + ), + ) + outline.add_argument("file", help="File path as stored in the graph (POSIX-relative to source root).") + outline.set_defaults(handler=_cmd_outline) + + imports = subparsers.add_parser( + "imports", + help="List imports declared in a file (tree-sitter parse + resolve_v2).", + parents=[_common_parser()], + description=( + "Parse with tree-sitter (ast_java.parse_java), walk its " + "import_declaration nodes, and resolve each imported FQN via resolve_v2 " + "against the graph. Returns one node per import: resolved graph Symbol " + "when resolve_v2 hits, or an unresolved placeholder carrying the raw FQN " + "otherwise. Static and wildcard imports are included (marked in the row)." + " --offset is rejected." + ), + ) + imports.add_argument("file", help="File path (POSIX-relative to source root, or absolute).") + imports.set_defaults(handler=_cmd_imports) + + # ---- Orientation commands (PR-JRAG-4) ---- + microservices = subparsers.add_parser( + "microservices", + help="List microservices with resolved type counts.", + parents=[_common_parser()], + description=( + "List every microservice with its resolved type-symbol count. " + "Calls g.microservice_counts(). Renders as a counts listing." + ), + ) + microservices.set_defaults(handler=_cmd_microservices, detail="full") + + map_cmd = subparsers.add_parser( + "map", + help="Symbol counts per kind, grouped by service or module.", + parents=[_common_parser()], + description=( + "Count resolved type Symbols (class/interface/enum/record/annotation) " + "grouped by microservice or module. --by {microservice,module} selects " + "the grouping axis (default microservice); --service / --module narrow " + "the count to one service or module (filters, independent of --by)." + ), + ) + map_cmd.add_argument( + "--by", + dest="by", + choices=("microservice", "module"), + default="microservice", + help="Grouping axis: microservice (default) or module.", + ) + map_cmd.set_defaults(handler=_cmd_map, detail="full") + + conventions = subparsers.add_parser( + "conventions", + help="Dominant roles + framework tallies.", + parents=[_common_parser()], + description=( + "Report the dominant roles among resolved Symbols and the route framework " + "distribution. --service narrows the role tally to one microservice." + ), + ) + conventions.set_defaults(handler=_cmd_conventions, detail="full") + + overview = subparsers.add_parser( + "overview", + help="Bundle for a microservice, route, or topic.", + parents=[_common_parser()], + description=( + "Dispatch on the positional :\n" + " Route path (starts with '/') -> trace_request_flow (same as `flow`).\n" + " Microservice name -> routes + clients + producers bundle.\n" + " Topic string -> producers + consumers for the topic.\n" + "--as {microservice,route,topic} overrides auto-detection.\n" + "Auto-detection: starts with '/' -> route; matches a known microservice -> " + "microservice; otherwise -> topic." + ), + ) + overview.add_argument( + "subject", + nargs="?", + default=None, + help="Microservice name, route path (starts with '/'), or topic string.", + ) + overview.add_argument( + "--as", + dest="as_type", + choices=("microservice", "route", "topic"), + default=None, + help="Override auto-detection of subject type.", + ) + overview.set_defaults(handler=_cmd_overview, detail="full") + + # ---- Search command (PR-JRAG-4) ---- + search = subparsers.add_parser( + "search", + help="Semantic search over Lance tables.", + parents=[_common_parser()], + description=( + "Semantic search via search_v2 over the Lance index (java/sql/yaml tables). " + "--table all searches all three. --hybrid enables vector+keyword hybrid. " + "--offset paginates. --path-contains narrows by file path substring. " + "Filters (NodeFilter flags) narrow results.\n\n" + "--fuzzy is accepted but rejected IN-HANDLER with status: error (search is " + "inherently semantic; --fuzzy is a no-op synonym). Registering the flag " + "prevents argparse from exiting 2 before the handler can produce the envelope." + ), + ) + search.add_argument("query", help="Natural-language search query.") + search.add_argument( + "--table", + choices=("java", "sql", "yaml", "all"), + default="java", + help="Lance table to search (default: java; all = java+sql+yaml).", + ) + search.add_argument( + "--hybrid", action="store_true", help="Enable vector+keyword hybrid search." + ) + search.add_argument( + "--path-contains", type=str, default=None, dest="path_contains", + help="Narrow to chunks whose filename contains this substring.", + ) + search.add_argument( + "--fuzzy", action="store_true", + help="Accepted but rejected in-handler (search is semantic; --fuzzy is implicit).", + ) + # NodeFilter flags (same set as `find` filter mode, minus the query-only ones). + search.add_argument("--role", type=str, default=None, help="Filter by role.") + search.add_argument("--exclude-role", type=str, default=None, dest="exclude_role", help="Exclude by role.") + search.add_argument("--java-kind", type=str, default=None, dest="java_kind", help="Filter by Java symbol kind.") + search.add_argument("--annotation", type=str, default=None, help="Filter by annotation.") + search.add_argument("--capability", type=str, default=None, help="Filter by capability.") + search.add_argument("--framework", type=str, default=None, help="Filter by framework.") + search.add_argument("--fqn-prefix", type=str, default=None, dest="fqn_prefix", help="Filter by FQN prefix.") + search.add_argument( + "--offset", + type=int, + default=0, + help="Page offset (passed to search_v2; paginated via +1-fetch).", + ) + search.set_defaults(handler=_cmd_search) + + return parser + + +def _resolve_cfg(args: argparse.Namespace): # type: ignore[no-untyped-def] + """Resolve operator config (reuses the operator's cocoindex-free resolver). + + Same pattern as ``java_codebase_rag.cli._resolved_from_ns``: walks up from + cwd to find a project root (config file or ``.java-codebase-rag/`` index), + applies CLI ``--index-dir`` if given, and calls ``apply_to_os_environ`` so + downstream modules see a consistent env (critically: SBERT_MODEL for + ``jrag search`` in PR-JRAG-4). + """ + from java_codebase_rag.config import discover_project_root, resolve_operator_config + + cfg = resolve_operator_config( + source_root=discover_project_root(Path.cwd()), + cli_index_dir=getattr(args, "index_dir", None), + ) + cfg.apply_to_os_environ() + return cfg + + +def _load_graph(cfg): # type: ignore[no-untyped-def] + """Load the LadybugGraph with actionable error envelopes. + + * missing index -> ``_IndexNotFound`` (caught in ``main`` -> envelope with + a ``java-codebase-rag init --source-root `` remediation). + * ontology-mismatch (``RuntimeError`` from ``LadybugGraph.get``) -> + ``_IndexStale`` (caught in ``main`` -> envelope with a rebuild hint). + """ + from ladybug_queries import LadybugGraph + + ladybug_path = str(cfg.ladybug_path) + if not LadybugGraph.exists(ladybug_path): + raise _IndexNotFound( + f"No index at {cfg.ladybug_path}. " + "Run: java-codebase-rag init --source-root " + ) + try: + return LadybugGraph.get(ladybug_path) + except RuntimeError as exc: + raise _IndexStale(str(exc)) from exc + + +def _cmd_status(args: argparse.Namespace) -> int: + from java_codebase_rag.jrag_envelope import Envelope + from java_codebase_rag.jrag_render import render + + cfg = _resolve_cfg(args) + try: + graph = _load_graph(cfg) + except (_IndexNotFound, _IndexStale) as exc: + env = Envelope( + status="error", + message=str(exc), + ) + print(render(env, fmt=args.format, detail=args.detail)) + return 2 + + meta = graph.meta() + if "error" in meta: + env = Envelope( + status="error", + message=f"Index meta read failed: {meta['error']}", + ) + print(render(env, fmt=args.format, detail=args.detail)) + return 2 + + counts = meta.get("counts") or {} + edge_counts = meta.get("edge_counts") or {} + # Single notional "index" node carrying kv fields + nested counts/edges + # as top-level dict-valued fields. The renderer's inspect-shape dispatch + # fires on ANY dict-typed value (structural signal, not name-based), so + # ``counts`` / ``edges`` render as indented alphabetical sections without + # abusing ``edge_summary`` (which is reserved for PR-JRAG-3 real edge + # data). See jrag_render._render_inspect / _render_text_shape. + warnings = _warn_inapplicable_common(args, service=True, module=True, limit=True) + env = Envelope( + status="ok", + nodes={ + "index": { + "ontology_version": int(meta.get("ontology_version") or 0), + "built_at": int(meta.get("built_at") or 0), + "source_root": str(meta.get("source_root") or ""), + "db_path": str(meta.get("db_path") or ""), + "parse_errors": int(meta.get("parse_errors") or 0), + "index_dir": str(cfg.index_dir.resolve()), + "ladybug_path": str(cfg.ladybug_path.resolve()), + "counts": dict(counts), + "edges": dict(edge_counts), + }, + }, + warnings=warnings, + ) + print(render(env, fmt=args.format, detail=args.detail, noun="status", shape="inspect")) + return 0 + + +def _infer_kind(args: argparse.Namespace) -> str | None: + """Infer kind from domain flags when --kind is omitted. + + Inference rules (PR-JRAG-1b): + - --http-method or --path-prefix → route + - --client-kind or --calls-service or --calls-path-prefix → client + - --producer-kind or --topic-prefix → producer + - else → symbol (default) + Returns None if no flags are set (symbol default in callers). + """ + if args.kind is not None: + return args.kind + if args.http_method or args.path_prefix: + return "route" + if args.client_kind or args.calls_service or args.calls_path_prefix: + return "client" + if args.producer_kind or args.topic_prefix: + return "producer" + return "symbol" + + +def _check_kind_contradiction(args: argparse.Namespace, inferred: str | None) -> tuple[bool, str | None]: + """Check if domain flags contradict explicit --kind. + + Returns (is_contradiction, error_message). Contradiction pairs: + - --kind symbol + any route flag (--http-method, --path-prefix) + - --kind symbol + any client flag (--client-kind, --calls-service, --calls-path-prefix) + - --kind symbol + any producer flag (--producer-kind, --topic-prefix) + - (and similarly for route + non-route flags, etc.) + """ + if args.kind is None: + return False, None + explicit = args.kind + route_flags = args.http_method or args.path_prefix + client_flags = args.client_kind or args.calls_service or args.calls_path_prefix + producer_flags = args.producer_kind or args.topic_prefix + if explicit == "symbol" and (route_flags or client_flags or producer_flags): + return True, "--kind symbol conflicts with domain flags (route/client/producer flags require matching --kind)" + if explicit == "route" and (client_flags or producer_flags): + return True, "--kind route conflicts with client/producer flags" + if explicit == "client" and (route_flags or producer_flags): + return True, "--kind client conflicts with route/producer flags" + if explicit == "producer" and (route_flags or client_flags): + return True, "--kind producer conflicts with route/client flags" + return False, None + + +def _cmd_find(args: argparse.Namespace) -> int: + from java_codebase_rag.jrag_envelope import Envelope + from java_codebase_rag.jrag_render import render + + cfg = _resolve_cfg(args) + try: + graph = _load_graph(cfg) + except (_IndexNotFound, _IndexStale) as exc: + env = Envelope(status="error", message=str(exc)) + print(render(env, fmt=args.format, detail=args.detail)) + return 2 + + # Check kind contradiction first (before any backend work) + inferred = _infer_kind(args) + is_contradiction, error_msg = _check_kind_contradiction(args, inferred) + if is_contradiction: + env = Envelope(status="error", message=error_msg or "kind contradiction") + print(render(env, fmt=args.format, detail=args.detail)) + return 2 + + # Cap at 499 so limit+1 <= 500 (backend clamp) + # If args.limit is None, default to 20 (from argparse) + raw_limit = args.limit if args.limit is not None else 20 + limit = min(raw_limit, 499) + + # Query mode: positional present + if args.query: + # find_by_name_or_fqn is Symbol-only (MATCH (s:Symbol) WHERE s.name=$needle + # OR s.fqn=$needle). A positional with a non-symbol kind (explicit + # OR inferred from --http-method/--client-kind/--producer-kind/etc.) is a + # usage contract violation -> status: error envelope (NOT argparse exit), + # telling the user to drop the positional and use filter mode. + effective_kind = inferred or "symbol" + if effective_kind != "symbol": + env = Envelope( + status="error", + message=( + f"query mode (positional ) only searches Symbols, but kind " + f"'{effective_kind}' was {'inferred from domain flags' if args.kind is None else 'set via --kind'}. " + "Drop the positional and use filter mode (the domain flags) " + "for route/client/producer searches." + ), + ) + print(render(env, fmt=args.format, detail=args.detail)) + return 2 + return _cmd_find_query_mode(args, cfg, graph, limit) + + # Filter mode: build NodeFilter and call find_v2 + return _cmd_find_filter_mode(args, cfg, graph, inferred or "symbol", limit) + + +def _cmd_find_query_mode( + args: argparse.Namespace, + cfg, + graph, + limit: int, +) -> int: + """Find query mode: g.find_by_name_or_fqn (Symbol-only, exact name/FQN match). + + ``find_by_name_or_fqn`` runs ``MATCH (s:Symbol) WHERE s.name=$needle OR + s.fqn=$needle`` — Symbol-only, exact-only. There is no fuzzy/prefix/contains + path; ``--fuzzy`` was deferred (see plans/active/PLAN-JRAG-CLI.md Out of + scope). Query mode is gated to ``effective_kind == "symbol"`` upstream in + ``_cmd_find``, so the only ``kinds`` filter we may pass is symbol sub-kinds + derived from ``--java-kind``. + """ + from java_codebase_rag.jrag_envelope import Envelope, next_actions_hook, normalize_enum + from java_codebase_rag.jrag_render import render + + query = args.query + + # find_by_name_or_fqn is always Symbol; the only valid kinds filter is the + # symbol sub-kind derived from --java-kind (lowercase, matching s.kind). + # route/client/producer kinds were removed: they would never match Symbols. + if args.java_kind: + java_kind_norm = normalize_enum(args.java_kind, kind="java_kind") + kinds = [java_kind_norm.lower()] + else: + kinds = None + + # Call find_by_name_or_fqn (exact name OR fqn match). + rows = graph.find_by_name_or_fqn( + query, + kinds=kinds, + module=args.module, + microservice=args.service, + limit=limit + 1, # +1 for truncated detection + ) + # Truncation is decided by the RAW name/FQN fetch (limit+1), BEFORE + # post-filters reduce the set — otherwise a post-filter that drops rows + # would silently clear `truncated` even though more name matches may exist + # beyond the fetch (silent wrong-results). + raw_truncated = len(rows) > limit + + # Post-filter by role/annotation/capability (SymbolHit carries these). + post_filter_active = False + if args.role: + post_filter_active = True + role_norm = normalize_enum(args.role, kind="role") + rows = [r for r in rows if (r.role or "").upper().replace("-", "_") == role_norm.upper()] + if args.exclude_role: + post_filter_active = True + exclude_role_norm = normalize_enum(args.exclude_role, kind="role") + rows = [r for r in rows if (r.role or "").upper().replace("-", "_") != exclude_role_norm.upper()] + if args.annotation: + post_filter_active = True + rows = [r for r in rows if args.annotation in (r.annotations or [])] + if args.capability: + post_filter_active = True + rows = [r for r in rows if args.capability in (r.capabilities or [])] + + # Build warnings for filters that cannot apply in query mode. SymbolHit + # carries no framework/source_layer fields; rather than silently dropping + # the user's filter, surface a warning so they know to switch to filter mode. + warnings: list[str] = [] + if args.framework: + warnings.append( + "--framework ignored in query mode (applies to routes/clients/producers; use filter mode)" + ) + if args.source_layer: + warnings.append( + "--source-layer ignored in query mode (applies to routes; use filter mode)" + ) + # When post-filters apply after a capped fetch, `truncated` reflects the + # pre-filter name-match count and cannot know whether MORE filtered matches + # exist beyond the fetch — surface that honestly. + if raw_truncated and post_filter_active: + warnings.append( + "results truncated before --role/--annotation/--capability filters; " + "additional filtered matches may exist beyond the fetch" + ) + + # Display at most `limit` of the (post-filtered) rows. + display_rows = rows[:limit] + nodes = {} + for row in display_rows: + node_id = row.id + nodes[node_id] = { + "id": node_id, + "kind": "symbol", + "fqn": row.fqn, + "name": row.name, + "symbol_kind": row.kind, + "microservice": row.microservice, + "module": row.module, + "role": row.role, + } + + env = Envelope(status="ok", nodes=nodes, truncated=raw_truncated, warnings=warnings) + next_actions_hook(env) + + # Offset is not supported in query mode (find_by_name_or_fqn has no offset). + print(render(env, fmt=args.format, detail=args.detail, noun="symbol")) + return 0 + + +def _build_node_filter_or_error(filter_dict: dict): + """Build a ``NodeFilter`` from ``filter_dict``; on pydantic validation + failure return ``(None, error_envelope)`` so the caller can render a clean + ``status: error`` envelope instead of letting the ValidationError propagate + to the top-level handler (which renders "internal error" + a traceback). + + A bad enum (e.g. ``--role FOO``) should be a user-facing validation error, + not an internal crash. Returns ``(node_filter, None)`` on success. + """ + import mcp_v2 + + from java_codebase_rag.jrag_envelope import Envelope + from pydantic import ValidationError + + try: + nf = mcp_v2.NodeFilter.model_validate(filter_dict) if filter_dict else mcp_v2.NodeFilter() + return nf, None + except ValidationError as exc: + parts: list[str] = [] + for err in exc.errors(): + loc = ".".join(str(x) for x in err.get("loc", []) if x != "") + msg = str(err.get("msg") or "").strip() + parts.append(f"{loc}: {msg}" if loc else msg) + message = "; ".join(parts) if parts else str(exc) + return None, Envelope(status="error", message=f"invalid filter: {message}") + + +def _cmd_find_filter_mode( + args: argparse.Namespace, + cfg, + graph, + kind: str, + limit: int, +) -> int: + """Find filter mode: build NodeFilter and call find_v2.""" + import mcp_v2 + + from java_codebase_rag.jrag_envelope import Envelope, next_actions_hook, normalize_enum, to_envelope_rows + from java_codebase_rag.jrag_render import render + + # Build NodeFilter from args + filter_dict: dict = {} + if args.service: + filter_dict["microservice"] = args.service + if args.module: + filter_dict["module"] = args.module + if args.role: + filter_dict["role"] = normalize_enum(args.role, kind="role") + if args.exclude_role: + filter_dict["exclude_roles"] = [normalize_enum(args.exclude_role, kind="role")] + if args.annotation: + filter_dict["annotation"] = args.annotation + if args.capability: + filter_dict["capability"] = args.capability + if args.fqn_prefix: + filter_dict["fqn_prefix"] = args.fqn_prefix + if args.java_kind: + filter_dict["symbol_kind"] = normalize_enum(args.java_kind, kind="java_kind") + if args.framework: + filter_dict["framework"] = normalize_enum(args.framework, kind="framework") + if args.source_layer: + filter_dict["source_layer"] = normalize_enum(args.source_layer, kind="source_layer") + if args.http_method: + filter_dict["http_method"] = args.http_method.upper() + if args.path_prefix: + filter_dict["path_prefix"] = args.path_prefix + if args.client_kind: + filter_dict["client_kind"] = normalize_enum(args.client_kind, kind="client_kind") + if args.calls_service: + filter_dict["target_service"] = args.calls_service + if args.calls_path_prefix: + filter_dict["target_path_prefix"] = args.calls_path_prefix + if args.producer_kind: + filter_dict["producer_kind"] = normalize_enum(args.producer_kind, kind="producer_kind") + if args.topic_prefix: + filter_dict["topic_prefix"] = args.topic_prefix + + node_filter, err_env = _build_node_filter_or_error(filter_dict) + if err_env is not None: + print(render(err_env, fmt=args.format, detail=args.detail)) + return 2 + + # Call find_v2 + out = mcp_v2.find_v2( + kind=kind, + filter=node_filter, + limit=limit + 1, # +1 for has_more_results detection + offset=args.offset, + graph=graph, + ) + + if not out.success: + env = Envelope(status="error", message=out.message) + print(render(env, fmt=args.format, detail=args.detail)) + return 2 + + # Convert results to envelope rows. Slice to `limit`: find_v2 was called with + # limit+1, so when exactly user_limit+1 matches exist `out.results` carries + # one extra row that must be dropped (off-by-one guard). `truncated` is True + # when the backend reports more OR the +1 row is present. + results = list(out.results) + truncated = bool(out.has_more_results) or len(results) > limit + display_refs = results[:limit] + nodes_dict = {ref.id: to_envelope_rows([ref])[0] for ref in display_refs} + + env = Envelope(status="ok", nodes=nodes_dict, truncated=truncated) + next_actions_hook(env) + + # Render with offset hint if truncated + next_offset = args.offset + limit if truncated else None + print(render(env, fmt=args.format, detail=args.detail, noun=kind, next_offset=next_offset)) + return 0 + + +def _cmd_inspect(args: argparse.Namespace) -> int: + import mcp_v2 + + from java_codebase_rag.jrag_envelope import Envelope, next_actions_hook, resolve_query + from java_codebase_rag.jrag_render import render + + cfg = _resolve_cfg(args) + try: + graph = _load_graph(cfg) + except (_IndexNotFound, _IndexStale) as exc: + env = Envelope(status="error", message=str(exc)) + print(render(env, fmt=args.format, detail=args.detail)) + return 2 + + # Resolve the query + node, env = resolve_query( + args.query, + hint_kind=args.kind, + java_kind=args.java_kind, + role=args.role, + fqn_prefix=args.fqn_prefix, + cfg=cfg, + graph=graph, + ) + + if env.status != "ok": + print(render(env, fmt=args.format, detail=args.detail)) + return 2 if env.status == "error" else 0 + + # Node resolved successfully - call describe_v2 + desc_out = mcp_v2.describe_v2(id=node.id, graph=graph) + + if not desc_out.success or desc_out.record is None: + env = Envelope(status="error", message=desc_out.message or "describe failed") + print(render(env, fmt=args.format, detail=args.detail)) + return 2 + + # Convert NodeRecord to envelope format + record_dict = desc_out.record.model_dump() + node_id = record_dict.get("id") or node.id + env = Envelope( + status="ok", + nodes={node_id: record_dict}, + root=node_id, + file_location=env.file_location, # Preserve file_location from resolve + ) + next_actions_hook(env, root=node_id, edge_summary=record_dict.get("edge_summary")) + + # Render with inspect shape + print(render(env, fmt=args.format, detail=args.detail, shape="inspect")) + return 0 + + +def _backfill_service_from_filename(row: dict) -> None: + """Derive ``microservice`` / ``module`` from ``filename`` when empty. + + Kafka-topic Route nodes are created without ``microservice``/``module`` in + the graph builder, so the routes listing rendered them with no ``@service`` + (or as blank lines when the topic was also empty). The filename carries the + info reliably (``//src/...`` or + ``/src/...``) — the same path-based resolution graph_enrich + uses — so backfill from it for display without forcing a reindex. + """ + fn = str(row.get("filename") or "").strip() + if not fn: + return + parts = fn.split("/") + if "src" not in parts: + return + idx = parts.index("src") + if idx >= 1 and not (row.get("microservice") or "").strip(): + row["microservice"] = parts[0] + if idx >= 2 and not (row.get("module") or "").strip(): + row["module"] = parts[1] + + +def _cmd_routes(args: argparse.Namespace) -> int: + from java_codebase_rag.jrag_envelope import normalize_enum + + _, graph, rc = _load_graph_or_error(args) + if rc: + return rc + limit = _clamped_limit(args) + + # Normalize framework if provided + framework = normalize_enum(args.framework, kind="framework") if args.framework else None + + rows = graph.list_routes( + microservice=args.service, + framework=framework, + path_prefix=args.path_prefix, + method=args.method, + limit=limit + 1, # +1 for truncated detection + ) + for row in rows: + _backfill_service_from_filename(row) + return _render_listing(rows, limit=limit, args=args, noun="route") + + +def _cmd_clients(args: argparse.Namespace) -> int: + from java_codebase_rag.jrag_envelope import normalize_enum + + _, graph, rc = _load_graph_or_error(args) + if rc: + return rc + limit = _clamped_limit(args) + + # Normalize client_kind via lookup table (feign → feign_method, etc.) + client_kind = normalize_enum(args.client_kind, kind="client_kind") if args.client_kind else None + + rows = graph.list_clients( + microservice=args.service, + client_kind=client_kind, + target_service=args.calls_service, + path_prefix=args.path_prefix, + limit=limit + 1, # +1 for truncated detection + ) + return _render_listing(rows, limit=limit, args=args, noun="client") + + +def _cmd_producers(args: argparse.Namespace) -> int: + from java_codebase_rag.jrag_envelope import normalize_enum + + _, graph, rc = _load_graph_or_error(args) + if rc: + return rc + limit = _clamped_limit(args) + + # Normalize producer_kind via lookup table (kafka → kafka_send, etc.) + producer_kind = normalize_enum(args.producer_kind, kind="producer_kind") if args.producer_kind else None + + rows = graph.list_producers( + microservice=args.service, + producer_kind=producer_kind, + topic_prefix=args.topic_prefix, + limit=limit + 1, # +1 for truncated detection + ) + return _render_listing(rows, limit=limit, args=args, noun="producer") + + +def _cmd_topics(args: argparse.Namespace) -> int: + from java_codebase_rag.jrag_envelope import Envelope, mark_truncated, next_actions_hook + from java_codebase_rag.jrag_render import render + + _, graph, rc = _load_graph_or_error(args) + if rc: + return rc + limit = _clamped_limit(args) + + # Scope producers by --producer-in if provided (else --service push-down). + producer_microservice = args.producer_in or args.service + + # Call list_producers to get producers (grouped by topic) + rows = graph.list_producers( + microservice=producer_microservice, + topic_prefix=args.topic_prefix, + limit=limit + 1, # +1 for truncated detection + ) + + # Group by topic name. Track no-topic producers so they surface as a + # warning (distinguishable from "no producers at all"). + topics_dict: dict[str, dict] = {} + no_topic_count = 0 + for producer in rows: + topic = producer.get("topic") or "" + if not topic: + no_topic_count += 1 + continue + if topic not in topics_dict: + topics_dict[topic] = { + "topic": topic, + "producers": [], + "broker": producer.get("broker") or "", + } + topics_dict[topic]["producers"].append(producer) + + warnings: list[str] = [] + if no_topic_count: + warnings.append( + f"{no_topic_count} producer(s) had no topic and were excluded" + ) + # list_producers has no module kwarg (only microservice/topic_prefix); --module + # would be silently dropped — surface it (use --producer-in to scope by svc). + if getattr(args, "module", None): + warnings.append( + "--module is not applied on topics (list_producers has no module param; " + "use --producer-in to scope producers by microservice)" + ) + + # If --consumer-in is provided, resolve consumers for each topic group. + # A consumer of a topic IS a listener: the edge path is + # listener_class -[:DECLARES]-> listener_method -[:EXPOSES]-> Route(topic) + # (ASYNC_CALLS run Producer -> Route per java_ontology.py:415-416, so the + # inbound-ASYNC_CALLS traversal the original PR shipped returned empty on + # every graph — corrected here to use the EXPOSES-based resolver shared + # with `listeners --topic-prefix`.) + if args.consumer_in and topics_dict: + for topic_name, topic_group in topics_dict.items(): + consumers = _resolve_topic_consumers( + graph, + topic=topic_name, + microservice=args.consumer_in, + prefix=False, # exact match on the producer's topic literal + ) + if consumers: + topic_group["consumers"] = consumers + + # Convert to list and apply truncation + topic_list = list(topics_dict.values()) + display_topics_list, truncated = mark_truncated(topic_list, limit) + + # Build envelope with topic nodes + nodes = {} + for i, topic in enumerate(display_topics_list): + node_id = f"topic:{i}" + nodes[node_id] = topic + + env = Envelope(status="ok", nodes=nodes, truncated=truncated, warnings=warnings) + next_actions_hook(env) + print(render(env, fmt=args.format, detail=args.detail, noun="topic")) + return 0 + + +def _cmd_jobs(args: argparse.Namespace) -> int: + _, graph, rc = _load_graph_or_error(args) + if rc: + return rc + limit = _clamped_limit(args) + + symbol_hits = graph.list_by_capability( + capability="SCHEDULED_TASK", + module=args.module, + microservice=args.service, + limit=limit + 1, # +1 for truncated detection + ) + rows = [_symbol_hit_to_dict(h) for h in symbol_hits] + return _render_listing(rows, limit=limit, args=args, noun="symbol") + + +def _resolve_topic_consumers( + graph, + *, + topic: str, + microservice: str | None = None, + prefix: bool = False, +) -> list[dict]: + """Resolve listener classes that consume a topic via EXPOSES on Route. + + The graph models the listener→topic edge path as: + listener_class -[:DECLARES]-> listener_method -[:EXPOSES]-> Route(topic) + + This is the correct consumer-resolution path for async messaging topics: + ``ASYNC_CALLS`` run ``Producer → Route`` (java_ontology.py:415-416), so + there is no inbound ``ASYNC_CALLS`` edge into Producer nodes to traverse + via ``neighbors_v2(direction="in")``. The ``Route.topic`` property is not + projected onto the ``NodeRef`` returned by ``neighbors_v2``, so a + single-purpose Cypher lookup is used here — the same pattern as + ``jrag_envelope._node_file_location`` (``graph._rows`` for a focused + property fetch). This is a CLI-layer compose query, not a reimplementation + of backend traversal logic. + + Args: + topic: Topic string to match (exact unless ``prefix=True``). + microservice: Optional microservice filter on the listener class. + prefix: If True, match topic as a prefix (``STARTS WITH``); + if False (default), exact equality. + + Returns: + List of consumer dicts (``id``, ``fqn``, ``kind``, ``microservice``). + """ + if not topic: + return [] + match_clause = "r.topic STARTS WITH $topic" if prefix else "r.topic = $topic" + params: dict = {"topic": topic} + ms_clause = "" + if microservice: + ms_clause = " AND cls.microservice = $ms" + params["ms"] = microservice + rows = graph._rows( # noqa: SLF001 - focused property lookup (same as _node_file_location) + f"MATCH (cls:Symbol)-[:DECLARES]->(mth:Symbol)-[:EXPOSES]->(r:Route) " + f"WHERE {match_clause}{ms_clause} " + f"RETURN DISTINCT cls.id AS cid, cls.fqn AS cfqn, cls.microservice AS cms", + params, + ) + return [ + { + "id": str(r.get("cid") or ""), + "fqn": str(r.get("cfqn") or ""), + "kind": "symbol", + "microservice": str(r.get("cms") or ""), + } + for r in rows + if r.get("cid") + ] + + +def _listener_ids_for_topic_prefix(graph, listener_ids: list[str], prefix: str) -> set[str]: + """Resolve which listener classes consume a topic with the given prefix. + + Thin wrapper over :func:`_resolve_topic_consumers` intersected with the + pre-fetched ``listener_ids`` (from ``list_by_capability``). Retained as a + separate function so ``_cmd_listeners`` can narrow the SymbolHit list in + place (the capability fetch carries SymbolHit fields the resolver does not + project). See ``_resolve_topic_consumers`` for the edge-model rationale. + """ + if not listener_ids or not prefix: + return set(listener_ids) + consumers = _resolve_topic_consumers(graph, topic=prefix, prefix=True) + matching = {c["id"] for c in consumers} + return {lid for lid in listener_ids if lid in matching} + + +def _cmd_listeners(args: argparse.Namespace) -> int: + _, graph, rc = _load_graph_or_error(args) + if rc: + return rc + limit = _clamped_limit(args) + + symbol_hits = graph.list_by_capability( + capability="MESSAGE_LISTENER", + module=args.module, + microservice=args.service, + limit=_CONSUMER_FETCH_LIMIT, # generous pre-filter fetch; truncation applies after + ) + + # --topic-prefix: narrow to listeners consuming a topic with that prefix. + # The listener class itself carries no topic; its listener method EXPOSES + # a Route whose ``topic`` property holds the consumed topic name (resolved + # or as a constant reference). See _listener_ids_for_topic_prefix. + if args.topic_prefix and symbol_hits: + matching_ids = _listener_ids_for_topic_prefix( + graph, [h.id for h in symbol_hits], args.topic_prefix + ) + symbol_hits = [h for h in symbol_hits if h.id in matching_ids] + + # Apply the user-facing limit + 1 truncation AFTER the topic filter. + capped = symbol_hits[: limit + 1] + rows = [_symbol_hit_to_dict(h) for h in capped] + return _render_listing(rows, limit=limit, args=args, noun="symbol") + + +def _cmd_entities(args: argparse.Namespace) -> int: + _, graph, rc = _load_graph_or_error(args) + if rc: + return rc + limit = _clamped_limit(args) + + symbol_hits = graph.list_by_role( + role="ENTITY", + module=args.module, + microservice=args.service, + limit=limit + 1, # +1 for truncated detection + ) + rows = [_symbol_hit_to_dict(h) for h in symbol_hits] + return _render_listing(rows, limit=limit, args=args, noun="symbol") + + +# ============================================================================ +# PR-JRAG-3a: traversal helpers + 11 traversal command handlers. +# +# Every traversal is resolve-first (resolve_query), then calls a LadybugGraph +# method (or neighbors_v2 for the override axis), then renders via the +# traversal shape (envelope.root + edge rows). --offset is NOT supported on +# any traversal subparser. --limit uses +1-fetch where the method takes a +# limit; client-side slice otherwise. +# +# Backend signatures verified against source (ladybug_queries.py / mcp_v2.py / +# java_ontology.py) at PR-JRAG-3a time. Adaptations from the brief: +# * find_implementors / find_subclasses / find_injectors DO accept a +# `capability` kwarg (the brief claimed they did not); --capability is +# PUSHED DOWN on `implementations` (more efficient + matches the global +# principle "pushed down where the method takes it"). +# * OVERRIDES edge direction confirmed: overrider -> declaration (subtype +# method -> supertype method), so `out`=dispatch UP (overrides) and +# `in`=dispatch DOWN (overridden-by). Brief was correct. +# ============================================================================ + + +def _resolve_traversal_node( + args: argparse.Namespace, + *, + cfg, + graph, + hint_kind, +): + """Resolve-first frame shared by every traversal command. + + Returns ``(node, env, rc)``. On resolve failure (ambiguous / not_found / + error), renders the envelope and returns ``(None, env, rc)`` with rc=2 on + error, 0 on ambiguous/not_found (matches the inspect command convention). + """ + from java_codebase_rag.jrag_envelope import resolve_query + from java_codebase_rag.jrag_render import render + + node, env = resolve_query( + args.query, + hint_kind=hint_kind, + java_kind=getattr(args, "java_kind", None), + role=getattr(args, "role", None), + fqn_prefix=getattr(args, "fqn_prefix", None), + cfg=cfg, + graph=graph, + ) + if env.status != "ok": + print(render(env, fmt=args.format, detail=args.detail)) + return None, env, 2 if env.status == "error" else 0 + return node, env, 0 + + +def _noderef_to_node_dict(ref) -> dict: + """NodeRef (pydantic, from neighbors_v2 / resolve) -> envelope node dict.""" + return ref.model_dump() + + +def _emit_traversal( + args: argparse.Namespace, + *, + root_id: str, + nodes: dict[str, dict], + edges: list[dict], + noun: str, + warnings: list[str] | None = None, + truncated: bool = False, +) -> int: + """Build the traversal envelope (root + nodes + edges) and render. + + The traversal shape requires ``envelope.root`` so the renderer uses the + traversal shape (root + edge rows). ``next_offset`` is left None on every + traversal (non-offset -> "truncated: more results - narrow your query"). + """ + from java_codebase_rag.jrag_envelope import Envelope, next_actions_hook + from java_codebase_rag.jrag_render import render + + env = Envelope( + status="ok", + nodes=dict(nodes), + edges=list(edges), + root=root_id, + warnings=warnings or [], + truncated=truncated, + ) + next_actions_hook(env, root=root_id, result_edges=edges, command=getattr(args, "command", None)) + print(render(env, fmt=args.format, detail=args.detail, noun=noun)) + return 0 + + +def _require_kind( + node, + *, + expected: str, + kinds: tuple[str, ...], + args: argparse.Namespace, + hint: str = "", +) -> int | None: + """Kind guard shared by traversal handlers (DRY for the 11x guard block). + + Returns ``None`` when ``node.kind`` is in ``kinds`` (caller proceeds). On + mismatch, prints a ``status: error`` envelope and returns 2. ``expected`` + is the human-readable root description (e.g. ``"overrides expects a method + Symbol root"``); ``hint`` is an optional trailing suggestion (e.g. ``"Use + --kind symbol to narrow resolve."``). Callers whose kind-dispatch is more + complex (e.g. ``callers`` accepts Symbol OR Route and routes between them) + keep an inline guard. + """ + if node.kind in kinds: + return None + from java_codebase_rag.jrag_envelope import Envelope + from java_codebase_rag.jrag_render import render + + msg = f"{expected}; resolved kind is {node.kind!r}." + if hint: + msg = f"{msg} {hint}" + print(render(Envelope(status="error", message=msg), fmt=args.format, detail=args.detail)) + return 2 + + +def _warn_unapplied_scope(args: argparse.Namespace, *, reason: str) -> list[str]: + """Build warnings[] for --service/--module that cannot be applied. + + Used by hierarchy/overrides/overridden-by/flow, where the backend query + has no microservice/module predicate (structural edges / index-time data + property). The plan principle "inapplicable flags never silently ignored" + requires surfacing these as warnings rather than dropping them. + """ + warnings: list[str] = [] + if args.service: + warnings.append(f"--service is not applied on this command ({reason})") + if getattr(args, "module", None): + warnings.append(f"--module is not applied on this command ({reason})") + return warnings + + +def _warn_inapplicable_common( + args: argparse.Namespace, *, service: bool, module: bool, limit: bool +) -> list[str]: + """Warn when common flags that don't apply to a command are set. + + Companion to :func:`_warn_unapplied_scope` for the aggregate / orientation + commands (status / microservices / map / conventions) which inherit the + ``common`` parent parser (``--service`` / ``--module`` / ``--limit``) but + don't apply all of them. Each kwarg names whether THAT flag is inapplicable + for this command (``True`` -> warn if the user set it). The plan principle + "inapplicable flags never silently ignored" requires the warning; with the + renderer now printing ``warning:`` lines, this is visible to text consumers + too (not just ``--format json``). + """ + warnings: list[str] = [] + if service and args.service: + warnings.append("--service is not applied on this command") + if module and getattr(args, "module", None): + warnings.append("--module is not applied on this command") + if limit and getattr(args, "limit", None) is not None and args.limit != 20: + warnings.append("--limit is not applied on this command") + return warnings + + +def _cmd_callers(args: argparse.Namespace) -> int: + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + node, _renv, rrc = _resolve_traversal_node(args, cfg=cfg, graph=graph, hint_kind=args.kind) + if rrc or node is None: + return rrc + limit = _clamped_limit(args) + + root_dict = _noderef_to_node_dict(node) + root_id = node.id + + # Route root -> find_route_callers (client-side --service post-filter). + if node.kind == "route": + route_callers = graph.find_route_callers(route_id=root_id) + warnings: list[str] = [] + if args.service: + # find_route_callers ignores microservice once route_id is set + # (microservice is only used to *resolve* the route_id when not + # given). Surface that as a warning so the user knows the filter + # was applied client-side, not pushed down. + warnings.append( + "--service is a post-filter on route callers " + "(find_route_callers ignores microservice once route_id is set)" + ) + route_callers = [ + rc for rc in route_callers if (rc.caller_microservice or "") == args.service + ] + # No backend limit on find_route_callers; client-side slice for truncation. + truncated = len(route_callers) > limit + display = route_callers[:limit] + nodes: dict[str, dict] = {} + edges: list[dict] = [] + for rc in display: + caller_id = rc.caller_node_id + if rc.caller_node_kind == "client": + fqn = rc.raw_uri or rc.target_service or "(client)" + edge_type = "HTTP_CALLS" + else: + fqn = rc.topic or "(producer)" + edge_type = "ASYNC_CALLS" + nodes[caller_id] = { + "id": caller_id, + "kind": rc.caller_node_kind, + "fqn": fqn, + "microservice": rc.caller_microservice, + } + edges.append( + {"other_id": caller_id, "edge_type": edge_type, "confidence": rc.confidence} + ) + # Include the root (Route) node so the zero-callers rendering surfaces + # the route path rather than a bare "0 callers" line. + nodes[root_id] = root_dict + return _emit_traversal( + args, root_id=root_id, nodes=nodes, edges=edges, + noun="callers", warnings=warnings, truncated=truncated, + ) + + # Symbol root -> find_callers (push down --service/--module/depth/etc.). + if node.kind != "symbol": + from java_codebase_rag.jrag_envelope import Envelope + from java_codebase_rag.jrag_render import render + + env = Envelope( + status="error", + message=( + f"callers expects a Symbol or Route root; resolved node kind is " + f"{node.kind!r}. Use --kind to narrow resolve." + ), + ) + print(render(env, fmt=args.format, detail=args.detail)) + return 2 + + depth = getattr(args, "depth", 1) + min_conf = getattr(args, "min_confidence", 0.0) + exclude_external = not getattr(args, "include_external", False) + call_edges = graph.find_callers( + node.fqn, + depth=depth, + limit=limit + 1, + min_confidence=min_conf, + exclude_external=exclude_external, + module=args.module, + microservice=args.service, + ) + from java_codebase_rag.jrag_envelope import mark_truncated + + display, truncated = mark_truncated(call_edges, limit) + nodes = {} + edges = [] + for ce in display: + nodes[ce.src.id] = _symbol_hit_to_dict(ce.src) + edges.append( + {"other_id": ce.src.id, "edge_type": "CALLS", "confidence": ce.confidence} + ) + nodes[root_id] = root_dict + return _emit_traversal( + args, root_id=root_id, nodes=nodes, edges=edges, + noun="callers", truncated=truncated, + ) + + +def _cmd_callees(args: argparse.Namespace) -> int: + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + node, _renv, rrc = _resolve_traversal_node(args, cfg=cfg, graph=graph, hint_kind=args.kind) + if rrc or node is None: + return rrc + limit = _clamped_limit(args) + + # PR-JRAG-3b: accept Symbol (CALLS), Client (HTTP_CALLS), and Producer + # (ASYNC_CALLS) roots. The Symbol path is unchanged from PR-JRAG-3a. + guard = _require_kind( + node, + expected="callees expects a Symbol, Client, or Producer root", + kinds=("symbol", "client", "producer"), + args=args, + hint="Use --kind to narrow resolve.", + ) + if guard is not None: + return guard + + from java_codebase_rag.jrag_envelope import Envelope, mark_truncated + from java_codebase_rag.jrag_render import render + + # Client root -> HTTP_CALLS out (Client -> :Route). + # Producer root -> ASYNC_CALLS out (Producer -> :Route, the kafka_topic + # Route this producer publishes to — NOT a :Producer node). + if node.kind in ("client", "producer"): + import mcp_v2 + + edge_types = ["HTTP_CALLS"] if node.kind == "client" else ["ASYNC_CALLS"] + out = mcp_v2.neighbors_v2( + [node.id], direction="out", edge_types=edge_types, + limit=limit + 1, graph=graph, + ) + if not out.success: + print(render(Envelope(status="error", message=out.message or "neighbors_v2 failed"), fmt=args.format, detail=args.detail)) + return 2 + root_id = node.id + nodes: dict[str, dict] = {root_id: _noderef_to_node_dict(node)} + edges: list[dict] = [] + for e in out.results: + nodes[e.other.id] = _noderef_to_node_dict(e.other) + edges.append( + { + "other_id": e.other.id, + "edge_type": e.edge_type, + "confidence": e.attrs.get("confidence"), + } + ) + truncated = bool(out.has_more_results) or len(edges) > limit + if len(edges) > limit: + edges = edges[:limit] + # --include-external is accepted but does not apply on Client/Producer + # roots (the edges are to :Route, which is always in-graph; there is no + # external-exclusion analog). Surface as a warning so the flag is not + # silently dropped (plan principle: inapplicable flags never silently ignored). + warnings: list[str] = [] + if getattr(args, "include_external", False): + warnings.append( + "--include-external does not apply to Client/Producer roots " + "(HTTP_CALLS/ASYNC_CALLS reach :Route, which is always in-graph)" + ) + return _emit_traversal( + args, root_id=root_id, nodes=nodes, edges=edges, + noun="callees", warnings=warnings, truncated=truncated, + ) + + depth = getattr(args, "depth", 1) + min_conf = getattr(args, "min_confidence", 0.0) + exclude_external = not getattr(args, "include_external", False) + call_edges = graph.find_callees( + node.fqn, + depth=depth, + limit=limit + 1, + min_confidence=min_conf, + exclude_external=exclude_external, + module=args.module, + microservice=args.service, + ) + display, truncated = mark_truncated(call_edges, limit) + root_id = node.id + nodes = {root_id: _noderef_to_node_dict(node)} + edges = [] + for ce in display: + nodes[ce.dst.id] = _symbol_hit_to_dict(ce.dst) + edges.append( + {"other_id": ce.dst.id, "edge_type": "CALLS", "confidence": ce.confidence} + ) + return _emit_traversal( + args, root_id=root_id, nodes=nodes, edges=edges, + noun="callees", truncated=truncated, + ) + + +def _cmd_hierarchy(args: argparse.Namespace) -> int: + import mcp_v2 + + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + node, _renv, rrc = _resolve_traversal_node(args, cfg=cfg, graph=graph, hint_kind=args.kind) + if rrc or node is None: + return rrc + limit = _clamped_limit(args) + + guard = _require_kind( + node, expected="hierarchy expects a type Symbol root", kinds=("symbol",), args=args, + ) + if guard is not None: + return guard + + warnings = _warn_unapplied_scope( + args, reason="neighbors_v2 walks structural EXTENDS/IMPLEMENTS edges with no microservice predicate" + ) + + root_id = node.id + # Fetch both directions with limit+1 for +1-fetch truncation on each axis. + fetch = limit + 1 + up = mcp_v2.neighbors_v2( + [root_id], direction="out", edge_types=["EXTENDS", "IMPLEMENTS"], + limit=fetch, graph=graph, + ) + dn = mcp_v2.neighbors_v2( + [root_id], direction="in", edge_types=["EXTENDS", "IMPLEMENTS"], + limit=fetch, graph=graph, + ) + from java_codebase_rag.jrag_envelope import Envelope + from java_codebase_rag.jrag_render import render + + if not up.success: + print(render(Envelope(status="error", message=up.message or "neighbors_v2 failed"), fmt=args.format, detail=args.detail)) + return 2 + + nodes: dict[str, dict] = {root_id: _noderef_to_node_dict(node)} + # Build up/down edges separately so the limit applies PER DIRECTION + # (Fix 5: combined-list truncation could starve `down` behind a full `up`). + up_edges: list[dict] = [] + for e in up.results: + nodes[e.other.id] = _noderef_to_node_dict(e.other) + up_edges.append({"other_id": e.other.id, "edge_type": e.edge_type, "direction": "up"}) + dn_edges: list[dict] = [] + for e in dn.results: + nodes[e.other.id] = _noderef_to_node_dict(e.other) + dn_edges.append({"other_id": e.other.id, "edge_type": e.edge_type, "direction": "down"}) + + # Per-direction +1-fetch truncation: each side independently drops its + # overflow row and flags truncation if it had limit+1 rows. + truncated = len(up_edges) > limit or len(dn_edges) > limit + up_display = up_edges[:limit] + dn_display = dn_edges[:limit] + display_edges = up_display + dn_display + # Drop nodes no longer referenced after per-direction truncation (keep root). + referenced = {root_id} | {e["other_id"] for e in display_edges} + nodes = {nid: nd for nid, nd in nodes.items() if nid in referenced} + return _emit_traversal( + args, root_id=root_id, nodes=nodes, edges=display_edges, + noun="hierarchy", warnings=warnings, truncated=truncated, + ) + + +def _cmd_implementations(args: argparse.Namespace) -> int: + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + node, _renv, rrc = _resolve_traversal_node(args, cfg=cfg, graph=graph, hint_kind=args.kind) + if rrc or node is None: + return rrc + limit = _clamped_limit(args) + + guard = _require_kind( + node, expected="implementations expects an interface Symbol root", kinds=("symbol",), args=args, + ) + if guard is not None: + return guard + + from java_codebase_rag.jrag_envelope import mark_truncated + + # ADAPTATION: find_implementors DOES accept a `capability` kwarg (brief + # claimed otherwise). Push --capability down (matches the global principle + # "pushed down where the method takes it"); --service/--module also pushed. + impls = graph.find_implementors( + node.fqn, + microservice=args.service, + module=args.module, + capability=args.capability, + limit=limit + 1, + ) + display, truncated = mark_truncated(impls, limit) + root_id = node.id + nodes: dict[str, dict] = {root_id: _noderef_to_node_dict(node)} + edges: list[dict] = [] + for hit in display: + nodes[hit.id] = _symbol_hit_to_dict(hit) + edges.append({"other_id": hit.id, "edge_type": "IMPLEMENTS"}) + return _emit_traversal( + args, root_id=root_id, nodes=nodes, edges=edges, + noun="implementations", truncated=truncated, + ) + + +def _cmd_subclasses(args: argparse.Namespace) -> int: + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + node, _renv, rrc = _resolve_traversal_node(args, cfg=cfg, graph=graph, hint_kind=args.kind) + if rrc or node is None: + return rrc + limit = _clamped_limit(args) + + guard = _require_kind( + node, expected="subclasses expects a class Symbol root", kinds=("symbol",), args=args, + ) + if guard is not None: + return guard + + from java_codebase_rag.jrag_envelope import mark_truncated + + subs = graph.find_subclasses( + node.fqn, + microservice=args.service, + module=args.module, + limit=limit + 1, + ) + display, truncated = mark_truncated(subs, limit) + root_id = node.id + nodes: dict[str, dict] = {root_id: _noderef_to_node_dict(node)} + edges: list[dict] = [] + for hit in display: + nodes[hit.id] = _symbol_hit_to_dict(hit) + edges.append({"other_id": hit.id, "edge_type": "EXTENDS"}) + return _emit_traversal( + args, root_id=root_id, nodes=nodes, edges=edges, + noun="subclasses", truncated=truncated, + ) + + +def _cmd_overrides(args: argparse.Namespace) -> int: + import mcp_v2 + + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + node, _renv, rrc = _resolve_traversal_node(args, cfg=cfg, graph=graph, hint_kind=args.kind) + if rrc or node is None: + return rrc + limit = _clamped_limit(args) + + from java_codebase_rag.jrag_envelope import Envelope + from java_codebase_rag.jrag_render import render + + guard = _require_kind( + node, expected="overrides expects a method Symbol root", kinds=("symbol",), args=args, + ) + if guard is not None: + return guard + + warnings = _warn_unapplied_scope( + args, reason="OVERRIDES is a structural method-to-method edge with no microservice predicate" + ) + + root_id = node.id + # OVERRIDES edge runs overrider -> declaration (subtype -> supertype method). + # direction="out" dispatches UP (the declarations this method overrides). + out = mcp_v2.neighbors_v2( + [root_id], direction="out", edge_types=["OVERRIDES"], + limit=limit + 1, graph=graph, + ) + if not out.success: + print(render(Envelope(status="error", message=out.message or "neighbors_v2 failed"), fmt=args.format, detail=args.detail)) + return 2 + + nodes: dict[str, dict] = {root_id: _noderef_to_node_dict(node)} + edges: list[dict] = [] + for e in out.results: + nodes[e.other.id] = _noderef_to_node_dict(e.other) + # No `direction` key: overrides is a flat list, not a tree. Setting + # direction="up" would trip the renderer's has_direction guard and + # mis-label these rows as `↑ supertypes:` (hierarchy). Flat is correct. + edges.append({"other_id": e.other.id, "edge_type": "OVERRIDES"}) + truncated = bool(out.has_more_results) or len(edges) > limit + if len(edges) > limit: + edges = edges[:limit] + return _emit_traversal( + args, root_id=root_id, nodes=nodes, edges=edges, + noun="overrides", warnings=warnings, truncated=truncated, + ) + + +def _cmd_overridden_by(args: argparse.Namespace) -> int: + import mcp_v2 + + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + node, _renv, rrc = _resolve_traversal_node(args, cfg=cfg, graph=graph, hint_kind=args.kind) + if rrc or node is None: + return rrc + limit = _clamped_limit(args) + + from java_codebase_rag.jrag_envelope import Envelope + from java_codebase_rag.jrag_render import render + + guard = _require_kind( + node, expected="overridden-by expects a method Symbol root", kinds=("symbol",), args=args, + ) + if guard is not None: + return guard + + warnings = _warn_unapplied_scope( + args, reason="OVERRIDES is a structural method-to-method edge with no microservice predicate" + ) + + root_id = node.id + # direction="in" on OVERRIDES = virtual OVERRIDDEN_BY out (dispatch DOWN: + # from declaration to its overriders). + out = mcp_v2.neighbors_v2( + [root_id], direction="in", edge_types=["OVERRIDES"], + limit=limit + 1, graph=graph, + ) + if not out.success: + print(render(Envelope(status="error", message=out.message or "neighbors_v2 failed"), fmt=args.format, detail=args.detail)) + return 2 + + nodes: dict[str, dict] = {root_id: _noderef_to_node_dict(node)} + edges: list[dict] = [] + for e in out.results: + nodes[e.other.id] = _noderef_to_node_dict(e.other) + # No `direction` key — see _cmd_overrides: a `direction` value would + # route these into the hierarchy renderer (`↓ subtypes:`), mis-labeling + # a flat overridden-by list. + edges.append({"other_id": e.other.id, "edge_type": "OVERRIDES"}) + truncated = bool(out.has_more_results) or len(edges) > limit + if len(edges) > limit: + edges = edges[:limit] + return _emit_traversal( + args, root_id=root_id, nodes=nodes, edges=edges, + noun="overridden-by", warnings=warnings, truncated=truncated, + ) + + +def _cmd_dependents(args: argparse.Namespace) -> int: + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + node, _renv, rrc = _resolve_traversal_node(args, cfg=cfg, graph=graph, hint_kind=args.kind) + if rrc or node is None: + return rrc + limit = _clamped_limit(args) + + guard = _require_kind( + node, expected="dependents expects a type Symbol root", kinds=("symbol",), args=args, + ) + if guard is not None: + return guard + + from java_codebase_rag.jrag_envelope import mark_truncated + + inj = graph.find_injectors( + node.fqn, + microservice=args.service, + module=args.module, + limit=limit + 1, + ) + display, truncated = mark_truncated(inj, limit) + root_id = node.id + nodes: dict[str, dict] = {root_id: _noderef_to_node_dict(node)} + edges: list[dict] = [] + for eh in display: + nodes[eh.src.id] = _symbol_hit_to_dict(eh.src) + edges.append( + { + "other_id": eh.src.id, + "edge_type": "INJECTS", + "mechanism": eh.mechanism, + "annotation": eh.annotation, + "field_or_param": eh.field_or_param, + } + ) + return _emit_traversal( + args, root_id=root_id, nodes=nodes, edges=edges, + noun="dependents", truncated=truncated, + ) + + +def _cmd_impact(args: argparse.Namespace) -> int: + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + node, _renv, rrc = _resolve_traversal_node(args, cfg=cfg, graph=graph, hint_kind=args.kind) + if rrc or node is None: + return rrc + limit = _clamped_limit(args) + depth = getattr(args, "depth", 2) + + from java_codebase_rag.jrag_envelope import mark_truncated + + impacts = graph.impact_analysis(node.fqn, depth=depth, limit=limit + 1) + warnings: list[str] = [] + if args.service: + # impact_analysis has no microservice param (verified); filter + # client-side and surface a warning so the user knows. + warnings.append( + "--service is a post-filter on impact (impact_analysis has no microservice param)" + ) + impacts = [h for h in impacts if (h.microservice or "") == args.service] + if getattr(args, "module", None): + # impact_analysis has no module param either; warn rather than drop silently. + warnings.append( + "--module is not applied on impact (impact_analysis has no module param)" + ) + display, truncated = mark_truncated(impacts, limit) + root_id = node.id + nodes: dict[str, dict] = {root_id: _noderef_to_node_dict(node)} + edges: list[dict] = [] + for hit in display: + nodes[hit.id] = _symbol_hit_to_dict(hit) + edges.append({"other_id": hit.id, "edge_type": "IMPACTS"}) + return _emit_traversal( + args, root_id=root_id, nodes=nodes, edges=edges, + noun="impact", warnings=warnings, truncated=truncated, + ) + + +def _cmd_decompose(args: argparse.Namespace) -> int: + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + node, _renv, rrc = _resolve_traversal_node(args, cfg=cfg, graph=graph, hint_kind=args.kind) + if rrc or node is None: + return rrc + + guard = _require_kind( + node, expected="decompose expects an entrypoint Symbol root", kinds=("symbol",), args=args, + ) + if guard is not None: + return guard + + # trace_flow clamps depth internally to 1..3; mirror here for the help text. + depth = max(1, min(3, getattr(args, "depth", 2))) + # decompose walks a TYPE role-waterfall (CONTROLLER -> SERVICE/COMPONENT -> + # CLIENT/REPOSITORY/MAPPER) via INJECTS/EXTENDS/IMPLEMENTS, which are + # type-to-type edges. A METHOD seed has no such edges, so trace_flow would + # return only stage 0 (the seed itself). Promote a method seed to its owning + # type so the waterfall is meaningful; point the agent at `callees` for the + # method's direct call chain. (root stays the resolved method node.) + seed_fqn = node.fqn + warnings: list[str] = [] + if seed_fqn and "#" in seed_fqn: + owning_type = seed_fqn.split("#", 1)[0] + warnings.append( + f"decompose is a type role-waterfall; promoted method seed " + f"'{seed_fqn}' to its owning type '{owning_type}'. " + f"Use `jrag callees {seed_fqn}` for the method's direct call chain." + ) + seed_fqn = owning_type + stages = graph.trace_flow( + seed_fqns=[seed_fqn], + depth=depth, + follow_calls=getattr(args, "follow_calls", False), + stage_limit=getattr(args, "max_stage", 20), + min_call_confidence=getattr(args, "min_confidence", 0.0), + exclude_external=not getattr(args, "include_external", False), + microservice=args.service, + module=args.module, + ) + root_id = node.id + nodes: dict[str, dict] = {root_id: _noderef_to_node_dict(node)} + edges: list[dict] = [] + for stage_idx, stage in enumerate(stages): + for ss in stage: + nodes[ss.symbol.id] = _symbol_hit_to_dict(ss.symbol) + via = ss.via[0] if ss.via else None + edge_type = via.edge_type if via else ("SEED" if stage_idx == 0 else "STAGE") + edge_row = { + "other_id": ss.symbol.id, + "edge_type": edge_type, + "stage": stage_idx, + # Role carries through to the renderer so the waterfall can + # label each stage with the role allow-list it matched. + "role": ss.symbol.role or "", + } + if via and via.from_fqn: + edge_row["from_fqn"] = via.from_fqn + edges.append(edge_row) + # --limit is inherited from common but does not cap decompose (trace_flow + # is stage-limited via --max-stage, not a total edge count). Warn when the + # user explicitly set --limit away from the default so they get a signal + # rather than a silent multi-stage dump (Fix 4). + if args.limit is not None and args.limit != 20: + warnings.append( + "--limit does not apply to decompose; use --max-stage to cap per-stage breadth" + ) + return _emit_traversal( + args, root_id=root_id, nodes=nodes, edges=edges, + noun="decompose", warnings=warnings, + ) + + +def _cmd_flow(args: argparse.Namespace) -> int: + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + # flow requires a Route root; force hint_kind="route". + node, _renv, rrc = _resolve_traversal_node(args, cfg=cfg, graph=graph, hint_kind="route") + if rrc or node is None: + return rrc + limit = _clamped_limit(args) + + guard = _require_kind( + node, expected="flow requires a Route root", kinds=("route",), args=args, + hint="Pass a route path (e.g. /chat/assign).", + ) + if guard is not None: + return guard + + warnings = _warn_unapplied_scope( + args, reason="trace_request_flow carries no microservice predicate; intra-codebase is an index-time data property" + ) + + max_hops = max(1, min(8, getattr(args, "max_hops", 5))) + flow_data = graph.trace_request_flow(entry_route_id=node.id, max_hops=max_hops) + + root_id = node.id + nodes: dict[str, dict] = {root_id: _noderef_to_node_dict(node)} + edges: list[dict] = [] + # Inbound: cross-service HTTP/async callers (Client/Producer two-hop). + for row in flow_data.get("inbound", []): + caller_id = str(row.get("caller_node_id") or "") + if not caller_id: + continue + kind = str(row.get("caller_node_kind") or "") + nodes[caller_id] = { + "id": caller_id, + "kind": kind, + "fqn": str(row.get("declaring_symbol_fqn") or ""), + "microservice": str(row.get("microservice") or ""), + } + edges.append( + { + "other_id": caller_id, + "edge_type": "HTTP_CALLS" if kind == "client" else "ASYNC_CALLS", + "confidence": float(row.get("confidence") or 0.0), + } + ) + # Outbound: CALLS hops from the route handler (intra-service by construction). + for row in flow_data.get("outbound", []): + next_id = str(row.get("next_symbol_id") or "") + if not next_id: + continue + nodes[next_id] = { + "id": next_id, + "kind": "symbol", + "fqn": str(row.get("next_fqn") or ""), + "microservice": str(row.get("next_microservice") or ""), + } + edges.append({"other_id": next_id, "edge_type": "CALLS"}) + + # Client-side slice for truncation (trace_request_flow has no limit param). + truncated = len(edges) > limit + if truncated: + edges = edges[:limit] + return _emit_traversal( + args, root_id=root_id, nodes=nodes, edges=edges, + noun="flow", warnings=warnings, truncated=truncated, + ) + + +# ============================================================================ +# PR-JRAG-3b: compose traversals + connection + outline/imports. +# +# callees Client/Producer variant (above) re-uses _cmd_callees. The four new +# handlers below cover: dependencies (INJECTS out), connection (multi-section +# microservice view, resolve-first EXCEPTION), outline (file -> symbols), +# imports (file -> tree-sitter parse -> resolve_v2 per FQN). +# +# Backend signatures verified at PR-JRAG-3b time: +# * neighbors_v2(ids, direction, edge_types, limit=25, offset=0, ...) returns +# NeighborsOutput.results: list[Edge] where Edge.other: NodeRef, +# Edge.edge_type: str, Edge.attrs: dict (mcp_v2.py:1284). +# * find_symbols_in_file_range(graph, *, filename, start_line, end_line) +# returns list[SymbolHit]; start_line<1 returns [] (ladybug_queries.py:302). +# * parse_java(source, *, filename, verbose) -> JavaFileAst with +# explicit_imports: dict[str, str] (simple_name -> FQN) (ast_java.py:2612). +# * INJECTS is Symbol -> Symbol (java_ontology.py:216); out = types this +# symbol injects = direct dependencies. +# * HTTP_CALLS is Client -> Route (java_ontology.py:352); ASYNC_CALLS is +# Producer -> Route (java_ontology.py:386). Both confirmed. +# ============================================================================ + + +def _cmd_dependencies(args: argparse.Namespace) -> int: + import mcp_v2 + + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + node, _renv, rrc = _resolve_traversal_node(args, cfg=cfg, graph=graph, hint_kind=args.kind) + if rrc or node is None: + return rrc + limit = _clamped_limit(args) + + from java_codebase_rag.jrag_envelope import Envelope + from java_codebase_rag.jrag_render import render + + # INJECTS is Symbol -> Symbol; Client/Producer/Route roots have no + # injection edges (the edge type only fires on type Symbols). + guard = _require_kind( + node, expected="dependencies expects a Symbol root (INJECTS is Symbol -> Symbol)", + kinds=("symbol",), args=args, + ) + if guard is not None: + return guard + + warnings = _warn_unapplied_scope( + args, reason="neighbors_v2 walks structural INJECTS edges with no microservice predicate" + ) + # --include-external is accepted for surface symmetry with callers/callees + # but is a warned no-op here (INJECTS has no external-exclusion analog at + # the neighbors_v2 layer; the edge is structural Symbol -> Symbol). + if getattr(args, "include_external", False): + warnings.append( + "--include-external does not apply to dependencies " + "(INJECTS is structural Symbol -> Symbol with no external-exclusion analog)" + ) + + root_id = node.id + out = mcp_v2.neighbors_v2( + [root_id], direction="out", edge_types=["INJECTS"], + limit=limit + 1, graph=graph, + ) + if not out.success: + print(render(Envelope(status="error", message=out.message or "neighbors_v2 failed"), fmt=args.format, detail=args.detail)) + return 2 + + nodes: dict[str, dict] = {root_id: _noderef_to_node_dict(node)} + edges: list[dict] = [] + for e in out.results: + nodes[e.other.id] = _noderef_to_node_dict(e.other) + # Carry the injection metadata from the edge attrs (mechanism/annotation/ + # field_or_param) so the renderer and JSON consumers see how the dep is + # injected. + edge_row = {"other_id": e.other.id, "edge_type": "INJECTS"} + for k in ("mechanism", "annotation", "field_or_param", "dst_fqn", "resolved"): + if k in e.attrs: + edge_row[k] = e.attrs[k] + edges.append(edge_row) + truncated = bool(out.has_more_results) or len(edges) > limit + if len(edges) > limit: + edges = edges[:limit] + return _emit_traversal( + args, root_id=root_id, nodes=nodes, edges=edges, + noun="dependencies", warnings=warnings, truncated=truncated, + ) + + +def _client_dict_to_node(c: dict) -> dict: + """list_clients dict -> envelope node dict (kind=client).""" + return { + "id": str(c.get("id") or ""), + "kind": "client", + "fqn": str(c.get("member_fqn") or c.get("path") or ""), + "name": str(c.get("path") or ""), + "client_kind": str(c.get("client_kind") or ""), + "target_service": str(c.get("target_service") or ""), + "method": str(c.get("method") or ""), + "path": str(c.get("path") or ""), + "microservice": str(c.get("microservice") or ""), + "module": str(c.get("module") or ""), + } + + +def _producer_dict_to_node(p: dict) -> dict: + """list_producers dict -> envelope node dict (kind=producer).""" + return { + "id": str(p.get("id") or ""), + "kind": "producer", + "fqn": str(p.get("member_fqn") or p.get("topic") or ""), + "name": str(p.get("topic") or ""), + "producer_kind": str(p.get("producer_kind") or ""), + "topic": str(p.get("topic") or ""), + "broker": str(p.get("broker") or ""), + "microservice": str(p.get("microservice") or ""), + "module": str(p.get("module") or ""), + } + + +def _cmd_connection(args: argparse.Namespace) -> int: + """connection — multi-section inbound:/outbound: view. + + RESOLVE-FIRST EXCEPTION: the first positional is a microservice NAME (used + literally for list_clients / list_producers / find_route_callers); resolve_v2 + is NEVER run on it (the agent spec calls this out loudly in --help). + """ + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + limit = _clamped_limit(args) + + from java_codebase_rag.jrag_envelope import Envelope, next_actions_hook + from java_codebase_rag.jrag_render import render + + microservice = args.microservice + # argparse stores --inbound/--outbound/--both into `direction` via + # action="store_const"; default is None when no flag is given (-> inbound, + # per the brief: --inbound is the default direction). + direction = getattr(args, "direction", None) or "both" + http_method = (args.http_method or "").upper() or None + calls_service = args.calls_service + + show_inbound = direction in ("inbound", "both") + show_outbound = direction in ("outbound", "both") + + nodes: dict[str, dict] = {} + edges: list[dict] = [] + warnings: list[str] = [] + + # Filter predicates (applied client-side; --module is the only structural + # common flag that's a bit meaningful here, but list_clients/list_producers + # already take microservice; --module has no analog and is warned). + if args.module: + warnings.append("--module is not applied on connection (use --calls-service to narrow)") + + # --calls-service on outbound: clients are filtered STRICTLY (target_service + # == calls_service); producers have no service target (they target topics), + # so they bypass the filter and we emit a single warning so the agent knows + # the async channel wasn't narrowed. The previous `or not target_service` + # escape hatch matched unresolved clients (empty target_service, e.g. + # AuditLogClient#logAssignment) — that was silent-wrong-results. + producers_bypass_calls_service = bool(calls_service) and show_outbound + + def _http_method_match(row: dict) -> bool: + if not http_method: + return True + return (str(row.get("method") or "").upper()) == http_method + + def _calls_service_match_out_client(row: dict) -> bool: + # STRICT: a client is kept iff target_service == calls_service exactly. + # Unresolved clients (empty target_service) are EXCLUDED — they did not + # resolve to a specific target service, so we cannot confirm they call + # --calls-service and must not surface them as a match. + if not calls_service: + return True + return str(row.get("target_service") or "") == calls_service + + def _calls_service_match_in(caller_microservice: str) -> bool: + if not calls_service: + return True + return caller_microservice == calls_service + + # --- Inbound: clients/producers in OTHER services targeting --- + if show_inbound: + # HTTP: list_clients(target_service=microservice) gives every client + # declaring a call into this service. Filter out clients IN this + # microservice (those are intra-service, not inbound). + http_in = graph.list_clients(target_service=microservice, limit=limit + 1) + http_in = [c for c in http_in if (c.get("microservice") or "") != microservice] + http_in = [c for c in http_in if _http_method_match(c) and _calls_service_match_in(c.get("microservice") or "")] + for c in http_in[:limit + 1]: + cid = c["id"] + nodes[cid] = _client_dict_to_node(c) + edges.append({"other_id": cid, "edge_type": "HTTP_CALLS", "section": "inbound"}) + + # Async: topic Routes consumed by this microservice's listeners are + # reached by producers in OTHER services via ASYNC_CALLS. The path is + # listener_method -[:EXPOSES]-> Route(topic) <-[:ASYNC_CALLS]- Producer + # find_route_callers gives both client and producer callers for a route, + # so we (a) enumerate this service's listener classes, (b) for each, + # resolve the Route(s) it EXPOSES, (c) call find_route_callers on each + # topic Route, (d) keep producer callers from other services. + try: + listener_hits = graph.list_by_capability( + capability="MESSAGE_LISTENER", + microservice=microservice, + limit=_CONSUMER_FETCH_LIMIT, + ) + except Exception as e: # noqa: BLE001 - best-effort multi-section view + # Don't swallow silently: surface the failure so an empty async + # inbound section is distinguishable from "no listeners". HTTP + # inbound above is unaffected; the command still returns its other + # sections. (The bare `except: listener_hits = []` this replaces + # produced silent wrong-results — status:ok with no async + no clue.) + warnings.append(f"listener lookup failed; async inbound section skipped: {e}") + listener_hits = [] + topic_route_ids: set[str] = set() + for h in listener_hits: + # listener method -> EXPOSES -> Route(topic). Resolve via a focused + # Cypher lookup (Route.id for the EXPOSES target). + rows = graph._rows( # noqa: SLF001 - focused lookup, same pattern as _node_file_location + "MATCH (mth:Symbol)-[:EXPOSES]->(r:Route) WHERE mth.id = $mid RETURN r.id AS rid", + {"mid": h.id}, + ) + for r in rows: + rid = str(r.get("rid") or "") + if rid: + topic_route_ids.add(rid) + # Cache list_producers() per caller_microservice so the inbound-async + # loop issues ONE fetch per external service (not one per producer id). + producer_cache: dict[str, list[dict]] = {} + for rid in topic_route_ids: + callers = graph.find_route_callers(route_id=rid) + for c in callers: + if c.caller_node_kind != "producer": + continue + if (c.caller_microservice or "") == microservice: + continue # intra-service + if not _calls_service_match_in(c.caller_microservice or ""): + continue + pid = c.caller_node_id + if pid in nodes: + # Already rendered (e.g. duplicated via multiple topic routes) + edges.append({"other_id": pid, "edge_type": "ASYNC_CALLS", "section": "inbound", "confidence": c.confidence}) + continue + # Fetch producer dict for richer node data (cached per service). + caller_ms = c.caller_microservice or "" + if caller_ms not in producer_cache: + producer_cache[caller_ms] = graph.list_producers( + microservice=caller_ms or None, limit=_CONSUMER_FETCH_LIMIT, + ) + prod_dict = next((p for p in producer_cache[caller_ms] if p.get("id") == pid), None) + if prod_dict: + nodes[pid] = _producer_dict_to_node(prod_dict) + else: + nodes[pid] = { + "id": pid, + "kind": "producer", + "fqn": c.topic or "", + "name": c.topic or "", + "topic": c.topic or "", + "broker": c.broker or "", + "microservice": c.caller_microservice or "", + } + edges.append({"other_id": pid, "edge_type": "ASYNC_CALLS", "section": "inbound", "confidence": c.confidence}) + + # --- Outbound: clients/producers IN this microservice (calling out) --- + if show_outbound: + clients_out = graph.list_clients(microservice=microservice, limit=limit + 1) + # Clients: apply --http-method AND --calls-service strictly (no empty- + # target escape; unresolved clients are EXCLUDED under --calls-service). + clients_out = [c for c in clients_out if _http_method_match(c) and _calls_service_match_out_client(c)] + for c in clients_out[:limit + 1]: + cid = c["id"] + nodes[cid] = _client_dict_to_node(c) + edges.append({"other_id": cid, "edge_type": "HTTP_CALLS", "section": "outbound"}) + + producers_out = graph.list_producers(microservice=microservice, limit=limit + 1) + # Producers bypass --calls-service (no service target on ASYNC channels); + # emit ONE warning so the agent knows the async channel wasn't narrowed. + if producers_bypass_calls_service and producers_out: + warnings.append( + f"--calls-service does not filter producers (no target_service on " + f"ASYNC channels); {len(producers_out)} producer(s) kept visible" + ) + for p in producers_out[:limit + 1]: + pid = p["id"] + nodes[pid] = _producer_dict_to_node(p) + edges.append({"other_id": pid, "edge_type": "ASYNC_CALLS", "section": "outbound"}) + + # Synthesize a microservice "root" node so the renderer uses the traversal + # shape (root + edges) and the section-grouped rendering fires. The synthetic + # id is namespaced to avoid colliding with real node ids. + root_id = f"microservice:{microservice}" + nodes[root_id] = { + "id": root_id, + "kind": "microservice", + "fqn": microservice, + "name": microservice, + "microservice": microservice, + } + + # Per-section truncation: cap each section at `limit` (drop overflow rows + # and flag truncation if either side overflowed). We collected limit+1 + # rows above; slice here. + inbound_edges = [e for e in edges if e.get("section") == "inbound"] + outbound_edges = [e for e in edges if e.get("section") == "outbound"] + truncated = len(inbound_edges) > limit or len(outbound_edges) > limit + inbound_edges = inbound_edges[:limit] + outbound_edges = outbound_edges[:limit] + display_edges = inbound_edges + outbound_edges + # Drop unreferenced node ids (keep the synthetic root). + referenced = {root_id} | {e["other_id"] for e in display_edges} + nodes = {nid: nd for nid, nd in nodes.items() if nid in referenced} + + env = Envelope( + status="ok", + nodes=nodes, + edges=display_edges, + root=root_id, + warnings=warnings, + truncated=truncated, + ) + next_actions_hook(env, root=root_id, result_edges=display_edges) + print(render(env, fmt=args.format, detail=args.detail, noun="connection")) + return 0 + + +def _resolve_source_path(cfg, file_arg: str) -> Path | None: + """Resolve to an existing path: absolute, else cfg.source_root/. + + Returns None when neither exists (callers render a graceful envelope). + """ + p = Path(file_arg) + if p.is_absolute() and p.is_file(): + return p + src = Path(cfg.source_root) if cfg.source_root else Path.cwd() + candidate = src / file_arg + if candidate.is_file(): + return candidate + return None + + +def _cmd_outline(args: argparse.Namespace) -> int: + """outline — list every Symbol whose declared location is in . + + Calls find_symbols_in_file_range(graph, filename=, start_line=1, + end_line=2**31-1). start_line MUST be >=1 (the backend returns [] for + start_line<1). UNBOUNDED: no --limit cap (the entire file's symbol table + is returned). --limit is accepted (inherited common flag) but does not + truncate; the agent spec calls this out in --help. + """ + from ladybug_queries import find_symbols_in_file_range + + from java_codebase_rag.jrag_envelope import Envelope, next_actions_hook + from java_codebase_rag.jrag_render import render + + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + + filename = args.file + # find_symbols_in_file_range matches s.filename = $fn exactly. The graph + # stores filenames as POSIX-relative paths from source root (build_ast_graph + # line 534: `rel_path = abs_path_resolved.relative_to(source_root).as_posix()`). + # We pass the user's input through directly; if no match, the result is [] + # (graceful, not crash). + try: + hits = find_symbols_in_file_range( + graph, + filename=filename, + start_line=1, + end_line=2**31 - 1, + ) + except Exception as exc: + env = Envelope(status="error", message=f"outline failed: {exc}") + print(render(env, fmt=args.format, detail=args.detail)) + return 2 + + nodes: dict[str, dict] = {} + for h in hits: + nodes[h.id] = _symbol_hit_to_dict(h) + + warnings: list[str] = [] + # --limit is accepted (common flag) but outline is documented unbounded; + # surface a warning when the user explicitly set --limit away from the + # default so they know it has no effect (plan principle: inapplicable flags + # never silently ignored). + if args.limit is not None and args.limit != 20: + warnings.append("--limit does not apply to outline (unbounded by design)") + + env = Envelope(status="ok", nodes=nodes, warnings=warnings) + next_actions_hook(env) + print(render(env, fmt=args.format, detail=args.detail, noun="symbol")) + return 0 + + +def _cmd_imports(args: argparse.Namespace) -> int: + """imports — tree-sitter parse + resolve_v2 per imported FQN. + + Reads from disk (cfg.source_root / for relative paths), + parses with ast_java.parse_java, walks explicit_imports (dict: simple_name + -> FQN), then resolves each FQN via resolve_v2 against the graph. Returns + a node per import: resolved graph Symbol when resolve_v2 hits (status=one), + or an unresolved placeholder carrying the raw FQN otherwise. + """ + from ast_java import parse_java + from resolve_service import resolve_v2 + + from java_codebase_rag.jrag_envelope import Envelope, next_actions_hook + from java_codebase_rag.jrag_render import render + + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + + file_path = _resolve_source_path(cfg, args.file) + if file_path is None: + env = Envelope( + status="error", + message=( + f"file not found: {args.file!r} (looked at the literal path and at " + f"/{args.file})" + ), + ) + print(render(env, fmt=args.format, detail=args.detail)) + return 2 + + try: + src = file_path.read_bytes() + except OSError as exc: + env = Envelope(status="error", message=f"could not read {file_path}: {exc}") + print(render(env, fmt=args.format, detail=args.detail)) + return 2 + + # parse_java is robust to invalid source (returns an empty JavaFileAst on + # parse errors, never raises). It builds imports from the + # `import_declaration` tree-sitter nodes via `_import_declaration_is_static` + # (ast_java.py:905) and the scoped_identifier child walk (ast_java.py:2658). + # explicit_imports: dict[str, str] = simple_name -> FQN (non-wildcard, + # non-static); we also surface wildcard/static imports as unresolved rows so + # the agent sees the full import block. + ast = parse_java(src, filename=args.file) + nodes: dict[str, dict] = {} + edges: list[dict] = [] + warnings: list[str] = [] + # Mirror outline: --limit is accepted (common flag) but imports returns the + # full import block; surface a warning when the user explicitly set --limit + # away from the default so they know it has no effect. + if args.limit is not None and args.limit != 20: + warnings.append("--limit does not apply to imports (the full import block is returned)") + + # Static + wildcard imports: rendered as unresolved rows (resolve_v2 only + # matches type Symbols, not methods or wildcards). + unresolved_imports: list[dict] = [] + for ident in ast.wildcard_imports: + unresolved_imports.append({"fqn": f"{ident}.*", "kind": "wildcard"}) + for simple, fqn in ast.file_imports.static_methods.items(): + unresolved_imports.append({"fqn": fqn, "kind": "static_method", "name": simple}) + for prefix in ast.file_imports.static_wildcards: + unresolved_imports.append({"fqn": f"{prefix}.*", "kind": "static_wildcard"}) + + # Explicit type imports: resolve each via resolve_v2. + resolved_count = 0 + unresolved_count = 0 + for simple, fqn in ast.explicit_imports.items(): + out = resolve_v2(fqn, hint_kind="symbol", graph=graph) + if out.status == "one" and out.node is not None: + ref = out.node + node_dict = _noderef_to_node_dict(ref) + node_dict["import_fqn"] = fqn + node_dict["import_simple"] = simple + nodes[ref.id] = node_dict + edges.append({"other_id": ref.id, "edge_type": "IMPORTS", "resolved": True}) + resolved_count += 1 + else: + # Use a stable synthetic id so unresolved imports round-trip JSON. + synthetic_id = f"import:{fqn}" + nodes[synthetic_id] = { + "id": synthetic_id, + "kind": "unresolved_import", + "fqn": fqn, + "name": simple, + "import_simple": simple, + "import_fqn": fqn, + } + edges.append({"other_id": synthetic_id, "edge_type": "IMPORTS", "resolved": False}) + unresolved_count += 1 + + # Append unresolved static/wildcard imports as additional rows. + for entry in unresolved_imports: + fqn = entry["fqn"] + synthetic_id = f"import:{fqn}" + nodes[synthetic_id] = { + "id": synthetic_id, + "kind": "unresolved_import", + "fqn": fqn, + "name": fqn.rsplit(".", 1)[-1], + "import_kind": entry.get("kind", ""), + } + edges.append({"other_id": synthetic_id, "edge_type": "IMPORTS", "resolved": False}) + + if ast.parse_error: + warnings.append("tree-sitter reported a parse_error for this file (imports extracted best-effort)") + + env = Envelope(status="ok", nodes=nodes, edges=edges, warnings=warnings) + next_actions_hook(env, result_edges=edges) + print(render(env, fmt=args.format, detail=args.detail, noun="import")) + return 0 + + +# ============================================================================ +# PR-JRAG-4: orientation commands (microservices / map / conventions / overview) +# + semantic search. +# +# Orientation commands compose counts and listings from LadybugGraph methods +# and focused Cypher lookups (graph._rows). They render as inspect-shape +# (kv-block + nested dict sections) so the agent sees compact structured data. +# +# Search dispatches to search_v2 (mcp_v2.search_v2) after building a NodeFilter +# from flags. --fuzzy is registered on the parser but rejected IN-HANDLER with +# status: error (not argparse exit) so the envelope carries the message. +# ============================================================================ + + +def _cmd_microservices(args: argparse.Namespace) -> int: + """microservices — list every microservice with its resolved type count.""" + from java_codebase_rag.jrag_envelope import Envelope, next_actions_hook + from java_codebase_rag.jrag_render import render + + _, graph, rc = _load_graph_or_error(args) + if rc: + return rc + + counts = graph.microservice_counts() + warnings = _warn_inapplicable_common(args, service=True, module=True, limit=True) + env = Envelope( + status="ok", + nodes={"microservices": {"counts": dict(counts)}}, + warnings=warnings, + ) + next_actions_hook(env) + print(render(env, fmt=args.format, detail=args.detail, noun="microservices", shape="inspect")) + return 0 + + +def _cmd_map(args: argparse.Namespace) -> int: + """map [--by microservice|module] [--service] [--module] — counts per kind. + + ``--by`` selects the grouping axis (default microservice). ``--service`` / + ``--module`` narrow the count to one service / module (filters, independent + of the axis). Previously ``--module`` was overloaded to also switch the + axis, which made "group by ALL modules" unreachable. + """ + from java_codebase_rag.jrag_envelope import Envelope, next_actions_hook + from java_codebase_rag.jrag_render import render + + _, graph, rc = _load_graph_or_error(args) + if rc: + return rc + + # Grouping axis: --by (validated by argparse choices). --module is a filter only. + group_col = args.by + scope_clauses: list[str] = [] + params: dict = {} + if args.service: + scope_clauses.append("s.microservice = $ms") + params["ms"] = args.service + if args.module: + scope_clauses.append("s.module = $mod") + params["mod"] = args.module + scope_clause = " AND " + " AND ".join(scope_clauses) if scope_clauses else "" + + rows = graph._rows( # noqa: SLF001 - counts compose query (same pattern as _scope_counts) + f"MATCH (s:Symbol) WHERE s.resolved " + f"AND s.kind IN ['class','interface','enum','record','annotation']" + f"{scope_clause} " + f"RETURN s.{group_col} AS scope, s.kind AS kind, count(*) AS n", + params, + ) + grouped: dict[str, dict[str, int]] = {} + for r in rows: + scope = str(r.get("scope") or "(unscoped)") + kind = str(r.get("kind") or "(unknown)") + grouped.setdefault(scope, {})[kind] = int(r.get("n") or 0) + + # --service/--module are applied above (scope_clauses); --limit is not (this + # is an aggregate count, not a row fetch). + warnings = _warn_inapplicable_common(args, service=False, module=False, limit=True) + env = Envelope( + status="ok", + nodes={"map": {"group_by": group_col, "counts": grouped}}, + warnings=warnings, + ) + next_actions_hook(env) + print(render(env, fmt=args.format, detail=args.detail, noun="map", shape="inspect")) + return 0 + + +def _cmd_conventions(args: argparse.Namespace) -> int: + """conventions [--service] — dominant roles + framework tallies.""" + from java_codebase_rag.jrag_envelope import Envelope, next_actions_hook + from java_codebase_rag.jrag_render import render + + _, graph, rc = _load_graph_or_error(args) + if rc: + return rc + + scope_clause = "" + params: dict = {} + if args.service: + scope_clause = " AND s.microservice = $ms" + params["ms"] = args.service + + role_rows = graph._rows( # noqa: SLF001 - counts compose query + f"MATCH (s:Symbol) WHERE s.resolved AND s.role IS NOT NULL AND s.role <> ''" + f"{scope_clause} " + f"RETURN s.role AS role, count(*) AS n ORDER BY n DESC", + params, + ) + role_counts: dict[str, int] = {} + for r in role_rows: + role = str(r.get("role") or "") + if role: + role_counts[role] = int(r.get("n") or 0) + + # Framework tallies: reuse meta().routes_by_framework (already computed) plus + # a direct count of route nodes by framework for accuracy. + fw_rows = graph._rows( # noqa: SLF001 - counts compose query + "MATCH (r:Route) WHERE r.framework IS NOT NULL AND r.framework <> '' " + "RETURN r.framework AS framework, count(*) AS n ORDER BY n DESC" + ) + framework_counts: dict[str, int] = {} + for r in fw_rows: + fw = str(r.get("framework") or "") + if fw: + framework_counts[fw] = int(r.get("n") or 0) + + # --service is applied above; --module/--limit are not (no module clause; + # aggregate count). + warnings = _warn_inapplicable_common(args, service=False, module=True, limit=True) + env = Envelope( + status="ok", + nodes={"conventions": {"roles": role_counts, "frameworks": framework_counts}}, + warnings=warnings, + ) + next_actions_hook(env) + print(render(env, fmt=args.format, detail=args.detail, noun="conventions", shape="inspect")) + return 0 + + +def _overview_detect_type(subject: str, graph) -> str: + """Auto-detect the subject type for `overview`. + + Returns "route" | "microservice" | "topic". Heuristics: + * Starts with '/' → route. + * Matches a known microservice name (microservice_counts keys) → microservice. + * Else → topic (catch-all for messaging strings). + """ + if subject.startswith("/"): + return "route" + try: + ms_counts = graph.microservice_counts() + except Exception: + ms_counts = {} + if subject in ms_counts: + return "microservice" + return "topic" + + +def _overview_microservice(args: argparse.Namespace, graph, microservice: str) -> int: + """overview microservice bundle: counts + routes + clients + producers.""" + from java_codebase_rag.jrag_envelope import Envelope, next_actions_hook + from java_codebase_rag.jrag_render import render + + limit = _clamped_limit(args) + routes = graph.list_routes(microservice=microservice, limit=limit + 1) + clients = graph.list_clients(microservice=microservice, limit=limit + 1) + producers = graph.list_producers(microservice=microservice, limit=limit + 1) + + bundle = { + "microservice": microservice, + "routes": len(routes), + "clients": len(clients), + "producers": len(producers), + } + # Include sample entities (entities + listeners + jobs) for the service. + try: + entities = graph.list_by_role( + role="ENTITY", microservice=microservice, limit=limit + 1 + ) + bundle["entities"] = len(entities) + except Exception: + pass + + env = Envelope( + status="ok", + nodes={f"microservice:{microservice}": { + "kind": "microservice", + "fqn": microservice, + "name": microservice, + "microservice": microservice, + "bundle": bundle, + "route_sample": [{"path": r.get("path", ""), "framework": r.get("framework", "")} for r in routes[:5]], + "client_sample": [{"fqn": c.get("member_fqn", ""), "target_service": c.get("target_service", "")} for c in clients[:5]], + "producer_sample": [{"topic": p.get("topic", ""), "producer_kind": p.get("producer_kind", "")} for p in producers[:5]], + }}, + ) + next_actions_hook(env) + print(render(env, fmt=args.format, detail=args.detail, noun="overview", shape="inspect")) + return 0 + + +def _overview_route(args: argparse.Namespace, cfg, graph, route_path: str) -> int: + """overview route: resolve + trace_request_flow (same as `flow`).""" + from java_codebase_rag.jrag_envelope import Envelope, next_actions_hook, resolve_query + from java_codebase_rag.jrag_render import render + + limit = _clamped_limit(args) + node, renv = resolve_query( + route_path, hint_kind="route", java_kind=None, role=None, fqn_prefix=None, + cfg=cfg, graph=graph, + ) + if renv.status != "ok" or node is None: + print(render(renv, fmt=args.format, detail=args.detail)) + return 2 if renv.status == "error" else 0 + + if node.kind != "route": + env = Envelope( + status="error", + message=f"overview --as route expects a Route; resolved kind is {node.kind!r}.", + ) + print(render(env, fmt=args.format, detail=args.detail)) + return 2 + + max_hops = max(1, min(8, 5)) + flow_data = graph.trace_request_flow(entry_route_id=node.id, max_hops=max_hops) + root_id = node.id + nodes_dict: dict[str, dict] = {root_id: _noderef_to_node_dict(node)} + edges: list[dict] = [] + for row in flow_data.get("inbound", []): + caller_id = str(row.get("caller_node_id") or "") + if not caller_id: + continue + kind = str(row.get("caller_node_kind") or "") + nodes_dict[caller_id] = { + "id": caller_id, "kind": kind, + "fqn": str(row.get("declaring_symbol_fqn") or ""), + "microservice": str(row.get("microservice") or ""), + } + edges.append({ + "other_id": caller_id, + "edge_type": "HTTP_CALLS" if kind == "client" else "ASYNC_CALLS", + "confidence": float(row.get("confidence") or 0.0), + }) + for row in flow_data.get("outbound", []): + next_id = str(row.get("next_symbol_id") or "") + if not next_id: + continue + nodes_dict[next_id] = { + "id": next_id, "kind": "symbol", + "fqn": str(row.get("next_fqn") or ""), + "microservice": str(row.get("next_microservice") or ""), + } + edges.append({"other_id": next_id, "edge_type": "CALLS"}) + truncated = len(edges) > limit + if truncated: + edges = edges[:limit] + env = Envelope(status="ok", nodes=nodes_dict, edges=edges, root=root_id, truncated=truncated) + next_actions_hook(env, root=root_id, result_edges=edges) + print(render(env, fmt=args.format, detail=args.detail, noun="overview")) + return 0 + + +def _overview_topic(args: argparse.Namespace, graph, topic: str) -> int: + """overview topic: producers + consumers for a topic string.""" + from java_codebase_rag.jrag_envelope import Envelope, next_actions_hook + from java_codebase_rag.jrag_render import render + + limit = _clamped_limit(args) + # Producers: exact topic match first, then prefix match as fallback. + producers = graph.list_producers(topic_prefix=topic, limit=limit + 1) + if not producers and len(topic) >= 3: + # Try a shorter prefix if the exact topic yields nothing. + producers = graph.list_producers(topic_prefix=topic[:3], limit=limit + 1) + producers = [p for p in producers if topic in str(p.get("topic") or "")] + + # Consumers: listener classes consuming this topic via EXPOSES on Route. + consumers = _resolve_topic_consumers(graph, topic=topic, prefix=False) + if not consumers: + consumers = _resolve_topic_consumers(graph, topic=topic, prefix=True) + + topic_node = { + "kind": "topic", + "fqn": topic, + "name": topic, + "topic": topic, + "producers": [ + { + "fqn": str(p.get("member_fqn") or ""), + "topic": str(p.get("topic") or ""), + "producer_kind": str(p.get("producer_kind") or ""), + "microservice": str(p.get("microservice") or ""), + } + for p in producers[:limit] + ], + "consumers": [ + { + "id": c.get("id", ""), + "fqn": c.get("fqn", ""), + "kind": c.get("kind", "symbol"), + "microservice": c.get("microservice", ""), + } + for c in consumers[:limit] + ], + } + env = Envelope( + status="ok", + nodes={f"topic:{topic}": topic_node}, + ) + next_actions_hook(env) + print(render(env, fmt=args.format, detail=args.detail, noun="overview", shape="inspect")) + return 0 + + +def _cmd_overview(args: argparse.Namespace) -> int: + """overview [--as ...] — dispatch on type.""" + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + + subject = args.subject + if not subject: + # Subject is optional on the parser (nargs='?') so we can emit a helpful + # explanation instead of argparse's opaque "the following arguments are + # required: subject". Prints to stderr (usage guidance) + a status:error + # envelope to stdout, exit 2. + from java_codebase_rag.jrag_envelope import Envelope + from java_codebase_rag.jrag_render import render + + msg = ( + "overview requires a : a microservice name (e.g. 'chat-core'), " + "a route path (e.g. '/api/v1/chat/events'), or a topic string " + "(e.g. 'banking.chat.audit'). Use --as {microservice,route,topic} to " + "override auto-detection." + ) + print(render(Envelope(status="error", message=msg), fmt=args.format, detail=args.detail)) + return 2 + as_type = getattr(args, "as_type", None) + if as_type is None: + as_type = _overview_detect_type(subject, graph) + + if as_type == "route": + return _overview_route(args, cfg, graph, subject) + if as_type == "microservice": + return _overview_microservice(args, graph, subject) + return _overview_topic(args, graph, subject) + + +# ============================================================================ +# Search (PR-JRAG-4) +# ============================================================================ + + +def _cmd_search(args: argparse.Namespace) -> int: + """search — semantic search via search_v2 over Lance tables. + + Builds a NodeFilter from flags, calls search_v2 with limit+1 for +1-fetch + truncation, and renders. --fuzzy is rejected IN-HANDLER (not argparse-exit) + so the error carries the canonical envelope shape. + """ + import mcp_v2 + + from java_codebase_rag.jrag_envelope import Envelope, mark_truncated, next_actions_hook, normalize_enum + from java_codebase_rag.jrag_render import render + + # --fuzzy: registered on the parser (so argparse doesn't exit 2), but rejected + # IN-HANDLER with status: error (search is inherently semantic; --fuzzy is + # a no-op synonym, not a real mode toggle). + if getattr(args, "fuzzy", False): + env = Envelope( + status="error", + message="search is semantic; --fuzzy is implicit", + ) + print(render(env, fmt=args.format, detail=args.detail)) + return 2 + + cfg, graph, rc = _load_graph_or_error(args) + if rc: + return rc + + limit = min(args.limit if args.limit is not None else 20, 499) + + # Build NodeFilter from flags (same set as `find` filter mode). + filter_dict: dict = {} + if args.service: + filter_dict["microservice"] = args.service + if args.module: + filter_dict["module"] = args.module + if args.role: + filter_dict["role"] = normalize_enum(args.role, kind="role") + if args.exclude_role: + filter_dict["exclude_roles"] = [normalize_enum(args.exclude_role, kind="role")] + if args.annotation: + filter_dict["annotation"] = args.annotation + if args.capability: + filter_dict["capability"] = args.capability + if args.fqn_prefix: + filter_dict["fqn_prefix"] = args.fqn_prefix + if args.java_kind: + filter_dict["symbol_kind"] = normalize_enum(args.java_kind, kind="java_kind") + if args.framework: + filter_dict["framework"] = normalize_enum(args.framework, kind="framework") + node_filter, err_env = _build_node_filter_or_error(filter_dict) + if err_env is not None: + print(render(err_env, fmt=args.format, detail=args.detail)) + return 2 + + out = mcp_v2.search_v2( + args.query, + table=args.table, + hybrid=args.hybrid, + limit=limit + 1, # +1 for truncated detection + offset=args.offset, + path_contains=args.path_contains, + filter=node_filter, + graph=graph, + ) + + if not out.success: + env = Envelope(status="error", message=out.message or "search failed") + print(render(env, fmt=args.format, detail=args.detail)) + return 2 + + # Convert SearchHit list to envelope node dicts. + hit_dicts: list[dict] = [] + for hit in out.results: + d = hit.model_dump() if hasattr(hit, "model_dump") else dict(hit) + # Ensure an `id` key for envelope nodes (SearchHit carries chunk_id + + # optional symbol_id; use chunk_id as the envelope node id). + if "id" not in d: + d["id"] = d.get("chunk_id") or d.get("symbol_id") or d.get("fqn") or "" + if "kind" not in d: + d["kind"] = "search_hit" + hit_dicts.append(d) + + display, truncated = mark_truncated(hit_dicts, limit) + nodes = {n["id"]: n for n in display} if display else {} + + env = Envelope(status="ok", nodes=nodes, truncated=truncated) + next_actions_hook(env) + next_offset = args.offset + limit if truncated else None + print(render(env, fmt=args.format, detail=args.detail, noun="search", next_offset=next_offset)) + return 0 + + +def _suppress_runtime_stderr_noise() -> None: + """Silence known-benign stderr noise from the embedding/LanceDB stack. + + The CLI loads sentence_transformers + LanceDB per invocation; both emit + benign stderr noise that an agent-facing tool should not dump on the caller: + + * tqdm ``Loading weights`` progress bar (sentence_transformers model load) + * HuggingFace hub progress bars / telemetry + * torch multiprocessing ``leaked semaphore objects`` ``resource_tracker`` + UserWarning emitted at shutdown + + Real diagnostics (the top-level handler's ``traceback.format_exc()``) still + go to stderr. Env vars are set with ``setdefault`` so an explicit caller + override wins. The ``resource_tracker`` warning is raised inside a spawned + child process; under the spawn start method (macOS default) the child + re-initializes ``warnings`` and does NOT inherit the parent's + ``warnings.filterwarnings``, so we route it through ``PYTHONWARNINGS`` (env + vars ARE inherited by spawned children) as well as the parent filter. + """ + for key, val in ( + ("TQDM_DISABLE", "1"), + ("TRANSFORMERS_VERBOSITY", "error"), + ("HF_HUB_DISABLE_PROGRESS_BARS", "1"), + ("HF_HUB_DISABLE_TELEMETRY", "1"), + # mcp_v2._log_fail_loud operator diagnostic — the CLI surfaces the same + # failure as a clean status:error envelope, so silence the stderr line. + ("JAVA_CODEBASE_RAG_FAIL_LOUD", "0"), + ): + os.environ.setdefault(key, val) + existing_pw = os.environ.get("PYTHONWARNINGS", "") + extra_pw = "ignore:resource_tracker:UserWarning" + if extra_pw not in existing_pw: + os.environ["PYTHONWARNINGS"] = f"{existing_pw},{extra_pw}" if existing_pw else extra_pw + import warnings + + warnings.filterwarnings("ignore", message=r"resource_tracker.*", category=UserWarning) + + +def main(argv: list[str] | None = None) -> int: + """Process-level entry. Returns the exit code. + + First line raises the FD soft limit (lancedb merge-insert opens many + handles; macOS IDE-launched soft limit is 256). Returns 0 on ok, 1 on + usage error (argparse rejects argv), 2 on handler exception. The top-level + exception handler emits a ``status: error`` envelope to stdout AND + ``traceback.format_exc()`` to stderr before returning 2 - this is a + deliberate divergence from the operator CLI which swallows tracebacks. + """ + raise_fd_limit() + _suppress_runtime_stderr_noise() + parser = build_parser() + raw = list(argv if argv is not None else sys.argv[1:]) + try: + args = parser.parse_args(raw) + except SystemExit as exc: + # argparse with exit_on_error=False raises SystemExit on -h/--help + # (code 0) and ArgumentError-propagated paths. Treat 0/None as ok and + # any other code as usage error (exit 1). + if exc.code in (0, None): + return 0 + return 1 + except argparse.ArgumentError as exc: + # exit_on_error=False routes argparse usage errors here. We deliberately + # surface them on stderr (no envelope to stdout) and exit 1 - the agent + # gets a clear "usage error" signal distinct from internal failures (2). + print(f"jrag: {exc}", file=sys.stderr) + return 1 + handler = getattr(args, "handler", None) + if handler is None: + # No subcommand: print help to stderr, return usage error. + parser.print_help(sys.stderr) + return 1 + try: + return int(handler(args)) + except Exception as exc: + from java_codebase_rag.jrag_envelope import Envelope + from java_codebase_rag.jrag_render import render + + env = Envelope( + status="error", + message=f"internal error: {exc}", + ) + print(render(env, fmt=getattr(args, "format", "text"))) + print(traceback.format_exc(), file=sys.stderr) + return 2 + + +def _console_script_main() -> None: + """Real CLI entry: terminate without interpreter finalization. + + Mirrors ``java_codebase_rag.cli._console_script_main``: a pyarrow/lance + worker thread (loaded via lancedb in lifecycle commands) can outlive CPython + finalization in a one-shot CLI subprocess and trip ``PyGILState_Release`` + (SIGABRT, exit -6). Flushing + ``os._exit`` skips that racy teardown - the + command has already done its work and emitted its result. ``main()`` stays + return-based so in-process test callers keep working. + """ + force_utf8_stdio() + rc = main() + sys.stdout.flush() + sys.stderr.flush() + os._exit(rc) + + +if __name__ == "__main__": + _console_script_main() diff --git a/java_codebase_rag/jrag_envelope.py b/java_codebase_rag/jrag_envelope.py new file mode 100644 index 00000000..2ae15f92 --- /dev/null +++ b/java_codebase_rag/jrag_envelope.py @@ -0,0 +1,940 @@ +"""JRAG envelope dataclass + resolve-first mapper + enum normalization (PR-JRAG-1a). + +This is the frozen contract every later JRAG-CLI PR builds on. The envelope is a +lean ``@dataclass`` (not pydantic): backend pydantic outputs cross the boundary +via ``.model_dump()`` exactly once in :func:`to_envelope_rows`. Renderers and +``to_json()`` operate on plain dicts only. + +Lazy imports: :mod:`resolve_service` and :mod:`ladybug_queries` are imported +inside :func:`resolve_query` so this module's import stays light (no torch, no +sentence_transformers, no mcp_v2). The dataclass and pure helpers +(``normalize_enum``/``mark_truncated``/``simple_name``/``to_envelope_rows``) do +not need any backend module. +""" +from __future__ import annotations + +import json +import re +from dataclasses import dataclass, field +from typing import Any, Literal + +from graph_types import NodeRef + +__all__ = [ + "Envelope", + "EnvelopeStatus", + "resolve_query", + "normalize_enum", + "mark_truncated", + "simple_name", + "to_envelope_rows", + "next_actions_hook", + "project_node", + "project_edge", + "project_envelope", + "node_key", +] + + +EnvelopeStatus = Literal["ok", "ambiguous", "not_found", "error"] + +# Explicit lookup tables for kinds whose stored literal is not a plain +# UPPER_SNAKE form of the user's input. Confirmed against java_ontology.py and +# graph_enrich.py source: +# - client_kind literals: feign_method / rest_template / web_client +# (java_ontology.VALID_CLIENT_KINDS) +# - producer_kind literals: kafka_send / stream_bridge_send +# (java_ontology.VALID_PRODUCER_KINDS) +# - source_layer literals: builtin / layer_a_meta / layer_b_ann / +# layer_b_fqn / layer_c_source (graph_enrich.route_source_layer assignments) +# +# Keys are the *normalized* form (lowercase + kebab/space -> underscore). +_CLIENT_KIND_TABLE: dict[str, str] = { + "feign": "feign_method", + "feign_method": "feign_method", + "rest_template": "rest_template", + "resttemplate": "rest_template", + "web_client": "web_client", + "webclient": "web_client", +} + +_PRODUCER_KIND_TABLE: dict[str, str] = { + "kafka": "kafka_send", + "kafka_send": "kafka_send", + "stream_bridge": "stream_bridge_send", + "stream_bridge_send": "stream_bridge_send", + "streambridge": "stream_bridge_send", +} + +_SOURCE_LAYER_TABLE: dict[str, str] = { + "builtin": "builtin", + "layer_a": "layer_a_meta", + "layer_a_meta": "layer_a_meta", + "layer_b_ann": "layer_b_ann", + "layer_b_fqn": "layer_b_fqn", + "layer_c": "layer_c_source", + "layer_c_source": "layer_c_source", +} + +_ENUM_LOOKUP_TABLES: dict[str, dict[str, str]] = { + "client_kind": _CLIENT_KIND_TABLE, + "producer_kind": _PRODUCER_KIND_TABLE, + "source_layer": _SOURCE_LAYER_TABLE, +} + + +@dataclass +class Envelope: + """The single output shape every jrag command emits. + + Backend pydantic outputs are converted to plain dicts at the boundary + (``to_envelope_rows``); the renderer and ``to_json()`` operate on dicts + only. ``to_dict()`` omits empty optionals so a clean status=ok envelope + stays small. + + Internal vs agent-facing: ``nodes`` is keyed by the graph node id internally + (handlers build ``nodes[h.id] = ...``), and ``to_dict()`` preserves that + id-keyed shape for debugging / internal use. ``to_json()`` — the CLI output + boundary — is id-free: it re-keys ``nodes`` to each node's natural key + (FQN / path / topic / literal), strips graph-id fields (``id`` / ``*_id``), + and collapses edge id-refs into ``target``. The CLI is resolve-first, so no + raw graph id ever reaches an agent on either the text or the JSON surface. + """ + + status: EnvelopeStatus + nodes: dict[str, dict] = field(default_factory=dict) + edges: list[dict] = field(default_factory=list) + root: str | None = None + candidates: list[dict] = field(default_factory=list) + agent_next_actions: list[str] = field(default_factory=list) + warnings: list[str] = field(default_factory=list) + truncated: bool = False + file_location: str | None = None + # Used to carry the resolve ``message`` for not_found / error envelopes + # (the renderer surfaces it as ``not found: ``). None on ok/ambiguous. + message: str | None = None + + def to_dict(self) -> dict[str, Any]: + """Serialize to a JSON-ready dict, omitting empty optionals. + + Top-level collection fields are shallow-copied (``list(...)`` / + ``dict(...)``); their VALUES are shared references - mutating a node + dict in place will propagate to a prior snapshot. Callers that need + true snapshot isolation across subsequent mutation should + ``copy.deepcopy`` the result. (In practice the envelope is short-lived: + built, rendered via ``to_json()`` in the same call site, then discarded + - so shared references are not a hazard.) + """ + out: dict[str, Any] = {"status": self.status} + if self.nodes: + out["nodes"] = dict(self.nodes) + if self.edges: + out["edges"] = list(self.edges) + if self.root is not None: + out["root"] = self.root + if self.candidates: + out["candidates"] = list(self.candidates) + if self.agent_next_actions: + out["agent_next_actions"] = list(self.agent_next_actions) + if self.warnings: + out["warnings"] = list(self.warnings) + if self.truncated: + out["truncated"] = True + if self.file_location is not None: + out["file_location"] = self.file_location + if self.message is not None: + out["message"] = self.message + return out + + def to_json(self) -> str: + """Serialize to the AGENT-FACING id-free JSON string. + + This is the CLI output boundary: it does NOT delegate to + :meth:`to_dict` (which stays id-keyed for internal/debug use and is + unit-tested as such). Instead it builds a fresh, id-free dict: + + * ``nodes`` is re-keyed from raw graph ids to each node's natural key + via :func:`node_key`; when ``node_key`` returns ``None`` (no + identity field — e.g. status/orientation rollup nodes) the existing + dict key is kept unchanged. Each node value is stripped of graph-id + fields (``id`` / ``*_id``). + * each edge's ``other_id`` / ``dst_id`` / ``target_id`` / ``term_id`` + (only ``other_id`` is ever emitted by handlers) collapses to a + single ``"target"`` holding the referenced node's natural key. A + dangling ref (no matching node) keeps its literal value — never + null, never silently dropped. + * ``root`` becomes the root node's natural key (omitted when absent). + * ``candidates`` are stripped of graph-id fields. + * Envelope-level scalars (``status`` / ``warnings`` / ``truncated`` / + ``file_location`` / ``message`` / ``agent_next_actions``) pass + through unchanged. + + Builds NEW dicts throughout (no in-place mutation of ``self``). + """ + return json.dumps(self._to_idfree_dict()) + + def _to_idfree_dict(self) -> dict[str, Any]: + """Build the id-free agent-facing dict (see :meth:`to_json`).""" + # 1. id -> natural key map (falls back to the existing key when node_key + # returns None, preserving literal keys like "index"/"microservices"). + id_to_key: dict[str, str] = {} + used: set[str] = set() + for id_key, node in self.nodes.items(): + natural = node_key(node) + if natural is None: + # No semantic identity. Keep the existing dict key ONLY when it + # is not a raw graph id (e.g. the literal "index" / "microservices" + # rollup keys). If it IS a raw id (40-hex SHA or a prefixed hash + # form like r:phantom: / ucs:), synthesize an opaque + # positional key so no graph id ever leaks as a JSON key. + if _looks_like_raw_graph_id(id_key): + key = f"node-{len(used)}" + else: + key = id_key + else: + key = natural + # Collision suffix: first occurrence unsuffixed, then #2, #3, ... + if key in used: + base = key + n = 2 + while key in used: + key = f"{base}#{n}" + n += 1 + used.add(key) + id_to_key[id_key] = key + + out: dict[str, Any] = {"status": self.status} + + if self.nodes: + out["nodes"] = { + id_to_key[nid]: _strip_graph_id_fields(dict(node)) + for nid, node in self.nodes.items() + } + if self.edges: + out["edges"] = [self._edge_to_idfree(e, id_to_key) for e in self.edges] + if self.root is not None: + # The root's natural key (falls back to the raw root id only if the + # root node isn't in self.nodes — a defensive no-op in practice). + out["root"] = id_to_key.get(self.root, self.root) + if self.candidates: + out["candidates"] = [_strip_graph_id_fields(dict(c)) for c in self.candidates] + if self.agent_next_actions: + out["agent_next_actions"] = list(self.agent_next_actions) + if self.warnings: + out["warnings"] = list(self.warnings) + if self.truncated: + out["truncated"] = True + if self.file_location is not None: + out["file_location"] = self.file_location + if self.message is not None: + out["message"] = self.message + return out + + @staticmethod + def _edge_to_idfree(edge: dict[str, Any], id_to_key: dict[str, str]) -> dict[str, Any]: + """Copy an edge, collapsing id-ref variants into one ``target`` key. + + Mirrors :func:`jrag_render._node_id`'s variant list. Only ``other_id`` + is emitted by handlers in practice; the others are defensive. The + remaining raw id-ref keys are dropped; all other edge attrs pass through. + """ + out: dict[str, Any] = {} + ref_value: str | None = None + for k, v in edge.items(): + if k in ("other_id", "dst_id", "target_id", "term_id"): + if ref_value is None and isinstance(v, str) and v: + ref_value = v + # skip (don't copy the raw id-ref key) + elif not _is_graph_id_field(k): + out[k] = v + if ref_value is not None: + out["target"] = id_to_key.get(ref_value, ref_value) + return out + + +def simple_name(node_dict: dict[str, Any]) -> str: + """Simple name = ``fqn.rsplit('.', 1)[-1]``. + + ``NodeRef`` carries no ``name`` field; the rendering layer derives a short + label from the FQN on demand. Empty/missing FQN returns "". + """ + fqn = str(node_dict.get("fqn") or "") + if not fqn: + return "" + return fqn.rsplit(".", 1)[-1] + + +def node_key(node: dict[str, Any]) -> str | None: + """Derive a stable, agent-meaningful, NON-graph-id key for a node. + + Used by :meth:`Envelope.to_json` to re-key ``nodes`` (away from raw graph + ids) and to translate edge ``other_id`` refs. Returns ``None`` when no + identity field is derivable, in which case :meth:`Envelope.to_json` keeps + the existing dict key unchanged (this preserves already-id-free literal + keys such as ``"index"`` / ``"microservices"`` / ``"map"`` / ``"conventions"`` + that status / orientation commands build). + + Precedence (first non-empty wins): + * ``fqn`` -> symbols AND route roots. Route roots come from + ``NodeRef.model_dump()`` whose ``fqn`` already + carries ``"METHOD path"`` (no separate path/method + fields), so this single branch covers them. + * ``member_fqn`` -> clients: ``member_fqn->target_service`` (disambiguates + a client member from a symbol of the same name). + * ``topic`` -> producers/topics: ``topic:`` (the ``topic:`` + prefix matches the existing _cmd_topics key shape). + * ``name`` -> fallback for any other named node. + * ``file`` -> unresolved/phantom routes carry no fqn/path/topic/name + but DO carry a composed ``file`` location; keying by + it avoids leaking the raw graph id (e.g. + ``r:phantom:``) when no semantic id exists. + * else -> ``None`` (caller keeps the existing dict key — safe + only when that key is already a non-id literal, e.g. + the ``"index"`` / ``"microservices"`` rollup keys). + """ + fqn = str(node.get("fqn") or "").strip() + if fqn: + return fqn + member_fqn = str(node.get("member_fqn") or "").strip() + if member_fqn: + target = str(node.get("target_service") or "").strip() + return f"{member_fqn}->{target}" if target else member_fqn + topic = str(node.get("topic") or "").strip() + if topic: + return f"topic:{topic}" + name = str(node.get("name") or "").strip() + if name: + return name + file_loc = str(node.get("file") or "").strip() + if file_loc: + return file_loc + return None + + +def to_envelope_rows(pydantic_results: list[Any]) -> list[dict[str, Any]]: + """Pydantic -> dict boundary: ``.model_dump()`` each item exactly once. + + Accepts pydantic models (``.model_dump()``) or plain dicts (passthrough). + Any other type raises ``TypeError`` rather than silently coercing - the + boundary is a single-shape conversion, not a best-effort adapter, and a + non-dict/non-pydantic item signals a backend-contract bug we want to + surface immediately. + """ + out: list[dict[str, Any]] = [] + for item in pydantic_results: + if hasattr(item, "model_dump"): + out.append(item.model_dump()) + elif isinstance(item, dict): + out.append(item) + else: + raise TypeError( + f"to_envelope_rows: expected pydantic model or dict, got {type(item).__name__}" + ) + return out + + +def mark_truncated(rows: list[Any], limit: int) -> tuple[list[Any], bool]: + """+1-fetch trick. + + Pass ``limit+1`` to the backend; this helper drops the overflow row and + reports whether truncation occurred. ``limit`` must be ``>= 0``. + """ + if limit < 0: + raise ValueError(f"mark_truncated: limit must be >= 0, got {limit}") + truncated = len(rows) > limit + if not truncated: + return list(rows), False + return list(rows[:limit]), True + + +def normalize_enum(value: str, *, kind: str) -> str: + """Normalize a user-supplied enum to the graph's stored literal form. + + * role / capability / framework / java_kind: case + kebab -> UPPER_SNAKE + (the stored literals are uppercase; e.g. ``Controller``/``controller`` + -> ``CONTROLLER``, ``web-flux`` -> ``WEB_FLUX``). + * client_kind / producer_kind / source_layer: routed through the explicit + lookup tables above (the stored literals are lowercase_snake with + non-obvious suffixes: ``feign`` -> ``feign_method``, ``kafka`` -> + ``kafka_send``, ``layer-a`` -> ``layer_a_meta``). + + Empty input returns empty. Unknown lookup values fall through to the + UPPER_SNAKE path so callers see *something* (validation against the + graph's ``VALID_*`` set happens at the command layer). + """ + raw = (value or "").strip() + if not raw: + return raw + table = _ENUM_LOOKUP_TABLES.get(kind) + if table is not None: + if raw in table: + return table[raw] + norm = raw.lower().replace("-", "_").replace(" ", "_") + if norm in table: + return table[norm] + # Fall through to UPPER_SNAKE for unknown values; the command layer + # validates against VALID_CLIENT_KINDS / VALID_PRODUCER_KINDS / the + # source_layer set and emits an actionable error envelope. + # framework / java_kind (symbol_kind) literals are stored LOWERCASE — both + # in the graph (Route.framework, Symbol.kind) and in the NodeFilter Literal + # types (mcp_v2.Framework / DeclarationSymbolKind). Uppercasing them broke + # `routes --framework`, `find --java-kind` filter mode, and crashed + # `search --framework` with a pydantic ValidationError. role / capability + # stay UPPER_SNAKE (those ARE stored uppercase). + if kind in ("framework", "java_kind"): + return raw.lower().replace("-", "_").replace(" ", "_") + return raw.upper().replace("-", "_").replace(" ", "_") + + +def _matches_post_filters( + node: NodeRef, + *, + java_kind: str | None, + role: str | None, + fqn_prefix: str | None, +) -> bool: + """Client-side post-filter on a resolved node (PR-JRAG-1a resolve-first).""" + if java_kind is not None: + want = normalize_enum(java_kind, kind="java_kind") + # symbol_kind is stored LOWERCASE (DeclarationSymbolKind: class/method/...); + # normalize_enum now returns lowercase for java_kind, so compare on the + # lowercased actual (was upper-vs-upper, which only worked by accident). + actual = (node.symbol_kind or "").lower().replace("-", "_") + if actual != want: + return False + if role is not None: + want = normalize_enum(role, kind="role") + actual = (node.role or "").upper().replace("-", "_") + if actual != want: + return False + if fqn_prefix is not None: + if not (node.fqn or "").startswith(fqn_prefix): + return False + return True + + +def _candidate_to_dict(node: NodeRef, reason: str) -> dict[str, Any]: + """Build a candidate dict for the ambiguous envelope, carrying ``reason``. + + No ``file`` / ``score`` fields — ambiguous candidates are not file pointers + or ranked matches, they are *narrowing* hints (PR-JRAG-1a renderer spec). + """ + return { + "id": node.id, + "fqn": node.fqn, + "kind": node.kind, + "name": simple_name({"fqn": node.fqn}), + "microservice": node.microservice, + "module": node.module, + "role": node.role, + "symbol_kind": node.symbol_kind, + "reason": reason, + } + + +def _constructor_owner_fqn(node: NodeRef) -> str | None: + """If ``node`` is a constructor, return its owning class FQN; else None. + + A constructor's FQN is ``#(args)`` where the member + name equals the class's simple name (``com.x.Foo#Foo(...)``). ``symbol_kind`` + may be ``"constructor"`` or, on older nodes, ``"method"`` — the FQN shape is + authoritative. Used by the class-vs-constructor auto-pick in + :func:`resolve_query` so ``inspect/callers/callees `` does not + bounce to "ambiguous" just because the class shares its name with its ctor. + """ + fqn = (node.fqn or "").strip() + if "#" not in fqn: + return None + head, rest = fqn.split("#", 1) + member = rest.split("(", 1)[0].strip() + class_simple = head.rsplit(".", 1)[-1] + if member and member == class_simple: + return head + return None + + +def _node_file_location(graph: Any, node_id: str) -> str | None: + """Fetch ``filename:start_line`` for a resolved node from the graph. + + ``NodeRef`` does not carry ``filename`` / ``start_line`` (graph_types.NodeRef + only has id/kind/fqn/symbol_kind/microservice/module/role); the resolved + node's location is fetched separately via a single-column Cypher lookup. + """ + rows = graph._rows( # noqa: SLF001 - same pattern as mcp_v2._load_node_record + "MATCH (n) WHERE n.id = $id " + "RETURN n.filename AS filename, n.start_line AS start_line LIMIT 1", + {"id": node_id}, + ) + if not rows: + return None + row = rows[0] + filename = str(row.get("filename") or "").strip() + if not filename: + return None + start_line = row.get("start_line") + if start_line: + try: + return f"{filename}:{int(start_line)}" + except (TypeError, ValueError): + return filename + return filename + + +def resolve_query( + identifier: str, + *, + hint_kind: Literal["symbol", "route", "client", "producer"] | None, + java_kind: str | None, + role: str | None, + fqn_prefix: str | None, + cfg: Any, + graph: Any | None = None, +) -> tuple[NodeRef | None, Envelope]: + """Resolve-first mapper: runs ``resolve_v2`` and maps its contract to the envelope. + + * ``one`` -> apply post-filters (``java_kind`` / ``role`` / ``fqn_prefix``) + to the resolved node. If pass: ``(node, env ok)`` with + ``env.file_location`` set from the node's ``filename`` + ``start_line`` + and ``env.root = node.id``. If fail: ``(None, env not_found)``. + * ``many`` -> apply post-filters to candidates. If exactly one survives, + treat as ``one`` (proceed). Else ``(None, env ambiguous)`` with candidates + capped at 10, each carrying ``reason``. Auto-pick is forbidden. + * ``none`` -> ``(None, env not_found)`` with a message mentioning + ``jrag search``. + + ``cfg`` is a ``ResolvedOperatorConfig`` (typed loosely to keep this module + cocoindex-free and to avoid importing the operator config layer here). + ``graph`` is optional for testability; in production the caller passes the + graph it loaded via :func:`jrag._load_graph`. + """ + # Lazy imports — keeps build_parser() / `jrag --help` free of resolve/ladybug. + from resolve_service import resolve_v2 + + if graph is None: + from ladybug_queries import LadybugGraph + + graph = LadybugGraph.get(str(cfg.ladybug_path)) + + out = resolve_v2(identifier, hint_kind=hint_kind, graph=graph) + + if out.status == "one" and out.node is not None: + node = out.node + if _matches_post_filters(node, java_kind=java_kind, role=role, fqn_prefix=fqn_prefix): + env = Envelope(status="ok", root=node.id) + loc = _node_file_location(graph, node.id) + if loc is not None: + env.file_location = loc + return node, env + return None, Envelope( + status="not_found", + message=( + f"No matches for {identifier!r} after applying --java-kind/--role/--fqn-prefix " + "filters; use `jrag search ` for ranked fuzzy lookup." + ), + ) + + if out.status == "many" and out.candidates: + survivors = [ + c for c in out.candidates + if _matches_post_filters(c.node, java_kind=java_kind, role=role, fqn_prefix=fqn_prefix) + ] + if len(survivors) == 1: + node = survivors[0].node + env = Envelope(status="ok", root=node.id) + loc = _node_file_location(graph, node.id) + if loc is not None: + env.file_location = loc + return node, env + if not survivors: + # Every `many` candidate was rejected by the post-filters — there is + # nothing left to disambiguate, so this is not_found, NOT an empty + # ambiguous list (which would render as "0 ambiguous matches" with no + # narrowing value). Same message as the `one` post-filter-fail branch. + return None, Envelope( + status="not_found", + message=( + f"No matches for {identifier!r} after applying --java-kind/--role/--fqn-prefix " + "filters; use `jrag search ` for ranked fuzzy lookup." + ), + ) + # Class-vs-constructor auto-pick: a class and its constructor share a + # simple name, so resolve_v2 returns "many" for ANY + # `inspect/callers/callees/decompose/dependencies `. When the + # survivors are exactly ONE type (FQN with no '#') plus one-or-more + # constructors OF THAT SAME TYPE, auto-pick the type. The constructor + # stays reachable via its explicit FQN or `--java-kind constructor`. + # Two genuinely-different types (same simple name across services) still + # surface as ambiguous — we never silently guess across distinct classes. + if len(survivors) >= 2: + type_survivors = [c for c in survivors if "#" not in (c.node.fqn or "")] + member_survivors = [c for c in survivors if "#" in (c.node.fqn or "")] + if len(type_survivors) == 1 and member_survivors: + type_fqn = (type_survivors[0].node.fqn or "").strip() + if all(_constructor_owner_fqn(c.node) == type_fqn for c in member_survivors): + node = type_survivors[0].node + env = Envelope(status="ok", root=node.id) + loc = _node_file_location(graph, node.id) + if loc is not None: + env.file_location = loc + return node, env + capped = survivors[:10] + env = Envelope( + status="ambiguous", + candidates=[_candidate_to_dict(c.node, c.reason) for c in capped], + ) + return None, env + + # status == "none" (or "one"/"many" with missing data — treat as not_found). + raw_msg = out.message or f"No matches for {identifier!r}." + # Always surface the CLI-specific `jrag search` hint (resolve_v2's built-in + # message references the MCP `search(query=...)` form, which is wrong for + # the agent-facing CLI). + if "jrag search" not in raw_msg: + raw_msg = f"{raw_msg} Use `jrag search ` for ranked fuzzy lookup." + return None, Envelope(status="not_found", message=raw_msg) + + +def next_actions_hook( + envelope: Envelope, + root: str | None = None, + edge_summary: dict[str, Any] | None = None, + result_edges: list[dict[str, Any]] | None = None, + command: str | None = None, +) -> list[str]: + """Populate ``envelope.agent_next_actions`` via :mod:`jrag_hints` (PR-JRAG-4). + + Every command that produces edges or an edge_summary calls this hook. The + hook extracts the root node's FQN from ``envelope.nodes[root]`` and delegates + to :func:`jrag_hints.next_actions`, which maps edge labels → ``jrag + `` hints (≤5, zero-direction suppressed, dot-keys covered). The result + is assigned to ``envelope.agent_next_actions`` (auto-omitted from + ``to_dict()`` when empty — see :meth:`Envelope.to_dict`). + + Skipped (returns ``[]``) when: + * ``root`` is ``None`` (listing / find / outline commands — no single root). + * The root node is absent from ``envelope.nodes`` (defensive). + * The root node's ``fqn`` is empty/missing. + * The root node is a synthetic kind (``microservice`` / ``topic`` / + ``unresolved_import``) — hints targeting a synthetic id would never + resolve and would mislead the agent. + + Args: + envelope: The output envelope (mutated in place: ``agent_next_actions`` + is set on success). + root: The root node id (for commands that resolve a single node). + edge_summary: The edge_summary from describe_v2 (inspect command only). + result_edges: Raw edge rows from traversal commands (used when + ``edge_summary`` is ``None``). + + Returns: + The list of hint strings assigned to ``envelope.agent_next_actions`` + (empty when the hook was a no-op for this call). + """ + if root is None: + return [] + root_node = envelope.nodes.get(root) + if root_node is None: + return [] + root_fqn = str(root_node.get("fqn") or "").strip() + if not root_fqn: + return [] + # Suppress hints for synthetic roots (microservice connection view, topic + # grouping, unresolved imports) — these would produce ``jrag callees `` + # style hints that would never resolve. + kind = str(root_node.get("kind") or "") + if kind in ("microservice", "topic", "unresolved_import"): + return [] + from java_codebase_rag.jrag_hints import next_actions + + envelope.agent_next_actions = next_actions( + root_fqn=root_fqn, + edge_summary=edge_summary, + result_edges=result_edges if result_edges is not None else list(envelope.edges), + current_command=command, + ) + return envelope.agent_next_actions + + +# --------------------------------------------------------------------------- +# Output detail projection (PR-JRAG-6). +# +# ``--detail brief|normal|full`` is ORTHOGONAL to ``--format text|json``. The +# renderer calls :func:`project_envelope` once, then BOTH the JSON path and the +# text renderers consume the trimmed dict — so ``--format json --detail brief`` +# and ``--format text --detail brief`` go through the SAME field set. +# +# Detail was previously decided per-handler at node-dict construction +# (``_symbol_hit_to_dict`` trimmed; ``SearchHit.model_dump()`` carried the full +# snippet), which coupled "how much" to "which format" and made JSON dump 50- +# line snippets + 10 empty fields while text showed only ``Name @service``. +# Inverting to "carry full, trim at one seam" makes the two axes independent. +# +# Key-sets are CATEGORY-based (intersected with each node's present keys), so +# they are kind-agnostic and auto-handle new node kinds: a route at ``normal`` +# shows the same categories of fields as a symbol at ``normal``. +# --------------------------------------------------------------------------- + +# Raw location columns carried by SymbolHit; folded into the display field +# ``file`` by :func:`_compose_file`. They are NOT display fields themselves. +_RAW_LOCATION_KEYS = frozenset( + {"filename", "start_line", "end_line", "start_byte", "end_byte"} +) + +# Identity only == the keys the text renderers' display_name / tiered_name read. +# Reproduces today's terse text output exactly at ``brief``. ``reason`` is +# candidate-structural (the ambiguous narrowing hint), so it survives at every +# level — a candidate without its reason is useless. +# +# NOTE: ``id`` is intentionally ABSENT. Graph node ids (40-hex SHAs) are an +# internal join key, never an agent-facing identifier — the CLI is resolve-first +# (agents pass FQN / simple name / route / topic). :func:`project_node` drops +# ``id`` and every ``*_id`` graph foreign key at every detail level via +# :func:`_is_graph_id_field`; see the boundary-strip rule there. +_BRIEF_NODE_KEYS: frozenset[str] = frozenset( + { + "kind", + "fqn", + "name", + "microservice", + "path", + "method", + "topic", + "member_fqn", + "target_service", + "broker", + "client_kind", + "producer_kind", + "import_simple", + "import_fqn", + "import_kind", + "resolved", + "reason", + } +) + +# brief + location / classification / ranking. ``file`` is the composed +# ``filename:start_line`` display field (see :func:`_compose_file`). +_NORMAL_NODE_KEYS: frozenset[str] = _BRIEF_NODE_KEYS | frozenset( + {"module", "role", "symbol_kind", "framework", "file", "score"} +) + +# Edge attrs the text renderers read at the default level (target id variants +# across backends + the grouping/confidence keys). +_BRIEF_EDGE_KEYS: frozenset[str] = frozenset( + { + "other_id", + "dst_id", + "target_id", + "term_id", + "edge_type", + "stored_edge_type", + "label", + "type", + "confidence", + "direction", + "section", + "stage", + "resolved", + } +) + +# brief + the cheap edge attrs (injection mechanism, role label, origin fqn). +_NORMAL_EDGE_KEYS: frozenset[str] = _BRIEF_EDGE_KEYS | frozenset( + {"mechanism", "role", "from_fqn"} +) + + +def _is_empty(value: Any) -> bool: + """True for values that carry no information: ``None`` / ``""`` / ``[]`` / ``{}``. + + ``False`` and ``0`` / ``0.0`` are NOT empty (they are meaningful: an + unresolved ``resolved=False`` flag, a ``0.0`` confidence). Only None and + zero-length containers are dropped. + """ + if value is None: + return True + if isinstance(value, (str, list, dict)) and len(value) == 0: + return True + return False + + +def _is_graph_id_field(key: str) -> bool: + """True for raw graph node-id fields stripped at the CLI boundary. + + Boundary-strip rule for the agent-facing surface: the CLI is resolve-first + (agents pass FQN / simple name / route / topic — never a raw id), so no + graph-internal id or graph foreign-key column reaches text or JSON. The rule + is ``key == "id" or key.endswith("_id")``, which catches ``id``, + ``parent_id`` (SymbolHit), ``chunk_id`` / ``symbol_id`` (SearchHit), and + ``member_id`` (raw list_clients/list_producers rows), plus any future graph + FK. No agent-meaningful field in this domain uses the ``_id`` suffix — + topics are keyed by name, routes by path — so the suffix rule is safe. + + Applied in :func:`project_node` (fields) and :meth:`Envelope.to_json` + (boundary reshape); the internal envelope + :meth:`Envelope.to_dict` stay + id-keyed for join/debug use. + """ + return key == "id" or key.endswith("_id") + + +def _strip_graph_id_fields(node: dict[str, Any]) -> dict[str, Any]: + """Return a copy of ``node`` with graph-id fields removed (see :func:`_is_graph_id_field`). + + RECURSES into nested dicts and list-of-dicts values so ids embedded in + sub-records are stripped too — e.g. the ``data`` sub-dict on a + ``NodeRecord.model_dump()`` (the ``inspect`` envelope node) carries its own + ``id`` / ``parent_id``; a top-level-only strip would leave them. Scalars and + non-dict lists pass through unchanged. + """ + out: dict[str, Any] = {} + for k, v in node.items(): + if _is_graph_id_field(k): + continue + out[k] = _strip_nested_ids(v) + return out + + +def _looks_like_raw_graph_id(key: str) -> bool: + """Heuristic: does this string look like a raw graph node id? + + Used by :meth:`Envelope._to_idfree_dict` to decide whether to synthesize an + opaque positional key when a node has no semantic identity (see + :func:`node_key` returning ``None``). Matches 40-hex SHA-1 ids and the + prefixed hash forms the graph builder emits (``r:phantom:``, + ``ucs:``, ``sym:``, ``chunk:``). Meaningful literal keys + (``"index"``, ``"microservices"``) and handler-built synthetics + (``microservice:``, ``import:``, ``topic:``) do NOT match. + """ + if not key: + return False + if _SHA1_RE.fullmatch(key): + return True + # prefixed hash forms: "::" + if ":" in key: + head = key.split(":", 1)[0] + tail = key.rsplit(":", 1)[-1] + if head in _GRAPH_ID_PREFIXES and _HEX_TAIL_RE.search(tail): + return True + return False + + +_GRAPH_ID_PREFIXES = frozenset({"r", "ucs", "sym", "chunk", "route", "member"}) +_SHA1_RE = re.compile(r"[0-9a-f]{40}") +_HEX_TAIL_RE = re.compile(r"[0-9a-f]{8,}") + + +def _strip_nested_ids(value: Any) -> Any: + """Recursively strip graph-id fields from nested dicts / list-of-dicts.""" + if isinstance(value, dict): + return _strip_graph_id_fields(value) + if isinstance(value, list): + return [_strip_nested_ids(item) for item in value] + return value + + +def _drop_empty(node: dict[str, Any]) -> dict[str, Any]: + """Drop keys whose value is ``None`` / ``""`` / ``[]`` / ``{}``. + + Extends the "omit empty optionals" rule from :meth:`Envelope.to_dict` DOWN + into each node/edge dict so JSON stops serializing ``"symbol_id": null`` / + ``"role": null`` (the "10 empty fields" complaint). Applied at every detail + level — no consumer benefits from empty fields, and the text renderers + already skip missing keys, so this only changes JSON (for the better). + """ + return {k: v for k, v in node.items() if not _is_empty(v)} + + +def _compose_file(node: dict[str, Any]) -> dict[str, Any]: + """Fold raw ``filename`` + ``start_line`` into a display ``file`` field. + + SymbolHit-derived nodes carry ``filename`` / ``start_line`` (raw graph + columns) that are not display fields. Compose them into one + ``"filename:start_line"`` string (or just ``filename`` when no line) so the + ``normal`` tier can show location as a single stable field, then drop the + raw location columns. Returns the node unchanged (minus raw columns) when + no ``filename`` is present. Returns a new dict; the input is not mutated. + """ + filename = str(node.get("filename") or "").strip() + if not filename: + return {k: v for k, v in node.items() if k not in _RAW_LOCATION_KEYS} + start_line = node.get("start_line") + try: + file_value = f"{filename}:{int(start_line)}" if start_line not in (None, "") else filename + except (TypeError, ValueError): + file_value = filename + out = {k: v for k, v in node.items() if k not in _RAW_LOCATION_KEYS} + out["file"] = file_value + return out + + +def project_node(node: dict[str, Any], detail: str) -> dict[str, Any]: + """Project a node dict to the field set for ``detail``. + + * ``"full"`` -> keep every present key (still :func:`_compose_file` + + :func:`_drop_empty`, so raw location columns become ``file`` and empties + vanish). + * ``"normal"`` -> keep ``_NORMAL_NODE_KEYS`` (identity + location + + classification + ranking). This is the default and the fix for the + "text too terse" complaint: adds ``file`` / ``score`` / ``role`` / + ``module``. + * ``"brief"`` -> keep ``_BRIEF_NODE_KEYS`` (identity only == today's text). + + ``file`` is composed before selection so it is available at ``normal`` / + ``full``. Empty values are dropped at every level. Graph-id fields (``id``, + ``*_id``) are stripped at every level via :func:`_strip_graph_id_fields` — + the CLI is resolve-first, so raw graph ids never reach the agent. Returns a + new dict. + """ + composed = _compose_file(node) + if detail == "full": + selected = composed + else: + allow = _NORMAL_NODE_KEYS if detail == "normal" else _BRIEF_NODE_KEYS + selected = {k: v for k, v in composed.items() if k in allow} + return _drop_empty(_strip_graph_id_fields(selected)) + + +def project_edge(edge: dict[str, Any], detail: str) -> dict[str, Any]: + """Project an edge row to the attr set for ``detail`` (mirrors :func:`project_node`). + + * ``"full"`` -> all attrs. + * ``"normal"`` -> ``_NORMAL_EDGE_KEYS`` (adds ``mechanism`` / ``role`` / + ``from_fqn`` over brief). + * ``"brief"`` -> ``_BRIEF_EDGE_KEYS`` (target id + label + confidence + + grouping keys == what the text renderers read today). + """ + if detail == "full": + selected = edge + else: + allow = _NORMAL_EDGE_KEYS if detail == "normal" else _BRIEF_EDGE_KEYS + selected = {k: v for k, v in edge.items() if k in allow} + return _drop_empty(selected) + + +def project_envelope(envelope: Envelope, detail: str) -> Envelope: + """Return a new Envelope with nodes/edges/candidates projected to ``detail``. + + The single projection seam: :func:`jrag_render.render` calls this once, + then both the JSON path (``to_json``) and the text renderers consume the + result. Envelope-level fields (``status`` / ``root`` / ``warnings`` / + ``truncated`` / ``file_location`` / ``message`` / ``agent_next_actions``) + are passed through unchanged — they are not node-level and have no detail + axis. ``detail`` is validated up front so a typo raises instead of + silently behaving like ``full``. + """ + if detail not in ("brief", "normal", "full"): + raise ValueError( + f"project_envelope: detail must be brief|normal|full, got {detail!r}" + ) + return Envelope( + status=envelope.status, + nodes={nid: project_node(n, detail) for nid, n in envelope.nodes.items()}, + edges=[project_edge(e, detail) for e in envelope.edges], + root=envelope.root, + candidates=[project_node(c, detail) for c in envelope.candidates], + agent_next_actions=list(envelope.agent_next_actions), + warnings=list(envelope.warnings), + truncated=envelope.truncated, + file_location=envelope.file_location, + message=envelope.message, + ) diff --git a/java_codebase_rag/jrag_hints.py b/java_codebase_rag/jrag_hints.py new file mode 100644 index 00000000..687e46e1 --- /dev/null +++ b/java_codebase_rag/jrag_hints.py @@ -0,0 +1,164 @@ +"""JRAG edge-label → CLI-command hint mapper (PR-JRAG-4). + +This is the **net-new** module that powers ``envelope.agent_next_actions``. It +maps the graph's edge labels (CALLS, IMPLEMENTS, EXTENDS, INJECTS, OVERRIDES, +OVERRIDDEN_BY, HTTP_CALLS, ASYNC_CALLS — plus composed dot-keys like +``DECLARES.CALLS`` and ``OVERRIDDEN_BY.DECLARES_CLIENT``) to the +``jrag`` command an agent should run next for the resolved root. + +Public surface: :func:`next_actions` — keyword-only, returns ``list[str]`` of +``jrag `` hint strings (≤5, de-duped, zero-direction suppressed). +The function imports :data:`java_ontology.EDGE_SCHEMA` **lazily inside the body** +so :func:`java_codebase_rag.jrag.build_parser` stays pure (no backend imports at +module import time — the sentinel test pins this). +""" +from __future__ import annotations + +from typing import Any + +__all__ = ["next_actions"] + + +# Edge label → {direction: jrag_command} map. +# +# Confirmed against java_ontology.EDGE_SCHEMA (java_ontology.py:179) and the +# traversal command surface (PR-JRAG-3a/3b). ``OVERRIDDEN_BY`` is a virtual +# label (the stored edge is ``OVERRIDES``; the describe-time rollup surfaces the +# inbound axis as ``OVERRIDDEN_BY`` for method Symbols — see NodeRecord.edge_summary +# docs at mcp_v2.py:469). HTTP_CALLS / ASYNC_CALLS only fire ``out`` because the +# ``callees`` command dispatches on Client/Producer roots to traverse those edges +# outbound; there is no inbound-only command for them (callers on a Route covers +# the inbound case via a different code path and a different root kind). +_LABEL_COMMANDS: dict[str, dict[str, str]] = { + "CALLS": {"in": "callers", "out": "callees"}, + "IMPLEMENTS": {"in": "implementations", "out": "hierarchy"}, + "EXTENDS": {"in": "subclasses", "out": "hierarchy"}, + "INJECTS": {"in": "dependents", "out": "dependencies"}, + "OVERRIDES": {"out": "overrides"}, + "OVERRIDDEN_BY": {"in": "overridden-by"}, + "HTTP_CALLS": {"out": "callees"}, + "ASYNC_CALLS": {"out": "callees"}, +} + +# Cap on returned hints (brief: ≤5). Matches the Envelope.agent_next_actions +# contract and mcp_hints' own cap. +_MAX_HINTS = 5 + + +def _candidate_labels(label: str) -> list[str]: + """Return the lookup candidates for a (possibly composed) edge label. + + For a plain label (``"CALLS"``): ``["CALLS"]``. + For ``"OVERRIDDEN_BY.DECLARES_CLIENT"``: the prefix ``"OVERRIDDEN_BY"`` is + the semantic axis → looked up first. + For ``"DECLARES.CALLS"``: ``"DECLARES"`` is a rollup prefix with no direct + command, so the suffix ``"CALLS"`` is the actionable label. + The full label is always tried first (covers ``"DECLARES_CLIENT"`` if it + ever appears un-split). + """ + if "." not in label: + return [label] + parts = label.split(".") + # Full label first (handles un-split composed forms), then prefix, then suffix. + return [label, parts[0], parts[-1]] + + +def _lookup_cmd(label: str) -> dict[str, str] | None: + """Look up a (possibly composed) label in the command map. + + Tries the full label, the dot-prefix, and the dot-suffix. Returns the first + match or ``None``. + """ + for cand in _candidate_labels(label): + cmds = _LABEL_COMMANDS.get(cand) + if cmds is not None: + return cmds + return None + + +def next_actions( + *, + root_fqn: str, + edge_summary: dict[str, Any] | None = None, + result_edges: list[dict[str, Any]], + graph: Any = None, # noqa: ARG001 — reserved for future use (brief contract) + current_command: str | None = None, +) -> list[str]: + """Build ``agent_next_actions`` hints for a resolved root. + + * When ``edge_summary`` is provided (``inspect`` path): iterate each + ``(label, counts)`` and emit ``jrag `` for direction ``d`` **only + when ``counts[d] > 0``** (zero-suppression). Composed dot-keys are covered + via :func:`_lookup_cmd`. + * When ``edge_summary`` is ``None`` (traversal path): fall back to the set of + ``edge_type`` labels present in ``result_edges``. Per-direction counts are + unavailable, so zero-suppression cannot apply — we emit both directions for + each recognized label. (The traversal command already filtered to one + direction; the hints surface the *other* edges the root has, encouraging + orthogonal exploration.) + + De-dups and caps at ``_MAX_HINTS`` (5). ``graph`` is accepted for forward + compatibility but not read — all needed data comes from ``edge_summary`` or + ``result_edges``. + """ + if not root_fqn: + return [] + + # Lazy import so build_parser() stays pure (PR-JRAG-4 sentinel test). + # EDGE_SCHEMA is the canonical label set; we use it to skip labels we don't + # recognize (avoids emitting hints for spurious / future edge types the + # command map doesn't cover). + from java_ontology import EDGE_SCHEMA + + # Known virtual labels not in EDGE_SCHEMA (describe-time rollup constructs). + _VIRTUAL_LABELS = frozenset({"OVERRIDDEN_BY"}) + + def _is_known_label(label: str) -> bool: + base = label.split(".")[0] + return base in EDGE_SCHEMA or base in _VIRTUAL_LABELS + + hints: list[str] = [] + seen: set[str] = set() + + def _add(cmd: str) -> None: + hint = f"jrag {cmd} {root_fqn}" + if hint not in seen: + seen.add(hint) + hints.append(hint) + + if edge_summary is not None: + # inspect path: zero-suppress per direction using counts. + for label, counts in edge_summary.items(): + if not _is_known_label(str(label)): + continue + cmds = _lookup_cmd(str(label)) + if cmds is None: + continue + counts_dict = counts if isinstance(counts, dict) else {} + in_n = int(counts_dict.get("in", 0) or 0) + out_n = int(counts_dict.get("out", 0) or 0) + if in_n > 0 and "in" in cmds: + _add(cmds["in"]) + if out_n > 0 and "out" in cmds: + _add(cmds["out"]) + else: + # traversal path: infer from result_edges labels. + # No per-direction counts → emit both directions for recognized labels, + # then drop the self-hint (the command an agent just ran). The inverse + # direction (e.g. `callees` after `callers`) is the useful exploration + # signal and is kept; only the exact command just run is redundant. + # ``current_command`` is the jrag subcommand name (``args.command``). + labels_seen: set[str] = set() + for edge in result_edges or []: + et = str(edge.get("edge_type") or "").strip() + if et and _is_known_label(et): + labels_seen.add(et.split(".")[0]) + for label in labels_seen: + cmds = _LABEL_COMMANDS.get(label) + if cmds is None: + continue + for d in ("in", "out"): + if d in cmds and cmds[d] != current_command: + _add(cmds[d]) + + return hints[:_MAX_HINTS] diff --git a/java_codebase_rag/jrag_render.py b/java_codebase_rag/jrag_render.py new file mode 100644 index 00000000..8187d785 --- /dev/null +++ b/java_codebase_rag/jrag_render.py @@ -0,0 +1,652 @@ +"""JRAG text rendering (PR-JRAG-1a). + +Fresh-built renderer (``cli_format.py`` is styling-primitives only — glyphs and +ANSI — it ships no renderers). The default output is compact text; ``--format +json`` emits the envelope verbatim via :meth:`Envelope.to_json`. + +This module imports only the envelope module (which itself imports no heavy +backend modules), so it stays import-safe under the ``build_parser`` lazy +invariant. +""" +from __future__ import annotations + +from typing import Any + +from java_codebase_rag.jrag_envelope import Envelope, project_envelope, simple_name + +__all__ = ["render", "tiered_name", "display_name"] + + +# Edge labels that carry a ``confidence`` column (CALLS-family). ``conf:`` is +# rendered only for these (PR-JRAG-1a renderer spec). Confirmed against +# java_ontology.EDGE_SCHEMA: CALLS / HTTP_CALLS / ASYNC_CALLS each carry an +# ``EdgeAttr("confidence", "DOUBLE", ...)``; the structural edges +# (EXTENDS/IMPLEMENTS/INJECTS/DECLARES/OVERRIDES/EXPOSES/DECLARES_CLIENT/ +# DECLARES_PRODUCER) do not all carry confidence, and even where they do, the +# CALLS-family is what the agent-facing ``conf:`` road-sign is reserved for. +_CALLS_FAMILY_EDGES = frozenset({"CALLS", "HTTP_CALLS", "ASYNC_CALLS"}) + +# Route node kinds → short text tag so the routes listing distinguishes HTTP +# endpoints from Kafka topics (otherwise they mash together with no indicator). +# Only route kinds are tagged; symbol/client/producer rows carry other kinds (or +# none) and are left untagged. +_ROUTE_KIND_TAGS: dict[str, str] = {"kafka_topic": "kafka", "http_endpoint": "http"} + +# Identity keys already represented in a listing line (display_name + @service + +# kind tag). At ``--detail full`` the per-row kv-block skips these (they are in +# the header line) and renders every OTHER key, so full listing == per-row +# inspect block. ``id`` is absent here AND stripped by the envelope projector's +# graph-id-field rule (see jrag_envelope._strip_graph_id_fields) — listed for +# documentation of the identity set, but the projector is the authoritative +# strip seam. Must agree with the identity half of the envelope projector's +# ``_BRIEF_NODE_KEYS`` (see jrag_envelope.py). +_LISTING_LINE_KEYS: frozenset[str] = frozenset( + { + "kind", + "fqn", + "name", + "microservice", + "path", + "method", + "topic", + "member_fqn", + "target_service", + "broker", + "client_kind", + "producer_kind", + "import_simple", + "import_fqn", + "import_kind", + } +) + +# Fixed left-to-right order for the inline extras appended at ``--detail normal`` +# (only the non-empty ones are rendered). Equals the envelope projector's +# ``_NORMAL_NODE_KEYS - _BRIEF_NODE_KEYS``. +_NORMAL_INLINE_EXTRAS: tuple[str, ...] = ( + "module", + "role", + "symbol_kind", + "framework", + "file", + "score", +) + +# Edge attrs the edge line already renders (label/confidence); at ``--detail +# full`` these are skipped when appending the remaining attrs inline. +_EDGE_LINE_KEYS: frozenset[str] = frozenset( + {"other_id", "dst_id", "target_id", "term_id", "edge_type", "stored_edge_type", "label", "type", "confidence"} +) + + +def _next_action_lines(envelope: Envelope) -> list[str]: + """Build up to 2 ``next: `` lines from ``agent_next_actions``. + + Cap at 2 to keep text-mode output token-lean (consistent with the ambiguous + renderer at :func:`_render_ambiguous`); JSON carries all ≤5. Returns an empty + list when ``agent_next_actions`` is empty (commands with no root produce no + hints → nothing appended). + """ + return [f"next: {hint}" for hint in envelope.agent_next_actions[:2]] + + +def display_name(node: dict[str, Any]) -> str: + """Best short label for a node across all kinds (symbol + route/client/producer). + + Listing rows and traversal targets carry different identifying fields per + kind; this picks the most informative one rather than assuming every node + has an FQN (routes have ``path``/``method``; clients/producers have + ``member_fqn`` + ``topic``/``target_service``). Precedence: + + * explicit ``name`` -> symbols (SymbolHit carries one) + * ``member_fqn`` -> the member making the call/emit, with + ``→ topic`` / ``→ target_service`` when present + * ``path`` -> ``METHOD path`` (route) or ``path`` (client) + * ``topic`` -> bare topic (producer without a member) + * ``fqn`` -> fqn-derived simple name (classes/methods) + + Returns ``""`` only when nothing identifiable is present. + + For a method symbol (``pkg.Class#method(args)``) the label is + ``Class#method``, NOT the bare ``name``: ``getId`` / ``process`` / ``create`` + collide across classes, so a traversal/listing row reduced to the bare + method name is ambiguous (the SlaService callees example — four ``getId``, + five ``process``). The declaring class is identity-level disambiguation, so + it folds into the label at every detail tier (brief included). ``name`` (the + clean method name, no args) is preferred when present; the FQN-derived method + name is the fallback when ``name`` is absent. + """ + fqn = str(node.get("fqn") or "").strip() + if "#" in fqn: + head, _, tail = fqn.partition("#") + cls = head.rsplit(".", 1)[-1] + method = str(node.get("name") or "").strip() or tail.split("(", 1)[0] + if cls and method: + return f"{cls}#{method}" + name = str(node.get("name") or "").strip() + if name: + return name + member_fqn = str(node.get("member_fqn") or "").strip() + if member_fqn: + base = member_fqn.rsplit(".", 1)[-1] + topic = str(node.get("topic") or "").strip() + if topic: + return f"{base} → {topic}" + target = str(node.get("target_service") or "").strip() + if target: + return f"{base} → {target}" + return base + path = str(node.get("path") or "").strip() + if path: + method = str(node.get("method") or "").strip() + return f"{method} {path}" if method else path + topic = str(node.get("topic") or "").strip() + if topic: + return topic + # Symbol / fallback: fqn-derived simple name. + return simple_name(node) + + +def tiered_name(node_id: str, nodes: dict[str, dict]) -> str: + """Tiered label: ``display_name @service`` -> display_name -> FQN -> id. + + ``display_name`` covers symbols (fqn) AND route/client/producer nodes + (path/member_fqn/topic). ``@service`` is appended when ``microservice`` is + present; if the node still yields no label, the raw FQN (then the id) is + returned so a traversal target is never rendered empty. + """ + node = nodes.get(node_id) or {} + name = display_name(node) + service = str(node.get("microservice") or "").strip() + if name and service: + return f"{name} @{service}" + if name: + return name + fqn = str(node.get("fqn") or "").strip() + return fqn or node_id + + +def _node_id(edge: dict) -> str: + """Pull the *other-end* node id out of an edge row across backend variants. + + ``neighbors_v2`` returns ``other_id``; traversal LadybugGraph methods return + one of ``dst_id`` / ``target_id`` / ``term_id``. We try them in order. + """ + for key in ("other_id", "dst_id", "target_id", "term_id"): + val = edge.get(key) + if isinstance(val, str) and val: + return val + return "" + + +def _edge_label(edge: dict) -> str: + for key in ("edge_type", "stored_edge_type", "label", "type"): + val = edge.get(key) + if isinstance(val, str) and val: + return val + return "" + + +def _truncated_hint(*, next_offset: int | None) -> str: + if next_offset is not None: + return f"truncated: more results — use --offset {next_offset}" + return "truncated: more results — narrow your query" + + +def _render_error(envelope: Envelope) -> str: + msg = envelope.message or (envelope.warnings[0] if envelope.warnings else "error") + return f"error: {msg}" + + +def _render_not_found(envelope: Envelope) -> str: + msg = envelope.message or "not found" + return f"not found: {msg}" + + +def _render_listing(envelope: Envelope, *, noun: str, detail: str = "normal") -> str: + lines: list[str] = [] + for _node_id, node in envelope.nodes.items(): + # Listing omits FQN (PR-JRAG-1a test 11): display_name + @service only. + # display_name handles routes (METHOD path) / clients / producers, which + # carry no FQN — simple_name would render them blank. + name = display_name(node) + if not name: + # Unresolved brownfield routes can carry empty path+topic+member; + # fall back to the file basename (then a placeholder) so the row + # never renders as a blank line or a bare ``@service``. The + # projector composes raw filename+start_line into ``file``, so check + # both ``file`` and the raw ``filename`` (present pre-projection / + # when no start_line was carried). + label = "" + for key in ("file", "filename"): + raw = str(node.get(key) or "").strip() + if raw: + base = raw.rsplit(":", 1)[0] if raw.rsplit(":", 1)[-1].isdigit() else raw + label = base.rsplit("/", 1)[-1] + break + name = label or "(no identifier)" + service = str(node.get("microservice") or "").strip() + tag = _ROUTE_KIND_TAGS.get(str(node.get("kind") or "")) + parts: list[str] = [f"[{tag}]", name] if tag else [name] + line = " ".join(parts) + if service: + line += f" @{service}" + # PR-JRAG-3b: distinguish unresolved imports from resolved graph nodes + # in TEXT mode. Without this marker, `imports ` renders resolved + # Symbols and unresolved placeholders identically (only JSON carries + # the resolved flag), leaving a text-mode agent unable to tell which + # imports resolved. The marker is gated on the synthetic + # `kind="unresolved_import"` set by _cmd_imports. + if node.get("kind") == "unresolved_import": + line += " (unresolved)" + # detail > brief: surface the fields the terse line drops. The projector + # has already trimmed the node to the requested field set, so we only + # decide PRESENTATION. normal = append inline location/classification/ + # ranking extras to the SAME line (one line per row — the fix for "text + # too terse": adds module/role/file/score). full = per-row inspect block + # of every non-identity key (signature/annotations/snippet/...). + if detail == "normal": + extras = [ + f"{key}={node[key]}" + for key in _NORMAL_INLINE_EXTRAS + if key in node and node[key] not in ("", None) + ] + if extras: + line += " " + " ".join(extras) + lines.append(line) + if detail == "full": + rest = {k: v for k, v in node.items() if k not in _LISTING_LINE_KEYS} + if rest: + lines.extend(_render_inspect_block(rest, 1)) + if not lines: + lines.append(f"0 {noun}".rstrip()) + return "\n".join(lines) + + +def _node_normal_extras(node: dict[str, Any]) -> str: + """Inline ``key=value`` extras for a node at ``normal`` detail. + + Mirrors :func:`_render_listing`'s normal-tier inline append exactly (same + ``_NORMAL_INLINE_EXTRAS`` key list / format) so a traversal row and a listing + row show the SAME fields at the same level, and so text matches the field + set JSON carries at ``normal``. Returns ``""`` when none of the extras are + present (the line is left unchanged). + """ + extras = [ + f"{key}={node[key]}" + for key in _NORMAL_INLINE_EXTRAS + if key in node and node[key] not in ("", None) + ] + return (" " + " ".join(extras)) if extras else "" + + +def _node_full_rows(node: dict[str, Any], indent: int) -> list[str]: + """Indented kv-block of a node's non-identity fields at ``full`` detail. + + Mirrors :func:`_render_listing`'s full-tier block: identity keys + (``_LISTING_LINE_KEYS`` — already represented in the label/@service line) are + skipped, the rest recurse via :func:`_render_inspect_block` so + signature / annotations / modifiers / package render as readable nested kv + lines. Returns ``[]`` when the node has no content fields. + """ + rest = {k: v for k, v in node.items() if k not in _LISTING_LINE_KEYS} + return _render_inspect_block(rest, indent) if rest else [] + + +def _format_edge_rows(edge: dict, nodes: dict[str, dict], *, detail: str = "normal") -> list[str]: + """Format an edge as one header line plus (at ``full``) a per-edge block. + + Shared across all render modes (flat + grouped). The header is + `` `` plus ``conf=N.NN`` for CALLS-family edges. The caller is + responsible for any grouping header above these rows. + + NODE-level detail is honored symmetrically with :func:`_render_listing` + (PR-JRAG-6 fixed listings but not traversals; this closes that gap so + ``jrag callees`` text carries the same fields as its JSON, and ``--detail + full`` is no longer a no-op): + + * ``brief`` -> header only (label + conf). Identity only; the label already + carries the declaring class for methods via :func:`display_name`. + * ``normal`` -> header + edge ``mechanism`` + the target node's + ``_NORMAL_INLINE_EXTRAS`` (module/role/symbol_kind/framework/file/score) + inline — the same inline append listings use. + * ``full`` -> header + every remaining EDGE attr inline (annotation / + field_or_param / from_fqn / …) + a per-edge indented block of the target + node's content fields (signature/annotations/modifiers/...). + """ + target_id = _node_id(edge) + target = nodes.get(target_id) or {} + label = tiered_name(target_id, nodes) if target_id else "(missing)" + line = f" {label}" + edge_type = _edge_label(edge) + # conf: only on CALLS-family edges (PR-JRAG-1a test 12). + if edge_type in _CALLS_FAMILY_EDGES: + conf = edge.get("confidence") + if conf is not None: + try: + line += f" conf={float(conf):.2f}" + except (TypeError, ValueError): + pass + if detail == "normal": + mech = edge.get("mechanism") + if mech not in ("", None): + line += f" mechanism={mech}" + line += _node_normal_extras(target) + return [line] + if detail == "full": + for key in edge: + if key in _EDGE_LINE_KEYS: + continue + val = edge.get(key) + if val in ("", None): + continue + line += f" {key}={val}" + rows = [line] + rows.extend(_node_full_rows(target, 2)) + return rows + return [line] + + +def _render_traversal(envelope: Envelope, *, noun: str, detail: str = "normal") -> str: + lines: list[str] = [] + root_id = envelope.root or "" + if root_id: + # root: tiered name (Class / Class#method + @service). At normal the + # root node's module/role/file/score append inline; at full a kv-block + # renders under it — the SAME detail contract as an edge-target row, so + # the resolved-subject line carries the same fields JSON shows (parity + # with the listing/edge detail work; pre-fix the root was always bare). + root_node = envelope.nodes.get(root_id, {}) + root_label = tiered_name(root_id, envelope.nodes) + if detail == "normal": + lines.append(f"root: {root_label}{_node_normal_extras(root_node)}") + elif detail == "full": + lines.append(f"root: {root_label}") + lines.extend(_node_full_rows(root_node, 1)) + else: + lines.append(f"root: {root_label}") + if not envelope.edges: + # Zero-results line for a traversal: "0 @". + # The fqn + service come from the root node (the resolved subject). + parts = [f"0 {noun}".rstrip()] + root_node = envelope.nodes.get(root_id, {}) + root_fqn = str(root_node.get("fqn") or "").strip() + root_svc = str(root_node.get("microservice") or "").strip() + if root_fqn: + parts.append(root_fqn) + if root_svc: + parts.append(f"@{root_svc}") + lines.append(" ".join(parts)) + lines.extend(_next_action_lines(envelope)) + return "\n".join(lines) + + # Grouped rendering fires ONLY when the producer attached the grouping + # key (hierarchy sets `direction`; decompose sets `stage`; connection sets + # `section`). Other traversals (callers/callees/dependents/...) leave all + # three unset and fall through to the flat list below — current behavior + # unchanged (Fix 1). + has_stages = any(e.get("stage") is not None for e in envelope.edges) + has_direction = any(e.get("direction") for e in envelope.edges) + has_section = any(e.get("section") for e in envelope.edges) + + if has_section: + # connection: group under inbound:/outbound: headers. Edges carry a + # `section` key set to "inbound" or "outbound" by _cmd_connection. + # Unknown section values are rendered under their literal name so the + # agent sees the data even if a future caller adds a new section. + in_sec = [e for e in envelope.edges if e.get("section") == "inbound"] + out_sec = [e for e in envelope.edges if e.get("section") == "outbound"] + other = [e for e in envelope.edges if e.get("section") not in ("inbound", "outbound")] + if in_sec: + lines.append("inbound:") + for e in in_sec: + lines.extend(_format_edge_rows(e, envelope.nodes, detail=detail)) + if out_sec: + lines.append("outbound:") + for e in out_sec: + lines.extend(_format_edge_rows(e, envelope.nodes, detail=detail)) + for e in other: + section = str(e.get("section") or "") + if section: + lines.append(f"{section}:") + lines.extend(_format_edge_rows(e, envelope.nodes, detail=detail)) + lines.extend(_next_action_lines(envelope)) + return "\n".join(lines) + + if has_stages: + # decompose role-waterfall: group edges under `stage N` headers. + # The role on each edge (carried from StageSymbol) labels the stage + # when homogeneous; otherwise we just number it. + stage_order: list[int] = [] + by_stage: dict[int, list[dict]] = {} + for e in envelope.edges: + s = int(e.get("stage") or 0) + if s not in by_stage: + by_stage[s] = [] + stage_order.append(s) + by_stage[s].append(e) + for s in stage_order: + stage_edges = by_stage[s] + roles = {str(e.get("role") or "").upper() for e in stage_edges if e.get("role")} + if s == 0: + header = "stage 0 (seed):" + elif len(roles) == 1: + header = f"stage {s} ({next(iter(roles)).lower()}):" + else: + header = f"stage {s}:" + lines.append(header) + for e in stage_edges: + lines.extend(_format_edge_rows(e, envelope.nodes, detail=detail)) + lines.extend(_next_action_lines(envelope)) + return "\n".join(lines) + + if has_direction: + # hierarchy tree: group under ↑ supertypes / ↓ subtypes headers. + up = [e for e in envelope.edges if e.get("direction") == "up"] + dn = [e for e in envelope.edges if e.get("direction") == "down"] + if up: + lines.append("↑ supertypes:") + for e in up: + lines.extend(_format_edge_rows(e, envelope.nodes, detail=detail)) + if dn: + lines.append("↓ subtypes:") + for e in dn: + lines.extend(_format_edge_rows(e, envelope.nodes, detail=detail)) + lines.extend(_next_action_lines(envelope)) + return "\n".join(lines) + + # Flat: callers / callees / implementations / subclasses / overrides / + # overridden-by / dependents / impact / flow (current behavior). + for edge in envelope.edges: + lines.extend(_format_edge_rows(edge, envelope.nodes, detail=detail)) + lines.extend(_next_action_lines(envelope)) + return "\n".join(lines) + + +def _inspect_inline(val: Any) -> str: + """One-line rendering for a leaf value or a collapsed list/dict item. + + Scalars render as themselves; a list of scalars joins with ``, ``; a dict + collapses to ``k: v, k: v`` (used for list-of-dict sample items, which are + short). Empty list/dict render as ``[]`` / ``{}``. + """ + if isinstance(val, list): + return ", ".join(_inspect_inline(x) for x in val) if val else "[]" + if isinstance(val, dict): + return ", ".join(f"{k}: {_inspect_inline(v)}" for k, v in val.items()) if val else "{}" + if isinstance(val, str): + return val + return str(val) + + +def _is_dict_list(v: Any) -> bool: + """True for a non-empty list whose every item is a dict (rendered as blocks).""" + return isinstance(v, list) and bool(v) and all(isinstance(x, dict) for x in v) + + +def _render_inspect_block(node: dict[str, Any], indent: int) -> list[str]: + """Recursively render a dict's keys as indented kv lines. + + dict -> header + recurse (so ``counts: {svc: {kind: n}}`` nests fully); + non-empty list-of-dicts -> header + one ``- `` line per entry + (sample lists like ``client_sample``/``route_sample``); other lists and + scalars -> inline. Replaces the old single-level renderer that printed + nested dicts and list-of-dicts as Python ``repr()``. + """ + pad = " " * indent + out: list[str] = [] + for key in sorted(node.keys(), key=str): + val = node[key] + if isinstance(val, dict) and val: + out.append(f"{pad}{key}:") + out.extend(_render_inspect_block(val, indent + 1)) + elif _is_dict_list(val): + out.append(f"{pad}{key}:") + for item in val: + out.append(f"{pad} - {_inspect_inline(item)}") + else: + out.append(f"{pad}{key}: {_inspect_inline(val)}") + return out + + +def _render_inspect(envelope: Envelope) -> str: + """kv-block renderer for nodes carrying one or more nested dict sections. + + Generic: ANY dict-typed value on a node renders as a header line plus + indented sorted sub-keys, recursing fully. This is the dispatch signal for + the inspect shape (PR-JRAG-1a status uses it for ``counts`` / ``edges``; + PR-JRAG-3 ``inspect`` uses it for ``edge_summary`` and other rollups). The + ``edge_summary`` key is NOT special here - it is reserved for real edge + data in PR-JRAG-3 and is one of many possible section sources. + """ + lines: list[str] = [] + for _node_id, node in envelope.nodes.items(): + # ALL dict keys alphabetical (PR-JRAG-1a test 13); nested dicts and + # list-of-dicts recurse via _render_inspect_block instead of repr(). + lines.extend(_render_inspect_block(node, 0)) + lines.extend(_next_action_lines(envelope)) + return "\n".join(lines) + + +def _render_ambiguous(envelope: Envelope, *, noun: str) -> str: + count = len(envelope.candidates) + header = f"{count} ambiguous matches for {noun!r}" if noun else f"{count} ambiguous matches" + lines = [header, "Narrow with --kind --java-kind --role --fqn-prefix:"] + for cand in envelope.candidates: + # Ambiguous candidates carry reason; NO file / score (PR-JRAG-1a test 14). + # display_name only — graph id is NOT a fallback (the envelope projector + # strips id/parent_id at every detail level; an unidentified candidate + # renders with "(no identifier)" rather than leaking a raw SHA). + name = display_name(cand) or "(no identifier)" + service = str(cand.get("microservice") or "").strip() + reason = str(cand.get("reason") or "").strip() + line = f" {name}" + if service: + line += f" @{service}" + if reason: + line += f" ({reason})" + lines.append(line) + # <=2 next: hints; no auto-pick (PR-JRAG-1a renderer spec). + for hint in envelope.agent_next_actions[:2]: + lines.append(f"next: {hint}") + return "\n".join(lines) + + +def _render_scalar(envelope: Envelope) -> str: + if envelope.message is not None: + return envelope.message + if envelope.warnings: + return "\n".join(envelope.warnings) + return envelope.status + + +def _render_text_shape(envelope: Envelope, *, noun: str, shape: str | None, detail: str = "normal") -> str: + if envelope.status == "error": + return _render_error(envelope) + if envelope.status == "not_found": + return _render_not_found(envelope) + if envelope.status == "ambiguous": + return _render_ambiguous(envelope, noun=noun) + # status == "ok": dispatch on EXPLICIT shape hint first, then envelope + # structure. The shape hint is the only path to ``_render_inspect`` - + # listing nodes typically carry dict-valued fields after ``.model_dump()`` + # (Symbol nodes have ``source_range`` / ``annotations`` / ``capabilities`` + # / ``metadata`` etc.), so inferring inspect from "any node has a dict + # value" would silently mis-render listings as inspect (FQN alphabetical). + # Inspect is declared by the caller, never guessed from node contents. + # + # Traversal shape: a root subject is set (the resolved node the edges are + # relative to). This is true even when the traversal produced zero edges + # — the zero-edges traversal line is "0 @", NOT + # the scalar fallback. + # + # Precedence: explicit ``shape="inspect"`` wins over ``root``/listing + # by intent (callers declare what they want); then ``root`` wins over + # listing (a root signals "edges are the story"). + # + # detail: the envelope passed in is ALREADY projected (see :func:`render`), + # so each renderer sees only the keys for its detail level. ``detail`` is + # threaded in only to choose PRESENTATION (inline vs block / which edge + # attrs to print) — the field-set decision was made once, up front, by + # :func:`project_envelope`. ``_render_inspect`` needs no ``detail`` kwarg: + # it renders whatever keys survived projection (few at brief, all at full). + if shape == "inspect": + return _render_inspect(envelope) + if envelope.root is not None: + return _render_traversal(envelope, noun=noun, detail=detail) + # Listing shape: zero or more node rows. Empty listing renders "0 ". + if envelope.nodes or noun: + return _render_listing(envelope, noun=noun, detail=detail) + return _render_scalar(envelope) + + +def render( + envelope: Envelope, + *, + fmt: str = "text", + detail: str = "normal", + noun: str = "", + next_offset: int | None = None, + shape: str | None = None, +) -> str: + """Dispatch on ``fmt`` (text default; json emits the projected envelope). + + ``detail`` (``brief`` / ``normal`` / ``full``, default ``normal``) is + ORTHOGONAL to ``fmt``: the envelope is projected to the requested field set + ONCE via :func:`project_envelope`, then BOTH the JSON path (``to_json``) + and the text renderers consume the projected result. So ``--format json + --detail brief`` and ``--format text --detail brief`` go through the same + field set. ``brief`` reproduces today's terse text; ``normal`` adds + ``module``/``role``/``symbol_kind``/``framework``/``file``/``score`` (the + fix for "text too terse"); ``full`` keeps everything (incl. ``snippet`` / + ``signature`` / ``annotations``) and drops empty fields at all levels. + + ``noun`` is the human-readable noun for the result kind (e.g. ``"callers"``, + ``"matches"``); used in zero-results and ambiguous headers. ``next_offset`` + selects the truncated hint: ``None`` -> ``narrow your query`` (no offset + support on this command); a number -> ``use --offset `` (find/search). + + ``shape`` is the EXPLICIT render-shape hint. The only accepted value today + is ``"inspect"`` (kv-block + indented alphabetical sections); callers that + need it declare it (PR-JRAG-1a ``status``, future PR-JRAG-1b/3 ``inspect``). + ``None`` falls back to structural inference: ``root`` -> traversal, + ``nodes``/``noun`` -> listing, else scalar. Listing nodes frequently carry + dict-valued fields after ``.model_dump()``, so inspect is NEVER inferred + from node contents - only an explicit ``shape="inspect"`` routes there. + """ + projected = project_envelope(envelope, detail) + if fmt == "json": + return projected.to_json() + body = _render_text_shape(projected, noun=noun, shape=shape, detail=detail) + if projected.truncated: + hint = _truncated_hint(next_offset=next_offset) + body = f"{body}\n{hint}" if body else hint + # Warnings are rendered in text mode (one ``warning:`` line each) so an + # agent running without ``--format json`` still sees inapplicable-flag / + # post-filter notices. Without this the warnings[] field was JSON-only and + # the "inapplicable flags never silently ignored" spec was effectively + # unenforced for text consumers. + if projected.warnings: + warning_lines = "\n".join(f"warning: {w}" for w in projected.warnings) + body = f"{body}\n{warning_lines}" if body else warning_lines + return body diff --git a/mcp_v2.py b/mcp_v2.py index 2ec9a2a7..883f3014 100644 --- a/mcp_v2.py +++ b/mcp_v2.py @@ -27,24 +27,41 @@ from pydantic import BaseModel, ConfigDict, Field, TypeAdapter, ValidationError, model_validator, validate_call from sentence_transformers import SentenceTransformer +from graph_types import ( + NodeRef, + StructuredHint, + _hints_or_skip, + _node_ref_from_row, + _resolve_node_kind, + _to_structured_hints, + set_hints_enabled, +) from index_common import SBERT_MODEL from java_codebase_rag.config import resolved_sbert_model_for_process_env -from java_ontology import EDGE_SCHEMA, ResolveReason +from java_ontology import EDGE_SCHEMA from ladybug_queries import LadybugGraph, OVERRIDE_AXIS_COMPOSED_EDGE_TYPES -from mcp_hints import generate_hints, MCP_HINTS_STRUCTURED_FIELD_DESCRIPTION +from mcp_hints import MCP_HINTS_STRUCTURED_FIELD_DESCRIPTION from search_lancedb import TABLES, run_search -# Module-level flag set by server.py at startup from resolved config. -_hints_enabled: bool = True - - -def set_hints_enabled(enabled: bool) -> None: - global _hints_enabled - _hints_enabled = enabled - - -def _hints_or_skip(tool: str, payload: dict) -> tuple[list, list]: - return generate_hints(tool, payload) if _hints_enabled else ([], []) +__all__ = [ + "search_v2", + "find_v2", + "describe_v2", + "neighbors_v2", + "resolve_v2", + "SearchOutput", + "FindOutput", + "DescribeOutput", + "NeighborsOutput", + "ResolveOutput", + "ResolveCandidate", + "ResolveStatus", + "NodeRef", + "NodeFilter", + "EdgeFilter", + "StructuredHint", + "set_hints_enabled", +] DeclarationSymbolKind = Literal["class", "interface", "enum", "record", "annotation", "method", "constructor"] @@ -119,11 +136,18 @@ def _hints_or_skip(tool: str, payload: dict) -> tuple[list, list]: def _log_fail_loud(category: str) -> None: - """Increment process-local fail-loud counter and emit one stderr line (PR-FRAME-3).""" + """Increment process-local fail-loud counter and emit one stderr line (PR-FRAME-3). + + The stderr line is gated on ``JAVA_CODEBASE_RAG_FAIL_LOUD`` (default ``"1"`` = + emit) so the MCP server keeps its operator diagnostic while the agent-facing + ``jrag`` CLI (which surfaces the same failure as a clean status:error + envelope) can run it with the diagnostic silenced. + """ with _fail_loud_lock: _fail_loud_counts[category] = _fail_loud_counts.get(category, 0) + 1 n = _fail_loud_counts[category] - print(f"[filter-frame] fail-loud category={category} count={n}", file=sys.stderr, flush=True) + if os.environ.get("JAVA_CODEBASE_RAG_FAIL_LOUD", "1") != "0": + print(f"[filter-frame] fail-loud category={category} count={n}", file=sys.stderr, flush=True) def filter_frame_counters() -> dict[str, int]: @@ -213,12 +237,8 @@ def _role_axes_mutually_exclusive(self) -> EdgeFilter: _EDGEFILTER_FIELD_ORDER: tuple[str, ...] = tuple(EdgeFilter.model_fields.keys()) -class StructuredHint(BaseModel): - label: str = "" - tool: Literal["search", "find", "describe", "neighbors", "resolve"] - args: dict[str, Any] - actionable: bool = True - reason: str = "" +# StructuredHint is now defined in graph_types.py and imported above + # Populated EdgeFilter field -> EDGE_SCHEMA attribute name used in Cypher pushdown. _EDGEFILTER_FIELD_TO_ATTR: dict[str, str] = { @@ -385,8 +405,7 @@ def _edgefilter_applicability_error(edge_types: list[str], ef: EdgeFilter) -> st return None -def _to_structured_hints(raw: list[Any]) -> list[StructuredHint]: - return [StructuredHint(label=h.label, tool=h.tool, args=h.args, actionable=h.actionable, reason=h.reason) for h in raw] +# _to_structured_hints is now defined in graph_types.py and imported above def _coerce_edge_filter( @@ -446,15 +465,7 @@ class SearchHit(BaseModel): role: str | None = None -class NodeRef(BaseModel): - id: str - kind: Literal["symbol", "route", "client", "producer", "unresolved_call_site"] - fqn: str - name: str | None = None - symbol_kind: str | None = None - microservice: str | None = None - module: str | None = None - role: str | None = None +# NodeRef is now defined in graph_types.py and imported above class NodeRecord(BaseModel): @@ -553,110 +564,11 @@ class NeighborsOutput(BaseModel): hints_structured: list[StructuredHint] = Field(default_factory=list, description=MCP_HINTS_STRUCTURED_FIELD_DESCRIPTION) -ResolveStatus = Literal["one", "many", "none"] - -_RESOLVE_CANDIDATE_CAP = 10 - -_RESOLVE_REASON_PRIORITY: dict[ResolveReason, int] = { - "exact_id": 0, - "exact_fqn": 1, - "route_method_path": 1, - "client_target_path": 1, - "producer_topic_prefix": 1, - "fqn_suffix": 2, - "route_template": 2, - "short_name": 3, - "client_target": 3, - "producer_topic": 3, -} - -_SYMBOL_RESOLVE_RETURN = ( - "s.id AS id, s.fqn AS fqn, s.microservice AS microservice, " - "s.module AS module, s.role AS role, s.kind AS symbol_kind" -) - -_ROUTE_RESOLVE_RETURN = ( - "r.id AS id, r.kind AS kind, r.framework AS framework, r.method AS method, " - "r.path AS path, r.path_template AS path_template, r.path_regex AS path_regex, " - "r.topic AS topic, r.broker AS broker, r.feign_name AS feign_name, r.feign_url AS feign_url, " - "r.microservice AS microservice, r.module AS module, r.filename AS filename, " - "r.start_line AS start_line, r.end_line AS end_line, r.resolved AS resolved" -) - -_CLIENT_RESOLVE_RETURN = ( - "c.id AS id, c.client_kind AS client_kind, c.target_service AS target_service, " - "c.method AS method, c.path AS path, c.path_template AS path_template, " - "c.path_regex AS path_regex, c.member_fqn AS member_fqn, c.member_id AS member_id, " - "c.microservice AS microservice, c.module AS module, c.filename AS filename, " - "c.start_line AS start_line, c.end_line AS end_line, c.resolved AS resolved, " - "c.source_layer AS source_layer" -) - -_PRODUCER_RESOLVE_RETURN = ( - "p.id AS id, p.producer_kind AS producer_kind, p.topic AS topic, p.broker AS broker, " - "p.direction AS direction, p.member_fqn AS member_fqn, p.member_id AS member_id, " - "p.microservice AS microservice, p.module AS module, p.filename AS filename, " - "p.start_line AS start_line, p.end_line AS end_line, p.resolved AS resolved, " - "p.source_layer AS source_layer" -) - -_RESOLVE_PRE_DEDUP_LIMIT = 50 - - -class ResolveCandidate(BaseModel): - model_config = ConfigDict(extra="forbid") - - node: NodeRef - score: float - reason: ResolveReason - - -class ResolveOutput(BaseModel): - model_config = ConfigDict(extra="forbid") - - success: bool - status: ResolveStatus - node: NodeRef | None = None - candidates: list[ResolveCandidate] = Field(default_factory=list) - message: str | None = None - resolved_identifier: str | None = None - advisories: list[str] = Field(default_factory=list, description="Pure informational text with no tool call suggestion") - hints_structured: list[StructuredHint] = Field(default_factory=list, description=MCP_HINTS_STRUCTURED_FIELD_DESCRIPTION) +# Re-exported from resolve_service.py (imported at end of module to avoid circular import) +# resolve_v2, ResolveOutput, ResolveCandidate, ResolveStatus are imported below -def _node_kind_from_id( - id_str: str, -) -> Literal["symbol", "route", "client", "producer", "unresolved_call_site"]: - if id_str.startswith("ucs:"): - return "unresolved_call_site" - if id_str.startswith("sym:"): - return "symbol" - if id_str.startswith("route:") or id_str.startswith("r:"): - return "route" - if id_str.startswith("client:") or id_str.startswith("c:"): - return "client" - if id_str.startswith("producer:") or id_str.startswith("p:"): - return "producer" - raise ValueError(f"Unknown id prefix for `{id_str}`") - - -def _resolve_node_kind( - graph: LadybugGraph, - node_id: str, -) -> Literal["symbol", "route", "client", "producer", "unresolved_call_site"]: - try: - return _node_kind_from_id(node_id) - except ValueError: - pass - if graph._rows("MATCH (n:Symbol) WHERE n.id = $id RETURN n.id AS id LIMIT 1", {"id": node_id}): # noqa: SLF001 - return "symbol" - if graph._rows("MATCH (n:Route) WHERE n.id = $id RETURN n.id AS id LIMIT 1", {"id": node_id}): # noqa: SLF001 - return "route" - if graph._rows("MATCH (n:Client) WHERE n.id = $id RETURN n.id AS id LIMIT 1", {"id": node_id}): # noqa: SLF001 - return "client" - if graph._rows("MATCH (n:Producer) WHERE n.id = $id RETURN n.id AS id LIMIT 1", {"id": node_id}): # noqa: SLF001 - return "producer" - raise ValueError(f"Unknown id prefix for `{node_id}`") +# _node_kind_from_id and _resolve_node_kind are now defined in graph_types.py and imported above def _chunk_id_from_row(row: dict[str, Any]) -> str: @@ -735,38 +647,7 @@ def _symbol_where_from_filter(f: NodeFilter) -> tuple[str, dict[str, Any]]: return where, params -def _node_ref_from_row(kind: Literal["symbol", "route", "client", "producer"], row: dict[str, Any]) -> NodeRef: - symbol_kind: str | None = None - if kind == "symbol": - fqn = str(row.get("fqn") or "") - role = str(row.get("role") or "") or None - symbol_kind_val = str(row.get("symbol_kind") or row.get("kind") or "").strip() - symbol_kind = symbol_kind_val or None - elif kind == "route": - method = str(row.get("method") or "") - path = str(row.get("path_template") or row.get("path") or "") - fqn = f"{method} {path}".strip() - role = None - elif kind == "client": - method = str(row.get("method") or "") - target = str(row.get("target_service") or "") - path = str(row.get("path_template") or row.get("path") or "") - fqn = f"{target} {method} {path}".strip() - role = None - else: - topic = str(row.get("topic") or "") - broker = str(row.get("broker") or "") - fqn = f"{topic} {broker}".strip() - role = None - return NodeRef( - id=str(row.get("id") or ""), - kind=kind, - fqn=fqn, - symbol_kind=symbol_kind, - microservice=str(row.get("microservice") or "") or None, - module=str(row.get("module") or "") or None, - role=role, - ) +# _node_ref_from_row is now defined in graph_types.py and imported above def _load_node_record( @@ -1188,390 +1069,6 @@ def describe_v2( return DescribeOutput(success=False, message=str(exc), advisories=[]) -def _resolve_validate_identifier(raw: str) -> tuple[str | None, str | None]: - trimmed = raw.strip() - if not trimmed: - detail = "empty string" if raw == "" else "whitespace only" - return None, f"Invalid identifier: {detail}" - return trimmed, None - - -def _resolve_kinds_to_search( - hint_kind: Literal["symbol", "route", "client", "producer"] | None, -) -> list[Literal["symbol", "route", "client", "producer"]]: - if hint_kind is None: - return ["symbol", "route", "client", "producer"] - return [hint_kind] - - -def _resolve_parse_route_method_path(identifier: str) -> tuple[str, str] | None: - parts = identifier.split(None, 1) - if len(parts) != 2: - return None - method, path = parts[0].upper(), parts[1].strip() - if not method.isalpha() or not path.startswith("/"): - return None - return method, path - - -def _resolve_parse_microservice_route(identifier: str) -> tuple[str, str, str] | None: - parts = identifier.split(None, 2) - if len(parts) != 3: - return None - microservice, method, path = parts[0], parts[1].upper(), parts[2].strip() - if not method.isalpha() or not path.startswith("/"): - return None - return microservice, method, path - - -def _resolve_symbol_candidates( - g: LadybugGraph, - identifier: str, -) -> list[tuple[NodeRef, ResolveReason, int]]: - out: list[tuple[NodeRef, ResolveReason, int]] = [] - lim = _RESOLVE_PRE_DEDUP_LIMIT - - rows = g._rows( # noqa: SLF001 - f"MATCH (s:Symbol) WHERE s.id = $id RETURN {_SYMBOL_RESOLVE_RETURN} LIMIT $lim", - {"id": identifier, "lim": lim}, - ) - for row in rows: - out.append((_node_ref_from_row("symbol", row), "exact_id", len(identifier))) - - rows = g._rows( # noqa: SLF001 - f"MATCH (s:Symbol) WHERE s.fqn = $fqn RETURN {_SYMBOL_RESOLVE_RETURN} LIMIT $lim", - {"fqn": identifier, "lim": lim}, - ) - for row in rows: - out.append((_node_ref_from_row("symbol", row), "exact_fqn", len(identifier))) - - suffix = f".{identifier}" - rows = g._rows( # noqa: SLF001 - f"MATCH (s:Symbol) WHERE s.fqn = $ident OR s.fqn ENDS WITH $suffix " - f"RETURN {_SYMBOL_RESOLVE_RETURN} LIMIT $lim", - {"ident": identifier, "suffix": suffix, "lim": lim}, - ) - for row in rows: - fqn = str(row.get("fqn") or "") - spec = len(fqn) - out.append((_node_ref_from_row("symbol", row), "fqn_suffix", spec)) - - rows = g._rows( # noqa: SLF001 - f"MATCH (s:Symbol) WHERE s.name = $name RETURN {_SYMBOL_RESOLVE_RETURN} LIMIT $lim", - {"name": identifier, "lim": lim}, - ) - for row in rows: - out.append((_node_ref_from_row("symbol", row), "short_name", len(identifier))) - - return out - - -def _resolve_route_candidates( - g: LadybugGraph, - identifier: str, -) -> list[tuple[NodeRef, ResolveReason, int]]: - out: list[tuple[NodeRef, ResolveReason, int]] = [] - lim = _RESOLVE_PRE_DEDUP_LIMIT - - rows = g._rows( # noqa: SLF001 - f"MATCH (r:Route) WHERE r.id = $id RETURN {_ROUTE_RESOLVE_RETURN} LIMIT $lim", - {"id": identifier, "lim": lim}, - ) - for row in rows: - out.append((_node_ref_from_row("route", row), "exact_id", len(identifier))) - - ms_route = _resolve_parse_microservice_route(identifier) - if ms_route is not None: - microservice, method, path = ms_route - rows = g._rows( # noqa: SLF001 - f"MATCH (r:Route) WHERE r.microservice = $ms AND r.method = $method " - f"AND (r.path = $path OR r.path_template = $path) " - f"RETURN {_ROUTE_RESOLVE_RETURN} LIMIT $lim", - {"ms": microservice, "method": method, "path": path, "lim": lim}, - ) - for row in rows: - spec = len(path) - out.append((_node_ref_from_row("route", row), "route_method_path", spec)) - - method_path = _resolve_parse_route_method_path(identifier) - if method_path is not None: - method, path = method_path - rows = g._rows( # noqa: SLF001 - f"MATCH (r:Route) WHERE r.method = $method " - f"AND (r.path = $path OR r.path_template = $path) " - f"RETURN {_ROUTE_RESOLVE_RETURN} LIMIT $lim", - {"method": method, "path": path, "lim": lim}, - ) - for row in rows: - out.append((_node_ref_from_row("route", row), "route_method_path", len(path))) - - if identifier.startswith("/"): - rows = g._rows( # noqa: SLF001 - f"MATCH (r:Route) WHERE r.path = $path OR r.path_template = $path " - f"RETURN {_ROUTE_RESOLVE_RETURN} LIMIT $lim", - {"path": identifier, "lim": lim}, - ) - for row in rows: - path_val = str(row.get("path_template") or row.get("path") or "") - out.append((_node_ref_from_row("route", row), "route_template", len(path_val))) - - return out - - -def _resolve_client_candidates( - g: LadybugGraph, - identifier: str, -) -> list[tuple[NodeRef, ResolveReason, int]]: - out: list[tuple[NodeRef, ResolveReason, int]] = [] - lim = _RESOLVE_PRE_DEDUP_LIMIT - - rows = g._rows( # noqa: SLF001 - f"MATCH (c:Client) WHERE c.id = $id RETURN {_CLIENT_RESOLVE_RETURN} LIMIT $lim", - {"id": identifier, "lim": lim}, - ) - for row in rows: - out.append((_node_ref_from_row("client", row), "exact_id", len(identifier))) - - if " " in identifier: - target, path_prefix = identifier.split(" ", 1) - target = target.strip() - path_prefix = path_prefix.strip() - if target and path_prefix: - rows = g._rows( # noqa: SLF001 - f"MATCH (c:Client) WHERE c.target_service = $target " - f"AND (c.path STARTS WITH $path OR c.path_template STARTS WITH $path) " - f"RETURN {_CLIENT_RESOLVE_RETURN} LIMIT $lim", - {"target": target, "path": path_prefix, "lim": lim}, - ) - for row in rows: - spec = len(path_prefix) - out.append((_node_ref_from_row("client", row), "client_target_path", spec)) - elif not identifier.startswith("/"): - rows = g._rows( # noqa: SLF001 - f"MATCH (c:Client) WHERE c.target_service = $target RETURN {_CLIENT_RESOLVE_RETURN} LIMIT $lim", - {"target": identifier, "lim": lim}, - ) - for row in rows: - out.append((_node_ref_from_row("client", row), "client_target", len(identifier))) - - return out - - -def _resolve_producer_candidates( - g: LadybugGraph, - identifier: str, -) -> list[tuple[NodeRef, ResolveReason, int]]: - out: list[tuple[NodeRef, ResolveReason, int]] = [] - lim = _RESOLVE_PRE_DEDUP_LIMIT - - rows = g._rows( # noqa: SLF001 - f"MATCH (p:Producer) WHERE p.id = $id RETURN {_PRODUCER_RESOLVE_RETURN} LIMIT $lim", - {"id": identifier, "lim": lim}, - ) - for row in rows: - out.append((_node_ref_from_row("producer", row), "exact_id", len(identifier))) - - rows = g._rows( # noqa: SLF001 - f"MATCH (p:Producer) WHERE p.topic = $topic RETURN {_PRODUCER_RESOLVE_RETURN} LIMIT $lim", - {"topic": identifier, "lim": lim}, - ) - for row in rows: - out.append((_node_ref_from_row("producer", row), "producer_topic", len(identifier))) - - if not identifier.startswith("/"): - rows = g._rows( # noqa: SLF001 - f"MATCH (p:Producer) WHERE p.topic STARTS WITH $topic RETURN {_PRODUCER_RESOLVE_RETURN} LIMIT $lim", - {"topic": identifier, "lim": lim}, - ) - for row in rows: - out.append((_node_ref_from_row("producer", row), "producer_topic_prefix", len(identifier))) - - return out - - -def _resolve_dedupe_candidates( - raw: list[tuple[NodeRef, ResolveReason, int]], -) -> list[tuple[NodeRef, ResolveReason, int]]: - best: dict[str, tuple[NodeRef, ResolveReason, int]] = {} - for node, reason, specificity in raw: - prev = best.get(node.id) - if prev is None: - best[node.id] = (node, reason, specificity) - continue - prev_pri = _RESOLVE_REASON_PRIORITY[prev[1]] - new_pri = _RESOLVE_REASON_PRIORITY[reason] - if new_pri < prev_pri or (new_pri == prev_pri and specificity > prev[2]): - best[node.id] = (node, reason, specificity) - return list(best.values()) - - -def _resolve_rank_candidates( - deduped: list[tuple[NodeRef, ResolveReason, int]], -) -> list[ResolveCandidate]: - ordered = sorted( - deduped, - key=lambda item: (_RESOLVE_REASON_PRIORITY[item[1]], -item[2], item[0].id), - ) - total = len(ordered) - return [ - ResolveCandidate( - node=node, - reason=reason, - score=(1.0 - (idx / total)) if total else 0.0, - ) - for idx, (node, reason, _spec) in enumerate(ordered) - ] - - -def _resolve_assert_invariants(out: ResolveOutput) -> None: - if not out.success: - assert out.status == "none" - assert out.node is None - assert not out.candidates - assert out.message - return - if out.status == "one": - assert out.node is not None - assert not out.candidates - elif out.status == "many": - assert out.node is None - assert len(out.candidates) >= 2 - elif out.status == "none": - assert out.node is None - assert not out.candidates - assert out.message - - -def _resolve_seeds_for_hints(identifier: str) -> tuple[str | None, str | None]: - path_prefix_seed: str | None = None - method_path = _resolve_parse_route_method_path(identifier) - if method_path is not None: - path_prefix_seed = method_path[1] - else: - ms_route = _resolve_parse_microservice_route(identifier) - if ms_route is not None: - path_prefix_seed = ms_route[2] - elif identifier.startswith("/"): - path_prefix_seed = identifier - - target_service_seed: str | None = None - if " " in identifier: - target, _path_prefix = identifier.split(" ", 1) - target = target.strip() - if target: - target_service_seed = target - elif not identifier.startswith("/"): - target_service_seed = identifier - - return path_prefix_seed, target_service_seed - - -def _resolve_finalize_success( - trimmed: str, - hint_kind: Literal["symbol", "route", "client", "producer"] | None, - matches: list[ResolveCandidate], -) -> ResolveOutput: - if not matches: - out = ResolveOutput( - success=True, - status="none", - message=( - "No matches for identifier; use search(query=...) for ranked fuzzy lookup." - ), - resolved_identifier=trimmed, - ) - elif len(matches) == 1: - out = ResolveOutput( - success=True, - status="one", - node=matches[0].node, - resolved_identifier=trimmed, - ) - else: - out = ResolveOutput( - success=True, - status="many", - candidates=matches, - resolved_identifier=trimmed, - ) - - path_prefix_seed, target_service_seed = _resolve_seeds_for_hints(trimmed) - hint_payload = { - "status": out.status, - "resolved_identifier": trimmed, - "candidates": out.candidates, - "hint_kind": hint_kind, - "path_prefix_seed": path_prefix_seed, - "target_service_seed": target_service_seed, - } - raw_struct, raw_advisories = _hints_or_skip("resolve", hint_payload) - out = out.model_copy(update={ - "advisories": raw_advisories, - "hints_structured": _to_structured_hints(raw_struct), - }) - _resolve_assert_invariants(out) - return out - - -def resolve_v2( - identifier: str, - hint_kind: Literal["symbol", "route", "client", "producer"] | None = None, - graph: LadybugGraph | None = None, -) -> ResolveOutput: - try: - trimmed, err = _resolve_validate_identifier(identifier) - if err is not None: - out = ResolveOutput( - success=False, - status="none", - message=err, - advisories=[], - resolved_identifier=None, - ) - _resolve_assert_invariants(out) - return out - - assert trimmed is not None - if "*" in trimmed or "?" in trimmed: - out = ResolveOutput( - success=False, - status="none", - message=( - "Wildcards (* and ?) are not supported in resolve; " - "use search(query=...) for ranked text search." - ), - advisories=[], - resolved_identifier=trimmed, - ) - _resolve_assert_invariants(out) - return out - - g = graph or LadybugGraph.get() - raw: list[tuple[NodeRef, ResolveReason, int]] = [] - for kind in _resolve_kinds_to_search(hint_kind): - if kind == "symbol": - raw.extend(_resolve_symbol_candidates(g, trimmed)) - elif kind == "route": - raw.extend(_resolve_route_candidates(g, trimmed)) - elif kind == "client": - raw.extend(_resolve_client_candidates(g, trimmed)) - else: - raw.extend(_resolve_producer_candidates(g, trimmed)) - - deduped = _resolve_dedupe_candidates(raw) - ranked = _resolve_rank_candidates(deduped) - capped = ranked[:_RESOLVE_CANDIDATE_CAP] - return _resolve_finalize_success(trimmed, hint_kind, capped) - except Exception as exc: - out = ResolveOutput( - success=False, - status="none", - message=str(exc), - advisories=[], - resolved_identifier=None, - ) - _resolve_assert_invariants(out) - return out # Per-edge-type attribute columns selected by the generic (flat-label) neighbors @@ -2079,3 +1576,12 @@ def neighbors_v2( raise except Exception as exc: return NeighborsOutput(success=False, message=str(exc), advisories=[], requested_edge_types=[]) + + +# Re-export resolve symbols from resolve_service.py (imported here to avoid circular import) +from resolve_service import ( # noqa: E402 + ResolveCandidate, + ResolveOutput, + ResolveStatus, + resolve_v2, +) diff --git a/plans/active/PLAN-JRAG-CLI.md b/plans/active/PLAN-JRAG-CLI.md new file mode 100644 index 00000000..6cedc2c5 --- /dev/null +++ b/plans/active/PLAN-JRAG-CLI.md @@ -0,0 +1,1171 @@ +# Plan: JRAG CLI — agent-facing command-line interface + +Status: **active (planning)**. This plan implements +[`propose/JRAG-CLI-PROPOSE.md`](../../propose/JRAG-CLI-PROPOSE.md). + +> **Grounded against current source (2026-07-04), then adversarially reviewed +> by a 5-subagent fan-out.** Every backend function, line range, and packaging +> claim was verified against `master` AND pressure-tested. The review caught 6 +> blockers and ~7 highs that are folded in below (see "Revision log"). The +> proposal should be relocated to `propose/active/` to match `AGENTS.md` hygiene +> (out of scope for this plan; tracked at the end). + +Depends on: nothing external. PR-JRAG-0a and PR-JRAG-0b are independent prep +refactors that unblock PR-JRAG-5 and PR-JRAG-1a respectively. + +## Revision log (from the 5-reviewer fan-out) + +Folded corrections (each verified against source): +- `outline` used `start_line=0` → `find_symbols_in_file_range` returns `[]` + (guard rejects `<1`); now `start_line=1`. +- `overrides` (dispatch UP) was mapped to `override_axis_traversal_for`, which + dispatches DOWN; now `neighbors_v2(out, ["OVERRIDES"])`. +- `overridden-by` was mapped to `override_axis_rollup_for`, which returns counts, + not nodes; now `neighbors_v2(in, ["OVERRIDES"])` (= virtual `OVERRIDDEN_BY` out). +- `--offset` was global but **no `LadybugGraph` method takes `offset`** (only + `find_v2`/`search_v2`/`neighbors_v2` do); now scoped to `find`/`search` only. +- PR-JRAG-5 now updates `tests/test_agent_skills_static.py` (hardcoded skill-dir + set) and `run_update`'s tuple unpacking (return-type change). +- **Reversed the `resolve_operator_config` avoidance** — it does NOT import + cocoindex (verified `config.py:387-465`); reusing it + `apply_to_os_environ()` + is required for `search` to load the YAML-configured embedding model. +- Added `raise_fd_limit()` to `main`, pydantic→dict `model_dump()` at the envelope + boundary, missing-index/ontology error envelopes, traceback-to-stderr. +- `find_route_callers`'s `microservice`/`method` kwargs are no-ops once `route_id` + is set and it has no `limit` → `--service` is now a client-side post-filter + + warning; truncation via client-side slice. +- Enum normalization now uses explicit lookup tables for `client_kind`/ + `producer_kind`/`source_layer` (their backend literals are lowercase snake + + suffix, not UPPER_SNAKE). +- `callees` Producer target is `:Route` (`kafka_topic`), not `:Producer`. +- `flow` outbound intra-service is an index-time data property, not a query + guarantee — reworded + the test validates the fixture's CALLS edges. +- PR-JRAG-1 and PR-JRAG-3 were each ~2× a reviewable size → split into 1a/1b and + 3a/3b (9 PRs total). + +## Goal + +- Ship a new `jrag` console script (separate from the `java-codebase-rag` + operator CLI) that gives an AI coding agent **one command per engineering + intent**, taking human-readable identifiers (FQN / simple name / route path / + topic) and never raw node IDs. +- Make every common agent task achievable in one call by **internalizing + resolve** (`resolve_v2`) as the first step of every ``-accepting + command, mapping its `one` / `many` / `none` contract onto a single output + envelope. +- Build the CLI as a **thin compose-and-render layer** over the existing + backend — `resolve_v2`, the MCP v2 handlers (`find_v2` / `search_v2` / + `describe_v2` / `neighbors_v2`), `LadybugGraph` query methods, and + `run_search`. No backend query logic is reimplemented. +- v1 loads the index **in-process** per call (no daemon), reusing the operator's + index directory and config resolver. + +## Principles (do not relitigate in review) + +These were locked during the propose (`propose/JRAG-CLI-PROPOSE.md` §1, §2, §10). +If a reviewer wants to revisit one, they revisit the propose, not this plan. + +- **Names in, names out; resolve-first.** Every traversal/inspect command takes + a ``; `resolve_v2` runs internally. Raw node IDs are never required or + accepted. On `many` → return candidates and stop; on `none` → `not_found`. + Auto-pick is forbidden. +- **Disambiguation flags narrow resolve, post-filter not push-down.** `--kind` + maps to `resolve_v2`'s `hint_kind` (a true resolve input). `--java-kind`, + `--role`, `--fqn-prefix` are **client-side post-filters** on resolve's + node/candidate set — `resolve_v2(identifier, hint_kind, graph)` takes nothing + else (`mcp_v2.py:1487`). If a post-filter collapses `many`→`one`, proceed; if + it still leaves `many`, return the narrowed candidates. +- **Reuse, do not reimplement.** `find` → `find_v2` (`mcp_v2.py:990`); `search` + → `search_v2` (`mcp_v2.py:907`); `inspect` → `describe_v2` (`mcp_v2.py:1088`); + `callees` for Client/Producer → `resolve_v2` + `neighbors_v2(direction="out", + edge_types=["HTTP_CALLS"|"ASYNC_CALLS"])` (`mcp_v2.py:1732`); `dependencies` + → `neighbors_v2(direction="out", edge_types=["INJECTS"])`; `overrides` → + `neighbors_v2(out, ["OVERRIDES"])`; `overridden-by` → `neighbors_v2(in, + ["OVERRIDES"])`. Traversal commands with no composed path call the + `LadybugGraph` method directly. **Config resolution reuses + `resolve_operator_config` + `apply_to_os_environ`** (see Architecture). +- **`neighbors` is removed as a surface concept.** Every edge traversal gets a + named engineering command. Agents never pass `direction` / `edge_types`. +- **One envelope; text default; JSON opt-in.** Default rendering is compact text; + `--format json` emits the envelope verbatim. This is a **deliberate divergence** + from the operator CLI's `sys.stdout.isatty()` heuristic + (`java_codebase_rag/cli.py:218-220` → pprint-when-TTY / JSON-when-piped, no + flag); `jrag` is agent-facing (non-TTY), so text-default-with-flag is the new + convention. +- **`--help` is the spec.** Names guessable, grouped; flag/kind contradictions + hard-error (`status: error`); inapplicable flags never silently ignored. +- **No ontology bump, no re-index** (`ontology_version` stays 17). **No daemon + in v1.** **No cocoindex dependency** (the CLI never imports cocoindex; config + resolution reuses the path layer, which is cocoindex-free). + +## Architecture (where the CLI lives) + +- **CLI module(s): inside the existing `java_codebase_rag` package**, as sibling + modules to `cli.py`. Rationale: `java_codebase_rag` is already the one shipped + package (`pyproject.toml:61`), so adding `.py` files inside it ships them with + **zero packaging change** beyond one `[project.scripts]` line. Modules: + - `java_codebase_rag/jrag.py` — argparse builder, `main(argv)`, and + `_console_script_main()` (the `os._exit` wrapper the operator CLI uses at + `cli.py:1031` — `jrag` loads lancedb + ladybug, so it needs the same wrapper). + - `java_codebase_rag/jrag_envelope.py` — the `Envelope` dataclass, the + resolve-first mapper, enum normalization (+ lookup tables), the +1-fetch + `truncated` helper, and the pydantic→dict boundary. + - `java_codebase_rag/jrag_render.py` — text rendering. Built fresh + (`cli_format.py` is styling-primitives only — glyphs + ANSI, no renderers). + - `java_codebase_rag/jrag_hints.py` — the **net-new** edge-label → CLI-command + mapper for `agent_next_actions` (PR-JRAG-4). +- **Extracted resolve module: at repo root** as `resolve_service.py`, sibling to + `mcp_v2.py` (PR-JRAG-0b). Shipped via `py-modules`. `mcp_v2.py` re-exports + `resolve_v2` / `ResolveOutput` / `ResolveCandidate` / `ResolveStatus`. +- **Index + config resolution reuses the operator's resolver, exactly.** Call + `resolve_operator_config(source_root=, cli_index_dir=args.index_dir)` + (same as `_resolved_from_ns` at `cli.py:237-244`), then `cfg.apply_to_os_environ()` + — this sets `SBERT_MODEL` so `jrag search` loads the YAML-configured embedding + model, not the default (without it, `run_search` reads the default model via + `resolved_sbt_model_for_process_env`, `config.py:120-129` → silently wrong + results). Pass `cfg.ladybug_path` to `LadybugGraph.get(...)`. **Verified + cocoindex-free**: `resolve_operator_config` (`config.py:387-465`) only builds a + `cocoindex.db` Path string; it never imports cocoindex. (The earlier "may pull + cocoindex glue" rationale was wrong and is deleted.) +- **`main()` robustness:** first line `raise_fd_limit()` (from + `java_codebase_rag._fdlimit`; the operator `main()` does this at `cli.py:1004` + — lancedb's merge-insert opens many handles and macOS GUI/IDE soft limit is + 256). `_load_graph` calls `LadybugGraph.exists(ladybug_path)` first; on `False` + → `status: error, message="No index at . Run: java-codebase-rag init + --source-root "`; wraps `LadybugGraph.get()` in `try/except RuntimeError` + → ontology-mismatch rebuild hint (`ladybug_queries.py:372-378`). The top-level + handler emits the `status: error` envelope to stdout AND + `traceback.format_exc()` to stderr before returning 2 (the operator CLI + swallows tracebacks — `cli.py:1024-1028` — do NOT copy that). +- **Pydantic→dict boundary:** every backend handler returns pydantic v2 models + (`FindOutput`, `DescribeOutput`, `NeighborsOutput`, `ResolveOutput`, + `SearchOutput`). The envelope holds plain `dict` (`nodes: dict[str, dict]`, + `edges: list[dict]`, `candidates: list[dict]`). Conversion is via + `.model_dump()` **at the envelope boundary, once**; the renderer and + `to_json()` operate on dicts only. +- **Lazy imports:** `ladybug_queries`, `mcp_v2`, `search_lancedb`, + `resolve_service`, and `resolve_operator_config` are imported **inside command + handlers**, and `build_parser()` imports no backend modules at all — so + `jrag --help` stays fast (matches `cli.py:1-4`, `build_parser` at `:796`). PR-4 + pins this with a `sys.modules` sentinel. + +## PR breakdown - overview + +| PR | Scope | Ontology bump | Areas of concern | Test buckets | Independent of | +| --- | --- | --- | --- | --- | --- | +| **PR-JRAG-0a** | Single source of truth for shipped skill/agent docs: `scripts/sync_agent_artifacts.py` syncs the **shipped subtrees only** (`skills/explore-codebase/` + `agents/*.md`) and asserts equality; drift test gates it. | none | `skills/README.md` is dev-only (NOT in `install_data` today) — the sync must mirror only what `package-data` ships, or it will copy+ship README.md. Publish is manual — sync runs from the publish runbook. | `tests/test_install_data_sync.py` | — | +| **PR-JRAG-0b** | Extract `resolve_v2` + its pipeline + resolve-only models into root `resolve_service.py`; `mcp_v2.py` re-exports. | none | `NodeRef` (`mcp_v2.py:449`) is shared by `Edge.other` (and constructed-from in `describe_v2`) — it STAYS in `mcp_v2.py`; only resolve-specific models move. Gate: existing resolve tests in `test_mcp_v2.py` + `test_mcp_hints.py`. | `tests/test_resolve_service.py` | — | +| **PR-JRAG-1a** | Entry point + envelope + render foundation + resolve-first + `status`. The frozen contract every later PR depends on. | none | Defines the envelope + resolve-first + render contract. `+1-fetch` truncated; enum lookup tables (client_kind/producer_kind/source_layer are lowercase-snake+suffix, NOT UPPER_SNAKE); pydantic→dict `model_dump()` boundary; `resolve_operator_config` + `apply_to_os_environ` + `raise_fd_limit` + missing-index error envelopes; `--offset` is NOT global. | `test_jrag_envelope.py`, `test_jrag_render.py`, `test_jrag_status.py` | PR-JRAG-0b | +| **PR-JRAG-1b** | `find` (query-mode via `find_by_name_or_fqn` + `--fuzzy` fallback; filter-mode via `find_v2` + `NodeFilter`; kind-inference; contradiction-error; **`--offset` supported here**) + `inspect` (`describe_v2` + `edge_summary`). | none | `find` has two modes (positional `` vs pure flags) — different backends; `--limit` effectively capped at 499 (so `limit+1` fits the 500 backend clamp); `NodeRef` has no `name` → renderer derives it from FQN. Both wire a no-op `next_actions` hook for PR-4. | `test_jrag_locate.py` | PR-JRAG-1a | +| **PR-JRAG-2** | Listing tier: `routes`, `clients`, `producers`, `topics`, `jobs`, `listeners`, `entities` + globals. | none | `--offset` NOT supported (no offset param on `list_*`); `topics` are `:Producer` rows (no `:Topic` node) and `--consumer-in` resolves via `neighbors_v2(producer_ids, "in", ["ASYNC_CALLS"])`; enum lookup tables; truncated via +1-fetch (cap 499). | `test_jrag_listing.py` | PR-JRAG-1a | +| **PR-JRAG-3a** | Direct-backend traversals: `callers`, `callees`(symbol), `hierarchy`, `implementations`, `subclasses`, `overrides`, `overridden-by`, `dependents`, `impact`, `decompose`, `flow`. | none | `--offset` NOT supported; `overrides`/`overridden-by` via `neighbors_v2` (not the rollup/traversal fns — those go the wrong way / return counts); `find_route_callers` `--service` is a client-side post-filter + warning (kwarg ignored once `route_id` set) and has no `limit` → client-side slice; `flow` intra-service is an index-time data property; `--include-external` symmetric on callers+callees. | `test_jrag_traversal_direct.py` | PR-JRAG-1a | +| **PR-JRAG-3b** | Compose commands + file inspection: `callees`(client/producer via `resolve_v2`+`neighbors_v2`), `dependencies` (`neighbors_v2` out INJECTS), `connection` (microservice positional; `--inbound`/`--outbound`/`--both`/`--http-method`/`--calls-service`), `outline` (`find_symbols_in_file_range`, `start_line=1`), `imports` (tree-sitter `import_declaration` + `resolve_v2`). | none | `callees` Producer target is `:Route` (`kafka_topic`), not `:Producer`; `outline`/`imports` have no `limit` → documented unbounded; `connection` first positional is a microservice name (resolve-first exception). | `test_jrag_traversal_compose.py` | PR-JRAG-3a | +| **PR-JRAG-4** | Orientation + search + `agent_next_actions` + packaging: `microservices`, `map`, `conventions`, `overview` (`--as`); `search` (`search_v2`, `--offset`, `--table all`, `--hybrid`, `--fuzzy` rejected in-handler); `jrag_hints.next_actions` (edge_summary optional, zero-direction suppression, dot-keys); wire `next_actions` into all commands; README; version bump; token-budget assertion; `build_parser` lazy sentinel. | none | `search` reuses `search_v2` (map flags → `NodeFilter`); `next_actions` is NET-NEW (`mcp_hints` maps every edge to `tool="neighbors"`); `edge_summary` is `None` for traversal roots → fall back to `result_edges`; `--fuzzy` must be registered + rejected in-handler (not argparse-exit) to yield `status: error`. | `test_jrag_orientation.py`, `test_jrag_token_budget.py` | PR-JRAG-1a, PR-JRAG-3b | +| **PR-JRAG-5** | Agent host integration: `Surface` dimension + `ArtifactManifest`; `select_surface` wizard step + `--surface mcp\|cli` (default `mcp`); marker-file `detect_configured_hosts` fix (NamedTuple return + `run_update` unpacking); surface-conditional `resolve_mcp_command` (incl. interactive prompt); ship CLI skill + subagent; update `test_agent_skills_static.py`, `AGENTS.md`, `skills/README.md`, README three-layer section. | none | Installer coupling — 4 functions + 2 tests + 3 docs touched; `deploy_artifacts`/`refresh_artifacts` gain `surface="mcp"` kw default (back-comat with 8 direct-call tests); CLI-only install must not regress `update`; depends on PR-JRAG-0a. | `tests/test_installer_surface.py` (+ updates to `tests/test_installer.py`, `tests/test_agent_skills_static.py`) | PR-JRAG-0a (hard), PR-JRAG-4 (soft) | + +Landing order: **0a → 0b → 1a → 1b → 2 → 3a → 3b → 4 → 5**. +- **0a** and **0b** are independent (different files); may land in either order. +- **1a** depends on **0b**; **1b** depends on **1a**. +- **2** and **3a** depend on **1a** (envelope/render/resolve-first); independent + of **1b** and of each other — may land in parallel after 1a. +- **3b** depends on **3a** (traversal patterns + resolve-first reuse). +- **4** depends on **1a** (hard) and **3b** (hard — `agent_next_actions` suggests + traversal commands that must exist). +- **5** depends on **0a** (hard) and **4** (soft — the CLI skill mirrors the + shipped grammar). + +## Resolved design decisions + +| Topic | Decision | +| --- | --- | +| CLI location | Inside `java_codebase_rag/` package (sibling modules to `cli.py`); entry `jrag = "java_codebase_rag.jrag:_console_script_main"`. Zero packaging change beyond the script line. | +| Resolve extraction target | Root-level `resolve_service.py` (sibling to `mcp_v2.py`); `NodeRef` stays in `mcp_v2.py`; `mcp_v2` re-exports resolve symbols. | +| Config resolution | Reuse `resolve_operator_config` + `apply_to_os_environ()` (cocoindex-free, verified); pass `cfg.ladybug_path` to `LadybugGraph.get()`. | +| Envelope model | Lean `@dataclass` (not pydantic — avoids validation overhead); backend pydantic outputs converted via `model_dump()` at the boundary; `to_json()` via `json.dumps`. Omits empty optionals. | +| `truncated` | +1-fetch trick: pass `limit+1`; `truncated = len(rows) > limit`; drop the +1th row. `total_count` / "M of N" deferred. | +| `--offset` scope | NOT global. Supported only on `find`/`search` (they route through `find_v2`/`search_v2`, which accept `offset`). Traversal/listing commands (direct `LadybugGraph` methods — none take `offset`) emit `truncated: more results — narrow your query` instead. `--limit` on find/search effectively capped at 499 (so `limit+1` fits the 500 backend clamp on `list_*`). | +| Disambiguation flags | Only `--kind` is a resolve input (`hint_kind`); `--java-kind`/`--role`/`--fqn-prefix` post-filter the resolve node/candidate set client-side. | +| `--service` push-down vs post-filter | Pushed down where the method takes `microservice` (`find_callers`, `find_callees`, `find_implementors`, `find_subclasses`, `find_injectors`, `list_*`, `trace_flow`). NOT pushed (client-side post-filter + `warnings[]`) on: `impact` (no param), `find_route_callers` (kwarg ignored once `route_id` set). | +| Enum normalization | `normalize_enum()` for `role`/`capability`/`framework`/`java_kind` (case + kebab→UPPER_SNAKE). Explicit **lookup tables** for the lowercase-snake+suffix kinds: `client_kind` (`feign`→`feign_method`, `rest-template`→`rest_template`, `web-client`→`web_client`), `producer_kind` (`kafka`→`kafka_send`, `stream-bridge`→`stream_bridge_send`), `source_layer` (`builtin`→`builtin`, `layer-a`→`layer_a_meta`, `layer-b-ann`→`layer_b_ann`, `layer-b-fqn`→`layer_b_fqn`, `layer-c`→`layer_c_source`; confirm literals against `java_ontology`/`graph_enrich` at impl). | +| Text rendering | Built fresh in `jrag_render.py`; inspect renderer sorts ALL dict keys alphabetically (snapshot stability); `simple_name(node) = node.fqn.rsplit('.', 1)[-1]` (NodeRef has no `name`); `conf:` only on CALLS-family; zero-vs-`not_found` distinct; ambiguous candidates carry `reason`. | +| `overrides` / `overridden-by` | `overrides` → `neighbors_v2([id], "out", ["OVERRIDES"])` (overrider→declaration = dispatch UP); `overridden-by` → `neighbors_v2([id], "in", ["OVERRIDES"])` (= virtual `OVERRIDDEN_BY` out). The `override_axis_*` functions are NOT used for these listings (wrong direction / counts-only); `override_axis_rollup_for` feeds `inspect`'s `edge_summary` only. | +| `agent_next_actions` | NEW mapper in `jrag_hints.py`; `next_actions(*, root, edge_summary=None, result_edges, graph)`; for traversal commands pass `edge_summary=None` (fall back to `result_edges`); for each `(label, counts)` emit only where `counts[d] > 0`; ≤5; covers dot-keys. | +| `file_location` | Populated by `resolve_query` from the resolved node's `filename` + `start_line` when `status="one"`; omitted otherwise. | +| Output format | `--format text\|json`, default `text`. New convention (diverges from operator CLI isatty). | +| Daemon / `jrag source` / raw IDs | Deferred / not shipped / never required. | + +--- + +# PR-JRAG-0a — Single source of truth for shipped agent artifacts + +**Goal:** collapse the byte-identical, hand-synced dual copies of the skill and +agent docs into **one canonical dev source** with a derived `install_data` copy, +so PR-JRAG-5 does not create four hand-synced copies when the CLI variants land. + +**Key facts (verified):** `skills/explore-codebase/SKILL.md` and +`agents/explorer-rag-enhanced.md` are byte-identical to their +`java_codebase_rag/install_data/...` counterparts. They ship via +`[tool.setuptools.package-data] "java_codebase_rag" = +["install_data/skills/**/*", "install_data/agents/**/*"]` (`pyproject.toml:85-86`) +and are read at runtime by `_read_package_artifact` +(`java_codebase_rag/installer.py:550`) via +`importlib.resources.files("java_codebase_rag.install_data")`. No build-time +generation; no `MANIFEST.in`. **`skills/README.md` exists ONLY in dev-root** (not +shipped) — the sync must mirror only the shipped subtrees, not the whole +`skills/` directory, or it will copy+ship README.md. Publishing is manual. + +## File-by-file changes + +### 1. New `scripts/sync_agent_artifacts.py` +- Sync ONLY the package-data-shipped subtrees: `skills/explore-codebase/` → + `java_codebase_rag/install_data/skills/explore-codebase/` and + `agents/*.md` → `java_codebase_rag/install_data/agents/`. Do **not** mirror + `skills/README.md` (dev-only index). +- After copying, assert every shipped destination file is byte-equal to its + source; exit non-zero with a diff on mismatch. +- `--check` mode: verify only (no copy), for CI / pre-commit. + +### 2. `.agents/skills/publish-pip/SKILL.md` — runbook update +- Insert the sync step before `.venv/bin/python -m build`: invoke + `.venv/bin/python scripts/sync_agent_artifacts.py` (fail the publish on drift). + +### 3. `tests/test_install_data_sync.py` (new) +- `test_install_data_artifacts_in_sync_with_dev_source` — `--check` passes at HEAD. +- `test_sync_script_detects_drift` — mutate a dev source byte, assert `--check` + exits non-zero and names the file; restore via tempfile shadowing. + +## Definition of done (PR-JRAG-0a) + +- [ ] `scripts/sync_agent_artifacts.py` mirrors only shipped subtrees (excludes + `skills/README.md`); `--check` passes at HEAD. +- [ ] `publish-pip` runbook invokes the sync step before `python -m build`. +- [ ] `tests/test_install_data_sync.py` present and passing. +- [ ] No change to shipped artifact contents (the shipped set is unchanged). +- [ ] `.venv/bin/ruff check .` clean. +- [ ] PR title: `chore(install): single-source agent artifacts + sync check (PR-JRAG-0a)`. + +## Implementation step list + +| # | Step | File(s) | Done when | +| --- | --- | --- | --- | +| 1 | Write `scripts/sync_agent_artifacts.py` (shipped-subtree copy + `--check`; exclude `skills/README.md`) | `scripts/sync_agent_artifacts.py` | `--check` passes at HEAD; README.md not mirrored | +| 2 | Add drift + detection tests | `tests/test_install_data_sync.py` | both pass | +| 3 | Wire sync into the publish runbook | `.agents/skills/publish-pip/SKILL.md` | step before `python -m build` | +| 4 | ruff + full suite | repo | clean + green | + +--- + +# PR-JRAG-0b — Extract `resolve_v2` into `resolve_service.py` + +**Goal:** lift the resolve pipeline out of `mcp_v2.py` into a neutral-named, +transport-agnostic root module so the CLI's resolve-first layer imports +`resolve_service` and cannot silently re-implement the pipeline. + +**Key facts (verified):** `mcp_v2.py` imports **zero MCP SDK** (only local +`mcp_hints`, plus `ladybug_queries`, `search_lancedb`, `java_ontology`, +`index_common`, `java_codebase_rag.config`). `resolve_v2(identifier, hint_kind, +graph) -> ResolveOutput` is at `mcp_v2.py:1487`; `ResolveOutput` at `:602`, +`ResolveCandidate` at `:594`, `ResolveStatus = Literal["one","many","none"]` at +`:544`. `NodeRef` (`:449`) is shared by `Edge.other` (`:488`) and constructed-from +in `describe_v2` — it stays in `mcp_v2.py`. + +## File-by-file changes + +### 1. New `resolve_service.py` (repo root) +- Move `resolve_v2` + its private pipeline (identifier parse → candidate + collectors → dedupe → rank → finalize) into this module. +- Move `ResolveOutput`, `ResolveCandidate`, `ResolveStatus` here. +- **Import** `NodeRef` from `mcp_v2` (do not move — shared by non-resolve models). + Import `ResolveReason` from `java_ontology`; `LadybugGraph` from `ladybug_queries`. + +### 2. `mcp_v2.py` — re-export + deduplicate +- `from resolve_service import resolve_v2, ResolveOutput, ResolveCandidate, ResolveStatus` + so every existing call site is unchanged. Remove the duplicated private helpers. + +### 3. `pyproject.toml` +- Add `resolve_service` to `[tool.setuptools] py-modules` (`:62-79`). + +### 4. `tests/test_resolve_service.py` (new) +- Direct-import parity tests (see below). + +## Tests for PR-JRAG-0b + +1. `test_resolve_service_importable_and_one_match` — + `from resolve_service import resolve_v2, ResolveOutput`; unique FQN → `status=="one"`. +2. `test_resolve_service_many_returns_candidates` +3. `test_resolve_service_none_is_not_found` +4. **Must-still-pass:** `tests/test_mcp_v2.py` and `tests/test_mcp_hints.py` (the + two existing resolve-symbol importers) unchanged. + +## Definition of done (PR-JRAG-0b) + +- [ ] `resolve_v2`/`ResolveOutput`/`ResolveCandidate`/`ResolveStatus` in `resolve_service.py`; `NodeRef` remains in `mcp_v2.py`. +- [ ] `mcp_v2.py` re-exports; no call site changed; still imports zero MCP SDK. +- [ ] `resolve_service` in `py-modules`. +- [ ] New + existing resolve tests green. +- [ ] `.venv/bin/ruff check .` clean. +- [ ] Sentinel: `grep -nE "^from mcp import|^import mcp|FastMCP" mcp_v2.py` returns 0. +- [ ] PR title: `refactor(resolve): extract resolve_v2 to resolve_service.py (PR-JRAG-0b)`. + +## Implementation step list + +| # | Step | File(s) | Done when | +| --- | --- | --- | --- | +| 1 | Create `resolve_service.py`; move resolve symbols + pipeline; import `NodeRef` from `mcp_v2` | `resolve_service.py`, `mcp_v2.py` | `resolve_v2` callable from new module | +| 2 | Re-export from `mcp_v2`; delete duplicated helpers | `mcp_v2.py` | existing imports still resolve | +| 3 | Add `resolve_service` to `py-modules` | `pyproject.toml` | `pip install -e .` ships it | +| 4 | Parity tests + ruff + full suite | `tests/test_resolve_service.py`, repo | green | + +--- + +# PR-JRAG-1a — Entry point + envelope/render foundation + resolve-first + status + +**Goal:** the frozen contract every later PR builds on: the `jrag` console +script, the `Envelope` + resolve-first layer + text renderer, the config/index +loader (with error envelopes), and `status`. No command logic beyond `status`. + +## File-by-file changes + +### 1. `java_codebase_rag/jrag_envelope.py` (new) +- `@dataclass class Envelope`: `status` (`Literal["ok","ambiguous","not_found","error"]`), + `nodes: dict[str, dict]`, `edges: list[dict]`, `root: str | None`, + `candidates: list[dict]`, `agent_next_actions: list[str]`, `warnings: list[str]`, + `truncated: bool`, `file_location: str | None`. `to_dict()` omits empty + optionals; `to_json()` = `json.dumps(to_dict())`. +- `resolve_query(identifier, *, hint_kind, java_kind, role, fqn_prefix, cfg) -> + tuple[NodeRef | None, Envelope]`: + - `g = LadybugGraph.get(cfg.ladybug_path)` (caller passes the loaded graph). + - Calls `resolve_v2(identifier, hint_kind=hint_kind, graph=g)`. + - `"one"`: apply post-filters (`java_kind`/`role`/`fqn_prefix`) to the node; if + pass → `(node, env ok)` and set `env.file_location` from `node.filename` + + `node.start_line`; if fail → `(None, not_found)`. + - `"many"`: post-filter candidates; if one survives → treat as one; else + `(None, ambiguous)` with capped-at-10 candidates each carrying `reason`. + - `"none"`: `(None, not_found)` with `message` mentioning `jrag search`. +- `normalize_enum(value, *, kind)` — case+kebran→UPPER_SNAKE for + role/capability/framework/java_kind; routes client_kind/producer_kind/source_layer + through their lookup tables (see Resolved decisions). +- `mark_truncated(rows, limit) -> tuple[list, bool]` — +1-fetch helper. +- `simple_name(node_dict) -> str` — `fqn.rsplit('.', 1)[-1]`. +- `to_envelope_rows(pydantic_results)` — `.model_dump()` each (the boundary). + +### 2. `java_codebase_rag/jrag_render.py` (new) +- `render(envelope, *, fmt, noun="")` dispatches (`text` default; `json`→`to_json`). +- Shapes: `_render_listing`, `_render_traversal` (`root:` + edge rows; `conf:` + only on CALLS-family), `_render_graph` (`d=N`), `_render_inspect` (kv-block + + indented `edge_summary`, ALL keys alphabetical), `_render_ambiguous` (count + + narrowing legend + `reason`; no file/score; ≤2 `next:` hints; no auto-pick), + `_render_scalar`. +- `tiered_name(node_id, nodes)` — simple name → `name @service` → FQN (via `simple_name`). +- Zero-results: `0 @`; `not_found`: `not found: `. +- Non-offset commands: `truncated: more results — narrow your query`. Offset + commands (`find`/`search`): `truncated: more results — use --offset `. + +### 3. `java_codebase_rag/jrag.py` (new) +- `build_parser()` — argparse + subparsers (`dest="command"`). Globals per + command via a parent parser: `--service`, `--module`, `--limit` (default 20; + 10 fan-out), `--index-dir`, `--format text|json`, `--brief`, `--fields`, + `--count`/`--exists`. **`--offset` is added ONLY to `find`/`search` subparsers + (PR-1b/PR-4), not as a global.** No backend imports at module top. +- `_load_graph(cfg) -> LadybugGraph` — `LadybugGraph.exists(cfg.ladybug_path)` + first → on False raise `_IndexNotFound` (caught in `main` → actionable + envelope); else `LadybugGraph.get(cfg.ladybug_path)` wrapped in + `try/except RuntimeError` (ontology mismatch → `_IndexStale`). +- `_resolve_cfg(args) -> ResolvedOperatorConfig` — + `cfg = resolve_operator_config(source_root=discover_project_root(Path.cwd()), + cli_index_dir=args.index_dir)`; `cfg.apply_to_os_environ()`; return `cfg`. + (Lazy import of `resolve_operator_config` + `discover_project_root`.) +- `main(argv=None) -> int` — first line `raise_fd_limit()`; parse; dispatch; the + top-level handler emits `status: error` envelope to stdout AND + `traceback.format_exc()` to stderr; returns 2 on error / 1 on usage / 0 on ok. +- `_console_script_main()` — `os._exit(main())` wrapper. +- `status` command — `cfg` + `LadybugGraph`; render `meta()` + counts (ontology + version, index dir, freshness, loaded counts, source root). + +### 4. `pyproject.toml` +- Add `[project.scripts]` `jrag = "java_codebase_rag.jrag:_console_script_main"`. + +### 5. `README.md` — preview subsection (`## jrag (agent CLI, preview)`). + +## Tests for PR-JRAG-1a + +`tests/test_jrag_envelope.py`: +1. `test_envelope_to_dict_omits_empty_optionals` +2. `test_pydantic_results_converted_via_model_dump` — pass a pydantic `NodeRef`, + assert the envelope holds a plain dict. +3. `test_resolve_query_one_proceeds_and_sets_file_location` +4. `test_resolve_query_many_returns_candidates_with_reason` +5. `test_resolve_query_many_post_filter_collapses_to_one` +6. `test_resolve_query_none_is_not_found_with_search_hint` +7. `test_normalize_enum_role_uppercase` (`controller`/`Controller`/`CONTROLLER`→`CONTROLLER`) +8. `test_normalize_enum_client_kind_lookup` (`feign`→`feign_method`, `rest-template`→`rest_template`) +9. `test_normalize_enum_producer_kind_lookup` (`kafka`→`kafka_send`) +10. `test_mark_truncated_flags_and_clips` + +`tests/test_jrag_render.py`: +11. `test_render_listing_omits_fqn` +12. `test_render_traversal_conf_only_on_calls` +13. `test_render_inspect_edge_summary_alphabetical` +14. `test_render_ambiguous_lists_reason_no_file` +15. `test_render_zero_results_vs_not_found_distinct` +16. `test_render_truncated_narrow_query_for_non_offset_commands` +17. `test_render_truncated_offset_hint_for_offset_commands` +18. `test_render_json_emits_envelope_verbatim` +19. `test_simple_name_derived_from_fqn` (NodeRef has no `name`) + +`tests/test_jrag_status.py`: +20. `test_status_reports_ontology_version_and_counts` (ontology 17). +21. `test_missing_index_returns_actionable_error` — point at an empty dir → + `status: error`, message mentions `java-codebase-rag init`. +22. `test_offset_is_not_a_global_flag` — `jrag callers --offset 5` → usage error + (offset not registered on traversal commands). + +Plus one subprocess smoke test: `.venv/bin/jrag status` exits 0; `.venv/bin/jrag --help` +completes and lists `status`. + +## Definition of done (PR-JRAG-1a) + +- [ ] `jrag.py`/`jrag_envelope.py`/`jrag_render.py` present; `[project.scripts] jrag` added. +- [ ] `resolve_operator_config` + `apply_to_os_environ` reused; `raise_fd_limit()` in `main`. +- [ ] Missing-index + ontology-mismatch → actionable `status: error` envelopes; + top-level handler logs traceback to stderr. +- [ ] Pydantic→dict `model_dump()` boundary; envelope omits empty optionals. +- [ ] Enum lookup tables for client_kind/producer_kind/source_layer. +- [ ] `--offset` is NOT global (only find/search get it later). +- [ ] All named tests green; full suite green; `jrag --help` fast. +- [ ] Sentinels: `grep -nE "^from mcp import|^import mcp" java_codebase_rag/jrag*.py` → 0; + `grep -n "import cocoindex\|java_index_flow_lancedb" java_codebase_rag/jrag*.py` → 0; + `python -c "import java_codebase_rag.jrag as j; j.build_parser()"` imports no torch/sentence_transformers (check `sys.modules`). +- [ ] PR title: `feat(cli): jrag entry point + envelope/render + status (PR-JRAG-1a)`. + +## Implementation step list + +| # | Step | File(s) | Done when | +| --- | --- | --- | --- | +| 1 | Envelope dataclass + model_dump boundary + lean-omit | `jrag_envelope.py` | tests 1–2 pass | +| 2 | `resolve_query` + post-filter + `file_location`; `normalize_enum`+tables; `mark_truncated`; `simple_name` | `jrag_envelope.py` | tests 3–10 pass | +| 3 | Renderer (all shapes + tiered_name + truncated variants) | `jrag_render.py` | tests 11–19 pass | +| 4 | `jrag.py`: parser (no global offset) + `_resolve_cfg` (reuse operator config) + `_load_graph` (exists + error envelopes) + `main` (`raise_fd_limit`, stdout+stderr handler) + `_console_script_main` + `status` | `jrag.py` | tests 20–22 pass | +| 5 | `[project.scripts] jrag`; README preview | `pyproject.toml`, `README.md` | `jrag --help` works; link resolves | +| 6 | ruff + full suite + subprocess smoke + sentinels | repo | clean + green | + +--- + +# PR-JRAG-1b — `find` + `inspect` + +**Goal:** the first real commands, over `find_v2` / `find_by_name_or_fqn` / +`describe_v2`. Both leave a no-op `next_actions` hook for PR-4. + +## File-by-file changes + +### 1. `java_codebase_rag/jrag.py` — add `find` + `inspect` +- `find` has **two modes**: + - **Query mode** (positional ``): call + `g.find_by_name_or_fqn(query, kinds=, module=..., microservice=..., limit=limit+1)`. + `--fuzzy` enables a jrag-side fallback (exact → prefix → contains on the + identifier string) if exact returns nothing (NOT semantic; `find_by_name_or_fqn` + has no fuzzy param). `--role`/`--java-kind`/`--exclude-role`/`--annotation`/ + `--capability`/`--framework`/`--source-layer` post-filter the rows. + - **Filter mode** (no positional): build a `NodeFilter` from flags and call + `find_v2(kind, filter, limit=limit+1, offset=args.offset, graph=g)`. + - **Kind inference** (when `--kind` omitted): `--http-method`/`--path-prefix`⇒route, + `--client-kind`/`--calls-service`/`--calls-path-prefix`⇒client, + `--producer-kind`/`--topic-prefix`⇒producer, else symbol. A domain flag + contradicting explicit `--kind` → `status: error` naming the pair. + - `--offset` IS supported (passes to `find_v2`); render offset-hint truncated. + `--limit` effectively capped at 499. + - Flag→`NodeFilter`/post-filter map (all proposal §5 find flags handled): + `--role`→role, `--exclude-role`→exclude_roles, `--annotation`→annotation, + `--capability`→capability, `--fqn-prefix`→fqn_prefix, `--java-kind`→symbol_kind, + `--framework`→framework, `--source-layer`→source_layer, `--http-method`→http_method, + `--path-prefix`→path_prefix, `--client-kind`→client_kind, `--calls-service`→target_service, + `--calls-path-prefix`→target_path_prefix, `--producer-kind`→producer_kind, + `--topic-prefix`→topic_prefix. +- `inspect ` — `resolve_query(...)`; on one, `describe_v2(id=node.id, + graph=g)`; place `NodeRecord.model_dump()` (incl. `edge_summary`) in `nodes`; + render inspect. Call `next_actions_hook(...)` (no-op stub for now). +- Both call `next_actions_hook(envelope, root, edge_summary=None, result_edges=...)` + defined as a no-op in `jrag_envelope` (PR-4 fills it). + +## Tests for PR-JRAG-1b + +`tests/test_jrag_locate.py` (bank-chat fixture): +1. `test_find_by_fqn_exact` (query mode) +2. `test_find_filter_mode_by_role` (filter mode, `--role controller`) +3. `test_find_by_capability` (`--capability scheduled-task`, symbol inferred) +4. `test_find_kind_inference_from_http_method` (route inferred) +5. `test_find_kind_contradiction_is_error` (`--kind symbol --http-method GET`) +6. `test_find_fuzzy_falls_back_to_prefix` +7. `test_find_annotation_flag_filters` +8. `test_find_exclude_role_flag_filters` +9. `test_find_offset_paginates` (`--offset` works on find) +10. `test_find_limit_capped_under_500` (`--limit 600` → behaves as ≤499) +11. `test_inspect_returns_edge_summary_with_composed_keys` (`OVERRIDDEN_BY` virtual key) +12. `test_inspect_ambiguous_returns_candidates` +13. `test_inspect_populates_file_location` + +## Definition of done (PR-JRAG-1b) + +- [ ] `find` (both modes) + `inspect` implemented; all §5 find flags mapped. +- [ ] `--offset` on `find`; `--limit` cap-at-499 documented. +- [ ] `next_actions_hook` stub present (no-op). +- [ ] All named tests green; full suite green. +- [ ] `.venv/bin/ruff check .` clean. +- [ ] PR title: `feat(cli): jrag find + inspect (PR-JRAG-1b)`. + +## Implementation step list + +| # | Step | File(s) | Done when | +| --- | --- | --- | --- | +| 1 | `find` query mode (`find_by_name_or_fqn` + `--fuzzy` fallback + post-filters) | `jrag.py` | tests 1,6,7,8 pass | +| 2 | `find` filter mode (`find_v2` + NodeFilter + kind-inference + contradiction + offset) | `jrag.py` | tests 2–5,9,10 pass | +| 3 | `inspect` (`describe_v2` + edge_summary + file_location) + `next_actions_hook` stub | `jrag.py`, `jrag_envelope.py` | tests 11–13 pass | +| 4 | ruff + full suite | repo | clean + green | + +--- + +# PR-JRAG-2 — Listing tier + +**Goal:** all-nodes-of-a-kind commands. Globals except `--offset` (not supported +here — `list_*` methods take no offset). + +## File-by-file changes + +### 1. `java_codebase_rag/jrag.py` — listing subcommands +Each builds kwargs, calls the `LadybugGraph` method with `limit+1` (capped so +`limit+1 ≤ 500`), `mark_truncated`, renders listing. Enum flags via lookup tables. +- `routes` → `g.list_routes(microservice=..., framework=..., path_prefix=..., method=..., limit=...)`. +- `clients` → `g.list_clients(microservice=..., client_kind=..., target_service=<--calls-service>, path_prefix=..., limit=...)`. +- `producers` → `g.list_producers(microservice=..., producer_kind=..., topic_prefix=..., limit=...)`. +- `topics` → group `list_producers(topic_prefix=...)` by topic name. `--producer-in` + scopes producers by their `microservice`; `--consumer-in ` resolves + consumers via `neighbors_v2(producer_ids, direction="in", edge_types=["ASYNC_CALLS"])` + across producers sharing the topic, filtered to ``. (No `:Topic` node.) +- `jobs` → `g.list_by_capability("SCHEDULED_TASK", ...)`. +- `listeners` → `g.list_by_capability("MESSAGE_LISTENER", ...)` + optional `--topic-prefix`. +- `entities` → `g.list_by_role("ENTITY", ...)`. + +## Tests for PR-JRAG-2 + +`tests/test_jrag_listing.py`: +1. `test_routes_returns_route_kind` +2. `test_clients_filters_by_calls_service` +3. `test_producers_filter_by_topic_prefix` +4. `test_topics_groups_producers_by_topic` (no `:Topic` node assumed) +5. `test_topics_consumer_in_uses_neighbors_in_async_calls` +6. `test_jobs_lists_scheduled_task` +7. `test_listeners_lists_message_listener` +8. `test_entities_lists_entity_role` +9. `test_listing_service_scope_pushes_down` +10. `test_listing_truncated_fires_at_limit` (+1-fetch) +11. `test_listing_client_kind_enum_lookup` (`--client-kind feign` → `feign_method`) +12. `test_listing_rejects_offset` (`--offset` not registered → usage error) + +## Definition of done (PR-JRAG-2) + +- [ ] All 7 listing commands; globals supported; `--offset` rejected. +- [ ] `topics --consumer-in` via `neighbors_v2(in, ASYNC_CALLS)`; client/producer + kinds via lookup tables. +- [ ] All named tests green; full suite green. +- [ ] `.venv/bin/ruff check .` clean. +- [ ] PR title: `feat(cli): jrag listing tier (PR-JRAG-2)`. + +## Implementation step list + +| # | Step | File(s) | Done when | +| --- | --- | --- | --- | +| 1 | 7 listing subcommands + flags (no offset) | `jrag.py` | tests 1–4,6–9 pass | +| 2 | `topics` producer-grouped + `--consumer-in` via neighbors_v2 | `jrag.py` | tests 4,5 pass | +| 3 | Enum lookups + truncation + offset-rejection | `jrag.py` | tests 10–12 pass | +| 4 | ruff + full suite | repo | clean + green | + +--- + +# PR-JRAG-3a — Direct-backend traversals + +**Goal:** the traversals that call `LadybugGraph` methods (or `neighbors_v2` for +the override axis) directly. `--offset` NOT supported. + +## File-by-file changes + +### 1. `java_codebase_rag/jrag.py` — traversal subcommands +Each: `resolve_query(...)` → on one, call backend → envelope → render. `--limit` +via +1-fetch where the method takes `limit`; client-side slice otherwise. +- `callers` — Symbol → `g.find_callers(node.fqn, *, depth, limit=limit+1, + min_confidence, exclude_external=not --include-external, module, microservice)`; + Route → `g.find_route_callers(route_id=node.id)` then **client-side** filter by + `--service` on `RouteCaller.caller_microservice` (+ `warnings[]` like `impact`) + and client-side slice for truncation (no backend `limit`). +- `callees` (Symbol) → `g.find_callees(...)` (`--include-external` symmetric). +- `hierarchy` → `neighbors_v2([id], "in", ["EXTENDS","IMPLEMENTS"])` + `"out"`; + render `↑`/`↓` tree. +- `implementations` → `g.find_implementors(node.fqn, *, microservice, module, + limit=limit+1)`; `--capability` is a **client-side post-filter** on returned + implementors' capabilities (the method has no `capability` kwarg). +- `subclasses` → `g.find_subclasses(...)`. +- `overrides` → `neighbors_v2([id], "out", ["OVERRIDES"])` (dispatch UP: + overrider→declaration). +- `overridden-by` → `neighbors_v2([id], "in", ["OVERRIDES"])` (= virtual + `OVERRIDDEN_BY` out; dispatch DOWN). +- `dependents` → `g.find_injectors(node.fqn, *, microservice, module, limit=limit+1)`. +- `impact` → `g.impact_analysis(node.fqn, *, depth, limit=limit+1)`; `--service` + client-side post-filter + `warnings[]`. +- `decompose` → `g.trace_flow(seed_fqns=[node.fqn], *, depth=clamp(1..3), + follow_calls=--follow-calls, stage_limit=--max-stage, microservice, module, + min_call_confidence, exclude_external)`; role-waterfall render. +- `flow` → requires Route root; `g.trace_request_flow(entry_route_id=node.id, + max_hops=clamp(1..8))`. Inbound = cross-service callers; outbound follows CALLS + hops. **Intra-service is an index-time property** (CALLS edges are intra-service + by construction; the query has no microservice predicate) — the test validates + the fixture's data, not a query constraint. +- All call `next_actions_hook(...)` (no-op stub until PR-4). + +## Tests for PR-JRAG-3a + +`tests/test_jrag_traversal_direct.py`: +1. `test_callers_symbol_uses_find_callers` +2. `test_callers_route_service_is_post_filter_with_warning` (`--service` filters + client-side + emits warning; not pushed down) +3. `test_callees_symbol_uses_find_callees` +4. `test_callers_and_callees_support_include_external` (symmetric) +5. `test_hierarchy_renders_tree_both_directions` +6. `test_implementations_uses_find_implementors` +7. `test_implementations_capability_post_filter` +8. `test_subclasses_uses_find_subclasses` +9. `test_overrides_dispatches_up_via_neighbors_out_overrides` +10. `test_overridden_by_dispatches_down_via_neighbors_in_overrides` +11. `test_dependents_uses_find_injectors` +12. `test_impact_runs_fleet_wide_without_service` +13. `test_impact_service_post_filter_emits_warning` +14. `test_decompose_renders_role_waterfall` +15. `test_flow_outbound_intra_service_on_fixture` (validates fixture CALLS edges) +16. `test_traversal_resolve_ambiguous_stops` +17. `test_traversal_rejects_offset` + +## Definition of done (PR-JRAG-3a) + +- [ ] 11 direct traversals implemented; `overrides`/`overridden-by` via `neighbors_v2`. +- [ ] `--service` post-filter + warning on `callers` Route and `impact`; + `--include-external` symmetric; `--capability` post-filter on `implementations`. +- [ ] `--offset` rejected; `flow` intra-service framed as a data property. +- [ ] All named tests green; full suite green. +- [ ] `.venv/bin/ruff check .` clean. +- [ ] PR title: `feat(cli): jrag direct-backend traversals (PR-JRAG-3a)`. + +## Implementation step list + +| # | Step | File(s) | Done when | +| --- | --- | --- | --- | +| 1 | Symbol/Route callers + callees-symbol + include-external + route-caller post-filter | `jrag.py` | tests 1–4 pass | +| 2 | hierarchy + implementations(+cap) + subclasses | `jrag.py` | tests 5–8 pass | +| 3 | overrides/overridden-by (neighbors_v2) + dependents | `jrag.py` | tests 9–11 pass | +| 4 | impact (+post-filter) + decompose + flow (data-property framing) | `jrag.py` | tests 12–15 pass | +| 5 | resolve-stop + offset-rejection + next_actions hook | `jrag.py` | tests 16,17 pass | +| 6 | ruff + full suite | repo | clean + green | + +--- + +# PR-JRAG-3b — Compose commands + file inspection + +**Goal:** the `neighbors_v2`-compose traversals + `connection` + `outline`/`imports`. + +## File-by-file changes + +### 1. `java_codebase_rag/jrag.py` +- `callees` — Symbol (handled in 3a); Client → `resolve_v2` gave the node → + `neighbors_v2([node.id], "out", ["HTTP_CALLS"], limit=limit+1, graph=g)` reaching + the `:Route`; Producer → `neighbors_v2([...], "out", ["ASYNC_CALLS"])` reaching + the `:Route` (`kafka_topic`) that consumes this producer's topic. `--include-external`. +- `dependencies` → `neighbors_v2([node.id], "out", ["INJECTS"], limit=limit+1, graph=g)`. +- `connection ` — first positional is a microservice NAME (resolve-first + exception; documented loudly). `--inbound` (default), `--outbound`, `--both`; + `--http-method` (filter routes); `--calls-service`. Inbound: clients/producers + targeting this service (`list_clients(target_service=...)` + producers whose + ASYNC_CALLS consumers are external) + `find_route_callers` for hit routes. + Outbound: this service's clients/producers and the routes/topics they call. + Render `inbound:`/`outbound:` sections. +- `outline ` → `find_symbols_in_file_range(graph=g, filename=file, + start_line=1, end_line=2**31-1)` (1-based; `<1` returns `[]`). Documented + unbounded (no `limit`). +- `imports ` → tree-sitter Java parse (`ast_java` grammar); walk + `import_declaration` nodes (cf. `_import_declaration_is_static`, `ast_java.py:905`); + resolve each imported FQN via `resolve_v2`; render with resolved node refs. +- All call `next_actions_hook(...)` (no-op stub until PR-4). + +## Tests for PR-JRAG-3b + +`tests/test_jrag_traversal_compose.py`: +1. `test_callees_client_reaches_route_via_http_calls` (Client root → `:Route`) +2. `test_callees_producer_reaches_route_topic_via_async_calls` (Producer root → `:Route` of `kafka_topic`) +3. `test_dependencies_composes_neighbors_out_injects` +4. `test_connection_inbound_lists_external_callers` +5. `test_connection_outbound_lists_this_service_clients` +6. `test_connection_both_default` +7. `test_connection_http_method_filter` +8. `test_connection_first_positional_is_microservice_not_query` +9. `test_outline_lists_file_symbols` (`start_line=1`) +10. `test_outline_empty_for_missing_file` (graceful, not crash) +11. `test_imports_resolves_graph_nodes` +12. `test_outline_and_import_reject_offset_or_document_unbounded` + +## Definition of done (PR-JRAG-3b) + +- [ ] `callees` Client/Producer + `dependencies` + `connection` + `outline` + `imports`. +- [ ] `callees` Producer target documented as `:Route` (`kafka_topic`); + `outline` uses `start_line=1`; unbounded documented. +- [ ] All named tests green; full suite green. +- [ ] `.venv/bin/ruff check .` clean. +- [ ] PR title: `feat(cli): jrag compose traversals + connection + outline/imports (PR-JRAG-3b)`. + +## Implementation step list + +| # | Step | File(s) | Done when | +| --- | --- | --- | --- | +| 1 | `callees` Client/Producer (neighbors_v2 → :Route) + `dependencies` | `jrag.py` | tests 1–3 pass | +| 2 | `connection` (positional svc; inbound/outbound/both; http-method) | `jrag.py` | tests 4–8 pass | +| 3 | `outline` (start_line=1) + `imports` (tree-sitter + resolve) | `jrag.py` | tests 9–11 pass | +| 4 | ruff + full suite | repo | clean + green | + +--- + +# PR-JRAG-4 — Orientation + search + `agent_next_actions` + packaging + +**Goal:** orientation bundle, semantic search, the new edge→command hint mapper +(wired into all commands), README finalize, token-budget guard, build_parser sentinel. + +## File-by-file changes + +### 1. `java_codebase_rag/jrag.py` — orientation + search +- `microservices` → `g.microservice_counts()`. +- `map [--service] [--module]` → counts per kind per service/module. +- `conventions [--service]` → dominant roles + framework tallies. +- `overview [--as ...]` → dispatch on type: + microservice bundle / route flow / topic producers+consumers. +- `search ` → build `NodeFilter` from flags; `search_v2(query, + table=<--table>, hybrid=<--hybrid>, limit=limit+1, offset=args.offset, + path_contains=<--path-contains>, filter=filter, graph=g)`. `--table all` → + java+sql+yaml. `--offset` supported. **`--fuzzy` rejected in-handler** → + `status: error, message="search is semantic; --fuzzy is implicit"` (register + the flag, do not let argparse exit 2). + +### 2. `java_codebase_rag/jrag_hints.py` (new) +- `next_actions(*, root, edge_summary=None, result_edges, graph) -> list[str]` (≤5). + For each `(label, counts)` in `edge_summary.items()`: emit `jrag ` + for direction `d` **only when `counts[d] > 0`** (zero-suppression). Label→cmd + map: CALLS in→callers / out→callees; IMPLEMENTS in→implementations / out→hierarchy; + EXTENDS in→subclasses / out→hierarchy; INJECTS in→dependents / out→dependencies; + OVERRIDES out→overrides; OVERRIDDEN_BY in→overridden-by; HTTP_CALLS/ASYNC_CALLS + out→callees. Composed dot-keys (`DECLARES.*`, `OVERRIDDEN_BY.*`) handled via the + same label sets `mcp_hints` recognizes; canonical labels from `EDGE_SCHEMA` + (`java_ontology.py:174`). When `edge_summary is None` (traversal roots), fall + back to `result_edges` labels. De-dup; cap 5. `` from `root.fqn`. + Import `EDGE_SCHEMA` lazily inside the function (keep `build_parser` pure). + +### 3. `java_codebase_rag/jrag_envelope.py` — fill the `next_actions_hook` +- Replace the no-op stub with a call to `jrag_hints.next_actions(...)`; every + command's existing hook call now populates `envelope.agent_next_actions` + (omitted when empty). + +### 4. `README.md` — full `## jrag — agent CLI` section (replace preview). +### 5. `pyproject.toml` — version bump (release prep; manual publish out of scope). +### 6. `tests/test_jrag_token_budget.py` (new) — token-budget guard (§14). + +## Tests for PR-JRAG-4 + +`tests/test_jrag_orientation.py`: +1. `test_microservices_lists_counts` +2. `test_map_returns_non_empty_counts_per_service` +3. `test_conventions_reports_dominant_roles` +4. `test_overview_microservice_bundle` +5. `test_overview_route_uses_flow` +6. `test_overview_topic_lists_producers_and_consumers` +7. `test_overview_as_overrides_polymorphic_inference` +8. `test_search_returns_ranked_hits` +9. `test_search_hybrid_calls_hybrid_path` +10. `test_search_table_all_runs_three_tables` +11. `test_search_offset_paginates` +12. `test_search_fuzzy_rejected_in_handler_as_status_error` +13. `test_next_actions_valid_runnable_commands_capped_at_5` +14. `test_next_actions_zero_direction_suppressed` (a leaf `INJECTS in:0,out:3` → + no `jrag dependents` suggestion; `jrag dependencies` suggested) +15. `test_next_actions_covers_composed_dot_keys` (`OVERRIDDEN_BY.DECLARES_CLIENT`) +16. `test_next_actions_falls_back_to_result_edges_when_no_edge_summary` +17. `test_next_actions_omitted_when_empty` +18. `test_build_parser_imports_no_backend_modules` (`sys.modules` has no + torch/sentence_transformers/mcp_v2 after `build_parser()`) + +`tests/test_jrag_token_budget.py`: +19. `test_no_default_output_exceeds_token_ceiling` + +## Definition of done (PR-JRAG-4) + +- [ ] Orientation + `search` (offset, table all, hybrid, fuzzy-rejected) implemented. +- [ ] `jrag_hints.next_actions` ships; wired into all commands via the hook; + ≤5; zero-direction suppressed; dot-keys covered; falls back to result_edges. +- [ ] `build_parser` lazy-import sentinel green; README full section; version bumped. +- [ ] Token-budget assertion green. +- [ ] All named tests green; full suite green. +- [ ] `.venv/bin/ruff check .` clean. +- [ ] PR title: `feat(cli): jrag orientation + search + hints + packaging (PR-JRAG-4)`. + +## Implementation step list + +| # | Step | File(s) | Done when | +| --- | --- | --- | --- | +| 1 | Orientation (microservices/map/conventions/overview) | `jrag.py` | tests 1–7 pass | +| 2 | `search` over `search_v2` (filter, table all, offset, fuzzy-reject) | `jrag.py` | tests 8–12 pass | +| 3 | `jrag_hints.next_actions` (zero-suppress, dot-keys, fallback) + fill hook | `jrag_hints.py`, `jrag_envelope.py` | tests 13–17 pass | +| 4 | `build_parser` sentinel test + README + version bump | `jrag.py`, `README.md`, `pyproject.toml` | test 18 passes; links resolve | +| 5 | Token-budget guard | `tests/test_jrag_token_budget.py` | test 19 passes | +| 6 | ruff + full suite | repo | clean + green | + +--- + +# PR-JRAG-5 — Agent host integration (install branching, skill, subagent) + +**Goal:** `java-codebase-rag install` chooses an MCP or CLI surface; ship a +CLI-flavored skill + subagent; fix the `update` regression for CLI-only installs. + +**Key facts (verified):** `HostConfig` (`installer.py:43-73`) is paths-only; +`HOSTS` (`:75-94`) registers claude-code/qwen-code/gigacode; `deploy_artifacts` +(`:558`) and `refresh_artifacts` (`:1049`) hardcode parallel 3-artifact lists; +`detect_configured_hosts` (`:1001`) returns `list[tuple[HostConfig, str]]` +(host, scope) and scans MCP entries only (via `_has_java_codebase_rag_entry` +`:1027`), writing no marker → CLI-only install invisible to `update` (fatal at +`:1312-1315`); `run_update` unpacks the 2-tuple at `:1321`; `resolve_mcp_command` +(`:424`) hard-fails (`SystemExit(2)` at `:447`) when the MCP binary is missing +(non-interactive) and its interactive prompt hardcodes `java-codebase-rag-mcp` +(`:453,467,470`); `_refresh_mcp_config` (`:1167`) calls `resolve_mcp_command` at +`:1189` but is reached only on the MCP manifest path. README (`:150`) says "Pick +one of two options (not both)". + +## File-by-file changes + +### 1. `java_codebase_rag/installer.py` +- **`Surface = Literal["mcp", "cli"]`**; `HostConfig` unchanged (surface is + orthogonal). Introduce a `ConfiguredHost` NamedTuple `(host, scope, surface)`; + `detect_configured_hosts` returns `list[ConfiguredHost]` (read the marker file; + fall back to the MCP-entry scan + `surface="mcp"` for back-comat with + pre-marker installs). +- **`ArtifactManifest`** keyed by surface, iterated by both `deploy_artifacts` + (`:558`) and `refresh_artifacts` (`:1049`): + - `mcp` → [(mcp-config), (skill: explore-codebase), (agent: explorer-rag-enhanced)] + - `cli` → [(skill: explore-codebase-cli), (agent: explorer-rag-cli)] (no MCP entry) +- `deploy_artifacts` and `refresh_artifacts` gain `surface: Surface = "mcp"` + (keyword-only default; preserves back-comat with the 8 direct-call sites in + `tests/test_installer.py`). +- **`run_update` loop** (`installer.py:1321`): unpack `(host, scope, surface)` and + pass `surface=surface` to `refresh_artifacts`. +- **`select_surface`** wizard step in `run_install` (`:1454-1575`) at/with + `select_hosts` (`:1513`). On re-run (`handle_rerun`, `:950`), `select_surface` + pre-fills from the marker file and offers keep/switch. +- **Marker file** `.java-codebase-rag.hosts`: written at install (host/scope/surface + set); read by `detect_configured_hosts`. +- **`resolve_mcp_command`** (`:424`) surface-conditional: on `cli`, resolve the + `jrag` binary and parameterize the interactive prompt (`:453,467,470`) + + `shutil.which` target; skip the MCP-binary `SystemExit(2)` (`:447`). On `mcp`, + today's behavior. (`_refresh_mcp_config` is MCP-manifest-only — never reached on + CLI surface — make that explicit with a comment.) + +### 2. Non-interactive flag +- `--surface mcp|cli` (default `mcp`) on the `install` subparser alongside + `--agent`/`--scope`/`--model` (`java_codebase_rag/cli.py:844-867`). + +### 3. CLI skill + subagent (dev-root canonical; sync via PR-JRAG-0a) +- `skills/explore-codebase-cli/SKILL.md` + `agents/explorer-rag-cli.md`. Run + `scripts/sync_agent_artifacts.py`. + +### 4. Tests + docs +- **`tests/test_agent_skills_static.py`**: add `explore-codebase-cli` to + `EXPECTED_SKILL_DIRS`; gate the MCP-vocabulary static-validation tests + (tool-ref/kind/edge allowlists) to `explore-codebase` only (they don't apply + to the CLI skill's shell vocabulary). +- **`tests/test_installer.py`**: the 8 direct `deploy_artifacts`/`refresh_artifacts` + callers keep working via the `surface="mcp"` default; add CLI-surface cases. +- `AGENTS.md:17-18,59-60`, `skills/README.md:10,13,33-34`, `README.md:174` + (three-layer section): add the CLI variants. + +## Tests for PR-JRAG-5 + +`tests/test_installer_surface.py`: +1. `test_surface_cli_deploys_cli_skill_and_agent_no_mcp_entry` +2. `test_surface_mcp_reproduces_today_behavior` +3. `test_marker_file_round_trips_host_scope_surface` +4. `test_detect_configured_hosts_returns_configured_host_namedtuple` (3-field) +5. `test_update_after_cli_only_install_refreshes_cli_skill` (no fatal exit) +6. `test_run_update_unpacks_surface_and_passes_to_refresh` +7. `test_resolve_mcp_command_resolves_jrag_on_cli_surface` (no `SystemExit(2)`; + prompt + which target are `jrag`) +8. `test_deploy_refresh_surface_defaults_to_mcp_back_compat` (existing direct + callers unchanged) +9. `test_handle_rerun_prefills_surface_from_marker` +10. `test_artifact_manifest_single_source_for_deploy_and_refresh` + +Plus: `tests/test_agent_skills_static.py` updated and green. + +## Definition of done (PR-JRAG-5) + +- [ ] `Surface` + `ArtifactManifest` (both entry points iterate it); `surface="mcp"` default. +- [ ] `ConfiguredHost` NamedTuple; `run_update` unpacks surface; marker file round-trips. +- [ ] `detect_configured_hosts` reads marker → CLI-only install visible to `update`. +- [ ] `resolve_mcp_command` surface-conditional (CLI resolves `jrag`; prompt parameterized). +- [ ] `select_surface` + `--surface` flag; `handle_rerun` pre-fills from marker. +- [ ] CLI skill + subagent shipped (sync via PR-JRAG-0a); `test_agent_skills_static.py` updated. +- [ ] `AGENTS.md`, `skills/README.md`, README three-layer section updated. +- [ ] All named tests + updated `test_installer.py`/`test_agent_skills_static.py` green; full suite green. +- [ ] `.venv/bin/ruff check .` clean. +- [ ] PR title: `feat(install): --surface mcp|cli branching + CLI skill/subagent (PR-JRAG-5)`. + +## Implementation step list + +| # | Step | File(s) | Done when | +| --- | --- | --- | --- | +| 1 | `Surface` + `ConfiguredHost` NamedTuple + `ArtifactManifest`; refactor deploy/refresh (surface kw default) | `installer.py` | tests 8,10 pass; mcp parity | +| 2 | Marker file write + `detect_configured_hosts` reads it (3-field return) | `installer.py` | tests 3,4 pass | +| 3 | `run_update` unpacks surface → refresh | `installer.py` | tests 5,6 pass | +| 4 | `select_surface` wizard + `--surface` flag + `handle_rerun` prefill | `installer.py`, `cli.py` | tests 1,2,9 pass | +| 5 | `resolve_mcp_command` surface-conditional (incl. prompt) | `installer.py` | test 7 passes | +| 6 | Author CLI skill + subagent; sync; update `test_agent_skills_static.py` + docs | `skills/`, `agents/`, tests, `AGENTS.md`, `skills/README.md`, `README.md` | artifacts in sync; tests green | +| 7 | ruff + full suite | repo | clean + green | + +--- + +# PR-JRAG-6 — Output detail orthogonality (`--detail`) + +**Goal:** split the concern currently conflated by `--format {text,json}` into two +orthogonal axes. `--format` keeps its meaning (representation: text vs json). A new +`--detail {brief,normal,full}` axis controls **how much of each node/edge** is +materialized, and **both modes honor it through one projection seam**. This fixes +the two halves of the design complaint: text is too terse (no `file`/`score`), +and JSON is too verbose (dumps the full `snippet` + every `None` field). + +**Root cause (verified):** detail is decided per-handler at node-dict construction, +not by the renderer. `_symbol_hit_to_dict` (`jrag.py:96`) hand-trims `SymbolHit` to +8 fields (drops `filename`/`start_line`/`signature`/`annotations`/`capabilities`); +`SearchHit.model_dump()` (`jrag.py` `_cmd_search`) carries the full `snippet`+`score` +— so text (via `_render_listing` `jrag_render.py:144`) drops both, while JSON +(`Envelope.to_json`) dumps the snippet verbatim. `Envelope.to_dict()` +(`jrag_envelope.py:104`) strips empty optionals only at the **top level**, so empty +fields inside node dicts (`symbol_id: null`, `role: null`) serialize in JSON. The +prior `--brief`/`--fields`/`--count`/`--exists` flags (removed in `45318ae`) were +registered but never read — the lesson: a detail flag must flow through ONE seam. + +**Key decision:** invert "trim at construction" → **carry full, trim at one seam**. +Handlers build the fullest node the backend gives; a single projector trims to the +requested level; both `to_json` (via projection) and the text renderers consume the +projected dict. This makes `--format json --detail brief` and `--format text --detail +brief` go through the **same field set**. + +**Decisions (locked with user):** +- Default `--detail normal` (directly fixes the "text too terse" complaint; `normal` + is still one line/row, no snippet, so token budget stays close). `--detail brief` + is the escape hatch reproducing today's exact text. +- `--fields` allowlist **deferred** (avoid scope creep + the dead-flag trap; can be + layered on the same projector later). + +## File-by-file changes + +### 1. `java_codebase_rag/jrag_envelope.py` — projection layer +- Three category key-sets (intersected with each node's present keys, so they are + kind-agnostic and auto-handle new kinds): + - `_BRIEF_NODE_KEYS` — identity only == the fields `display_name`/`tiered_name` + consult (`id, kind, fqn, name, microservice, path, method, topic, member_fqn, + target_service, broker, client_kind, producer_kind, import_simple, import_fqn`). + - `_NORMAL_NODE_KEYS = _BRIEF | {module, role, symbol_kind, framework, file, score}`. + - full = sentinel: keep every present key. +- `_compose_file(node)`: derive `"file" = "filename:start_line"` (or `"filename"`) + from the `SymbolHit`-carried `filename`+`start_line` so `normal` can show location + as one stable field; drops the raw `filename`/`start_line`/`end_line`/`start_byte`/ + `end_byte` from the output (they are not display fields). +- `_drop_empty(node)`: extend the "omit empty optionals" rule from `to_dict()` DOWN + into each node dict — drop `None`/`""`/`[]`/`{}` valued keys (fixes the JSON + "10 empty fields"). Applied at every detail level. +- `project_node(node, detail)`, `project_edge(edge, detail)`, `project_envelope(env, + detail) -> Envelope`: return trimmed **copies** (nodes, edges, candidates + projected; `status`/`root`/`warnings`/`truncated`/`file_location`/`message`/ + `agent_next_actions` passed through). `Envelope.to_dict()`/`to_json()` stay + **verbatim** (no `detail` param) — projection is a separate transform applied by + the renderer, preserving their meaning and existing tests. +- Edge key-sets: brief = `{other_id, edge_type, confidence, direction, section, + stage}` (what today's text reads); normal += `{mechanism, role, from_fqn}`; + full = all keys. + +### 2. `java_codebase_rag/jrag_render.py` — thread `detail` through +- `render(envelope, *, fmt="text", detail="normal", noun="", next_offset=None, + shape=None)`: apply `project_envelope(envelope, detail)` ONCE, then dispatch on + `fmt`. json path: `projected.to_json()`. text path: pass `detail` down. +- `_render_listing(envelope, *, noun, detail)`: identity line unchanged for `brief`; + `normal` appends inline ` module:M role:R kind:K framework:F file:L score:S` (only + the non-empty ones, fixed order); `full` appends an indented kv-block of all + remaining keys (reusing `_render_inspect_block` minus the identity keys). +- `_format_edge_line(edge, nodes, *, detail)`: brief = ` label conf=X.XX`; + normal += ` mechanism:M`; full appends an indented block of every edge attr + (`confidence`/`mechanism`/`annotation`/`field_or_param`/`from_fqn`/…). +- `_render_inspect` needs **no** `detail` param — projection already trimmed the + node; it renders whatever keys survived (few at `brief`, all at `full`). +- `_render_traversal`/`_render_ambiguous` pass `detail` through to `_format_edge_line` + / edge formatting. + +### 3. `java_codebase_rag/jrag.py` — flag, defaults, carry full +- Add `--detail {brief,normal,full}` (default `normal`) to the `common` parent parser + right after `--format` (`jrag.py:160-163`). +- `set_defaults(detail="full")` on the inspect-shape subparsers: `status`, `inspect`, + `microservices`, `map_cmd`, `conventions`, `overview` (their purpose IS detail; the + flag still overrides). Implemented by adding `detail="full"` to each existing + `.set_defaults(handler=...)` call. +- Thread `detail` through every render call: replace `fmt=args.format` → + `fmt=args.format, detail=args.detail` (45 call sites; all are `render(...)` calls — + verified). The top-level `main()` error render uses `getattr(args,"format",...)` + and is left on the `normal` default (error envelopes have no nodes → projection + no-op). +- `_symbol_hit_to_dict` (`jrag.py:96`): carry the **full** `SymbolHit` — add + `filename, start_line, end_line, signature, annotations, capabilities, modifiers, + package, parent_id, resolved` (so `normal` shows `file` and `full` is genuinely + rich for the ~15 commands that route through it). +- `_client_dict_to_node`/`_producer_dict_to_node`: left as-is. Their explicit field + sets are already reasonable, and `_drop_empty` cleans the empty ones (the actual + JSON complaint). A raw-extras merge was considered and deferred — it risks + subtly changing `display_name` precedence for edge-case (empty-path) clients, and + the gain (one or two extra fields at `full`) is marginal. + +## Tests for PR-JRAG-6 + +`tests/test_jrag_envelope.py` (projector): +1. `test_project_node_brief_keeps_identity_drops_extras` +2. `test_project_node_normal_adds_location_and_ranking` +3. `test_project_node_full_keeps_everything` +4. `test_project_node_drops_empty_fields_at_all_levels` (no `None`/`""`/`[]`/`{}`) +5. `test_compose_file_from_filename_and_start_line` +6. `test_project_envelope_passes_through_status_root_warnings_truncated` +7. `test_project_edge_brief_normal_full_attr_sets` + +`tests/test_jrag_render.py` (orthogonality + text levels): +8. `test_json_and_text_share_field_set_at_each_detail` (the core orthogonality + assertion — same projected keys behind both) +9. `test_listing_normal_appends_file_role_score_inline` +10. `test_listing_full_appends_indented_block` +11. `test_edge_line_normal_appends_mechanism` +12. `test_search_text_normal_shows_score_not_snippet` (regression for the complaint) +13. `test_search_json_normal_omits_snippet_drops_empty_fields` + +Updated existing tests: +- `test_render_json_emits_envelope_verbatim` → assert via `detail="full"` (full + + projection-invariant data == verbatim); note projection is the new json path. +- `test_render_inspect_edge_summary_alphabetical` → pass `detail="full"` (inspect + defaults to full in production; projection would otherwise trim `edge_summary`). +- `test_render_listing_*` / `test_render_traversal_*` → default is now `normal`, but + their nodes carry only identity fields, so output is unchanged; add a one-line + comment that they rely on identity-only data being projection-invariant. + +## Definition of done (PR-JRAG-6) + +- [ ] Projection layer in `jrag_envelope.py` (`project_node`/`project_edge`/ + `project_envelope`/`_drop_empty`/`_compose_file`); `to_dict`/`to_json` unchanged. +- [ ] `render()` applies projection once; json + text share the projected dict. +- [ ] `--detail` flag (default `normal`); inspect/orientation `set_defaults(detail="full")`. +- [ ] All 45 render calls thread `detail=args.detail`. +- [ ] `_symbol_hit_to_dict` carries full `SymbolHit`; client/producer dict + builders unchanged (explicit fields + `_drop_empty` is sufficient). +- [ ] `normal` text shows `file`/`score`/`role`/`module` inline; `full` shows blocks. +- [ ] JSON drops empty node-internal fields at all levels. +- [ ] Named projector + renderer tests green; updated existing render tests green. +- [ ] PR-JRAG-4 token-budget assertion re-pinned under `normal` default (risk #20). +- [ ] `skills/explore-codebase-cli/SKILL.md` + `agents/explorer-rag-cli.md` document + `--detail`; `scripts/sync_agent_artifacts.py` run; drift test green. +- [ ] `.venv/bin/ruff check .` clean; full suite green. +- [ ] PR title: `feat(cli): --detail orthogonality (brief|normal|full) for text+json (PR-JRAG-6)`. + +## Implementation step list + +| # | Step | File(s) | Done when | +| --- | --- | --- | --- | +| 1 | Projection layer (`project_node`/`edge`/`envelope`, `_drop_empty`, `_compose_file`, key-sets) | `jrag_envelope.py` | tests 1–7 pass | +| 2 | `render()` applies projection; thread `detail` into listing/edge renderers | `jrag_render.py` | tests 8–13 pass; existing render tests green | +| 3 | `--detail` flag + `set_defaults(detail="full")` on inspect-shape subparsers | `jrag.py` | `--help` shows flag; inspect defaults full | +| 4 | Thread `detail=args.detail` through 45 render calls (`replace_all`) | `jrag.py` | grep shows 0 un-threaded calls | +| 5 | Enrich `_symbol_hit_to_dict` to carry full `SymbolHit` (client/producer unchanged) | `jrag.py` | `full` shows signature/file/annotations | +| 6 | Re-pin token-budget assertion under `normal`; update skill/agent docs + sync | tests, `skills/`, `agents/` | token test green; drift test green | +| 7 | ruff + full suite | repo | clean + green | + +--- + +# Cross-PR risks and mitigations + +| # | Risk | Severity | Mitigation | +| --- | --- | --- | --- | +| 1 | PR-JRAG-0b extraction orphans `NodeRef` / breaks `mcp_v2` models | High | `NodeRef` stays in `mcp_v2.py`; only resolve models move; re-export; `test_mcp_v2.py`+`test_mcp_hints.py` gate. | +| 2 | Envelope/resolve-first contract churns after PR-1a | High | PR-1a is dedicated to the frozen contract; later PRs start only after it lands; signature frozen by tests. | +| 3 | `--offset` silently dropped/TypeErrors on traversal/listing | High | `--offset` registered ONLY on `find`/`search`; other commands reject it (test 22 / 17 / 12) and emit "narrow your query". | +| 4 | `jrag search` loads the wrong embedding model | High | Reuse `resolve_operator_config` + `apply_to_os_environ()` (sets `SBERT_MODEL`); test on a YAML-overridden-model fixture. | +| 5 | lancedb EMFILE flakiness | Medium | `raise_fd_limit()` is the first line of `main()`. | +| 6 | Pydantic objects leak into the dict envelope | Medium | `.model_dump()` at the boundary (one place); renderer + `to_json` are dict-only (test 2). | +| 7 | Missing/stale index → opaque error | Medium | `LadybugGraph.exists()` pre-check + ontology-mismatch hint (test 21). | +| 8 | `overrides`/`overridden-by` go the wrong way | High | Both via `neighbors_v2` on the stored `OVERRIDES` edge (out=UP, in=DOWN); tests 9,10. | +| 9 | `find_route_callers` `--service` silently ignored + no truncation | Medium | Client-side post-filter + warning + slice (test 2). | +| 10 | `callees` Producer target mis-typed | Low | Documented as `:Route` (`kafka_topic`); test 2 in PR-3b. | +| 11 | `flow` intra-service claimed as query-enforced | Low | Framed as index-time data property; test validates fixture (PR-3a test 15). | +| 12 | `agent_next_actions` suggests zero-result / wrong commands | Medium | Zero-direction suppression (PR-4 test 14); dot-keys covered; ≤5; fallback to result_edges. | +| 13 | PR-4↔PR-3 wiring leaves traversal commands without `next_actions` | Medium | PR-4 hard-depends on PR-3b; commands leave a `next_actions_hook` from PR-1b. | +| 14 | `jrag --help` slow (torch/sentence_transformers loaded) | Medium | `build_parser()` imports no backend modules; PR-4 `sys.modules` sentinel (test 18). | +| 15 | Snapshot flake on inspect rendering | Low | Inspect renderer sorts all dict keys alphabetically. | +| 16 | CLI-only install strands `update` | High | Marker file + `ConfiguredHost` 3-field return + `run_update` unpacking (PR-5 tests 3–6). | +| 17 | PR-5 breaks `test_agent_skills_static.py` / direct-call installer tests | High | Update `EXPECTED_SKILL_DIRS`; `surface="mcp"` kw default (PR-5 tests 8,10). | +| 18 | Dual-copy artifacts drift when CLI skill/subagent land | Medium | PR-0a first (single-source + drift test); PR-5 depends on it. | +| 19 | Enum kinds (`client_kind`/`producer_kind`/`source_layer`) reject | Medium | Lookup tables (not case conversion); tests 8,9,11. | +| 20 | Token budget regresses as fields accrete | Low | PR-4 token-budget assertion on the fixture. PR-JRAG-6 re-pins it under the new `normal` default; `brief` is the escape hatch if `normal` blows the ceiling. | +| 21 | PR-JRAG-6 default `normal` churns every command's text output / renderer tests | Medium | `normal` adds inline fields only when present; existing render-test nodes carry identity-only data (projection-invariant), so most pass unchanged. Inspect-shape commands floor at `full` via `set_defaults`. Token-budget + render tests explicitly re-pinned (PR-JRAG-6 step 6). | +| 22 | PR-JRAG-6 reintroduces a dead flag (`--detail` registered, unread) | Medium | `detail` flows through ONE projection seam (`render` → `project_envelope`), not 30 handlers; the 45 render calls are threaded via a single `replace_all`; projector unit tests assert each level changes output. | + +# Out of scope + +- **Daemon**; negative/absence filters; `diff-impact`/`changed`; `todos`/`unreferenced`; + `drift`; batch input; `--role` multi-value; `total_count`/"M of N" pagination + (only +1-fetch); a dedicated `LadybugGraph.client_calls_route`/`producer_calls_topic` + method (v1 composes `neighbors_v2`); standalone `jrag resolve`; `jrag source`; + moving operator lifecycle commands into `jrag`; ontology bump/re-index; the + actual PyPI publish (PR-JRAG-4 bumps version only). +- **`--fuzzy` on `find`** (faithful name-prefix/name-contains fallback). The backend + `find_by_name_or_fqn` is Symbol-only and exact-only + (`MATCH (s:Symbol) WHERE s.name=$needle OR s.fqn=$needle`); `NodeFilter` only has + `fqn_prefix` (FQN STARTS WITH), with no name-prefix/contains anywhere. Implementing + the brief's exact→prefix→contains fallback would require backend changes that are + out of scope for the thin-CLI PRs. The `--fuzzy` flag was removed from the `find` + subparser; tracked as a GitHub follow-up issue for the real implementation. + +# Whole-plan done definition + +1. `pip install java-codebase-rag` provides `jrag`; `--help` lists orientation / + locate / listings / traversal / inspection / search / health groups. +2. Every ``-accepting command honors resolve-first (`one`→run, `many`→ + candidates+stop, `none`→`not_found`); raw IDs never required. +3. Every command emits the canonical envelope (`--format json`) + token-lean text + by default; `truncated` via +1-fetch (or "narrow" for non-offset commands); + `agent_next_actions` ≤5. `--detail brief|normal|full` (default `normal`) is + orthogonal to `--format` — both modes honor the same projected field set. +4. `--offset` works only on `find`/`search`; all other commands reject it. +5. `jrag search` loads the YAML-configured embedding model (via `apply_to_os_environ`). +6. `java-codebase-rag install --surface cli` deploys the CLI skill + subagent and + `update` refreshes them (no fatal exit); `--surface mcp` reproduces today. +7. No ontology bump, no re-index, no cocoindex dependency in the CLI; full suite green. +8. Propose → `propose/completed/`; plan → `plans/completed/` once all PRs land. + +# Tracking + +- `PR-JRAG-0a`: _pending_ +- `PR-JRAG-0b`: _pending_ +- `PR-JRAG-1a`: _pending_ (blocked by PR-JRAG-0b) +- `PR-JRAG-1b`: _pending_ (blocked by PR-JRAG-1a) +- `PR-JRAG-2`: _pending_ (blocked by PR-JRAG-1a) +- `PR-JRAG-3a`: _pending_ (blocked by PR-JRAG-1a) +- `PR-JRAG-3b`: _pending_ (blocked by PR-JRAG-3a) +- `PR-JRAG-4`: _pending_ (blocked by PR-JRAG-1a, PR-JRAG-3b) +- `PR-JRAG-5`: _pending_ (blocked by PR-JRAG-0a; soft-depends on PR-JRAG-4) +- `PR-JRAG-6`: _pending_ (blocked by PR-JRAG-1a; touches envelope/render/jrag.py) + +# Notes + +- **Proposal relocation:** `propose/JRAG-CLI-PROPOSE.md` sits at `propose/` root; + `AGENTS.md` says in-flight proposes live in `propose/active/`. Relocate as part + of opening PR-JRAG-0a (or when the propose merges). +- **Companion `AGENT-PROMPTS-JRAG-CLI.md`:** not yet written; generate on request + (one prompt per PR, modeled on `plans/completed/AGENT-PROMPTS-INIT-INCREMENT-PERF.md`). diff --git a/propose/JRAG-CLI-PROPOSE.md b/propose/JRAG-CLI-PROPOSE.md new file mode 100644 index 00000000..c19d3e4b --- /dev/null +++ b/propose/JRAG-CLI-PROPOSE.md @@ -0,0 +1,647 @@ +# JRAG CLI — Agent-Facing Command-Line Interface + +**Status**: Proposal — not yet implemented. +**Author**: Dmitry + Computer +**Date**: 2026-07-03 + +--- + +## Revision summary (what changed in this pass) + +This revision corrects the proposal against the actual codebase (`ladybug_queries.py`, `mcp_v2.py`, `java_ontology.py`, `server.py`, `java_codebase_rag/cli.py`, `config.py`, `pyproject.toml`). Headline changes: + +- **Daemon deferred.** v1 is **in-process** (loads the index per call, like the MCP today). The unix-socket daemon is a post-v1 milestone, only if measured cold-start latency justifies it. PRs reordered so user-facing commands land first. +- **Naming fixes** (the `--help` is the agent's only documentation — names must be guessable): + - `injectors` → **`dependents`** (keep `dependencies`); symmetric actor/target pair matching `callers`/`callees`. + - `target` → **folded into `callees`** (a client *calls* a route, a producer *calls* a topic; `callees` already means "what this calls"). `callees` now dispatches by kind, mirroring `callers`. Standalone alternative `destination` is recorded as an open question. + - `trace` → **`decompose`** (resolves the `trace`/`flow` collision — "trace" conventionally means an end-to-end distributed trace, which is `flow`'s job; `trace` actually returns a static role-waterfall decomposition). *(Confirmed.)* +- **`find` stays flat, not split.** Real fix is kind-inference + hard-error on flag/kind contradiction + `--help` grouped by kind. `--target-service` → **`--calls-service`** (kills a one-hyphen collision with global `--service`). +- **Factual corrections:** the operator CLI is `java-codebase-rag` (there is no `user-rag`); MCP has **5** tools incl. `resolve` (the CLI's resolve-first contract *is* `resolve` internalized); `--index-dir` defaults to `/.java-codebase-rag` (`JAVA_CODEBASE_RAG_INDEX_DIR`), not an invented `~/.jrag`; `diff-impact`/`changed` are not "no backend" — the operator `analyze-pr` already does diff-based blast radius; `flow` outbound is intra-service only (cross-service is inbound); enum storage is `UPPER_SNAKE` (CLI normalizes); `OVERRIDDEN_BY` is a virtual key, not a stored edge; `NodeFilter` has no `kind` field. +- **Added §14 — token efficiency.** Output size is a first-class constraint for an agent-facing CLI (every byte enters the context window): per-command default field projection, `--brief`/`--fields`/`--count`, fan-out-scaled limits, de-dup, lean envelope, and a token-budget test assertion. +- **Default output format flipped to text** (JSON opt-in). JSON-default conflicted with the token-efficiency principle; text-default matches `gh`/`kubectl` and minimizes context-window cost. The envelope remains the canonical schema (Appendix A). +- **Text-rendering hardened via 3-agent adversarial review** (grounded in the backend). §6 now specifies: tiered endpoint disambiguation (simple name → `name @service` → FQN), direction conventions (`hierarchy` tree `↑`/`↓`, `connection` `inbound:`/`outbound:`, multi-hop `d=N`), root pinning, CALLS-family-only confidence, zero-vs-`not_found` distinction, candidate `reason` rendering, ASCII delimiters. Appendix A corrected: dropped phantom envelope-level `confidence`; `truncated` specified via +1-fetch (PR-JRAG-1); candidates annotated (cap-at-10, no file/score); edge `confidence` noted CALLS-family-only from `attrs`. +- **Install / skill / subagent integration + refactor verdicts** (grounded in the installer and the service layer). Added **PR-JRAG-5** (agent host integration: `Surface` dimension on `HostConfig`, a global MCP-vs-CLI wizard step, a CLI-flavored skill+subagent, an `ArtifactManifest`, a marker-file `detect_configured_hosts` fix, surface-conditional `resolve_mcp_command`) and **two prep refactors** (PR-JRAG-0a single-source shipped artifacts; PR-JRAG-0b `resolve_v2`→`resolve_service.py` extraction to kill the one duplication trap). Corrected the transport-edge inaccuracy: `neighbors_v2`'s generic flat-label path already reaches `:Route`/topic — `callees ` composes `resolve_v2`+`neighbors`, **no new query for v1**. Confirmed there is no smeared-logic problem (`mcp_v2.py` imports zero MCP SDK); the CLI builds on the existing `LadybugGraph`-direct precedent (`pr_analysis.py`). + +--- + +## TL;DR + +- The MCP gives agents a graph-navigation primitive (`search`, `find`, `describe`, `neighbors`, **`resolve`** — five tools). It is the right shape for a reasoning loop. The CLI is a *different product* for a different caller: an AI coding agent that speaks in names, not IDs, and needs one command per intent. +- The CLI is **not** a wrapper around the MCP. It is a named-intent surface built on the same `LadybugGraph` backend (`ladybug_queries.py`), designed so every common agent task is achievable in one call, without a prior resolve step. It **internalizes the MCP `resolve` tool** (its `one`/`many`/`none` contract becomes the CLI's resolve-first step). +- **`neighbors` is removed entirely.** Every edge traversal gets a named command (`callers`, `callees`, `hierarchy`, `dependents`, `dependencies`, `decompose`, …). No agent should ever reason about edge labels or directions. +- **Resolve-first contract.** Every command that accepts a `` runs an internal locate step first. The agent passes a name, FQN, route path, or topic name. If exactly one node matches → the command runs. If ambiguous → candidates are returned and the command stops. No raw node IDs are required or accepted. +- **Same repo, new PyPI entry point `jrag`** — separate from the existing `java-codebase-rag` operator CLI. **v1 is in-process** (no daemon); the index is loaded per call, reusing the operator's index directory. +- **v1 scope**: orientation, locate, direct listings, graph traversal, file inspection, search, and a lightweight `status`. `diff-impact`, `changed`, `unreferenced`, `todos`, and the **daemon** are explicitly deferred. +- **4 command PRs + PR-JRAG-5 (agent host integration) + 2 prep refactors**: locate, listing, traversal, orientation+search+packaging, then install branching / skill / subagent. Daemon is a separate post-v1 milestone. + +--- + +## §1 — Frame: what is the CLI, really? + +The MCP's job is to expose the raw graph shape to an LLM reasoning loop (`search` / `find` / `describe` / `neighbors` / `resolve`). The CLI's job is different: **give an AI coding agent one command per engineering intent, using the vocabulary the agent already has from reading code.** + +An agent reading a stack trace knows `com.acme.orders.OrderController`. It does not know `sym_a3f7b9`. Making the agent call `find` first to get an ID, then pass that ID to a traversal command, is the MCP's two-step pattern translated badly into a CLI. It is wrong for this surface. + +The frame is: **the jrag CLI is an intent-named command surface where every positional argument is a human-readable identifier, and every command name is an engineering question.** + +This frame rules out: +- Raw node IDs as required inputs (the resolve step is always internal — it *is* `resolve_v2`, the existing resolve pipeline, called directly as a transport-agnostic function). +- A `neighbors` command (it encodes graph topology, not engineering intent). +- Commands that are purely operational (`init`, `increment`, `reprocess`, `meta`, `analyze-pr`, `diagnose-ignore`) — those remain in the **`java-codebase-rag`** operator CLI. +- A standalone `resolve` command (resolution is not an agent intent; it is infrastructure, already implicit in every command). + +--- + +## §2 — Design principles + +1. **One command per engineering intent.** An agent should never need two commands to answer "who calls `OrderService.save`?". +2. **Names in, names out.** Every command accepts the identifier the agent already has from code context (FQN, simple name, `GET /path`, topic name). No raw IDs. +3. **Resolve-first, fail loud on ambiguity.** If a query matches multiple nodes, return candidates and stop — never silently pick one. Agents must narrow, not guess. (This is the MCP `resolve` contract, surfaced as a CLI guarantee.) +4. **Disambiguation flags on every traversal command.** `--kind`, `--java-kind`, `--role`, `--fqn-prefix` narrow the resolve step on any command that accepts ``. +5. **Global flags for scope, not per-command invention.** `--service`, `--module`, `--limit`, `--offset` apply uniformly. Per-command scope flags (e.g. `--consumer-in` on `topics`) are only added when the command has two orthogonal scope axes. +6. **Named commands map to named backend functions where one exists — and say so when they don't.** `jrag callers` → `find_callers`/`find_route_callers`. `jrag decompose` → `trace_flow`. `jrag flow` → `trace_request_flow`. `jrag impact` → `impact_analysis`. `jrag callees` for a client/producer composes `resolve_v2` + `neighbors_v2` (the generic flat-label path already reaches `:Route`/topic — **no new query for v1**); `jrag dependencies` (INJECTS-out) composes `neighbors(direction="out", edge_types=["INJECTS"])`. A dedicated `LadybugGraph` method for the transport edge is post-v1 polish. Flagged honestly in §9, not hidden behind "thin extraction." +7. **Compact text output by default; JSON envelope is the canonical schema.** Every command's data follows one envelope model (Appendix A). The default rendering is compact text — token-efficient for agent context windows; `--format json` emits the envelope verbatim for structured or pipeline use. `agent_next_actions` (capped at 5) replaces the MCP `StructuredHint` surface. +8. **`edge_summary` always present in `jrag inspect`.** This is the documented pivot from "what is this node" to "which traversal command to call next". Losing it would break the locate → inspect → walk workflow. (Verified: `describe_v2` produces `edge_summary` for **all four** kinds — symbol, route, client, producer — so this is safe.) +9. **`--help` is the spec.** Because agents discover the surface by reading help, command/flag names must be guessable and grouped, and inapplicable flag combinations must error loudly, not silently misbehave. + +--- + +## §3 — Global flags + +Every command accepts these flags. They map directly to `NodeFilter` fields and index resolution. + +``` +--service # NodeFilter.microservice +--module # NodeFilter.module (maven module; first-class, distinct from --service) +--limit N # default: 20 +--offset N # default: 0 +--index-dir # default: /.java-codebase-rag (see below) +--format text|json # default: text (token-efficient); json emits the canonical envelope +--brief # name+fqn+one discriminator only (see §14) +--fields # opt-in field projection (see §14) +--count | --exists # scalar result instead of node records (see §14) +``` + +**Index resolution reuses the operator's index, it does not invent a new one.** `--index-dir` defaults through the **same** resolver the operator CLI uses (`config.py:_resolve_index_dir_path`): explicit `--index-dir` → `JAVA_CODEBASE_RAG_INDEX_DIR` env → `index_dir` in `.java-codebase-rag.yml` → `/.java-codebase-rag`. Project discovery walks up from cwd to find `.java-codebase-rag.yml` / the index dir. There is **no** `~/.jrag` directory and no `JRAG_*` env var — the CLI reads exactly the index that `java-codebase-rag init` built. + +`--module` is not a post-filter. It maps to the stored `module` attribute on every node kind, exactly as it does in `NodeFilter` in the MCP. Agents can scope to a maven module independently of microservice boundaries. + +**Where `--service` is a real backend filter vs. a post-filter varies by command** (see §10): commands whose backend takes a `microservice` param (`callers`, `callees`, `implementations`, `subclasses`, `dependents`, …) push it down; `impact` does not take one, so `--service` is a client-side post-filter there (with a warning). + +--- + +## §4 — Resolve contract + +Applies to every command that accepts `` (all traversal, inspect, and orientation commands). The CLI runs `resolve_v2` internally and maps its `one`/`many`/`none` statuses onto the envelope: + +``` +resolve_v2 status envelope status behavior +------------------ ------------------ --------------------------------------------- +"one" ok proceed with the single resolved node +"many" ambiguous stop, return candidates[], hint to narrow +"none" not_found stop, message: "No node matches ''. + Try: jrag search " +``` + +`` accepts any identifier form `resolve_v2` understands: simple name, FQN, `GET /orders/{id}` (method+path), topic name. The full disambiguation flag set is available on every ``-accepting command: + +``` +--kind symbol|route|client|producer # node table discriminator (hint_kind in resolve_v2) +--java-kind class|interface|method|enum|record|annotation|constructor +--role controller|service|repository|entity|config|mapper|dto|component|client|other +--fqn-prefix com.acme.orders +``` + +These are **disambiguation inputs**, not traversal-result filters — they narrow the resolve step only. (Filtering traversal *results* by role is a separate, deferred concern; see §8.) + +**Enum casing.** Roles and capabilities are stored `UPPER_SNAKE` (`CONTROLLER`, `SCHEDULED_TASK`). The CLI accepts **either** form on input and normalizes (case-insensitive + kebab↔underscore), so `--role controller`, `--role Controller`, and `--role CONTROLLER` are equivalent. This normalization is net-new code in the CLI; no such layer exists today. + +--- + +## §5 — Command surface + +### Orientation + +``` +jrag microservices + # list all indexed microservices with node counts per kind + +jrag map [--service svc] [--module mod] + # structural density overview: node counts per kind per service/module + +jrag conventions [--service svc] + # auto-detected architectural patterns from the graph (dominant roles, framework) + +jrag overview [--as microservice|route|topic] + # orientation bundle depending on target type: + # microservice → connection summary + controller/endpoint count + Feign client list + entity list + scheduled-job list + # route → flow from entry + all downstream callers/producers (intra-service) + # topic → producers list + consumers list + # --as escapes polymorphic inference when a name could match >1 type +``` + +### Locate + +`find` accepts a positional query OR pure flags. It is the **cross-kind structured-filter escape hatch** — the `gh search` to the listing tier's `gh issue list`. For "list all of one kind" prefer the listing commands below over `find --kind `. + +**Kind inference + hard-error (the real fix for "too many flags"):** ~13 of `find`'s flags apply to exactly one kind. When `--kind` is omitted, the CLI infers it from the domain flags passed (`--http-method`⇒route, `--client-kind`⇒client, `--producer-kind`⇒producer, `--role`/`--java-kind`/`--capability`⇒symbol). A domain flag that **contradicts** an explicit `--kind` is a hard error (`status: error`, naming the pair). Inapplicable flags are never silently ignored — silent-ignore on a green `status: ok` is the worst failure mode for an agent. + +``` +jrag find [] + +GLOBAL SCOPE: + --service scope to microservice + --module scope to maven module + --limit N --offset N + +APPLIES TO ALL KINDS: + --kind symbol|route|client|producer + --fqn-prefix + --fuzzy exact → prefix → contains on the identifier string + (NOT semantic similarity; use jrag search for that) + --annotation + +SYMBOL ONLY (kind=symbol): + --java-kind class|interface|method|enum|record|annotation|constructor + --role controller|service|repository|entity|config|mapper|dto|component|client|other + --exclude-role [,] + --capability scheduled-task|message-listener|http-client|message-producer|exception-handler + --framework spring-mvc|webflux + --source-layer # see legend: builtin / layer-a / layer-b-ann / layer-b-fqn / layer-c + +ROUTE ONLY (kind=route): + --http-method GET|POST|PUT|DELETE|PATCH + --path-prefix /api/ + --framework spring-mvc|webflux + +CLIENT ONLY (kind=client): + --client-kind feign|rest-template|web-client + --calls-service # service this client calls (≠ global --service) + --calls-path-prefix /items/ # path prefix this client calls + +PRODUCER ONLY (kind=producer): + --producer-kind kafka|stream-bridge + --topic-prefix order. + +KIND INFERENCE (when --kind omitted): + --http-method / --path-prefix ⇒ route + --client-kind / --calls-service / --calls-path-prefix ⇒ client + --producer-kind / --topic-prefix ⇒ producer + --role / --java-kind / --capability ⇒ symbol + A domain flag conflicting with explicit --kind is an error. +``` + +Notes: +- `--role` is single-valued today (matches `NodeFilter.role: Role | None`). Multi-value is a backend change, deferred. +- `--fuzzy`'s prefix stage overlaps `--fqn-prefix`; if both are set, `--fqn-prefix` wins for the prefix stage (documented, not undefined). +- `--source-layer` values are opaque provenance codes; `--help` carries a one-line legend (they encode which inference layer produced a node: built-in / annotation-driven / FQN-driven / convention). +- `--capability http-client` (a symbol-side annotation view) overlaps `--kind client` (the dedicated client-node view); help states the distinction explicitly. + +### Direct listings + +All nodes of a kind, no query. All accept global `--service`, `--module`, `--limit`, `--offset`. + +``` +jrag routes [--http-method GET|POST|PUT|DELETE|PATCH] [--path-prefix /api/] [--framework spring-mvc|webflux] +jrag clients [--calls-service ] [--client-kind feign|rest-template|web-client] +jrag producers [--topic-prefix order.] +jrag topics [--producer-in ] [--consumer-in ] +jrag jobs +jrag listeners [--topic-prefix order.] +jrag entities +``` + +### Graph traversal + +All traversal commands share the resolve contract from §4. Disambiguation flags (`--kind`, `--java-kind`, `--role`, `--fqn-prefix`) are available on every command. + +``` +jrag decompose [--fqn-prefix ...] + [--depth 2] [--follow-calls] [--max-stage N] + # service-internal call-chain DECOMPOSITION by role layers + # (Controller → Service/Component → Client/Repository/Mapper), seeded from entrypoint roles. + # Static structural waterfall, NOT a runtime/distributed trace. + # backend: trace_flow(); --depth is per-stage hop count (clamped 1..3) + +jrag flow [--fqn-prefix ...] + [--max-hops 5] + # request reachability for a Route node: + # inbound → cross-service callers (Feign/RestTemplate clients + Kafka/StreamBridge producers) + # outbound → INTRA-service method CALLS hops only (does NOT descend into downstream services) + # backend: trace_request_flow(); --max-hops clamped 1..8 + +jrag impact [--kind ...] [--java-kind ...] [--role ...] [--fqn-prefix ...] + [--depth 2] + # transitive reverse reachability: what breaks if this node changes + # (INJECTS | IMPLEMENTS | EXTENDS, depth-bounded). Distinct from `callers` (direct, CALLS-edge). + # backend: impact_analysis() — takes NO microservice param; --service is a client-side post-filter + # and emits a warning when it excludes cross-service nodes. + +jrag callers [--kind symbol|route] [--fqn-prefix ...] + [--depth N] [--min-confidence 0.8] + # dispatches by resolved kind: + # Symbol → find_callers() (CALLS-in, intra-service) + # Route → find_route_callers() (HTTP_CALLS + ASYNC_CALLS in) + +jrag callees [--fqn-prefix ...] + [--min-confidence 0.8] [--include-external] [--depth N] + # dispatches by resolved kind: + # Symbol → find_callees() (CALLS-out; direct method callees) + # Client → HTTP_CALLS-out → Route (the endpoint this Feign/RestTemplate/WebClient calls) + # Producer→ ASYNC_CALLS-out → topic (the topic this KafkaTemplate/StreamBridge publishes to) + # NOTE: --exclude-role is NOT supported (find_callees has only exclude_external: JDK/Spring/Lombok + # FQN filtering). Client/Producer cases compose `resolve_v2` (name→id) + `neighbors_v2(id, "out", ["HTTP_CALLS"|"ASYNC_CALLS"])` (the generic flat-label path already reaches the `:Route`/topic) — + # a dedicated `LadybugGraph.client_calls_route`/`producer_calls_topic` method is post-v1 polish (§9). + +jrag hierarchy [--kind ...] [--java-kind ...] [--fqn-prefix ...] + [--depth N] + # full inheritance tree, both directions: EXTENDS + IMPLEMENTS in and out + +jrag implementations [--fqn-prefix ...] [--capability ...] + # interface → all implementing classes (IMPLEMENTS-in) + # backend: find_implementors() + +jrag subclasses [--fqn-prefix ...] + # class → all subclasses (EXTENDS-in) + # backend: find_subclasses() + +jrag overrides [--fqn-prefix ...] + # method → what it overrides (dispatch UP to superclass declaration) + # backend: override_axis_traversal_for() + +jrag overridden-by [--fqn-prefix ...] + # method → what overrides it (dispatch DOWN to concrete implementations) + # backend: override_axis_rollup_for() + +jrag dependents [--fqn-prefix ...] + # bean type → who injects/depends-on it (INJECTS-in) + # backend: find_injectors() + # (renamed from `injectors` for symmetry with `dependencies`) + +jrag dependencies [--fqn-prefix ...] + # bean/component → what it injects/depends-on (INJECTS-out) + # backend: NONE today — composed via neighbors(direction="out", edge_types=["INJECTS"]). + # Note: returns less edge detail than `dependents` (find_injectors gives mechanism/annotation/field). + +jrag connection [--inbound] [--outbound] [--both] + [--http-method ...] [--calls-service ...] + # cross-service connectivity map: who calls this service / who this service calls. + # First positional is a microservice NAME, not a resolve-first (the one exception — + # documented loudly in --help). `--calls-service` replaces the old `--target-service`. +``` + +### File inspection + +``` +jrag outline + # class/method structure of a source file + +jrag imports + # import statements in a file, with resolved graph node references where available + # (distinct from `dependencies`, which is DI/INJECTS — `imports` is source-level) +``` + +> **Scope note:** there is intentionally **no `jrag source `** to read a method body. The workflow is `inspect` → read `file_location` → the agent reads the file with its own file tools. `outline`/`imports` exist because they return graph-resolved structure (imports linked to nodes), which a raw file read does not. + +### Inspection & search + +``` +jrag inspect [--kind ...] [--java-kind ...] [--role ...] [--fqn-prefix ...] + # full node record + edge_summary (all incident labels, in/out counts, incl. composed keys) + # same resolve contract as traversal commands. edge_summary is required. + +jrag search + --table java|sql|yaml|all + --hybrid + --path-contains + --role ... --exclude-role ... + --annotation ... + --capability ... + --fqn-prefix ... + --java-kind ... + # semantic/vector similarity search — use when find returns nothing. + # Does NOT accept --fuzzy (already semantic by design). +``` + +### Daemon — deferred + +A persistent unix-socket daemon (`jrag daemon start|stop|status|list`, transparent auto-start) is **out of scope for v1**. v1 loads the index in-process per call, exactly as the MCP server does today. The daemon is revisited as a post-v1 milestone only if cold-start latency on large estates measurably justifies it (see §9, §11). + +### Health + +``` +jrag status + # in-process index health: ontology version, loaded index count, index freshness, source root. + # (No daemon in v1; a read-only check that the index the operator built is current.) +``` + +--- + +## §6 — Output model & format + +Every command's result follows **one envelope data model** — the canonical schema of record (Appendix A). The **default output format is compact text**: the envelope rendered for a reader, token-efficient for agent context windows. `--format json` emits the envelope verbatim for structured or pipeline consumption. + +### Envelope (canonical schema) + +```json +{ + "status": "ok | ambiguous | not_found | error", + "nodes": { "": { ...all node fields... } }, + "edges": [ + { "from": "", "to": "", "label": "CALLS", "confidence": 0.9 } + ], + "root": "", + "agent_next_actions": [ + "jrag callers OrderService.save", + "jrag inspect OrderService" + ], + "warnings": [], + "truncated": false, + "file_location": "OrderController.java:42" +} +``` + +- `status` ∈ {`ok`, `ambiguous`, `not_found`, `error`}. **`truncated` is a boolean only**, not a status value — a capped result set is always `status: ok` with `truncated: true`; that is the agent's signal to use `--offset`. +- `agent_next_actions` is capped at 5. It **replaces** the MCP `StructuredHint{tool,args,actionable,reason}` surface. Trade-off: plain command strings are directly runnable by the agent (no args to re-assemble); the structured tool+args form is dropped. This requires a new **edge-label → CLI-command** mapper (the existing hint engine maps edges to MCP tools, not CLI commands). +- Listing commands (`routes`, `clients`, etc.) omit `root` and `edges`. +- `edge_summary` appears only in `jrag inspect` output, nested under the node record. It covers all incident edge labels and in/out counts, including composed keys (`DECLARES.EXPOSES`, `OVERRIDDEN_BY.DECLARES_CLIENT`, etc.). + +### Text rendering (default) + +The default is compact **text**; `--format json` emits the envelope verbatim. **Parsing model:** the agent reads raw bytes with no reflow, so identifiers are **never** truncated or wrapped (a truncated FQN is a footgun — the agent re-emits it and resolve fails); columns are single-space aligned, one record per line. + +**Output shapes:** +- **Listing** (`routes`, `clients`, …) → table, header once, one row per result. **FQN omitted** (the next command re-resolves on a name + `--service`) — the single biggest token lever. +- **Single-anchor traversal** (`callers`, `callees`, `dependents`, `dependencies`, `implementations`, `subclasses`, `overrides`, `overridden-by`) → a `root:` line, then one row per result where each row *is* the edge (`name role service file:line [LABEL] conf:0.9`). No separate `edges:` block — every edge shares the root. +- **Graph traversal** (`flow`, `decompose`, `impact --depth≥2`, `hierarchy`) → `root:` line, `nodes:` block, then `edges:` block. +- **`inspect`** → `key: value` record (full FQN here) with `edge_summary:` as an indented sub-block. +- **`--count` / `--exists`** → bare scalar (`42` / `true` / `false`); exit 0 on every `ok` incl. `0`/`false`. `--exists` is read as a string, not a bash exit code (conflating "failed" with "false" is a silent-correctness footgun). + +**Conventions (full per-kind templates finalized in `/plan`):** +- **Endpoint disambiguation.** Edge/row endpoints use the *shortest form unique within this result's node set*: simple name → `name @service` → FQN. Common case stays short; collisions escalate only as far as needed. (`@service` is a name matching the `--service` flag — never an opaque ID. This preserves the resolve-first guarantee on the *output* side, where result-set name collisions would otherwise re-introduce the very ambiguity JSON avoids via IDs.) +- **Direction.** Arrow = stored edge direction (caller→callee, child→parent, implementer→interface, injector→bean), documented once in `--help`. `hierarchy` → indented tree with `↑`/`↓` relative to root; `connection` → `inbound:` / `outbound:` section headers; multi-hop traversals → `d=N` depth column. +- **Root identity.** Resolved root = first line, `* ` prefix, `(root)` suffix; the marker survives `--brief` (correctness over brevity). +- **Confidence.** `conf:0.9` is shown **only on CALLS-family edges** (`callers`/`callees`/`flow`/route-callers); structural edges (EXTENDS/IMPLEMENTS/INJECTS) carry none. **Envelope-level aggregate `confidence` is dropped** — no backend produces it (Appendix A). +- **Zero results vs not_found.** `ok` + zero is never empty stdout: it prints `0 @` (the resolved FQN also hedges against silent-wrong-resolve). `not_found` → `not found: `. +- **Ambiguous candidates.** Header states the count **and lists the narrowing flags** (`--service | --fqn-prefix | --kind | --java-kind | --role`); each line: `name FQN java-kind role service — ` (`reason` = `ResolveReason`, e.g. `exact_fqn` vs `short_name` — a cheap "how to narrow" signal). No `file:line` (`NodeRef` lacks file fields) and no `score` (positional rank, redundant with order). Up to two pre-filled `next:` narrowing commands; `probable:` may prefix the strongest but **auto-pick is forbidden** (§4 resolve-first is non-negotiable). +- **`truncated`.** Tier-1 v1 via the **+1-fetch trick** (`LIMIT limit+1`; `truncated = rows_fetched > limit`, zero cost): text renders `truncated: more results — use --offset `. `M of N` / `total_count` is deferred — it needs a separate COUNT query (and would also surface the silent resolve candidate cap-at-10). +- **`edge_summary`** → indented, labels padded to the longest key, alphabetical, zero-zero rows omitted, explicit `in:0` kept. +- **Projection.** `--fields` always uses a `key: value` record block; built-in listings use tables (≤4 short-token columns) and switch to record block at ≥5 fields or any long/whitespace field. +- **Delimiters.** ASCII by default for byte-efficiency (`->LABEL->`); Unicode (`—LABEL→`) permitted for human review; one form pinned in test snapshots. + +`--brief` / `--fields` / `--count` layer on top of either format. Text is the default because every byte of stdout enters the agent's context window (§14); `--format json` is one flag away. + +--- + +## §7 — Use-case re-walk + +Simulated agent: AI coding agent, ~15-service Spring Boot / Kafka / Feign fleet, 50k+ LoC services. + +| # | Use case | Commands | Chain | +|---|---|---|---| +| UC1 | Bug: "orders after 6pm don't trigger inventory updates" — find the producer path | 2 | `jrag flow "POST /orders"` (outbound intra-service hops incl. the Kafka send) → `jrag callees ` (topic it publishes to) | +| UC2 | Safe refactor: add parameter to `OrderService.calculateTotal` — blast radius | 2 | `jrag impact OrderService.calculateTotal` (transitive) → `jrag callers OrderService.calculateTotal` (direct) | +| UC3 | Check if method implements an interface (affects blast radius) | 2 | `jrag inspect OrderService.calculateTotal` → `jrag implementations PricingStrategy` (if edge_summary shows IMPLEMENTS) | +| UC4 | Find existing Feign multi-service join pattern to copy | 3 | `jrag find --calls-service inventory-service --service reporting-service` (kind=client inferred) → `jrag outline ReportingController.java` → `jrag decompose "ReportingController#joinEndpoint" --follow-calls` | +| UC5 | Incident: NPE from `InventoryClient#checkAvailability` in payment-service | 3 | `jrag inspect "InventoryClient#checkAvailability" --service payment-service` → `jrag callees "InventoryClient#checkAvailability"` (route it calls) → `jrag callers "InventoryClient#checkAvailability"` | +| UC6 | Onboarding to reporting-service (cold start) | 2 | `jrag overview reporting-service` → `jrag routes --service reporting-service` | +| UC7 | Kafka topology: who produces and consumes `order.created` | 2 | `jrag overview order.created` → `jrag decompose --follow-calls` (per consumer) | +| UC8 | PR review: 3 files changed in order-service — blast radius | 3 | `jrag impact OrderController --service order-service --depth 3` + `impact OrderService …` + `impact OrderRepository …` (each emits the post-filter warning) | +| UC9 | Scheduled job audit: all `@Scheduled` jobs fleet-wide | 1 | `jrag find --capability scheduled-task` (kind=symbol inferred; no `--service` = fleet-wide) | +| UC10 | Security review: endpoints missing `@PreAuthorize` | 2 | `jrag routes` (fleet-wide) → inspect each for annotation fields; `jrag find --annotation @PreAuthorize` as secondary cross-check | +| UC11 | Architecture conventions: what patterns does payment-service use? | 1 | `jrag conventions --service payment-service` | +| UC12 | Find all Feign clients calling inventory-service across the fleet | 1 | `jrag clients --calls-service inventory-service` | +| UC13 | Find the route handler for `GET /orders/{id}` | 1 | `jrag find "GET /orders/{id}" --kind route` | +| UC14 | Inheritance tree: full hierarchy of `AbstractOrderProcessor` | 1 | `jrag hierarchy AbstractOrderProcessor` | +| UC15 | Dependency injection: what does `OrderService` inject? | 1 | `jrag dependencies OrderService --role service` | +| UC16 | Where does `KafkaOrderProducer` actually publish to? | 1 | `jrag callees KafkaOrderProducer` ( folded target → topic) | +| UC17 | Fleet-wide: list all Kafka topics, filter by consumer service | 1 | `jrag topics --consumer-in inventory-service` | +| UC18 | Cross-service map: what calls payment-service inbound? | 1 | `jrag connection payment-service --inbound` | +| UC19 | Which methods does `OrderController` override from a parent? | 1 | `jrag overrides OrderController --java-kind class` | +| UC20 | Structural size sanity before touching a service | 1 | `jrag map --service order-service` | + +**Summary:** 15 of 20 use cases resolve in 1–2 commands. The 3-command cases (UC4, UC5, UC8) involve genuine multi-step investigation. No use case requires a prior call just to obtain an ID. + +**Important semantic correction (UC1/UC4/UC5/UC7):** `flow`'s *outbound* side is intra-service only — it does **not** descend into downstream services. Cross-service reachability is on `flow`'s *inbound* side (who calls this route from outside) and via `callees ` (where this outbound infra sends). Use cases that previously implied downstream cross-service hops now route through `callees` for the transport hop. + +**Awkward cases:** +- **UC10** (absent annotation): no negative filter (`--without-annotation`) in v1. Fleet-wide routes listing + annotation inspection is the only path. Known gap (§8). +- **UC8** (multi-symbol impact): 3 separate `impact` calls (no batch mode in v1). Note the operator `analyze-pr` already does single-diff blast radius — `jrag diff-impact` wrapping it is a natural post-v1 addition. + +--- + +## §8 — What this deliberately does NOT do (v1) + +| Feature | Why deferred / skipped | +|---|---| +| **Daemon** (unix-socket, auto-start) | Heaviest, riskiest piece; no infra exists today; §11 lists socket-recovery races. v1 is in-process; daemon revisited only if cold-start latency justifies it. | +| Negative/absence filters (`--without-annotation`, `--unreferenced`) | Non-trivial backend query shape; not addressed by existing functions; deferred. (UC10 gap.) | +| `diff-impact` / `changed` (git diff → symbols) | **Not** "no backend" — the operator `analyze-pr` already does diff-based blast radius. Deferred from the *agent* surface only because wrapping it cleanly (symbol-level, not file-level) is its own design; a natural post-v1 addition. | +| `todos` / `unreferenced` listing commands | Not needed for v1 agent workflows; addable without API breaks. | +| Batch/multi-identifier input | Each command takes one resolved node; batching is N sequential calls for v1. | +| `drift` detection | Explicitly a later milestone. | +| Raw node IDs as primary input | Agents never construct internal IDs; the resolve contract covers all identifier forms. | +| Standalone `jrag resolve` | Resolution is infrastructure (the `resolve` tool), implicit in every command. | +| `jrag source ` (read method body) | Out of scope — `inspect` → `file_location` → agent's own file tools covers it. | +| Operator commands (`init`, `increment`, `reprocess`, `meta`, `analyze-pr`, `diagnose-ignore`, `tables`, `unresolved-calls`, `install`, `update`, `erase`) | Remain in the `java-codebase-rag` operator CLI. `install`/`update` gain a `--surface mcp\|cli` branch + surface-keyed artifact set in PR-JRAG-5; the lifecycle commands themselves don't move. | + +--- + +## §9 — Migration plan — 5 PRs + prep refactors + deferred daemon + +**Two prep refactors (land first, independent of CLI commands):** +- **PR-JRAG-0a — Single source of truth for shipped agent artifacts.** Today `skills/explore-codebase/SKILL.md` and `agents/explorer-rag-enhanced.md` each exist in two byte-identical, hand-synced copies (dev path + `java_codebase_rag/install_data/...` shipped via `package_data`). Collapse to one dev source; generate the shipped copy at build time (or read the dev copy directly). **Must land before the CLI skill/subagent exist** — otherwise the CLI variant creates four hand-synced copies. Small, zero behavior change. +- **PR-JRAG-0b — Extract `resolve_v2` into `resolve_service.py`.** `mcp_v2.py` already imports zero MCP SDK, and `resolve_v2(identifier, hint_kind, graph=g)` is a transport-agnostic pure function (the test suite already calls it exactly this way). Lift `resolve_v2` + its ~370-line pipeline (identifier parse → four candidate collectors → dedupe → rank → finalize) + the `ResolveOutput`/`ResolveCandidate` models into a neutral-named module; `mcp_v2` re-exports. **This removes the single real duplication trap** — without it, the CLI's resolve-first layer would either re-implement the pipeline (silent drift) or import an `mcp_`-named module. Mechanical, protected by existing tests (they assert on output shapes, not internals). Land as the opening step of PR-JRAG-1, or standalone the same week. + +**PR-JRAG-1: Entry point + locate tier (in-process)** +- Add the `jrag` console script to `pyproject.toml` (`[project.scripts]`); build the shared resolve-first library (wraps `resolve_v2` → envelope status mapping); implement `jrag find` with kind-inference + contradiction-error + grouped `--help`; `jrag inspect` with full `edge_summary`; `jrag status`. Index loaded in-process via the existing `config.py` resolver — **no daemon**. +- **Honest `truncated`**: the boolean isn't surfaced by the backend today — implement via the **+1-fetch trick** (`LIMIT limit+1`; `truncated = rows_fetched > limit`, zero extra cost); text renders `truncated: more results — use --offset `. `total_count` / 'M of N' deferred (needs a COUNT query; would also surface the silent resolve candidate cap-at-10). +- Test: find by FQN exact, by `--role`, by `--capability`; kind-inference from flags; hard-error on `--kind symbol --http-method GET`; inspect returns `edge_summary` with composed keys; ambiguous → candidates (reason rendered, no file/score); `--index-dir` resolves to the operator's index; `truncated` fires correctly via +1-fetch. + +**PR-JRAG-2: Listing tier** +- `jrag routes`, `clients`, `producers`, `topics`, `jobs`, `listeners`, `entities` with their flags + globals. +- Test: each returns nodes of the correct kind; `--service`/`--module` scope correctly; `truncated: true` fires when limit is hit. + +**PR-JRAG-3: Traversal tier** +- `callers`, `callees`, `hierarchy`, `implementations`, `subclasses`, `overrides`, `overridden-by`, `dependents`, `dependencies`, `impact`, `decompose`, `flow`, `connection`; plus `outline`, `imports`. +- **Backend work (honest, not "thin extraction"):** (a) `callees` for Client/Producer composes `resolve_v2` (name→client/producer id) + `neighbors_v2(id, "out", ["HTTP_CALLS"|"ASYNC_CALLS"])` — the generic flat-label traversal branch already reaches `:Route`/topic nodes today, so **no new query is required for v1**; a dedicated `LadybugGraph.client_calls_route`/`producer_calls_topic` method (mirroring `find_route_callers`) is post-v1 polish for symmetry/testability. (b) `dependencies` composes `neighbors(direction="out", edge_types=["INJECTS"])`. +- Test: each command exercises its backend; resolve ambiguity stops traversal; `callers` and `callees` dispatch correctly by resolved kind; `flow` outbound is intra-service (assert no cross-service descent); `impact --service` post-filter emits its warning. + +**PR-JRAG-4: Orientation + search + packaging** +- `microservices`, `map`, `conventions`, `overview` (with `--as`); `search` (incl. `--hybrid` → BM25+vector); README; finalize the PyPI entry point; `agent_next_actions` generation (new edge→command mapper). +- Test: `overview` returns the correct bundle per target type; `search --hybrid` calls the BM25+vector path; `map` returns non-empty counts for every indexed service; `agent_next_actions` are valid runnable commands capped at 5. + +**PR-JRAG-5: Agent host integration (install branching, skill, subagent)** +- New wizard step `select_surface` — "MCP or CLI?" — applied **globally** to all selected agent hosts (one surface per install; per-host variation deferred). Non-interactive flag `--surface mcp|cli`, default `mcp`. This *enforces* what the README today only *warns* ("do not mix multiple mechanisms on the same agent — duplicate context confuses tool selection"). +- Ship a **CLI-flavored skill** (`explore-codebase-cli`) + **CLI-flavored subagent** (`explorer-rag-cli`), mirroring today's MCP pair (`explore-codebase` + `explorer-rag-enhanced`). Two separate documents, not one mode-switching skill: the MCP and CLI tool vocabularies differ (MCP tool calls vs `jrag` shell invocations), and a dual-vocabulary skill would carry exactly the "duplicate context" cost the README warns against. The CLI skill teaches the §5 command grammar, the §4 resolve contract, and §6 text output. +- **`Surface` dimension** on the existing `HostConfig`/`HOSTS` registry (`installer.py:43-95`) — host × surface = artifact set. `HostConfig` today abstracts paths only; surface is added orthogonal to it (not a host-capability flag). +- **`ArtifactManifest`** — replace the two hardcoded 3-artifact lists in `deploy_artifacts` (`installer.py:558`) and `refresh_artifacts` (`installer.py:1049`) with a single manifest iterated by both, keyed by surface. Kills the existing deploy/refresh duplication as a side benefit. +- **Fix `detect_configured_hosts` (`installer.py:1001`)** — today it discovers hosts *exclusively* by scanning for the `java-codebase-rag` MCP entry, so a CLI-only install (skill, no MCP entry) is invisible to `update`, which then exits fatal ("No configured agent hosts found"). Write a marker file (`.java-codebase-rag.hosts`, recording host/scope/surface chosen at install); discovery reads it. **Forced into this PR** — shipping the branch without it is a known `update` regression. +- **`resolve_mcp_command` (`installer.py:424`)** becomes surface-conditional: the CLI surface resolves the `jrag` binary, not the MCP server (today it hard-fails if the MCP binary is missing, which would block a CLI install). +- Depends on PR-JRAG-0a (single-source artifacts) so the new skill/subagent land in a one-copy world, not a four-copy world. +- Test: `--surface cli` deploys the CLI skill/subagent and writes no MCP entry; `--surface mcp` reproduces today's behavior; `update` after a CLI-only install refreshes the CLI skill (pre-fix this exited fatal); the marker file round-trips host/scope/surface; `resolve_mcp_command` resolves `jrag` on the CLI surface. + +**Deferred milestone — Daemon (post-v1):** unix-socket daemon with transparent auto-start, `jrag daemon stop|status|list`, multi-index registry. Taken on only if PR-JRAG-1..5 ship and cold-start latency on a large estate is measured to be a problem. + +--- + +## §10 — Decisions taken + +1. **Same repo, new `jrag` PyPI entry point.** Lives in `HumanBean17/java-codebase-rag`, alongside (not replacing) the `java-codebase-rag` operator CLI. *(Corrected: the operator CLI is `java-codebase-rag`; there is no `user-rag`.)* +2. **`neighbors` removed.** Every edge traversal gets a named command. No agent reasons about `direction` or `edge_types`. +3. **Resolve-first: `` not ``.** All traversal/inspect commands take a human-readable query; the resolve step (`resolve_v2`, called directly as a transport-agnostic function — extracted in PR-JRAG-0b) is internal and invisible. Raw node IDs are never required. +4. **Disambiguation flags on all `` commands.** `--kind`, `--java-kind`, `--role`, `--fqn-prefix` narrow resolve, not traversal results. +5. **`--service` semantics vary by command and are documented.** Pushed to the backend where the function takes `microservice` (`callers`, `callees`, `implementations`, `subclasses`, `dependents`, …); client-side post-filter with warning on `impact` (whose `impact_analysis()` takes none). +6. **`--module` is a first-class global flag** mapping to `NodeFilter.module`, distinct from `--service`. +7. **`--symbol-kind` → `--java-kind`.** Avoids the triple-"kind" overload. *(Note: the underlying `NodeFilter` field is `symbol_kind`/`symbol_kinds`; `java_kind` is the CLI flag name only.)* +8. **`connection` replaces `boundary`/`contract`/`service-map`.** Self-describing. +9. **`microservices` replaces `services`.** Avoids confusion with Spring `@Service`. +10. **`callers` and `callees` both dispatch by resolved kind.** `callers`: Symbol→`find_callers`, Route→`find_route_callers`. `callees`: Symbol→`find_callees`, Client→HTTP_CALLS-out, Producer→ASYNC_CALLS-out. *(New: `callees` now absorbs the old `target` command.)* +11. **`overrides` / `overridden-by` are separate commands** (direction ambiguity in one command would be a silent correctness risk). +12. **`dependents` (INJECTS-in) / `dependencies` (INJECTS-out).** *(Renamed from `injectors` for symmetry + guessability.)* +13. **`target` folded into `callees`** rather than kept standalone (decided). A client/producer *calls* its route/topic; consistent with `callees` = "what this calls" and with `callers`' kind-dispatch. (`destination` was the standalone alternative; rejected in favor of the fold.) +14. **`trace` → `decompose`** (decided). Resolves the trace/flow collision: "trace" implies end-to-end (which is `flow`); `decompose` honestly names the static role-waterfall. +15. **`find` stays flat** with kind-inference + contradiction-error + grouped help. Not split by kind (the listing tier already owns kind-specific access). +16. **`--target-service` → `--calls-service`** (and `--target-path-prefix` → `--calls-path-prefix`). Eliminates a one-hyphen near-collision with global `--service`. +17. **`flow --max-hops` not `flow --depth`.** Distinct from `decompose --depth` (per-stage hops vs stage count). +18. **Daemon deferred; v1 in-process.** Agents never manage a process in v1. +19. **`agent_next_actions` (≤5) replaces MCP `StructuredHint`.** Requires a new edge→CLI-command mapper. +20. **`edge_summary` required in `inspect`**, incl. composed keys. Verified available for all four kinds. +21. **`truncated` is a boolean only** (dropped from the `status` enum). Capped results are `status: ok` + `truncated: true`. +22. **Enum casing normalized.** CLI accepts lowercase/kebab or UPPER_SNAKE; maps to stored UPPER_SNAKE. Roles include `client`; `other` exposed (used by `--exclude-role`). +23. **No `jrag source`.** `inspect` → `file_location` → agent's file tools. + +--- + +## §11 — Risks and how we mitigate + +| Risk | Mitigation | +|---|---| +| Resolve ambiguity too frequent — agent narrows too often | `--fqn-prefix` + `--service` on every traversal command collapses most collisions; use-case re-walk shows 15/20 cases need 0 narrowing | +| `callers`/`callees` kind-dispatch wrong — symbol resolves to wrong kind | `--kind symbol\|route` explicit override; ambiguous cases surface candidates, not wrong results | +| `impact --service` post-filter silently misleads on cross-service blast radius | Warning in `warnings[]`: "impact ran fleet-wide; results filtered to --service. Cross-service nodes excluded." | +| `callees` for Client/Producer composes two calls | `resolve_v2` + `neighbors_v2` generic path already reaches `:Route`/topic (no new query for v1); dedicated `LadybugGraph` method is post-v1 polish. Real risk is the compose returning less edge detail than the Symbol path — documented, acceptable | +| `dependencies` returns less detail than `dependents` | Documented (neighbors-composed vs `find_injectors` EdgeHit); acceptable for v1 | +| `flow` outbound cross-service expectation | Help text states outbound is intra-service; cross-service is inbound + `callees `; use cases corrected | +| `agent_next_actions` suggests a non-existent/wrong command | New edge→command mapper must be tested against every edge label incl. composed keys (PR-JRAG-4) | +| `--source-layer` values opaque to agents | One-line legend in `--help` | +| `--calls-service` vs `--service` still confused | Distinct names + help cross-reference; grouped help separates global scope from client-call flags | +| ~~`edge_summary` missing for some kinds~~ | **Not a risk** — verified: `describe_v2`/`edge_counts_for` is kind-agnostic; `edge_summary` exists for all four kinds (composed dot-keys are symbol-only, which is correct) | +| `truncated` signal doesn't exist in the backend today (only `has_more_results`, fed to hints) | +1-fetch trick in PR-JRAG-1 (`LIMIT limit+1`); `total_count`/'M of N' deferred (needs COUNT) | +| Text edge names re-introduce ambiguity the resolve-first contract kills | Tiered endpoint rendering (simple → `name @service` → FQN) keyed to within-result uniqueness (§6) | +| `update` strands CLI-only installs after a `pip upgrade` | `detect_configured_hosts` reads a marker file (`.java-codebase-rag.hosts`) instead of scanning MCP entries (PR-JRAG-5); test covers CLI-only install → `update` refresh | +| CLI re-implements the resolve pipeline → silent drift | `resolve_v2` extracted to `resolve_service.py` (PR-JRAG-0b); both MCP and CLI import the same function | + +*(Daemon-related risks — stale PID, socket unavailable, auto-start races — are deferred with the daemon itself.)* + +--- + +## §12 — Open questions ([TBD]) + +1. **Daemon trigger threshold.** No daemon in v1; revisit only with measured cold-start latency data from a large estate after PR-JRAG-1..4 ship. +2. **`--role` multi-value.** Deferred enhancement: stay single-valued in v1 (matches `NodeFilter.role`); add multi-value (list) as a backend follow-up if agents hit the "controllers OR services" wall. +3. **Default field sets per command.** The token-efficiency projections in §14 are illustrative; the exact default field list per command is finalized in the `/plan` per-PR contracts. +4. **Discovery signal for CLI-only installs.** Marker file (`.java-codebase-rag.hosts`, recording host/scope/surface) vs. scanning for skill/agent files. Recommended: marker file — explicit, survives skill renames, round-trips the surface choice. (Decides a detail of PR-JRAG-5.) + +--- + +## §13 — Schema / Ontology / Re-index impact + +- **Ontology bump: not required.** The CLI is a read-only surface over the existing graph. +- **Re-index required: no.** It consumes the index that `java-codebase-rag init`/`increment` already produces. +- **Config/tool surface changes:** one new `[project.scripts]` entry (`jrag`); new CLI module; enum-casing normalization layer; new edge→command hint mapper. No change to the MCP, the operator CLI, `NodeFilter`, or the ontology. + +--- + +## §14 — Token efficiency (CLI outputs) + +Every byte of CLI output enters the agent's context window, so output size is a first-class design constraint, not an afterthought. Defaults favor small, intent-matched payloads; verbosity is opt-in. + +1. **Per-command default field projection.** Each command returns a curated default field set for its intent, not the full node record. **FQN is omitted from listings and traversal rows** (the next command re-resolves on a name + `--service`); full FQN appears only in `inspect`, ambiguous candidates, and `--fields`: + - `routes` → `{method, path, handler, service, file:line}` + - `callers` / `callees` / `dependents` / `dependencies` → `{name, role, service, file:line, confidence}` + - `impact` → `{name, role, service, depth, confidence}`, ranked + - `inspect` → full record + `edge_summary` (the one "tell me everything" command) + - `ambiguous` candidate lists → `{name, fqn, java-kind, role, service, reason}` + - `--fields ` opts in to specific fields; `--full` returns everything. +2. **`--brief`.** Name + FQN + one discriminator (role/kind) only — for scanning/confirmation. Default for candidate lists. +3. **`--count` / `--exists`.** Bare scalar output (`42` / `true` / `false`) — no records. Exit 0 on every `ok` (including `0` / `false`); `--exists` is read as a string, not a bash exit code (conflating "failed" with "false" is a silent-correctness footgun). +4. **Fan-out-scaled default `--limit`.** High-fanout commands (`impact --depth≥2`, `callees`/`callers --depth≥2`) default lower (e.g. 10); listings default 20. `truncated: true` always signals more. +5. **De-duplication.** A node reached via multiple paths appears once in `nodes` (the MCP `neighbors` `dedup_calls` behavior carries over); `edges` still lists every path. +6. **Ranked output.** Results ranked by confidence/relevance so the agent can take top-K and stop. `agent_next_actions` suggests narrowing over paging for semantic search (results degrade past page 1). +7. **Text is the default format; JSON is opt-in.** Compact text (one line per result, header-once tables) minimizes tokens by default; `--format json` emits the canonical envelope (Appendix A) for structured/pipeline use. Flipped from JSON-default because defaulting to the verbose format conflicted with token-efficiency-first (§6). +8. **Lean envelope.** Omit empty optional fields (`warnings`, `edges`, `agent_next_actions`) rather than emitting `[]` — saves tokens on the common success path. +9. **IDs in edges, records in `nodes`.** Edges carry `from`/`to` IDs; `nodes` is the ID→record map, so multi-edge results don't duplicate node data. (Inlining endpoint names in every edge was rejected as duplicative.) + +**Validation:** a token-budget assertion in the test suite — no command's *default* output exceeds a ceiling (e.g. 2k tokens) on the bank-chat fixture — guards against regression as fields accrete. + +--- + +## Appendix A — Output envelope schema (canonical model) + +Emitted verbatim by `--format json`; the default text rendering (§6) is a compact view of this same model. + +```json +{ + "$schema": "http://json-schema.org/draft-07/schema", + "type": "object", + "required": ["status"], + "properties": { + "status": { "type": "string", "enum": ["ok", "ambiguous", "not_found", "error"] }, + "nodes": { "type": "object", "additionalProperties": { "type": "object" } }, + "edges": { + "type": "array", + "items": { + "type": "object", + "required": ["from", "to", "label"], + "properties": { + "from": { "type": "string" }, + "to": { "type": "string" }, + "label": { "type": "string" }, + "confidence": { "type": "number", "description": "CALLS-family edges only (callers/callees/flow/route-callers), sourced from edge.attrs['confidence']; absent on EXTENDS/IMPLEMENTS/INJECTS" } + } + } + }, + "root": { "type": "string" }, + "candidates": { "type": "array", "items": { "type": "object" }, "description": "Capped at 10. Items: {node: NodeRef, score (positional rank), reason: ResolveReason}. Text renders name/FQN/java-kind/role/service/reason — not file (NodeRef lacks it), not score" }, + "agent_next_actions": { "type": "array", "maxItems": 5, "items": { "type": "string" } }, + "warnings": { "type": "array", "items": { "type": "string" } }, + "truncated": { "type": "boolean", "description": "v1: +1-fetch trick (LIMIT limit+1; truncated = rows_fetched > limit). total_count / 'M of N' deferred (needs COUNT; would also cover candidate cap-at-10)" }, + "file_location": { "type": "string", "description": "filename:line — composed from root's record; rendered only when root is set" } + } +} +``` + +`edge_summary` (inspect only) is nested under the node record. Labels seen include stored edges (`CALLS`, `HTTP_CALLS`, `ASYNC_CALLS`, `IMPLEMENTS`, `EXTENDS`, `OVERRIDES`, `INJECTS`, `DECLARES`, `EXPOSES`, `DECLARES_CLIENT`, `DECLARES_PRODUCER`) plus virtual/composed keys for symbols (`DECLARES.EXPOSES`, `OVERRIDDEN_BY`, `OVERRIDDEN_BY.DECLARES_CLIENT`, …). Note `OVERRIDDEN_BY` is a **virtual** key (reverse of stored `OVERRIDES`), not a stored edge. + +```json +"edge_summary": { + "CALLS": { "in": 14, "out": 3 }, + "DECLARES.EXPOSES": { "in": 0, "out": 2 }, + "OVERRIDDEN_BY": { "in": 0, "out": 1 }, + "OVERRIDDEN_BY.DECLARES_CLIENT": { "in": 0, "out": 1 } +} +``` + +## Appendix B — Backend mapping (verified) + +| CLI command | Backend (`ladybug_queries.py`) | Status | +|---|---|---| +| `find` | `find_by_name_or_fqn` / `list_by_role` / `list_by_annotation` / `list_by_capability` + `resolve_v2` | exists | +| `inspect` | `describe_v2` (+ `edge_counts_for`, `member_edge_rollup_for`, `override_axis_rollup_for`) | exists | +| `decompose` | `trace_flow` | exists | +| `flow` | `trace_request_flow` (outbound is intra-service) | exists | +| `impact` | `impact_analysis` (no microservice param) | exists | +| `callers` | `find_callers` / `find_route_callers` | exists | +| `callees` (Symbol) | `find_callees` | exists | +| `callees` (Client/Producer) | — | **new query needed** | +| `hierarchy` | `neighbors` (EXTENDS+IMPLEMENTS, both) | exists | +| `implementations` | `find_implementors` | exists | +| `subclasses` | `find_subclasses` | exists | +| `overrides` / `overridden-by` | `override_axis_traversal_for` / `override_axis_rollup_for` | exists | +| `dependents` | `find_injectors` | exists | +| `dependencies` | `neighbors(direction="out", edge_types=["INJECTS"])` | **composed (no dedicated fn)** | +| `connection` | `list_clients` / `list_producers` + route-caller queries | exists | +| `routes`/`clients`/`producers`/`topics` | `list_routes` / `list_clients` / `list_producers` (+ topics) | exists | +| `outline` / `imports` | `find_symbols_in_file_range` + source parse | exists | +| `status` | `meta` / `microservice_counts` / `module_counts` | exists | diff --git a/pyproject.toml b/pyproject.toml index 05167540..ee340f99 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta" [project] name = "java-codebase-rag" -version = "0.6.7" +version = "0.7.0" description = "MCP server for semantic + structural search over Java codebases" readme = "README.md" requires-python = ">=3.11" @@ -59,6 +59,7 @@ Issues = "https://github.com/HumanBean17/java-codebase-rag/issues" [project.scripts] java-codebase-rag = "java_codebase_rag.cli:_console_script_main" java-codebase-rag-mcp = "server:main" +jrag = "java_codebase_rag.jrag:_console_script_main" [tool.setuptools] packages = ["java_codebase_rag"] @@ -68,6 +69,7 @@ py-modules = [ "build_ast_graph", "chunk_heuristics", "graph_enrich", + "graph_types", "index_common", "java_index_flow_lancedb", "java_index_v1_common", @@ -77,6 +79,7 @@ py-modules = [ "mcp_v2", "path_filtering", "pr_analysis", + "resolve_service", "search_lancedb", "server", ] diff --git a/resolve_service.py b/resolve_service.py new file mode 100644 index 00000000..82a4a7ac --- /dev/null +++ b/resolve_service.py @@ -0,0 +1,502 @@ +"""Resolve service for mapping identifiers to graph nodes. + +Transport-agnostic resolve pipeline extracted from mcp_v2.py for reuse +by the CLI layer. Provides resolve_v2(identifier, hint_kind, graph) -> ResolveOutput. +""" + +from __future__ import annotations + +from typing import Literal + +from pydantic import BaseModel, ConfigDict, Field + +from graph_types import ( + NodeRef, + StructuredHint, + _hints_or_skip, + _node_ref_from_row, + _to_structured_hints, + set_hints_enabled, +) +from java_ontology import ResolveReason +from ladybug_queries import LadybugGraph +from mcp_hints import MCP_HINTS_STRUCTURED_FIELD_DESCRIPTION + +__all__ = [ + "resolve_v2", + "ResolveOutput", + "ResolveCandidate", + "ResolveStatus", + "set_hints_enabled", +] + + +ResolveStatus = Literal["one", "many", "none"] + +_RESOLVE_CANDIDATE_CAP = 10 + +_RESOLVE_REASON_PRIORITY: dict[ResolveReason, int] = { + "exact_id": 0, + "exact_fqn": 1, + "route_method_path": 1, + "client_target_path": 1, + "producer_topic_prefix": 1, + "fqn_suffix": 2, + "route_template": 2, + "short_name": 3, + "client_target": 3, + "producer_topic": 3, +} + +_SYMBOL_RESOLVE_RETURN = ( + "s.id AS id, s.fqn AS fqn, s.microservice AS microservice, " + "s.module AS module, s.role AS role, s.kind AS symbol_kind" +) + +_ROUTE_RESOLVE_RETURN = ( + "r.id AS id, r.kind AS kind, r.framework AS framework, r.method AS method, " + "r.path AS path, r.path_template AS path_template, r.path_regex AS path_regex, " + "r.topic AS topic, r.broker AS broker, r.feign_name AS feign_name, r.feign_url AS feign_url, " + "r.microservice AS microservice, r.module AS module, r.filename AS filename, " + "r.start_line AS start_line, r.end_line AS end_line, r.resolved AS resolved" +) + +_CLIENT_RESOLVE_RETURN = ( + "c.id AS id, c.client_kind AS client_kind, c.target_service AS target_service, " + "c.method AS method, c.path AS path, c.path_template AS path_template, " + "c.path_regex AS path_regex, c.member_fqn AS member_fqn, c.member_id AS member_id, " + "c.microservice AS microservice, c.module AS module, c.filename AS filename, " + "c.start_line AS start_line, c.end_line AS end_line, c.resolved AS resolved, " + "c.source_layer AS source_layer" +) + +_PRODUCER_RESOLVE_RETURN = ( + "p.id AS id, p.producer_kind AS producer_kind, p.topic AS topic, p.broker AS broker, " + "p.direction AS direction, p.member_fqn AS member_fqn, p.member_id AS member_id, " + "p.microservice AS microservice, p.module AS module, p.filename AS filename, " + "p.start_line AS start_line, p.end_line AS end_line, p.resolved AS resolved, " + "p.source_layer AS source_layer" +) + +_RESOLVE_PRE_DEDUP_LIMIT = 50 + + +class ResolveCandidate(BaseModel): + model_config = ConfigDict(extra="forbid") + + node: NodeRef + score: float + reason: ResolveReason + + +class ResolveOutput(BaseModel): + model_config = ConfigDict(extra="forbid") + + success: bool + status: ResolveStatus + node: NodeRef | None = None + candidates: list[ResolveCandidate] = Field(default_factory=list) + message: str | None = None + resolved_identifier: str | None = None + advisories: list[str] = Field(default_factory=list, description="Pure informational text with no tool call suggestion") + hints_structured: list[StructuredHint] = Field(default_factory=list, description=MCP_HINTS_STRUCTURED_FIELD_DESCRIPTION) + + +def _resolve_validate_identifier(raw: str) -> tuple[str | None, str | None]: + trimmed = raw.strip() + if not trimmed: + detail = "empty string" if raw == "" else "whitespace only" + return None, f"Invalid identifier: {detail}" + return trimmed, None + + +def _resolve_kinds_to_search( + hint_kind: Literal["symbol", "route", "client", "producer"] | None, +) -> list[Literal["symbol", "route", "client", "producer"]]: + if hint_kind is None: + return ["symbol", "route", "client", "producer"] + return [hint_kind] + + +def _resolve_parse_route_method_path(identifier: str) -> tuple[str, str] | None: + parts = identifier.split(None, 1) + if len(parts) != 2: + return None + method, path = parts[0].upper(), parts[1].strip() + if not method.isalpha() or not path.startswith("/"): + return None + return method, path + + +def _resolve_parse_microservice_route(identifier: str) -> tuple[str, str, str] | None: + parts = identifier.split(None, 2) + if len(parts) != 3: + return None + microservice, method, path = parts[0], parts[1].upper(), parts[2].strip() + if not method.isalpha() or not path.startswith("/"): + return None + return microservice, method, path + + +def _resolve_symbol_candidates( + g: LadybugGraph, + identifier: str, +) -> list[tuple[NodeRef, ResolveReason, int]]: + out: list[tuple[NodeRef, ResolveReason, int]] = [] + lim = _RESOLVE_PRE_DEDUP_LIMIT + + rows = g._rows( # noqa: SLF001 + f"MATCH (s:Symbol) WHERE s.id = $id RETURN {_SYMBOL_RESOLVE_RETURN} LIMIT $lim", + {"id": identifier, "lim": lim}, + ) + for row in rows: + out.append((_node_ref_from_row("symbol", row), "exact_id", len(identifier))) + + rows = g._rows( # noqa: SLF001 + f"MATCH (s:Symbol) WHERE s.fqn = $fqn RETURN {_SYMBOL_RESOLVE_RETURN} LIMIT $lim", + {"fqn": identifier, "lim": lim}, + ) + for row in rows: + out.append((_node_ref_from_row("symbol", row), "exact_fqn", len(identifier))) + + # Method FQN without arg signature (e.g. "pkg.Cls#method"): the stored method + # fqn is "pkg.Cls#method(Type,Type)", so an argless identifier misses the + # exact match above. Prefix-match on "(" so the agent doesn't + # have to type the exact "(Type,Type)" signature. Multiple overloads → the + # resolve "many" path surfaces them honestly as ambiguous candidates. + if "#" in identifier and "(" not in identifier: + rows = g._rows( # noqa: SLF001 + f"MATCH (s:Symbol) WHERE s.fqn STARTS WITH $mp " + f"RETURN {_SYMBOL_RESOLVE_RETURN} LIMIT $lim", + {"mp": identifier + "(", "lim": lim}, + ) + for row in rows: + out.append((_node_ref_from_row("symbol", row), "fqn_suffix", len(identifier) + 1)) + + suffix = f".{identifier}" + rows = g._rows( # noqa: SLF001 + f"MATCH (s:Symbol) WHERE s.fqn = $ident OR s.fqn ENDS WITH $suffix " + f"RETURN {_SYMBOL_RESOLVE_RETURN} LIMIT $lim", + {"ident": identifier, "suffix": suffix, "lim": lim}, + ) + for row in rows: + fqn = str(row.get("fqn") or "") + spec = len(fqn) + out.append((_node_ref_from_row("symbol", row), "fqn_suffix", spec)) + + rows = g._rows( # noqa: SLF001 + f"MATCH (s:Symbol) WHERE s.name = $name RETURN {_SYMBOL_RESOLVE_RETURN} LIMIT $lim", + {"name": identifier, "lim": lim}, + ) + for row in rows: + out.append((_node_ref_from_row("symbol", row), "short_name", len(identifier))) + + return out + + +def _resolve_route_candidates( + g: LadybugGraph, + identifier: str, +) -> list[tuple[NodeRef, ResolveReason, int]]: + out: list[tuple[NodeRef, ResolveReason, int]] = [] + lim = _RESOLVE_PRE_DEDUP_LIMIT + + rows = g._rows( # noqa: SLF001 + f"MATCH (r:Route) WHERE r.id = $id RETURN {_ROUTE_RESOLVE_RETURN} LIMIT $lim", + {"id": identifier, "lim": lim}, + ) + for row in rows: + out.append((_node_ref_from_row("route", row), "exact_id", len(identifier))) + + ms_route = _resolve_parse_microservice_route(identifier) + if ms_route is not None: + microservice, method, path = ms_route + rows = g._rows( # noqa: SLF001 + f"MATCH (r:Route) WHERE r.microservice = $ms AND r.method = $method " + f"AND (r.path = $path OR r.path_template = $path) " + f"RETURN {_ROUTE_RESOLVE_RETURN} LIMIT $lim", + {"ms": microservice, "method": method, "path": path, "lim": lim}, + ) + for row in rows: + spec = len(path) + out.append((_node_ref_from_row("route", row), "route_method_path", spec)) + + method_path = _resolve_parse_route_method_path(identifier) + if method_path is not None: + method, path = method_path + rows = g._rows( # noqa: SLF001 + f"MATCH (r:Route) WHERE r.method = $method " + f"AND (r.path = $path OR r.path_template = $path) " + f"RETURN {_ROUTE_RESOLVE_RETURN} LIMIT $lim", + {"method": method, "path": path, "lim": lim}, + ) + for row in rows: + out.append((_node_ref_from_row("route", row), "route_method_path", len(path))) + + if identifier.startswith("/"): + rows = g._rows( # noqa: SLF001 + f"MATCH (r:Route) WHERE r.path = $path OR r.path_template = $path " + f"RETURN {_ROUTE_RESOLVE_RETURN} LIMIT $lim", + {"path": identifier, "lim": lim}, + ) + for row in rows: + path_val = str(row.get("path_template") or row.get("path") or "") + out.append((_node_ref_from_row("route", row), "route_template", len(path_val))) + + return out + + +def _resolve_client_candidates( + g: LadybugGraph, + identifier: str, +) -> list[tuple[NodeRef, ResolveReason, int]]: + out: list[tuple[NodeRef, ResolveReason, int]] = [] + lim = _RESOLVE_PRE_DEDUP_LIMIT + + rows = g._rows( # noqa: SLF001 + f"MATCH (c:Client) WHERE c.id = $id RETURN {_CLIENT_RESOLVE_RETURN} LIMIT $lim", + {"id": identifier, "lim": lim}, + ) + for row in rows: + out.append((_node_ref_from_row("client", row), "exact_id", len(identifier))) + + if " " in identifier: + target, path_prefix = identifier.split(" ", 1) + target = target.strip() + path_prefix = path_prefix.strip() + if target and path_prefix: + rows = g._rows( # noqa: SLF001 + f"MATCH (c:Client) WHERE c.target_service = $target " + f"AND (c.path STARTS WITH $path OR c.path_template STARTS WITH $path) " + f"RETURN {_CLIENT_RESOLVE_RETURN} LIMIT $lim", + {"target": target, "path": path_prefix, "lim": lim}, + ) + for row in rows: + spec = len(path_prefix) + out.append((_node_ref_from_row("client", row), "client_target_path", spec)) + elif not identifier.startswith("/"): + rows = g._rows( # noqa: SLF001 + f"MATCH (c:Client) WHERE c.target_service = $target RETURN {_CLIENT_RESOLVE_RETURN} LIMIT $lim", + {"target": identifier, "lim": lim}, + ) + for row in rows: + out.append((_node_ref_from_row("client", row), "client_target", len(identifier))) + + return out + + +def _resolve_producer_candidates( + g: LadybugGraph, + identifier: str, +) -> list[tuple[NodeRef, ResolveReason, int]]: + out: list[tuple[NodeRef, ResolveReason, int]] = [] + lim = _RESOLVE_PRE_DEDUP_LIMIT + + rows = g._rows( # noqa: SLF001 + f"MATCH (p:Producer) WHERE p.id = $id RETURN {_PRODUCER_RESOLVE_RETURN} LIMIT $lim", + {"id": identifier, "lim": lim}, + ) + for row in rows: + out.append((_node_ref_from_row("producer", row), "exact_id", len(identifier))) + + rows = g._rows( # noqa: SLF001 + f"MATCH (p:Producer) WHERE p.topic = $topic RETURN {_PRODUCER_RESOLVE_RETURN} LIMIT $lim", + {"topic": identifier, "lim": lim}, + ) + for row in rows: + out.append((_node_ref_from_row("producer", row), "producer_topic", len(identifier))) + + if not identifier.startswith("/"): + rows = g._rows( # noqa: SLF001 + f"MATCH (p:Producer) WHERE p.topic STARTS WITH $topic RETURN {_PRODUCER_RESOLVE_RETURN} LIMIT $lim", + {"topic": identifier, "lim": lim}, + ) + for row in rows: + out.append((_node_ref_from_row("producer", row), "producer_topic_prefix", len(identifier))) + + return out + + +def _resolve_dedupe_candidates( + raw: list[tuple[NodeRef, ResolveReason, int]], +) -> list[tuple[NodeRef, ResolveReason, int]]: + best: dict[str, tuple[NodeRef, ResolveReason, int]] = {} + for node, reason, specificity in raw: + prev = best.get(node.id) + if prev is None: + best[node.id] = (node, reason, specificity) + continue + prev_pri = _RESOLVE_REASON_PRIORITY[prev[1]] + new_pri = _RESOLVE_REASON_PRIORITY[reason] + if new_pri < prev_pri or (new_pri == prev_pri and specificity > prev[2]): + best[node.id] = (node, reason, specificity) + return list(best.values()) + + +def _resolve_rank_candidates( + deduped: list[tuple[NodeRef, ResolveReason, int]], +) -> list[ResolveCandidate]: + ordered = sorted( + deduped, + key=lambda item: (_RESOLVE_REASON_PRIORITY[item[1]], -item[2], item[0].id), + ) + total = len(ordered) + return [ + ResolveCandidate( + node=node, + reason=reason, + score=(1.0 - (idx / total)) if total else 0.0, + ) + for idx, (node, reason, _spec) in enumerate(ordered) + ] + + +def _resolve_assert_invariants(out: ResolveOutput) -> None: + if not out.success: + assert out.status == "none" + assert out.node is None + assert not out.candidates + assert out.message + return + if out.status == "one": + assert out.node is not None + assert not out.candidates + elif out.status == "many": + assert out.node is None + assert len(out.candidates) >= 2 + elif out.status == "none": + assert out.node is None + assert not out.candidates + assert out.message + + +def _resolve_seeds_for_hints(identifier: str) -> tuple[str | None, str | None]: + path_prefix_seed: str | None = None + method_path = _resolve_parse_route_method_path(identifier) + if method_path is not None: + path_prefix_seed = method_path[1] + else: + ms_route = _resolve_parse_microservice_route(identifier) + if ms_route is not None: + path_prefix_seed = ms_route[2] + elif identifier.startswith("/"): + path_prefix_seed = identifier + + target_service_seed: str | None = None + if " " in identifier: + target, _path_prefix = identifier.split(" ", 1) + target = target.strip() + if target: + target_service_seed = target + elif not identifier.startswith("/"): + target_service_seed = identifier + + return path_prefix_seed, target_service_seed + + +def _resolve_finalize_success( + trimmed: str, + hint_kind: Literal["symbol", "route", "client", "producer"] | None, + matches: list[ResolveCandidate], +) -> ResolveOutput: + if not matches: + out = ResolveOutput( + success=True, + status="none", + message=( + "No matches for identifier; use search(query=...) for ranked fuzzy lookup." + ), + resolved_identifier=trimmed, + ) + elif len(matches) == 1: + out = ResolveOutput( + success=True, + status="one", + node=matches[0].node, + resolved_identifier=trimmed, + ) + else: + out = ResolveOutput( + success=True, + status="many", + candidates=matches, + resolved_identifier=trimmed, + ) + + path_prefix_seed, target_service_seed = _resolve_seeds_for_hints(trimmed) + hint_payload = { + "status": out.status, + "resolved_identifier": trimmed, + "candidates": out.candidates, + "hint_kind": hint_kind, + "path_prefix_seed": path_prefix_seed, + "target_service_seed": target_service_seed, + } + raw_struct, raw_advisories = _hints_or_skip("resolve", hint_payload) + out = out.model_copy(update={ + "advisories": raw_advisories, + "hints_structured": _to_structured_hints(raw_struct), + }) + _resolve_assert_invariants(out) + return out + + +def resolve_v2( + identifier: str, + hint_kind: Literal["symbol", "route", "client", "producer"] | None = None, + graph: LadybugGraph | None = None, +) -> ResolveOutput: + try: + trimmed, err = _resolve_validate_identifier(identifier) + if err is not None: + out = ResolveOutput( + success=False, + status="none", + message=err, + advisories=[], + resolved_identifier=None, + ) + _resolve_assert_invariants(out) + return out + + assert trimmed is not None + if "*" in trimmed or "?" in trimmed: + out = ResolveOutput( + success=False, + status="none", + message=( + "Wildcards (* and ?) are not supported in resolve; " + "use search(query=...) for ranked text search." + ), + advisories=[], + resolved_identifier=trimmed, + ) + _resolve_assert_invariants(out) + return out + + g = graph or LadybugGraph.get() + raw: list[tuple[NodeRef, ResolveReason, int]] = [] + for kind in _resolve_kinds_to_search(hint_kind): + if kind == "symbol": + raw.extend(_resolve_symbol_candidates(g, trimmed)) + elif kind == "route": + raw.extend(_resolve_route_candidates(g, trimmed)) + elif kind == "client": + raw.extend(_resolve_client_candidates(g, trimmed)) + else: + raw.extend(_resolve_producer_candidates(g, trimmed)) + + deduped = _resolve_dedupe_candidates(raw) + ranked = _resolve_rank_candidates(deduped) + capped = ranked[:_RESOLVE_CANDIDATE_CAP] + return _resolve_finalize_success(trimmed, hint_kind, capped) + except Exception as exc: + out = ResolveOutput( + success=False, + status="none", + message=str(exc), + advisories=[], + resolved_identifier=None, + ) + _resolve_assert_invariants(out) + return out diff --git a/scripts/sync_agent_artifacts.py b/scripts/sync_agent_artifacts.py new file mode 100644 index 00000000..4df3ac85 --- /dev/null +++ b/scripts/sync_agent_artifacts.py @@ -0,0 +1,204 @@ +#!/usr/bin/env python3 +"""Sync agent and skill artifacts from dev source to install_data. + +This script maintains a single source of truth for shipped agent artifacts: +- Dev source: skills/explore-codebase/ and agents/*.md +- Shipped: java_codebase_rag/install_data/skills/explore-codebase/ and install_data/agents/ + +Usage: + python scripts/sync_agent_artifacts.py # Copy dev → install_data + python scripts/sync_agent_artifacts.py --check # Verify only (CI mode) + +Exit codes: + 0: All files in sync + 1: Files out of sync (when --check) or copy verification failed +""" + +from __future__ import annotations + +import argparse +import difflib +import filecmp +import shutil +import sys +from pathlib import Path + + +# Mapping of source (dev) paths to destination (install_data) paths +# Only these subtrees are shipped - skills/README.md is explicitly excluded +SYNC_MAP: list[tuple[Path, Path]] = [ + (Path("skills/explore-codebase"), Path("java_codebase_rag/install_data/skills/explore-codebase")), + (Path("skills/explore-codebase-cli"), Path("java_codebase_rag/install_data/skills/explore-codebase-cli")), + (Path("agents"), Path("java_codebase_rag/install_data/agents")), +] + + +def collect_files(src_dir: Path, dst_dir: Path) -> list[tuple[Path, Path]]: + """Collect (source, destination) file pairs for a subtree. + + Only regular files are included (no symlinks, no directories). + """ + if not src_dir.is_dir(): + raise RuntimeError(f"Source directory missing: {src_dir}") + + pairs: list[tuple[Path, Path]] = [] + for src_file in src_dir.rglob("*"): + if not src_file.is_file(): + continue + # Compute relative path from source root + rel_path = src_file.relative_to(src_dir) + dst_file = dst_dir / rel_path + pairs.append((src_file, dst_file)) + + return pairs + + +def verify_byte_equality(src_file: Path, dst_file: Path) -> bool: + """Check if two files are byte-identical. + + Returns True if identical, False otherwise. + """ + if not dst_file.exists(): + return False + return filecmp.cmp(src_file, dst_file, shallow=False) + + +def show_diff(src_file: Path, dst_file: Path) -> str: + """Generate a unified diff between two files.""" + src_lines = src_file.read_text(encoding="utf-8").splitlines(keepends=True) + dst_lines = dst_file.read_text(encoding="utf-8").splitlines(keepends=True) + + return "".join( + difflib.unified_diff( + dst_lines, + src_lines, + fromfile=str(dst_file), + tofile=str(src_file), + lineterm="", + ) + ) + + +def sync_all(check_only: bool, repo_root: Path | None = None) -> int: + """Sync all artifacts from dev source to install_data. + + Args: + check_only: If True, verify only without copying. + repo_root: Repository root directory (defaults to script parent parent). + + Returns: + Exit code (0 for success, 1 for any mismatch). + """ + if repo_root is None: + repo_root = Path(__file__).resolve().parent.parent + else: + repo_root = repo_root.resolve() + + all_pairs: list[tuple[Path, Path]] = [] + for src_rel, dst_rel in SYNC_MAP: + src_dir = repo_root / src_rel + dst_dir = repo_root / dst_rel + all_pairs.extend(collect_files(src_dir, dst_dir)) + + if not all_pairs: + print("No files to sync - check source directories exist", file=sys.stderr) + return 1 + + # Check for drift + out_of_sync: list[tuple[Path, Path, str]] = [] + missing: list[tuple[Path, Path]] = [] + + for src_file, dst_file in all_pairs: + if not dst_file.exists(): + missing.append((src_file, dst_file)) + continue + + if not verify_byte_equality(src_file, dst_file): + out_of_sync.append((src_file, dst_file, "content differs")) + + # Check for extra files in destination that shouldn't be there + all_dst_files = {dst for _, dst in all_pairs} + for src_rel, dst_rel in SYNC_MAP: + dst_dir = repo_root / dst_rel + if dst_dir.exists(): + for dst_file in dst_dir.rglob("*"): + if dst_file.is_file() and dst_file not in all_dst_files: + out_of_sync.append((Path(""), dst_file, "extra file in install_data")) + + if check_only: + # --check mode: report issues and exit non-zero if any + if not (missing or out_of_sync): + print("✓ All agent artifacts in sync") + return 0 + + print("Agent artifacts out of sync:", file=sys.stderr) + for src_file, dst_file, reason in out_of_sync: + if reason == "extra file in install_data": + print(f" - {dst_file} (extra file)", file=sys.stderr) + else: + print(f" - {dst_file} (differs from source)", file=sys.stderr) + if src_file.exists() and dst_file.exists(): + diff = show_diff(src_file, dst_file) + if diff: + print(" Diff:", file=sys.stderr) + for line in diff.splitlines(): + print(f" {line}", file=sys.stderr) + + for src_file, dst_file in missing: + print(f" - {dst_file} (missing)", file=sys.stderr) + + return 1 + + # Copy mode: ensure destination directories exist and copy files + for src_rel, dst_rel in SYNC_MAP: + dst_dir = repo_root / dst_rel + dst_dir.mkdir(parents=True, exist_ok=True) + + for src_file, dst_file in all_pairs: + dst_file.parent.mkdir(parents=True, exist_ok=True) + shutil.copy2(src_file, dst_file) + + # Verify after copy + copy_errors: list[tuple[Path, Path]] = [] + for src_file, dst_file in all_pairs: + if not verify_byte_equality(src_file, dst_file): + copy_errors.append((src_file, dst_file)) + + if copy_errors: + print("Copy verification failed for:", file=sys.stderr) + for src_file, dst_file in copy_errors: + print(f" {src_file} → {dst_file}", file=sys.stderr) + return 1 + + print(f"✓ Synced {len(all_pairs)} agent artifact(s)") + return 0 + + +def main() -> int: + """CLI entry point.""" + # Success messages use '✓' (U+2713); on Windows cp1252 stdout that crashes + # with UnicodeEncodeError. Force UTF-8 (no-op on Unix). Standalone reconfigure + # rather than importing java_codebase_rag._stdio — this dev script is stdlib-only. + for _stream in (sys.stdout, sys.stderr): + _reconfigure = getattr(_stream, "reconfigure", None) + if _reconfigure is not None: + _reconfigure(encoding="utf-8", errors="replace") + parser = argparse.ArgumentParser( + description="Sync agent artifacts from dev source to install_data" + ) + parser.add_argument( + "--check", + action="store_true", + help="Verify only without copying (for CI)" + ) + args = parser.parse_args() + + try: + return sync_all(check_only=args.check, repo_root=Path.cwd()) + except Exception as e: + print(f"Error: {e}", file=sys.stderr) + return 1 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/search_lancedb.py b/search_lancedb.py index bc1543eb..5ca1fa64 100644 --- a/search_lancedb.py +++ b/search_lancedb.py @@ -534,6 +534,15 @@ def _search_one_table( for r in rows: r["_kind"] = kind r["_hybrid"] = False + # Populate `_score` from `_distance` so the SearchHit.score reflects + # relevance. The hybrid branch sets `_score` from `_relevance_score` + # above; without this, non-hybrid (default) search left `_score` unset + # and mcp_v2._row_to_search_hit fell back to 0.0 for EVERY hit — + # ranking still worked (the sort key uses `_distance` directly) but the + # exposed score was always 0.0, making results look unranked. + d = r.get("_distance") + if d is not None: + r["_score"] = l2_distance_to_score(float(d)) r["start"] = coerce_position_field(r.get("start")) r["end"] = coerce_position_field(r.get("end")) return rows diff --git a/skills/README.md b/skills/README.md index af37beba..6a8c8f39 100644 --- a/skills/README.md +++ b/skills/README.md @@ -1,18 +1,35 @@ -# skills/ — RAG navigation skill for the java-codebase-rag MCP +# skills/ — RAG navigation skills for java-codebase-rag -One self-contained skill for navigating indexed Java codebases via the 5-tool MCP (`search` / `find` / `describe` / `neighbors` / `resolve`). Skills are agent-side prompt scaffolding — they are **not** a second MCP API and **not** CLI subcommands. +Two self-contained skills for navigating indexed Java codebases — one per +**surface** (MCP server vs `jrag` CLI). Skills are agent-side prompt scaffolding +— they are **not** a second MCP API and **not** CLI subcommands. + +## Surfaces (PR-JRAG-5) + +`java-codebase-rag install` picks one of two surfaces: + +- **`--surface mcp`** (default) — registers the stdio MCP server (5 tools: + `search` / `find` / `describe` / `neighbors` / `resolve`) and deploys the + **`explore-codebase`** skill + **`explorer-rag-enhanced`** subagent. +- **`--surface cli`** — deploys the **`explore-codebase-cli`** skill + + **`explorer-rag-cli`** subagent, documenting the `jrag` console-script shell + vocabulary (one command per engineering intent; no MCP entry registered). + +Pick one surface per project — running both strands the agent in two +vocabularies. ## Layout ``` skills/ - README.md ← this file - explore-codebase/SKILL.md ← complete MCP operating manual + README.md ← this file + explore-codebase/SKILL.md ← complete MCP operating manual (mcp surface) + explore-codebase-cli/SKILL.md ← `jrag` CLI operating manual (cli surface; PR-JRAG-5) ``` -## `explore-codebase` +## `explore-codebase` (MCP surface) -The comprehensive operating manual. Includes: +The comprehensive MCP operating manual. Includes: - **Five-tool reference** — `search`, `find`, `describe`, `neighbors`, `resolve` with full argument shapes - **Node kinds** — Symbol, Route, Client, Producer @@ -23,15 +40,29 @@ The comprehensive operating manual. Includes: - **Navigation patterns** — 12 common intent-to-tool-chain mappings - **Ontology glossary** — roles, capabilities, symbol kinds, frameworks, match types +## `explore-codebase-cli` (CLI surface; PR-JRAG-5) + +The operating manual for the `jrag` CLI — same graph underneath, but the +agent drives shell commands (`jrag callers`, `jrag inspect`, `jrag search`, +…). Internalizes resolve so every `` command is "names in, names out". + +Includes: command groups (orientation / locate / listings / traversal / +inspection), common flags, resolve-first contract, traversal reference, +ontology glossary, recovery playbook, workflow patterns. + ## Relationship to `docs/AGENT-GUIDE.md` and `agents/` -`docs/AGENT-GUIDE.md` is the **single source of truth** for the MCP operating manual. Three delivery mechanisms all carry the same content: +`docs/AGENT-GUIDE.md` is the **single source of truth** for the MCP operating manual. Three delivery mechanisms all carry the same MCP content: | Mechanism | How to use | | --------- | ---------- | | **`docs/AGENT-GUIDE.md`** copy-paste block | Paste the `BEGIN`/`END` block into your project's `AGENTS.md` / `CLAUDE.md`. Always-on. Best for hosts without skill or subagent loading. | -| **`explore-codebase` skill** | Loaded on demand by hosts with skill discovery (Claude Code, Qwen Code, Cursor). One skill to rule them all. | -| **`agents/explorer-rag-enhanced.md`** subagent | Copy into your project's `.claude/agents/` for Claude Code subagent discovery. The agent combines RAG graph navigation with file-system search. | +| **`explore-codebase` skill** | Loaded on demand by hosts with skill discovery (Claude Code, Qwen Code, Cursor). One skill to rule them all. (MCP surface.) | +| **`agents/explorer-rag-enhanced.md`** subagent | Copy into your project's `.claude/agents/` for Claude Code subagent discovery. The agent combines RAG graph navigation with file-system search. (MCP surface.) | + +For the CLI surface, the parallel pair is **`explore-codebase-cli`** (skill) + +**`agents/explorer-rag-cli.md`** (subagent) — driven via the `jrag` shell CLI +rather than the MCP tools. Do not mix multiple mechanisms on the same agent — duplicate context confuses tool selection. diff --git a/skills/explore-codebase-cli/SKILL.md b/skills/explore-codebase-cli/SKILL.md new file mode 100644 index 00000000..97c2b6f3 --- /dev/null +++ b/skills/explore-codebase-cli/SKILL.md @@ -0,0 +1,251 @@ +--- +name: explore-codebase-cli +description: "MUST BE USED PROACTIVELY. Universal read-only codebase exploration via the `jrag` CLI — one command per engineering intent (callers, callees, routes, clients, producers, impact, search, inspect, flow, overview). Use for any exploration: locating code, tracing dependencies, finding patterns, 'where is X', 'who calls Y', 'find all controllers', 'trace the flow from A to B'. Combines graph navigation with file-system search (grep, glob, file reading). Do NOT use when the answer is already in open context or for a single known file — read that file directly." +--- + +# /explore-codebase-cli — Universal codebase exploration via `jrag` + +Read-only exploration combining **graph navigation through the `jrag` CLI** with **broad file-system search**. This is the CLI surface of java-codebase-rag; it loads the same index used by the MCP server but exposes one shell command per engineering intent instead of five MCP tools. + +## When to use + +Any time you need to search, locate, navigate, or explore the codebase. **Do NOT use when** the answer is already in open context or for a single known file — read that file directly. + +## Core Principles + +1. **Read-only.** Never edit, write, or modify any file. +2. **Names in, names out.** Every `` is human-readable (FQN / simple name / route path / topic). Raw node IDs are never required. +3. **One command per intent.** `jrag` collapses resolve + walk into one call. Pick the command that matches the intent; do not chain resolve→describe→neighbors manually. +4. **Stop when answered.** Don't prefetch unrelated subgraphs or directories. + +## Why `jrag` (CLI) vs `java-codebase-rag` (MCP) + +| Aspect | `jrag` CLI | MCP server (`java-codebase-rag-mcp`) | +| --- | --- | --- | +| Surface | Shell — one command per intent | 5 stdio MCP tools (`search` / `find` / `describe` / `neighbors` / `resolve`) | +| Resolve | **Internalized** — every `` command runs `resolve_v2` first | Explicit — agent calls `resolve` then `describe` / `neighbors` | +| Output | Compact text by default; `--format json` for the envelope; `--detail brief\|normal\|full` (orthogonal to format) | JSON-RPC envelope | +| Host fit | Any agent that can run shell commands | MCP-aware hosts (Claude Code, Claude Desktop, Qwen Code, GigaCode) | +| Index | Reuses the operator's `~/.java-codebase-rag` / `.java-codebase-rag/` index | Same | + +Pick **one** surface per project — running both strands the agent in two vocabularies. This skill is for the CLI surface. + +## Prerequisite: index must exist + +`jrag` is a thin compose-and-render layer over the existing index. If the project has not been indexed, every command exits 2 with an actionable envelope: + +``` +status: error +message: No index at . Run: java-codebase-rag init --source-root +``` + +Verify with `jrag status` first when in doubt. + +## Tool Inventory + +### `jrag` command groups + +Run `jrag --help` for the canonical list. Groups (PR-JRAG-1a..4): + +| Group | Commands | +| --- | --- | +| **Orientation** | `status`, `microservices`, `map`, `conventions`, `overview` | +| **Locate** | `find`, `search` | +| **Listings** | `routes`, `clients`, `producers`, `topics`, `jobs`, `listeners`, `entities` | +| **Traversal** | `callers`, `callees`, `hierarchy`, `implementations`, `subclasses`, `overrides`, `overridden-by`, `dependents`, `impact`, `flow`, `dependencies`, `connection` | +| **Inspection** | `inspect`, `outline`, `imports` | + +### Common flags (every command) + +``` +--service Filter by microservice +--module Filter by module +--limit Cap on results (default 20; 10 for fan-out commands) +--format text|json Output format (default: text) +--detail brief|normal|full Output detail (default: normal) — orthogonal to --format; + both modes honor it. brief=name @service; normal=+module/role/ + file/score; full=+signature/annotations/snippet. inspect and the + orientation commands (status/microservices/map/conventions/overview) + default to full. +--index-dir Index directory override (default: discovered from cwd) +``` + +`--offset` is supported **only** on `find` and `search` (they route through `find_v2` / `search_v2` which accept it). Other commands emit `truncated: more results — narrow your query` when capped. + +### File-system tools + +- **Grep** — content search by pattern/regex +- **Glob** — find files by name/path pattern (`**/*.java`, `**/*Controller*.java`, `**/application*.yml`) +- **Read** — read files (`offset`/`limit` for large files) + +### Other: **Bash** (read-only: `git log`, `git blame`, `ls`, `find`), **WebSearch**/**WebFetch** (external lookups) + +--- + +## Decision Framework + +| User asks… | First `jrag` command | Follow-up | +| ---------- | -------------------- | --------- | +| "Is the index fresh?" | `jrag status` | — | +| Identifier-shaped string (FQN / simple name) | `jrag inspect ` | `callers` / `callees` | +| Fuzzy / NL "where is X" | `jrag search ""` | `inspect ` | +| All controllers in service S | `jrag find --role CONTROLLER --service S` | `callees` | +| Interfaces in service S | `jrag find --java-kind interface --service S` | `implementations` | +| HTTP / messaging entry points | `jrag routes [--framework …] [--method …]` | `inspect ` | +| Outbound HTTP clients | `jrag clients [--calls-service …]` | `callees ` | +| Outbound async producers | `jrag producers [--topic-prefix …]` | `callees ` | +| Topics + consumers/producers | `jrag topics [--topic-prefix …]` | — | +| Who calls method M? | `jrag callers ` | `inspect ` | +| What does M call? | `jrag callees ` | `inspect ` | +| Who hits this route? | `jrag callers ` | — | +| Who implements interface T? | `jrag implementations ` | — | +| Subtypes of class C? | `jrag subclasses ` | — | +| Overriding methods? | `jrag overrides ` (dispatch UP) | — | +| Methods that override me? | `jrag overridden-by ` | — | +| Who injects T? | `jrag dependencies ` | — | +| Who depends on T? | `jrag dependents ` | — | +| Blast-radius of changing X? | `jrag impact ` (bounded fan-in) | `Grep` fallback | +| Trace request flow A→B | `jrag flow ` | `connection ` | +| File outline | `jrag outline ` | `inspect ` | +| File imports | `jrag imports ` | — | +| "Explain service S" | `jrag overview ` | `routes` / `clients` / `producers` | +| "Explain route /topic" | `jrag overview ` | `flow` | +| Find files matching pattern | `Glob` | `Read` | +| Search for text in files | `Grep` | `Read` | +| Who changed X and when? | Bash: `git log`/`git blame` | — | +| "How is this configured?" | `Glob` + `Grep` for config keys; `jrag search "" --table yaml` | `Read` sections | + +**Escalation:** ① Most targeted command first → ② Fall back gracefully (`callers` empty → `Grep`) → ③ Cross-validate (CLI vs file disagree → **trust the file** — index may be stale). + +**Rules of thumb:** Structure beats vector for exact questions (`find` / `inspect` + traversal); vector beats structure for fuzzy discovery (`search`); file-system beats stale index. + +--- + +## Resolve-first contract (every `` command) + +Every `jrag` command that takes a `` runs `resolve_v2` internally and maps the contract onto the envelope: + +| `resolve_v2` status | `jrag` behavior | +| --- | --- | +| `one` | Run the traversal/listing against the resolved node. | +| `many` | Return the candidate list and stop. **No auto-pick.** Disambiguate with `--kind`, `--role`, `--fqn-prefix`, etc. | +| `none` | Emit `status: not_found` envelope (exit 2). Fall back to `search` or `Grep`. | + +You never need to look up a raw node ID. Pass an FQN, simple name, `sym:`/`route:`/`client:`/`producer:` id (from a prior call), route path, topic, etc. + +### Disambiguation flags + +Only `--kind` is a true resolve input (`hint_kind`). The other narrowing flags (`--role`, `--java-kind`, `--fqn-prefix`, `--service`, `--module`) post-filter the resolve result client-side. If a post-filter collapses `many` → `one`, the command proceeds; if it still leaves `many`, the narrowed candidates are returned. + +--- + +## Output envelope + +`--format` (text|json) and `--detail` (brief|normal|full) are **orthogonal**: +`--format` picks the representation, `--detail` picks how much of each node/edge is +shown, and **both modes honor the same detail level** through one projection seam. + +- Default is `text` + `normal`: a one-line-per-row listing that includes + `name @service module=… role=… file=… score=…` (the cheap, high-value fields). + `inspect` and the orientation commands default to `full` (their purpose is detail). +- `--detail brief` reproduces the ultra-terse `name @service` line (escape hatch). +- `--detail full` adds an indented block per row (`signature`, `annotations`, + `snippet` for search, `data`/`edge_summary` for inspect). +- `--format json` emits the **projected** envelope (same field set as the text at + that detail level). Empty fields are dropped at every level (no `null` noise). + +`--format json` envelope shape (fields omitted when empty): + +```json +{ + "status": "ok|not_found|error", + "nodes": {"": {...}}, + "edges": [{...}], + "candidates": [{...}], + "truncated": false, + "agent_next_actions": ["jrag callers ", "..."], + "file_location": {"filename": "...", "start_line": 123} +} +``` + +- `truncated` is computed via +1-fetch on `find`/`search` (pass `--limit`, observe `truncated`, narrow or page with `--offset`); other commands emit `truncated: more results — narrow your query` when capped (no `--offset`). +- `agent_next_actions` is a CLI-native hint list (≤5) mapping the current result's edge labels to the next `jrag` command — use it as a starting point, not a directive. +- `file_location` is populated only on `one`-hit resolve (carries the resolved node's `filename` + `start_line`). + +--- + +## Traversal direction reference + +`jrag` abstracts away `direction` and `edge_types` — you name the intent, it picks the edges. For reference, the mapping is: + +| Intent (command) | Underlying edges | +| --- | --- | +| `callers` | `CALLS` direction=in | +| `callees` | `CALLS` direction=out | +| `hierarchy` | `EXTENDS` + `IMPLEMENTS` direction=out | +| `implementations` | `IMPLEMENTS` direction=in | +| `subclasses` | `EXTENDS` direction=in | +| `overrides` | `OVERRIDES` direction=out (subtype → supertype) | +| `overridden-by` | `OVERRIDES` direction=in (virtual `OVERRIDDEN_BY` out) | +| `dependencies` | `INJECTS` direction=out | +| `dependents` | `INJECTS` direction=in | +| `impact` | bounded fan-in: `CALLS`/`INJECTS`/`IMPLEMENTS`/`EXTENDS` direction=in (depth ≤2) | +| `flow ` | `trace_request_flow`: `EXPOSES`/`HTTP_CALLS`/`ASYNC_CALLS`/`CALLS` | +| `connection A B` | bounded search over the same edge set between A and B | + +### Node id prefixes (from prior results) + +`sym:` (Symbol), `route:`/`r:` (Route), `client:`/`c:` (Client), `producer:`/`p:` (Producer). Pass these verbatim if you have them; otherwise use the human-readable name. + +### Symbol FQN shape + +`.[.]#(,,…)`. Generics erased, no spaces after commas. No-arg: `()`. Constructor: `#(...)`. + +--- + +## Ontology glossary + +**Roles:** `CONTROLLER` | `SERVICE` | `REPOSITORY` | `COMPONENT` | `CONFIG` | `ENTITY` | `CLIENT` | `MAPPER` | `DTO` | `OTHER`. + +**Capabilities:** `MESSAGE_LISTENER`, `MESSAGE_PRODUCER`, `HTTP_CLIENT`, `SCHEDULED_TASK`, `EXCEPTION_HANDLER`. + +**Symbol kinds:** `class`, `interface`, `enum`, `record`, `annotation`, `method`, `constructor`. + +**Route frameworks:** `spring_mvc`, `webflux`. Route *kinds*: `http_endpoint`, `http_consumer`, `kafka_topic`, `rabbit_queue`, `jms_destination`, `stream_binding`. + +**Client kinds:** `feign_method`, `rest_template`, `web_client`. **Producer kinds:** `kafka_send`, `stream_bridge_send`. **Source layers (client/producer):** `builtin`, `layer_a_meta`, `layer_b_ann`, `layer_b_fqn`, `layer_c_source`. + +--- + +## Recovery Playbook + +**After two failed attempts on the same intent, stop and report command, args, and result snippet.** + +| Symptom | Fix | +| ------- | --- | +| `status: error` "No index at …" | Run `java-codebase-rag init --source-root ` then retry | +| `status: not_found` | Try `jrag search ""`; or `find --fqn-prefix …`; fallback `Grep` | +| `many` candidates returned | Add `--kind`/`--role`/`--fqn-prefix`/`--service`; re-run | +| `find` returns too much | Add `--service`, `--fqn-prefix`, `--path-prefix`, `--topic-prefix` | +| Empty `search` | Try `--table all`; `find --fqn-prefix`; `Grep` directly | +| `truncated: true` | Narrow the query, or page with `--offset` (`find`/`search` only) | +| Empty results across commands | Index missing/stale → `Grep`/`Glob`/`Read`; ask operator to rebuild (`java-codebase-rag reprocess`) | +| CLI vs file disagree | **Trust the file**; report stale index | +| `--offset` rejected | Only `find`/`search` accept it; other commands narrow via filters | +| Wrong node picked | Resolve must be ambiguous — pass `--kind` to narrow | + +--- + +## Workflow Patterns + +**"Explain feature X":** `jrag search "X"` → pick 1–3 hits → `jrag inspect ` → targeted traversal (`callees`/`implementations`) → stop when answered. + +**"Where is X used?":** `jrag inspect ` (resolves) → `jrag callers ` and `jrag dependents ` → `Grep` fallback → report all sites with file:line. + +**"Find all Y":** Structural → `jrag find --role [--service ]`. Textual → `Grep`. Broad → `Glob` + `Grep`. Summarize, don't dump. + +**"Trace flow from A to B":** `jrag flow ` to trace the request → `jrag connection A B` to confirm a path → `Grep` gaps → report with file:line. + +**"How is this configured?":** `Glob` for `**/application*.yml` → `Grep` for the key → `Read` sections → `jrag search "" --table yaml` supplement. + +**"Orient in a new service":** `jrag overview ` (bundle) → `jrag conventions --service ` (dominant roles) → `jrag map --service ` (counts) → `jrag routes --service ` (entry points). diff --git a/tests/README.md b/tests/README.md index 5abb96cb..b66abab2 100644 --- a/tests/README.md +++ b/tests/README.md @@ -41,7 +41,7 @@ cd /path/to/java-codebase-rag ## CI merge gate and fixture tiers -**Merge gate (mechanical):** [`.github/workflows/test.yml`](../.github/workflows/test.yml) always runs the `test` job on every pull request and on every push to `master`. When any **source** path changes (Python, deps, `pytest.ini`, `mcp.json.example`, `.gitignore`, workflows, non-markdown under `tests/` or `automation/`), it runs `pytest tests` with `JAVA_CODEBASE_RAG_RUN_HEAVY=0`. Documentation-only changes (`propose/`, `plans/`, `skills/`, `.agents/`, `docs/`, `**/*.md`, etc.) still produce a green `test` check but skip pytest. Branch protection on `master` requires the `test` status check to pass before merge and disables force-push. Break-glass policy: `enforce_admins: false` so the sole maintainer can bypass for emergency hotfixes — explain the bypass in the merge commit. +**Merge gate (mechanical):** [`.github/workflows/test.yml`](../.github/workflows/test.yml) always runs the `test` job on every pull request and on every push to `master`. When any **source** path changes (Python, deps, `pytest.ini`, `mcp.json.example`, `.gitignore`, workflows, non-markdown under `tests/`), it runs `pytest tests` with `JAVA_CODEBASE_RAG_RUN_HEAVY=0`. Documentation-only changes (`propose/`, `plans/`, `skills/`, `.agents/`, `docs/`, `**/*.md`, etc.) still produce a green `test` check but skip pytest. Branch protection on `master` requires the `test` status check to pass before merge and disables force-push. Break-glass policy: `enforce_admins: false` so the sole maintainer can bypass for emergency hotfixes — explain the bypass in the merge commit. **Iteration subset (convention):** During implementation, authors name a `pytest` file subset inside each per-PR execution prompt (for example in `plans/AGENT-PROMPTS-*.md`). The repo **[`plan-prompts`](../.agents/skills/plan-prompts/SKILL.md)** skill (`.agents/skills/plan-prompts/`) requires a **`## Tests to run (iteration loop)`** section in that scaffold, placed **after Deliverables and before Tests**. Reviewers follow the repo **[`pr-review`](../.agents/skills/pr-review/SKILL.md)** skill (`.agents/skills/pr-review/`): pasted subset command + exit code, plus a green `test` CI link from the merge gate documented above (full pytest for code changes; pytest skipped for docs-only PRs). Canonical skill sources live under `.agents/skills/` (symlink as `.cursor` or `.claude` locally if your editor expects those paths); you may copy them into `~/.cursor/skills/` if your Cursor setup loads personal skills only. See [`propose/completed/TEST-SUITE-FAST-LOOP-PROPOSE.md`](../propose/completed/TEST-SUITE-FAST-LOOP-PROPOSE.md) and [`plans/completed/PLAN-TEST-SUITE-FAST-LOOP.md`](../plans/completed/PLAN-TEST-SUITE-FAST-LOOP.md). diff --git a/tests/bank-chat-system/chat-core/chat-engine/src/main/java/com/bank/chat/engine/kafka/FollowUpKafkaPublisher.java b/tests/bank-chat-system/chat-core/chat-engine/src/main/java/com/bank/chat/engine/kafka/FollowUpKafkaPublisher.java index aaf5f507..f6498eaa 100644 --- a/tests/bank-chat-system/chat-core/chat-engine/src/main/java/com/bank/chat/engine/kafka/FollowUpKafkaPublisher.java +++ b/tests/bank-chat-system/chat-core/chat-engine/src/main/java/com/bank/chat/engine/kafka/FollowUpKafkaPublisher.java @@ -24,6 +24,7 @@ public void publishOperatorNotification(InternalEvent event) { kafkaTemplate.send(ChatTopics.OPERATOR_NOTIFICATIONS, event.getConversationId(), event); } + @CodebaseProducer(topic = "banking.chat.compliance.review", producerKind = "kafka_send") public void publishComplianceReview(InternalEvent event) { kafkaTemplate.send(ChatTopics.COMPLIANCE_REVIEW, event.getConversationId(), event); } diff --git a/tests/test_agent_skills_static.py b/tests/test_agent_skills_static.py index 13bf69bd..f8f88b42 100644 --- a/tests/test_agent_skills_static.py +++ b/tests/test_agent_skills_static.py @@ -37,8 +37,12 @@ SKILLS_DIR = Path(__file__).resolve().parent.parent / "skills" SKILL_NAME = "explore-codebase" -EXPECTED_SKILL_DIRS = {"explore-codebase"} +# PR-JRAG-5: the CLI surface ships its own skill (explore-codebase-cli) with a +# shell vocabulary, not the MCP vocabulary. The static-validation tests in this +# file (tool-ref/kind/edge allowlists) gate to the MCP skill (SKILL_NAME) only. +EXPECTED_SKILL_DIRS = {"explore-codebase", "explore-codebase-cli"} SKILL_PATH = SKILLS_DIR / SKILL_NAME / "SKILL.md" +CLI_SKILL_PATH = SKILLS_DIR / "explore-codebase-cli" / "SKILL.md" def _parse_frontmatter(text: str) -> dict[str, str]: @@ -126,6 +130,31 @@ def test_frontmatter_has_name_and_description(self): ) +class TestCliSkillFrontmatter: + """PR-JRAG-5: the explore-codebase-cli skill ships its own frontmatter. + + The MCP-vocabulary static-validation tests below (tool-ref / kind / edge + allowlists) do NOT apply to this skill — it documents the `jrag` shell + vocabulary, not the 5-tool MCP. Only frontmatter + existence are checked + here. + """ + + def test_cli_skill_file_exists(self): + assert CLI_SKILL_PATH.is_file(), f"Missing {CLI_SKILL_PATH}" + + def test_cli_frontmatter_has_name_and_description(self): + text = CLI_SKILL_PATH.read_text(encoding="utf-8") + fm = _parse_frontmatter(text) + assert "name" in fm, "CLI SKILL.md missing frontmatter 'name'" + assert fm["name"] == "explore-codebase-cli", ( + f"name={fm['name']!r}, expected 'explore-codebase-cli'" + ) + assert "description" in fm, "CLI SKILL.md missing frontmatter 'description'" + assert len(fm["description"]) >= 20, ( + f"description too short ({len(fm['description'])} chars)" + ) + + class TestMCPToolReferences: """Tool names in skill body must be valid MCP navigation tools.""" diff --git a/tests/test_install_data_sync.py b/tests/test_install_data_sync.py new file mode 100644 index 00000000..cbfd6028 --- /dev/null +++ b/tests/test_install_data_sync.py @@ -0,0 +1,249 @@ +"""Tests for agent artifacts sync script. + +Validates that: +- Dev source and install_data copies stay in sync +- The sync script detects drift correctly +""" + +from __future__ import annotations + +import subprocess +import sys +import tempfile +from pathlib import Path + + + +# Paths relative to repo root +SYNC_SCRIPT = Path("scripts/sync_agent_artifacts.py") + + +def run_sync_script(*, check: bool = False, cwd: Path | None = None) -> subprocess.CompletedProcess[str]: + """Run the sync script and return the result. + + Args: + check: Pass --check flag (verify only, no writes) + cwd: Working directory (defaults to repo root if None) + + Returns: + CompletedProcess with stdout/stderr captured as text. + """ + repo_root = Path(__file__).resolve().parent.parent + if cwd is None: + cwd = repo_root + + cmd = [sys.executable, str(repo_root / SYNC_SCRIPT)] + if check: + cmd.append("--check") + + return subprocess.run( + cmd, + cwd=cwd, + capture_output=True, + text=True, + encoding="utf-8", # Script emits UTF-8 (✓ marker); decode as such, not the locale ANSI codepage (cp1252 on Windows). + ) + + +def test_install_data_artifacts_in_sync_with_dev_source(): + """Baseline: --check passes at HEAD (dev source and install_data are byte-equal).""" + result = run_sync_script(check=True) + + assert result.returncode == 0, ( + f"Sync check failed - artifacts out of sync.\n" + f"stdout: {result.stdout}\n" + f"stderr: {result.stderr}" + ) + + assert "✓ All agent artifacts in sync" in result.stdout, ( + f"Expected success message not found in stdout.\n" + f"stdout: {result.stdout}" + ) + + +def _seed_dev_source(tmp_path: Path, *, cli_skill_content: str = "# test") -> None: + """Create the canonical dev source tree the SYNC_MAP expects. + + The sync script walks ``SYNC_MAP`` source dirs; PR-JRAG-5 added + ``skills/explore-codebase-cli`` to that map, so synthetic temp workspaces + used by the drift tests must seed it too. + """ + tmp_agents = tmp_path / "agents" + tmp_agents.mkdir(parents=True, exist_ok=True) + (tmp_agents / "explorer-rag-enhanced.md").write_text("# test") + + tmp_skills = tmp_path / "skills" / "explore-codebase" + tmp_skills.mkdir(parents=True, exist_ok=True) + (tmp_skills / "SKILL.md").write_text("# test") + + tmp_cli_skills = tmp_path / "skills" / "explore-codebase-cli" + tmp_cli_skills.mkdir(parents=True, exist_ok=True) + (tmp_cli_skills / "SKILL.md").write_text(cli_skill_content) + + +def _seed_install_data(tmp_path: Path, *, extra: list[Path] | None = None) -> None: + """Create the matching install_data tree (no drift) for the SYNC_MAP.""" + tmp_install_agents = tmp_path / "java_codebase_rag" / "install_data" / "agents" + tmp_install_agents.mkdir(parents=True, exist_ok=True) + (tmp_install_agents / "explorer-rag-enhanced.md").write_text("# test") + + tmp_install_mcp_skill = ( + tmp_path / "java_codebase_rag" / "install_data" / "skills" / "explore-codebase" + ) + tmp_install_mcp_skill.mkdir(parents=True, exist_ok=True) + (tmp_install_mcp_skill / "SKILL.md").write_text("# test") + + tmp_install_cli_skill = ( + tmp_path / "java_codebase_rag" / "install_data" / "skills" / "explore-codebase-cli" + ) + tmp_install_cli_skill.mkdir(parents=True, exist_ok=True) + (tmp_install_cli_skill / "SKILL.md").write_text("# test") + + for path in extra or []: + path.parent.mkdir(parents=True, exist_ok=True) + path.write_text("# this should not be here") + + +def test_sync_script_detects_drift(): + """Verify --check exits non-zero when dev source and install_data differ. + + This test: + 1. Copies a real dev source file to a temp dir + 2. Mutates a byte in the temp copy + 3. Points the sync script at the mutated tree via cwd override + 4. Asserts --check exits non-zero AND names the offending file + 5. Restores by temp dir auto-cleanup (no repo mutation) + """ + repo_root = Path(__file__).resolve().parent.parent + + # Copy a real file (agents/explorer-rag-enhanced.md) to temp workspace + real_dev_file = repo_root / "agents" / "explorer-rag-enhanced.md" + real_skill_file = repo_root / "skills" / "explore-codebase" / "SKILL.md" + + with tempfile.TemporaryDirectory() as tmpdir: + tmp_path = Path(tmpdir) + + # Create the agents directory structure in temp + tmp_agents = tmp_path / "agents" + tmp_agents.mkdir() + + # Copy real file to temp and mutate it + tmp_file = tmp_agents / "explorer-rag-enhanced.md" + tmp_file.write_bytes(real_dev_file.read_bytes()) + + # Mutate a byte (change first character if it's ASCII, otherwise append) + original_content = tmp_file.read_text(encoding="utf-8") + if original_content: + mutated_content = "X" + original_content[1:] + else: + mutated_content = "X" + tmp_file.write_text(mutated_content, encoding="utf-8") + + # Create skills/explore-codebase directory (unchanged, for completeness) + tmp_skills = tmp_path / "skills" / "explore-codebase" + tmp_skills.mkdir(parents=True) + (tmp_skills / "SKILL.md").write_bytes(real_skill_file.read_bytes()) + + # PR-JRAG-5: SYNC_MAP also walks skills/explore-codebase-cli — seed it. + tmp_cli_skills = tmp_path / "skills" / "explore-codebase-cli" + tmp_cli_skills.mkdir(parents=True) + (tmp_cli_skills / "SKILL.md").write_text("# test") + + # Also create the install_data directory structure in temp + # so the script has something to compare against + tmp_install = tmp_path / "java_codebase_rag" / "install_data" / "agents" + tmp_install.mkdir(parents=True) + + # Copy the unmutated file to install_data + (tmp_install / "explorer-rag-enhanced.md").write_bytes(real_dev_file.read_bytes()) + + tmp_install_skills = tmp_path / "java_codebase_rag" / "install_data" / "skills" / "explore-codebase" + tmp_install_skills.mkdir(parents=True) + (tmp_install_skills / "SKILL.md").write_bytes(real_skill_file.read_bytes()) + + tmp_install_cli_skills = ( + tmp_path / "java_codebase_rag" / "install_data" / "skills" / "explore-codebase-cli" + ) + tmp_install_cli_skills.mkdir(parents=True) + (tmp_install_cli_skills / "SKILL.md").write_text("# test") + + # Run the sync script from temp directory (so it sees the mutated file) + result = run_sync_script(check=True, cwd=tmp_path) + + # Should exit non-zero due to drift + assert result.returncode == 1, ( + f"Expected --check to exit non-zero on drift, but got {result.returncode}.\n" + f"stdout: {result.stdout}\n" + f"stderr: {result.stderr}" + ) + + # Should mention the file that differs + output = result.stdout + result.stderr + assert "explorer-rag-enhanced.md" in output or "out of sync" in output, ( + f"Expected script to report the drifted file or 'out of sync'.\n" + f"output: {output}" + ) + + +def test_sync_script_detects_extra_files(): + """Verify --check detects extra files in install_data that shouldn't exist.""" + with tempfile.TemporaryDirectory() as tmpdir: + tmp_path = Path(tmpdir) + + # Create dev source (agents + both skills — PR-JRAG-5 added CLI skill). + _seed_dev_source(tmp_path) + + # Create install_data with an extra file + _seed_install_data( + tmp_path, + extra=[tmp_path / "java_codebase_rag" / "install_data" / "agents" / "extra_file.md"], + ) + + result = run_sync_script(check=True, cwd=tmp_path) + + assert result.returncode == 1, ( + f"Expected --check to exit non-zero on extra files, but got {result.returncode}.\n" + f"stdout: {result.stdout}\n" + f"stderr: {result.stderr}" + ) + + output = result.stdout + result.stderr + assert "extra_file.md" in output or "extra file" in output.lower(), ( + f"Expected script to report the extra file.\n" + f"output: {output}" + ) + + +def test_sync_script_detects_missing_files(): + """Verify --check detects missing files in install_data.""" + with tempfile.TemporaryDirectory() as tmpdir: + tmp_path = Path(tmpdir) + + # Create dev source (agents + both skills — PR-JRAG-5 added CLI skill). + _seed_dev_source(tmp_path) + + # Create empty install_data (missing the files) + tmp_install = tmp_path / "java_codebase_rag" / "install_data" / "agents" + tmp_install.mkdir(parents=True) + + tmp_install_skills = tmp_path / "java_codebase_rag" / "install_data" / "skills" / "explore-codebase" + tmp_install_skills.mkdir(parents=True) + + tmp_install_cli_skills = ( + tmp_path / "java_codebase_rag" / "install_data" / "skills" / "explore-codebase-cli" + ) + tmp_install_cli_skills.mkdir(parents=True) + + result = run_sync_script(check=True, cwd=tmp_path) + + assert result.returncode == 1, ( + f"Expected --check to exit non-zero on missing files, but got {result.returncode}.\n" + f"stdout: {result.stdout}\n" + f"stderr: {result.stderr}" + ) + + output = result.stdout + result.stderr + assert "explorer-rag-enhanced.md" in output or "missing" in output.lower(), ( + f"Expected script to report the missing file.\n" + f"output: {output}" + ) diff --git a/tests/test_installer.py b/tests/test_installer.py index de4296a0..00d00318 100644 --- a/tests/test_installer.py +++ b/tests/test_installer.py @@ -910,9 +910,12 @@ def test_detect_hosts_project_mcp_json(self, tmp_path): detected = detect_configured_hosts(tmp_path) assert len(detected) == 1 - host_config, scope = detected[0] - assert host_config.name == "claude-code" - assert scope == "project" + # PR-JRAG-5: detect_configured_hosts returns ConfiguredHost (3-field). + # The legacy MCP-entry fallback path always carries surface="mcp". + configured = detected[0] + assert configured.host.name == "claude-code" + assert configured.scope == "project" + assert configured.surface == "mcp" def test_detect_hosts_user_claude_json(self, tmp_path, monkeypatch): """~/.claude.json with entry → detects claude-code user scope""" @@ -940,9 +943,11 @@ def test_detect_hosts_user_claude_json(self, tmp_path, monkeypatch): detected = detect_configured_hosts(tmp_path) assert len(detected) == 1 - host_config, scope = detected[0] - assert host_config.name == "claude-code" - assert scope == "user" + # PR-JRAG-5: 3-field NamedTuple (legacy MCP-entry scan → surface="mcp"). + configured = detected[0] + assert configured.host.name == "claude-code" + assert configured.scope == "user" + assert configured.surface == "mcp" def test_detect_hosts_multiple_hosts(self, tmp_path, monkeypatch): """both .mcp.json and ~/.qwen/settings.json → returns both""" @@ -987,16 +992,18 @@ def test_detect_hosts_multiple_hosts(self, tmp_path, monkeypatch): detected = detect_configured_hosts(tmp_path) assert len(detected) == 2 - # Sort by scope for consistent ordering - detected_sorted = sorted(detected, key=lambda x: x[1]) + # Sort by scope for consistent ordering (PR-JRAG-5: NamedTuple fields). + detected_sorted = sorted(detected, key=lambda ch: ch.scope) # First should be project scope claude-code - assert detected_sorted[0][0].name == "claude-code" - assert detected_sorted[0][1] == "project" + assert detected_sorted[0].host.name == "claude-code" + assert detected_sorted[0].scope == "project" + assert detected_sorted[0].surface == "mcp" # Second should be user scope qwen-code - assert detected_sorted[1][0].name == "qwen-code" - assert detected_sorted[1][1] == "user" + assert detected_sorted[1].host.name == "qwen-code" + assert detected_sorted[1].scope == "user" + assert detected_sorted[1].surface == "mcp" def test_detect_hosts_no_config_returns_empty(self, tmp_path): """no MCP configs → empty list""" diff --git a/tests/test_installer_surface.py b/tests/test_installer_surface.py new file mode 100644 index 00000000..b2a8b8db --- /dev/null +++ b/tests/test_installer_surface.py @@ -0,0 +1,467 @@ +"""PR-JRAG-5: ``--surface mcp|cli`` install branching. + +Validates the surface model end-to-end: + - ``Surface`` Literal + ``ConfiguredHost`` NamedTuple (3-field) + - ``ARTIFACT_MANIFEST`` single source iterated by ``deploy_artifacts`` and + ``refresh_artifacts`` (with ``surface="mcp"`` keyword-only default for + back-comat with the existing direct-call tests in ``test_installer.py``) + - ``.java-codebase-rag.hosts`` marker file round-trip (so a CLI-only install + is visible to ``update`` — no MCP entry to scan) + - ``detect_configured_hosts`` returns ``list[ConfiguredHost]`` (reads marker + first, falls back to the MCP-entry scan with ``surface="mcp"`` for + pre-marker installs) + - ``run_update`` unpacks surface and routes the refresh through it + - ``resolve_mcp_command`` surface-conditional: ``cli`` resolves the ``jrag`` + console script and skips the MCP-binary ``SystemExit(2)`` + - ``select_surface`` wizard + ``--surface`` flag + - ``handle_rerun`` prefill behavior +""" + +from __future__ import annotations + +import shutil + +import pytest + +from java_codebase_rag.installer import ( + ARTIFACT_MANIFEST, + ConfiguredHost, + HOSTS, + Surface, # noqa: F401 — assert the Literal is exported + _marker_path, + _read_hosts_marker, + _write_hosts_marker, + deploy_artifacts, + detect_configured_hosts, + refresh_artifacts, + resolve_mcp_command, + run_update, + select_surface, +) + + +# --------------------------------------------------------------------------- +# Test 1 + 2: deploy behavior per surface (parity) +# --------------------------------------------------------------------------- + + +def test_surface_cli_deploys_cli_skill_and_agent_no_mcp_entry(tmp_path, monkeypatch): + """surface="cli" deploys explore-codebase-cli skill + explorer-rag-cli agent. + + The CLI surface ships NO MCP entry — the manifest has only two rows + (skill + agent) and the dest paths use the CLI artifact names. + """ + # The CLI surface never reaches resolve_mcp_command in deploy_artifacts + # (no "mcp" manifest row), but the install wizard still resolves jrag and + # passes it. Stub shutil.which so any incidental call is harmless. + monkeypatch.setattr(shutil, "which", lambda name: "/fake/bin/jrag") + + results = deploy_artifacts( + [HOSTS["claude-code"]], + "project", + tmp_path, + non_interactive=True, + mcp_command="/fake/bin/jrag", + surface="cli", + ) + + # Exactly two artifacts (skill + agent); NO MCP entry. + assert len(results) == 2 + assert all(r.success for r in results), ( + [str((r.path, r.success, r.error)) for r in results] + ) + + skill_dest = tmp_path / ".claude" / "skills" / "explore-codebase-cli" / "SKILL.md" + agent_dest = tmp_path / ".claude" / "agents" / "explorer-rag-cli.md" + assert skill_dest.is_file(), f"CLI skill not deployed at {skill_dest}" + assert agent_dest.is_file(), f"CLI agent not deployed at {agent_dest}" + + # The MCP-surface artifacts must NOT have been written on the CLI surface. + assert not (tmp_path / ".claude" / "skills" / "explore-codebase" / "SKILL.md").is_file() + assert not (tmp_path / ".claude" / "agents" / "explorer-rag-enhanced.md").is_file() + # And no MCP config registered. + assert not (tmp_path / ".mcp.json").is_file() + + +def test_surface_mcp_reproduces_today_behavior(tmp_path, monkeypatch): + """surface="mcp" (explicit) deploys MCP entry + MCP skill + MCP agent. + + Same artifact set as today's pre-surface install: 3 results per host. + """ + monkeypatch.setattr(shutil, "which", lambda name: "/fake/bin/java-codebase-rag-mcp") + + results = deploy_artifacts( + [HOSTS["claude-code"]], + "project", + tmp_path, + non_interactive=True, + mcp_command="/fake/bin/java-codebase-rag-mcp", + surface="mcp", + ) + + # Three artifacts (MCP + skill + agent), in manifest order. + assert len(results) == 3 + assert all(r.success for r in results) + + assert (tmp_path / ".mcp.json").is_file() + assert (tmp_path / ".claude" / "skills" / "explore-codebase" / "SKILL.md").is_file() + assert (tmp_path / ".claude" / "agents" / "explorer-rag-enhanced.md").is_file() + + +# --------------------------------------------------------------------------- +# Test 3: marker file round-trips host/scope/surface +# --------------------------------------------------------------------------- + + +def test_marker_file_round_trips_host_scope_surface(tmp_path): + """_write_hosts_marker → _read_hosts_marker round-trips ConfiguredHost set.""" + configured_in = [ + ConfiguredHost(HOSTS["claude-code"], "project", "mcp"), + ConfiguredHost(HOSTS["qwen-code"], "user", "cli"), + ] + + _write_hosts_marker(tmp_path, configured_in) + + # The marker file exists at the project root with the canonical name. + assert _marker_path(tmp_path).is_file() + + configured_out = _read_hosts_marker(tmp_path) + assert configured_out is not None, "marker file not parsed" + assert len(configured_out) == 2 + + # Round-trip preserves host/scope/surface in order. + assert configured_out[0].host.name == "claude-code" + assert configured_out[0].scope == "project" + assert configured_out[0].surface == "mcp" + assert configured_out[1].host.name == "qwen-code" + assert configured_out[1].scope == "user" + assert configured_out[1].surface == "cli" + + +# --------------------------------------------------------------------------- +# Test 4: detect_configured_hosts returns ConfiguredHost (3-field NamedTuple) +# --------------------------------------------------------------------------- + + +def test_detect_configured_hosts_returns_configured_host_namedtuple(tmp_path): + """Marker-driven detection returns ConfiguredHost (3-field) instances. + + A CLI-only install writes a marker with surface="cli" and no MCP entry — + detect_configured_hosts must surface it via the marker (the legacy + MCP-entry scan would return [] here, leaving the install invisible to + ``update``). + """ + configured_in = [ + ConfiguredHost(HOSTS["claude-code"], "project", "cli"), + ] + _write_hosts_marker(tmp_path, configured_in) + + detected = detect_configured_hosts(tmp_path) + assert len(detected) == 1 + ch = detected[0] + # NamedTuple shape — 3 fields. + assert isinstance(ch, ConfiguredHost) + assert ch.host is HOSTS["claude-code"] + assert ch.scope == "project" + assert ch.surface == "cli" + + # Direct field access works (not tuple position only). + assert ch.host.name == "claude-code" + + +# --------------------------------------------------------------------------- +# Test 5 + 6: run_update routes through surface; CLI install visible +# --------------------------------------------------------------------------- + + +def test_update_after_cli_only_install_refreshes_cli_skill(tmp_path, monkeypatch): + """CLI-only install (no MCP entry) is visible to ``update`` via the marker. + + Regression: before PR-JRAG-5, ``detect_configured_hosts`` only scanned MCP + entries; a CLI-only install left no MCP entry, so ``run_update`` exited + with the fatal "No configured agent hosts found." (exit 2). With the marker + file + surface routing, update refreshes the CLI skill+agent instead. + """ + # Stage a CLI-only install state. + _write_hosts_marker( + tmp_path, + [ConfiguredHost(HOSTS["claude-code"], "project", "cli")], + ) + # Pre-create the CLI skill/agent so refresh has something to compare. + skill_dir = tmp_path / ".claude" / "skills" / "explore-codebase-cli" + skill_dir.mkdir(parents=True) + (skill_dir / "SKILL.md").write_text("STALE CLI SKILL", encoding="utf-8") + agents_dir = tmp_path / ".claude" / "agents" + agents_dir.mkdir(parents=True) + (agents_dir / "explorer-rag-cli.md").write_text("STALE CLI AGENT", encoding="utf-8") + + # Stub the package-artifact read so refresh has deterministic new content. + monkeypatch.setattr( + "java_codebase_rag.installer._read_package_artifact", + lambda rel: "FRESH CLI ARTIFACT", + ) + # Stub the index-side config discovery so update returns before indexing. + monkeypatch.setattr( + "java_codebase_rag.config.discover_project_root", + lambda cwd: None, + ) + + rc = run_update(force=False, dry_run=False, cwd=tmp_path) + # No fatal exit 2 ("No configured agent hosts found."). + assert rc != 2, "CLI-only install must NOT be invisible to update (exit 2)" + # Refresh wrote the new CLI artifacts. + assert (skill_dir / "SKILL.md").read_text() == "FRESH CLI ARTIFACT" + assert (agents_dir / "explorer-rag-cli.md").read_text() == "FRESH CLI ARTIFACT" + + +def test_run_update_unpacks_surface_and_passes_to_refresh(tmp_path, monkeypatch): + """run_update unpacks (host, scope, surface) and passes surface= to refresh. + + Captures the surface kwarg each refresh_artifacts call receives; the marker + is the source of truth (so a marker carrying surface=cli routes through + the CLI manifest). + """ + _write_hosts_marker( + tmp_path, + [ConfiguredHost(HOSTS["claude-code"], "project", "cli")], + ) + + seen_surfaces: list[str] = [] + real_refresh = refresh_artifacts + + def spy_refresh(host, scope, cwd, *, force, dry_run, surface="mcp"): + seen_surfaces.append(surface) + return real_refresh( + host, scope, cwd, force=force, dry_run=dry_run, surface=surface + ) + + monkeypatch.setattr("java_codebase_rag.installer.refresh_artifacts", spy_refresh) + monkeypatch.setattr( + "java_codebase_rag.installer._read_package_artifact", + lambda rel: "CONTENT", + ) + monkeypatch.setattr( + "java_codebase_rag.config.discover_project_root", + lambda cwd: None, + ) + + rc = run_update(force=False, dry_run=True, cwd=tmp_path) + assert rc in (0, 1) + assert seen_surfaces == ["cli"], ( + f"run_update must pass surface='cli' to refresh; got {seen_surfaces}" + ) + + +# --------------------------------------------------------------------------- +# Test 7: resolve_mcp_command surface-conditional +# --------------------------------------------------------------------------- + + +def test_resolve_mcp_command_resolves_jrag_on_cli_surface(monkeypatch): + """On surface='cli', resolve_mcp_command targets jrag (not the MCP binary). + + The CLI surface never raises SystemExit(2) for a missing MCP binary — the + MCP binary is irrelevant when no MCP entry is registered. + """ + seen_which_targets: list[str] = [] + + def fake_which(name): + seen_which_targets.append(name) + if name == "jrag": + return "/fake/bin/jrag" + return None # java-codebase-rag-mcp would NOT be found + + monkeypatch.setattr(shutil, "which", fake_which) + + resolved = resolve_mcp_command(non_interactive=True, surface="cli") + assert resolved == "/fake/bin/jrag" + assert "jrag" in seen_which_targets, "CLI surface must target jrag via which()" + # The MCP binary is never queried on the CLI surface. + assert "java-codebase-rag-mcp" not in seen_which_targets, ( + "CLI surface must not query for the MCP binary" + ) + + +def test_resolve_mcp_command_cli_surface_missing_jrag_exits_cleanly(monkeypatch, capsys): + """Missing jrag on CLI surface + non-interactive → SystemExit(2) (clean). + + Surfaces the same exit code as the MCP path, but the message targets + ``jrag`` and the user-facing hint mentions the console script. + """ + monkeypatch.setattr(shutil, "which", lambda name: None) + with pytest.raises(SystemExit) as exc: + resolve_mcp_command(non_interactive=True, surface="cli") + assert exc.value.code == 2 + out = capsys.readouterr().out + assert "jrag" in out + assert "java-codebase-rag-mcp" not in out + + +def test_resolve_mcp_command_mcp_surface_keeps_today_behavior(monkeypatch): + """On surface='mcp', resolve_mcp_command reproduces today's behavior + (targets java-codebase-rag-mcp).""" + monkeypatch.setattr( + shutil, "which", lambda name: "/usr/local/bin/java-codebase-rag-mcp" + ) + resolved = resolve_mcp_command(non_interactive=True, surface="mcp") + assert resolved == "/usr/local/bin/java-codebase-rag-mcp" + + +# --------------------------------------------------------------------------- +# Test 8: deploy/refresh surface defaults to mcp for back-comat +# --------------------------------------------------------------------------- + + +def test_deploy_refresh_surface_defaults_to_mcp_back_compat(tmp_path, monkeypatch): + """Existing direct-call sites in test_installer.py pass NO surface kwarg. + + Both deploy_artifacts and refresh_artifacts default to surface="mcp" + (keyword-only) so those callers keep working unchanged. Asserts the + default produces the same MCP-surface artifact set as today. + """ + monkeypatch.setattr( + shutil, "which", lambda name: "/fake/bin/java-codebase-rag-mcp" + ) + + # deploy_artifacts with NO surface kwarg. + deploy_results = deploy_artifacts( + [HOSTS["claude-code"]], + "project", + tmp_path, + non_interactive=True, + mcp_command="/fake/bin/java-codebase-rag-mcp", + ) + # MCP surface = 3 results (mcp + skill + agent). + assert len(deploy_results) == 3 + assert (tmp_path / ".mcp.json").is_file() + assert ( + tmp_path / ".claude" / "skills" / "explore-codebase" / "SKILL.md" + ).is_file() + assert ( + tmp_path / ".claude" / "agents" / "explorer-rag-enhanced.md" + ).is_file() + + # refresh_artifacts with NO surface kwarg. + monkeypatch.setattr( + "java_codebase_rag.installer._read_package_artifact", + lambda rel: "REFRESHED", + ) + refresh_results = refresh_artifacts( + HOSTS["claude-code"], + "project", + tmp_path, + force=True, + dry_run=False, + ) + # MCP surface = 3 results (mcp + skill + agent). + assert len(refresh_results) == 3 + + +# --------------------------------------------------------------------------- +# Test 9: handle_rerun pre-fills surface from marker +# --------------------------------------------------------------------------- + + +def test_handle_rerun_prefills_surface_from_marker(tmp_path, monkeypatch): + """select_surface(prefill=...) returns the prior surface on default input. + + The wizard's re-run path reads the marker, extracts the prior surface, and + passes it as ``prefill``. With non-interactive input (no --surface), the + prefill is preserved. + """ + _write_hosts_marker( + tmp_path, + [ConfiguredHost(HOSTS["qwen-code"], "user", "cli")], + ) + + # Read the prior surface exactly as run_install does. + from java_codebase_rag.installer import _prior_surface_from_marker + + prior = _prior_surface_from_marker(tmp_path) + assert prior == "cli" + + # select_surface with prefill + no CLI flag + non-interactive returns the + # default behavior — but interactive with default (TTY off) preserves the + # prefill as the default and returns it. + selected = select_surface( + non_interactive=False, + cli_surface=None, + prefill=prior, + ) + # Non-TTY prompt returns the default; select_surface uses prefill as default. + assert selected == "cli" + + +# --------------------------------------------------------------------------- +# Test 10: ARTIFACT_MANIFEST single source for deploy and refresh +# --------------------------------------------------------------------------- + + +def test_artifact_manifest_single_source_for_deploy_and_refresh(): + """ARTIFACT_MANIFEST is iterated by BOTH deploy_artifacts and refresh_artifacts. + + The invariant: adding/removing an artifact is ONE manifest edit, not two. + Asserts the manifest carries the documented entries and that the deploy/ + refresh loops are wired to the same constant (no parallel hardcoded lists). + """ + # Documented shape. + assert set(ARTIFACT_MANIFEST.keys()) == {"mcp", "cli"} + + mcp_entries = ARTIFACT_MANIFEST["mcp"] + cli_entries = ARTIFACT_MANIFEST["cli"] + + # MCP surface = mcp entry + explore-codebase skill + explorer-rag-enhanced. + assert len(mcp_entries) == 3 + mcp_kinds = [kind for kind, _, _ in mcp_entries] + assert mcp_kinds == ["mcp", "skill", "agent"] + # Skill + agent paths point at the MCP-surface artifact names. + skill_pkg = next(pkg for kind, pkg, _ in mcp_entries if kind == "skill") + agent_pkg = next(pkg for kind, pkg, _ in mcp_entries if kind == "agent") + assert "explore-codebase/" in skill_pkg + assert "enhanced" in agent_pkg + # No CLI-surface artifact leaks into the MCP manifest. + assert not any("explore-codebase-cli" in pkg for _, pkg, _ in mcp_entries) + assert not any("explorer-rag-cli" in pkg for _, pkg, _ in mcp_entries) + + # CLI surface = explore-codebase-cli skill + explorer-rag-cli agent (NO mcp). + assert len(cli_entries) == 2 + cli_kinds = [kind for kind, _, _ in cli_entries] + assert cli_kinds == ["skill", "agent"] + assert "mcp" not in cli_kinds, "CLI surface must NOT register an MCP entry" + # Skill + agent paths point at the CLI-surface artifact names. + cli_skill_pkg = next(pkg for kind, pkg, _ in cli_entries if kind == "skill") + cli_agent_pkg = next(pkg for kind, pkg, _ in cli_entries if kind == "agent") + assert "explore-codebase-cli/" in cli_skill_pkg + assert "cli" in cli_agent_pkg + + +# --------------------------------------------------------------------------- +# Bonus: --surface CLI flag registration (lightweight, parser-only) +# --------------------------------------------------------------------------- + + +def test_install_subparser_registers_surface_flag(): + """``--surface`` is registered on the install subparser. + + Default is ``None`` so the interactive ``select_surface`` wizard prompts + when the flag is omitted (the proposal's CLI-vs-MCP choice); non-interactive + installs fall back to ``'mcp'`` inside ``select_surface`` for back-comat. + """ + import argparse + + from java_codebase_rag.cli import build_parser # operator CLI + + parser = build_parser() + # Reach into argparse internals to find the install subparser's surface opt. + install_action = next( + a + for a in parser._actions + if isinstance(a, argparse._SubParsersAction) + ) + install_parser = install_action.choices["install"] + surface_action = next( + a for a in install_parser._actions if "--surface" in (a.option_strings or []) + ) + assert surface_action.choices == ["mcp", "cli"] + assert surface_action.default is None + assert surface_action.dest == "surface" diff --git a/tests/test_java_codebase_rag_cli.py b/tests/test_java_codebase_rag_cli.py index 6064d281..85e641e5 100644 --- a/tests/test_java_codebase_rag_cli.py +++ b/tests/test_java_codebase_rag_cli.py @@ -1513,10 +1513,11 @@ def test_cmd_install_forwards_verbose_flag( captured: dict = {} def _fake_run_install(*, non_interactive, agents, scope, model, - source_root=None, quiet=False, verbose=False): + source_root=None, quiet=False, verbose=False, surface=None): captured["quiet"] = quiet captured["verbose"] = verbose captured["non_interactive"] = non_interactive + captured["surface"] = surface return 0 monkeypatch.setattr(_installer, "run_install", _fake_run_install) @@ -1527,6 +1528,10 @@ def _fake_run_install(*, non_interactive, agents, scope, model, ) assert rc == 0 assert captured["verbose"] is True + # Omitting --surface forwards None so the interactive select_surface wizard + # prompts (non-interactive falls back to "mcp" inside select_surface). The + # operator never picking a surface implicitly is the bug-#1 contract. + assert captured["surface"] is None # quiet still flows through too. rc2 = cli_mod.main( ["install", "--non-interactive", "--agent", "claude-code", "--quiet"] diff --git a/tests/test_jrag_envelope.py b/tests/test_jrag_envelope.py new file mode 100644 index 00000000..e6a6f570 --- /dev/null +++ b/tests/test_jrag_envelope.py @@ -0,0 +1,619 @@ +"""Tests for java_codebase_rag.jrag_envelope (PR-JRAG-1a). + +Pure unit tests for the envelope dataclass and the resolve-first mapper / +enum normalization / boundary helpers. The resolve_v2 path is mocked so these +tests do not require a real LadybugDB graph. +""" +from __future__ import annotations + +from unittest.mock import MagicMock + +import pytest + +from graph_types import NodeRef +from java_codebase_rag.jrag_envelope import ( + Envelope, + mark_truncated, + normalize_enum, + project_edge, + project_envelope, + project_node, + resolve_query, + to_envelope_rows, +) +from resolve_service import ResolveCandidate, ResolveOutput + + +# ----- Test 1: to_dict omits empty optionals ----- + + +def test_envelope_to_dict_omits_empty_optionals() -> None: + env = Envelope(status="ok") + out = env.to_dict() + # Only status remains; all optional fields omitted. + assert out == {"status": "ok"} + # The omitted fields: + for key in ( + "nodes", + "edges", + "root", + "candidates", + "agent_next_actions", + "warnings", + "truncated", + "file_location", + "message", + ): + assert key not in out + + +def test_envelope_to_dict_includes_present_optionals() -> None: + env = Envelope( + status="ok", + root="sym:1", + nodes={"sym:1": {"fqn": "com.foo.Bar"}}, + warnings=["partial"], + truncated=True, + file_location="Bar.java:10", + ) + out = env.to_dict() + assert out["root"] == "sym:1" + assert out["nodes"] == {"sym:1": {"fqn": "com.foo.Bar"}} + assert out["warnings"] == ["partial"] + assert out["truncated"] is True + assert out["file_location"] == "Bar.java:10" + + +def test_envelope_to_json_roundtrips_status_and_message() -> None: + import json + + env = Envelope(status="not_found", message="no match") + out = json.loads(env.to_json()) + assert out == {"status": "not_found", "message": "no match"} + + +# ----- Test 2: pydantic -> dict boundary via .model_dump() ----- + + +def test_pydantic_results_converted_via_model_dump() -> None: + # NodeRef is a pydantic v2 BaseModel; passing one through to_envelope_rows + # yields a plain dict (NOT a pydantic model instance). + ref = NodeRef(id="sym:1", kind="symbol", fqn="com.foo.Bar", name="Bar") + rows = to_envelope_rows([ref]) + assert len(rows) == 1 + assert isinstance(rows[0], dict) + assert not hasattr(rows[0], "model_dump") + assert rows[0]["id"] == "sym:1" + assert rows[0]["fqn"] == "com.foo.Bar" + + +def test_to_envelope_rows_passes_dicts_through() -> None: + rows = to_envelope_rows([{"id": "x"}, {"id": "y"}]) + assert rows == [{"id": "x"}, {"id": "y"}] + + +# ----- Tests 3-6: resolve_query ----- + + +def _make_node( + *, + id: str = "sym:1", + kind: str = "symbol", + fqn: str = "com.foo.Bar.doStuff", + symbol_kind: str | None = "method", + role: str | None = "CONTROLLER", + microservice: str | None = "foo-service", + module: str | None = None, +) -> NodeRef: + return NodeRef( + id=id, + kind=kind, # type: ignore[arg-type] + fqn=fqn, + symbol_kind=symbol_kind, + role=role, + microservice=microservice, + module=module, + ) + + +def _graph_returning_file_location(filename: str, start_line: int) -> MagicMock: + """A mock graph whose `_rows` returns a filename/start_line row for any query.""" + g = MagicMock() + g._rows.return_value = [{"filename": filename, "start_line": start_line}] + return g + + +def test_resolve_query_one_proceeds_and_sets_file_location(monkeypatch: pytest.MonkeyPatch) -> None: + node = _make_node() + fake_output = ResolveOutput(success=True, status="one", node=node, resolved_identifier="doStuff") + + def fake_resolve_v2(identifier, hint_kind=None, graph=None): + assert identifier == "doStuff" + return fake_output + + monkeypatch.setattr("resolve_service.resolve_v2", fake_resolve_v2) + graph = _graph_returning_file_location("src/Foo.java", 42) + cfg = MagicMock() + cfg.ladybug_path = "/tmp/x/code_graph.lbug" + + result_node, env = resolve_query( + "doStuff", + hint_kind="symbol", + java_kind=None, + role=None, + fqn_prefix=None, + cfg=cfg, + graph=graph, + ) + + assert result_node is not None + assert result_node.id == "sym:1" + assert env.status == "ok" + assert env.root == "sym:1" + assert env.file_location == "src/Foo.java:42" + + +def test_resolve_query_one_blocked_by_post_filter_returns_not_found( + monkeypatch: pytest.MonkeyPatch, +) -> None: + node = _make_node(role="SERVICE") + fake_output = ResolveOutput(success=True, status="one", node=node) + monkeypatch.setattr("resolve_service.resolve_v2", lambda *a, **kw: fake_output) + + graph = _graph_returning_file_location("src/Foo.java", 1) + cfg = MagicMock() + result_node, env = resolve_query( + "doStuff", + hint_kind="symbol", + java_kind=None, + role="CONTROLLER", # mismatch -> post-filter fails + fqn_prefix=None, + cfg=cfg, + graph=graph, + ) + assert result_node is None + assert env.status == "not_found" + assert env.message is not None + # The not_found message must surface the post-filter failure. + assert "filters" in env.message.lower() or "post-filter" in env.message.lower() + + +def test_resolve_query_many_returns_candidates_with_reason(monkeypatch: pytest.MonkeyPatch) -> None: + n1 = _make_node(id="sym:1", fqn="com.foo.Bar.doStuff", microservice="foo") + n2 = _make_node(id="sym:2", fqn="com.foo.Baz.doStuff", microservice="bar") + fake_output = ResolveOutput( + success=True, + status="many", + candidates=[ + ResolveCandidate(node=n1, score=0.9, reason="fqn_suffix"), + ResolveCandidate(node=n2, score=0.5, reason="short_name"), + ], + ) + monkeypatch.setattr("resolve_service.resolve_v2", lambda *a, **kw: fake_output) + graph = MagicMock() + cfg = MagicMock() + + result_node, env = resolve_query( + "doStuff", + hint_kind="symbol", + java_kind=None, + role=None, + fqn_prefix=None, + cfg=cfg, + graph=graph, + ) + + assert result_node is None + assert env.status == "ambiguous" + assert len(env.candidates) == 2 + # Each candidate carries a reason; no file or score field. + for cand in env.candidates: + assert "reason" in cand + assert "file" not in cand + assert "score" not in cand + reasons = {c["reason"] for c in env.candidates} + assert reasons == {"fqn_suffix", "short_name"} + + +def test_resolve_query_many_post_filter_collapses_to_one(monkeypatch: pytest.MonkeyPatch) -> None: + # Two candidates, one matching the post-filter, the other not. After + # post-filter collapse, exactly one survives -> proceed (status=ok). + n_match = _make_node(id="sym:1", fqn="com.foo.Bar.doStuff", microservice="foo", role="CONTROLLER") + n_other = _make_node(id="sym:2", fqn="com.foo.Baz.doStuff", microservice="bar", role="SERVICE") + fake_output = ResolveOutput( + success=True, + status="many", + candidates=[ + ResolveCandidate(node=n_match, score=0.9, reason="fqn_suffix"), + ResolveCandidate(node=n_other, score=0.5, reason="short_name"), + ], + ) + monkeypatch.setattr("resolve_service.resolve_v2", lambda *a, **kw: fake_output) + graph = _graph_returning_file_location("Foo.java", 7) + cfg = MagicMock() + + result_node, env = resolve_query( + "doStuff", + hint_kind="symbol", + java_kind=None, + role="controller", # mixed-case; normalize_enum -> CONTROLLER + fqn_prefix=None, + cfg=cfg, + graph=graph, + ) + + assert result_node is not None + assert result_node.id == "sym:1" + assert env.status == "ok" + assert env.root == "sym:1" + assert env.file_location == "Foo.java:7" + + +def test_resolve_query_many_caps_candidates_at_ten(monkeypatch: pytest.MonkeyPatch) -> None: + # 12 candidates, no post-filter. All 12 survive -> ambiguous, capped at 10. + cands = [ + ResolveCandidate( + node=_make_node(id=f"sym:{i}", fqn=f"com.foo.C{i}.doStuff", microservice="foo"), + score=1.0 - i * 0.05, + reason="short_name", + ) + for i in range(12) + ] + fake_output = ResolveOutput(success=True, status="many", candidates=cands) + monkeypatch.setattr("resolve_service.resolve_v2", lambda *a, **kw: fake_output) + graph = MagicMock() + cfg = MagicMock() + + result_node, env = resolve_query( + "doStuff", hint_kind="symbol", java_kind=None, role=None, fqn_prefix=None, cfg=cfg, graph=graph + ) + assert result_node is None + assert env.status == "ambiguous" + assert len(env.candidates) == 10 # capped + + +def test_resolve_query_many_post_filter_rejects_all_is_not_found( + monkeypatch: pytest.MonkeyPatch, +) -> None: + """Regression (review finding A): when a post-filter rejects EVERY `many` + candidate, the result is not_found — NOT an empty ambiguous list. + + An empty ambiguous list would render as '0 ambiguous matches' with no + narrowing value; not_found with the filter-failure message is the honest, + actionable result (same message as the `one` post-filter-fail branch). + """ + n1 = _make_node(id="sym:1", fqn="com.foo.Bar.doStuff", microservice="foo", role="SERVICE") + n2 = _make_node(id="sym:2", fqn="com.foo.Baz.doStuff", microservice="bar", role="SERVICE") + fake_output = ResolveOutput( + success=True, + status="many", + candidates=[ + ResolveCandidate(node=n1, score=0.9, reason="fqn_suffix"), + ResolveCandidate(node=n2, score=0.5, reason="short_name"), + ], + ) + monkeypatch.setattr("resolve_service.resolve_v2", lambda *a, **kw: fake_output) + graph = MagicMock() + cfg = MagicMock() + + result_node, env = resolve_query( + "doStuff", + hint_kind="symbol", + java_kind=None, + role="CONTROLLER", # neither candidate is CONTROLLER -> all rejected + fqn_prefix=None, + cfg=cfg, + graph=graph, + ) + assert result_node is None + assert env.status == "not_found", ( + f"empty-many must be not_found (not {env.status!r}); candidates={env.candidates}" + ) + assert env.candidates == [] + assert env.message is not None + assert "filters" in env.message.lower() or "post-filter" in env.message.lower() + + +def test_resolve_query_none_is_not_found_with_search_hint(monkeypatch: pytest.MonkeyPatch) -> None: + fake_output = ResolveOutput( + success=True, + status="none", + message="No matches for identifier; use search(query=...) for ranked fuzzy lookup.", + ) + monkeypatch.setattr("resolve_service.resolve_v2", lambda *a, **kw: fake_output) + graph = MagicMock() + cfg = MagicMock() + + result_node, env = resolve_query( + "missing", + hint_kind="symbol", + java_kind=None, + role=None, + fqn_prefix=None, + cfg=cfg, + graph=graph, + ) + assert result_node is None + assert env.status == "not_found" + assert env.message is not None + # The CLI-specific hint must reference `jrag search` (not the MCP `search`). + assert "jrag search" in env.message + + +# ----- Tests 7-9: normalize_enum ----- + + +def test_normalize_enum_role_uppercase() -> None: + """role/capability: case + kebab -> UPPER_SNAKE (stored uppercase). + + framework / java_kind are stored LOWERCASE (NodeFilter Literal values + + graph node fields), so they normalize to lowercase regardless of input case + — uppercasing them crashed `search --framework` (pydantic ValidationError) + and made `routes --framework` return 0 results. + """ + for input_val in ("controller", "Controller", "CONTROLLER"): + assert normalize_enum(input_val, kind="role") == "CONTROLLER" + # role kebab-case becomes UPPER_SNAKE. + assert normalize_enum("rest-controller", kind="role") == "REST_CONTROLLER" + + # framework -> lowercase snake (matches NodeFilter.Framework Literal). + assert normalize_enum("spring-mvc", kind="framework") == "spring_mvc" + assert normalize_enum("SPRING_MVC", kind="framework") == "spring_mvc" + assert normalize_enum("web-flux", kind="framework") == "web_flux" + assert normalize_enum("kafka", kind="framework") == "kafka" + # java_kind -> lowercase (matches DeclarationSymbolKind Literal). + assert normalize_enum("class", kind="java_kind") == "class" + assert normalize_enum("METHOD", kind="java_kind") == "method" + assert normalize_enum("interface", kind="java_kind") == "interface" + + +def test_normalize_enum_client_kind_lookup() -> None: + """client_kind: explicit lookup table -> feign_method / rest_template / web_client.""" + assert normalize_enum("feign", kind="client_kind") == "feign_method" + assert normalize_enum("rest-template", kind="client_kind") == "rest_template" + assert normalize_enum("rest_template", kind="client_kind") == "rest_template" + assert normalize_enum("RestTemplate", kind="client_kind") == "rest_template" + assert normalize_enum("web-client", kind="client_kind") == "web_client" + assert normalize_enum("webclient", kind="client_kind") == "web_client" + + +def test_normalize_enum_producer_kind_lookup() -> None: + """producer_kind: explicit lookup table -> kafka_send / stream_bridge_send.""" + assert normalize_enum("kafka", kind="producer_kind") == "kafka_send" + assert normalize_enum("stream-bridge", kind="producer_kind") == "stream_bridge_send" + assert normalize_enum("stream_bridge", kind="producer_kind") == "stream_bridge_send" + + +def test_normalize_enum_source_layer_lookup() -> None: + """source_layer: explicit lookup table -> builtin / layer_a_meta / layer_b_* / layer_c_source.""" + assert normalize_enum("builtin", kind="source_layer") == "builtin" + assert normalize_enum("layer-a", kind="source_layer") == "layer_a_meta" + assert normalize_enum("layer-b-ann", kind="source_layer") == "layer_b_ann" + assert normalize_enum("layer-b-fqn", kind="source_layer") == "layer_b_fqn" + assert normalize_enum("layer-c", kind="source_layer") == "layer_c_source" + + +def test_normalize_enum_empty_passthrough() -> None: + assert normalize_enum("", kind="role") == "" + assert normalize_enum(" ", kind="client_kind") == "" + + +# ----- Test 10: mark_truncated ----- + + +def test_mark_truncated_flags_and_clips() -> None: + rows = list(range(8)) + visible, truncated = mark_truncated(rows, limit=5) + assert truncated is True + assert visible == [0, 1, 2, 3, 4] + + +def test_mark_truncated_no_truncation_when_under_limit() -> None: + rows = list(range(3)) + visible, truncated = mark_truncated(rows, limit=5) + assert truncated is False + assert visible == [0, 1, 2] + + +def test_mark_truncated_boundary_equal_is_not_truncated() -> None: + # Exactly limit rows -> not truncated (the +1 row is what signals truncation). + rows = list(range(5)) + visible, truncated = mark_truncated(rows, limit=5) + assert truncated is False + assert visible == [0, 1, 2, 3, 4] + + +def test_mark_truncated_zero_limit() -> None: + visible, truncated = mark_truncated([1, 2, 3], limit=0) + assert truncated is True + assert visible == [] + + +def test_mark_truncated_negative_limit_raises() -> None: + with pytest.raises(ValueError): + mark_truncated([1, 2], limit=-1) + + +# ----- Tests 11-18: detail projection (PR-JRAG-6) ----- +# +# `--detail brief|normal|full` is orthogonal to `--format text|json`. The +# projector is the single seam: the renderer applies it once, then both the +# JSON path and the text renderers consume the trimmed dict. These tests pin +# the field sets + the empty-field dropping + the file composition directly. + + +def _full_symbol_node() -> dict: + """A node carrying the full SymbolHit-derived field set.""" + return { + "id": "sym:1", + "kind": "symbol", + "fqn": "com.foo.Svc.find", + "name": "find", + "symbol_kind": "method", + "microservice": "chat", + "module": "core", + "role": "SERVICE", + "framework": "spring", + "filename": "src/Svc.java", + "start_line": 42, + "end_line": 60, + "signature": "find(Long)", + "annotations": ["@Override"], + "capabilities": ["TX"], + "modifiers": ["public"], + "package": "com.foo", + "parent_id": "sym:0", + "resolved": True, + "score": 0.91, + } + + +def test_project_node_brief_keeps_identity_drops_extras() -> None: + """brief == today's terse identity set; location/ranking/content dropped. + + Graph-id fields (``id`` / ``parent_id``) are stripped at every level — the + CLI is resolve-first, so no raw graph id reaches the agent. + """ + out = project_node(_full_symbol_node(), "brief") + # Identity keys survive (``id`` is NOT among them — stripped at the boundary). + for key in ("kind", "fqn", "name", "microservice", "resolved"): + assert key in out, f"brief dropped identity key {key!r}" + # file/score (ranking/location), content fields, AND graph-id fields are dropped. + for key in ("module", "role", "symbol_kind", "file", "score", "signature", + "annotations", "capabilities", "package", + "id", "parent_id"): + assert key not in out, f"brief leaked {key!r}" + # Raw location columns are folded away (no filename/start_line at any level). + assert "filename" not in out and "start_line" not in out + + +def test_project_node_normal_adds_location_and_ranking() -> None: + """normal adds module/role/symbol_kind/framework/file/score over brief. + + This is the fix for the 'text too terse' complaint: file + score become + visible. Content fields (signature/annotations/...) still dropped. + """ + out = project_node(_full_symbol_node(), "normal") + for key in ("kind", "fqn", "name", "microservice", + "module", "role", "symbol_kind", "framework", "score", "resolved"): + assert key in out, f"normal dropped {key!r}" + # file is composed from filename+start_line. + assert out["file"] == "src/Svc.java:42" + # Content + graph-id fields suppressed at normal. + for key in ("signature", "annotations", "capabilities", "modifiers", "package", + "id", "parent_id"): + assert key not in out, f"normal leaked content/id {key!r}" + + +def test_project_node_full_keeps_everything() -> None: + """full keeps every present key (still composes file + drops empties). + + The ONLY keys dropped at full are raw graph-id fields (``id`` / ``parent_id``) + and the raw location columns folded into ``file`` — everything else the + SymbolHit carries is kept. + """ + out = project_node(_full_symbol_node(), "full") + for key in ("signature", "annotations", "capabilities", "modifiers", + "package", "score", "file", "role", "module"): + assert key in out, f"full dropped {key!r}" + assert out["file"] == "src/Svc.java:42" + # Raw location columns folded into `file`; graph-id fields stripped at full. + for key in ("filename", "start_line", "end_line", "id", "parent_id"): + assert key not in out, f"full leaked {key!r}" + + +def test_project_node_drops_empty_fields_at_all_levels() -> None: + """None / '' / [] / {} vanish at every level (the '10 empty fields' fix). + + A SearchHit dump used to serialize ``symbol_id: null, role: null, module: null``. + The projector drops them. ``False`` and ``0.0`` are NOT empty (meaningful). + """ + node = { + "id": "chunk:1", + "kind": "search_hit", + "fqn": "com.foo.Bar", + "name": "Bar", + "microservice": "chat", + "score": 0.0, # NOT empty + "snippet": "body", # only at full + "module": None, # empty + "role": "", # empty + "symbol_id": None, # empty + "capabilities": [], # empty + "resolved": False, # NOT empty (meaningful) + } + for detail in ("brief", "normal", "full"): + out = project_node(node, detail) + # Empty values dropped at every level. + assert "module" not in out and "role" not in out, f"{detail}: empty kept" + assert "symbol_id" not in out and "capabilities" not in out, f"{detail}: empty kept" + # 0.0 / False are NOT empty (meaningful) — survive when in the level's set. + # `resolved` is identity (in brief); `score` is normal/full only. + assert out.get("resolved") is False, f"{detail}: False resolved wrongly dropped" + if detail in ("normal", "full"): + assert out.get("score") == 0.0, f"{detail}: 0.0 score wrongly dropped" + else: + assert "score" not in out, f"{detail}: score is not a brief field" + + +def test_compose_file_from_filename_and_start_line() -> None: + """file = 'filename:start_line'; bare filename when no line; absent when no filename.""" + assert project_node({"id": "1", "kind": "symbol", "fqn": "x", "name": "x", + "filename": "A.java", "start_line": 7}, "normal")["file"] == "A.java:7" + assert project_node({"id": "1", "kind": "symbol", "fqn": "x", "name": "x", + "filename": "A.java"}, "normal")["file"] == "A.java" + out = project_node({"id": "1", "kind": "symbol", "fqn": "x", "name": "x"}, "normal") + assert "file" not in out + + +def test_project_envelope_passes_through_envelope_level_fields() -> None: + """status/root/warnings/truncated/file_location/message/agent_next_actions + are envelope-level — projected through unchanged (no detail axis on them).""" + env = Envelope( + status="ok", + nodes={"sym:1": _full_symbol_node()}, + root="sym:1", + warnings=["w1"], + truncated=True, + file_location="src/Svc.java:42", + message=None, + ) + env.agent_next_actions = ["jrag inspect Svc"] + p = project_envelope(env, "brief") + assert p.status == "ok" + assert p.root == "sym:1" + assert p.warnings == ["w1"] + assert p.truncated is True + assert p.file_location == "src/Svc.java:42" + assert p.agent_next_actions == ["jrag inspect Svc"] + # Nodes ARE projected (brief drops the content). + assert "signature" not in p.nodes["sym:1"] + + +def test_project_edge_brief_normal_full_attr_sets() -> None: + edge = { + "other_id": "sym:2", + "edge_type": "INJECTS", + "confidence": 0.5, + "mechanism": "field", + "annotation": "@Inject", + "field_or_param": "repo", + "from_fqn": "com.foo.Svc", + "role": "REPOSITORY", + } + brief = project_edge(edge, "brief") + assert "other_id" in brief and "edge_type" in brief + assert "mechanism" not in brief and "annotation" not in brief + normal = project_edge(edge, "normal") + assert normal.get("mechanism") == "field" + assert "annotation" not in normal and "field_or_param" not in normal + full = project_edge(edge, "full") + for key in ("mechanism", "annotation", "field_or_param", "from_fqn", "role"): + assert key in full, f"full edge dropped {key!r}" + + +def test_project_envelope_bad_detail_raises() -> None: + """A typo must raise, not silently behave like full.""" + env = Envelope(status="ok", nodes={"sym:1": {"id": "1", "kind": "symbol", "fqn": "x"}}) + with pytest.raises(ValueError): + project_envelope(env, "bogus") diff --git a/tests/test_jrag_listing.py b/tests/test_jrag_listing.py new file mode 100644 index 00000000..614b8a8e --- /dev/null +++ b/tests/test_jrag_listing.py @@ -0,0 +1,458 @@ +"""Tests for `jrag` listing commands (PR-JRAG-2). + +Tests: +1. test_routes_returns_route_kind - routes command returns route nodes +2. test_clients_filters_by_calls_service - clients --calls-service filters +3. test_producers_filter_by_topic_prefix - producers --topic-prefix filters +4. test_topics_groups_producers_by_topic - topics groups producers by topic name +5. test_topics_consumer_in_uses_neighbors_in_async_calls - topics --consumer-in uses neighbors_v2 +6. test_jobs_lists_scheduled_task - jobs lists SCHEDULED_TASK symbols +7. test_listeners_lists_message_listener - listeners lists MESSAGE_LISTENER symbols +8. test_entities_lists_entity_role - entities lists ENTITY role symbols +9. test_listing_service_scope_pushes_down - --service pushes down to backend +10. test_listing_truncated_fires_at_limit - +1-fetch truncation detection +11. test_listing_client_kind_enum_lookup - --client-kind feign → feign_method +12. test_listing_rejects_offset - --offset not registered on listings + +Note: --offset is NOT supported on any listing command (test 12 confirms). +""" +from __future__ import annotations + +import json +import os +import shutil +import subprocess +import sys +from pathlib import Path + + +def _jrag_exe() -> str: + """Locate the installed ``jrag`` entry point next to the venv interpreter.""" + candidate = Path(sys.executable).parent / "jrag" + if candidate.is_file(): + return str(candidate) + exe = shutil.which("jrag") + assert exe is not None, "expected installed jrag entrypoint (run: pip install -e .)" + return exe + + +def _run_jrag( + args: list[str], + *, + env: dict[str, str] | None = None, + stdin: str | None = None, +) -> subprocess.CompletedProcess: + return subprocess.run( + [_jrag_exe(), *args], + capture_output=True, + text=True, + env=env, + input=stdin, + check=False, + ) + + +# ----- Test 1: routes returns route kind ----- + + +def test_routes_returns_route_kind(corpus_root: Path, ladybug_db_path: Path) -> None: + """routes command returns route nodes with correct kind.""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + proc = _run_jrag(["routes", "--format", "json"], env=env) + assert proc.returncode == 0, f"routes failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + nodes = payload.get("nodes", {}) + # At least some routes should exist + assert len(nodes) >= 1, "expected at least one route node" + # Each route carries its kind plus at least one identifying field. Resolved + # HTTP routes carry `path`, Kafka topic routes (kind=kafka_topic) carry + # `topic`, and unresolved/phantom HTTP endpoints may carry only `method` + + # `file`. The prior `or "id"` fallback was an always-true tautology (every + # node has an id) and masked a missing defining field. + for node_id, node in nodes.items(): + assert "kind" in node, f"route {node_id} missing kind: {node}" + assert any(k in node for k in ("path", "topic", "method", "file")), ( + f"route {node_id} has no identifying field: {node}" + ) + + +# ----- Test 2: clients filters by calls-service ----- + + +def test_clients_filters_by_calls_service(corpus_root: Path, ladybug_db_path: Path) -> None: + """clients --calls-service filters by target service.""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + # First get all clients + proc_all = _run_jrag(["clients", "--format", "json"], env=env) + assert proc_all.returncode == 0 + payload_all = json.loads(proc_all.stdout) + all_clients = payload_all.get("nodes", {}) + + # Now filter by a specific service (if any exist in the corpus) + if len(all_clients) > 0: + # Pick the first client's target_service to filter by + first_client = next(iter(all_clients.values())) + target_service = first_client.get("target_service") + if target_service: + proc_filtered = _run_jrag(["clients", "--calls-service", target_service, "--format", "json"], env=env) + assert proc_filtered.returncode == 0 + payload_filtered = json.loads(proc_filtered.stdout) + filtered_clients = payload_filtered.get("nodes", {}) + # All filtered clients should have the target_service + for node_id, node in filtered_clients.items(): + assert node.get("target_service") == target_service, f"client {node_id} has wrong target_service" + + +# ----- Test 3: producers filter by topic-prefix ----- + + +def test_producers_filter_by_topic_prefix(corpus_root: Path, ladybug_db_path: Path) -> None: + """producers --topic-prefix filters by topic prefix.""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + # First get all producers + proc_all = _run_jrag(["producers", "--format", "json"], env=env) + assert proc_all.returncode == 0 + payload_all = json.loads(proc_all.stdout) + all_producers = payload_all.get("nodes", {}) + + # Now filter by topic prefix (if any producers exist) + if len(all_producers) > 0: + # Pick the first producer's topic to use as prefix + first_producer = next(iter(all_producers.values())) + topic = first_producer.get("topic") + if topic: + # Use first character as prefix + prefix = topic[0] + proc_filtered = _run_jrag(["producers", "--topic-prefix", prefix, "--format", "json"], env=env) + assert proc_filtered.returncode == 0 + payload_filtered = json.loads(proc_filtered.stdout) + filtered_producers = payload_filtered.get("nodes", {}) + # All filtered producers should have topics starting with the prefix + for node_id, node in filtered_producers.items(): + assert node.get("topic", "").startswith(prefix), f"producer {node_id} topic doesn't start with {prefix}" + + +# ----- Test 4: topics groups producers by topic ----- + + +def test_topics_groups_producers_by_topic(corpus_root: Path, ladybug_db_path: Path) -> None: + """topics command groups producers by topic name.""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + proc = _run_jrag(["topics", "--format", "json"], env=env) + assert proc.returncode == 0, f"topics failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + nodes = payload.get("nodes", {}) + # Topics should be grouped with producers lists + for node_id, node in nodes.items(): + # Each topic node should have a topic field and producers list + assert "topic" in node, f"topic node {node_id} missing topic field" + assert "producers" in node, f"topic node {node_id} missing producers list" + assert isinstance(node["producers"], list), "producers should be a list" + + +# ----- Test 5: topics --consumer-in resolves consumers via EXPOSES ----- + + +def test_topics_consumer_in_resolves_consumers_via_exposes(ladybug_graph) -> None: + """topics --consumer-in resolves listener consumers via EXPOSES on Route. + + The original PR-JRAG-2 implementation traversed ASYNC_CALLS inbound to + Producer nodes, which is the wrong edge model (ASYNC_CALLS run + Producer -> Route per java_ontology.py:415-416). This test exercises the + corrected resolver directly. + + Fixture reality: no producer topic literal overlaps a listener topic + literal (producers carry unresolved constants like 'ChatTopics.*' or + resolved 'banking.chat.audit'; listeners carry different forms). So + `topics --consumer-in` will not attach consumers to producer-grouped topics + on THIS fixture — but the EXPOSES-based resolver does resolve a known + listener for a known resolved topic. We assert the resolver returns that + listener for the exact topic 'banking.chat.compliance.review' consumed by + ComplianceReviewListener in microservice 'chat-core'. + """ + from java_codebase_rag.jrag import _resolve_topic_consumers + + consumers = _resolve_topic_consumers( + ladybug_graph, + topic="banking.chat.compliance.review", + microservice="chat-core", + prefix=False, + ) + assert len(consumers) >= 1, ( + f"expected ComplianceReviewListener resolved for " + f"'banking.chat.compliance.review' in 'chat-core'; got {consumers}" + ) + found = any("ComplianceReviewListener" in c.get("fqn", "") for c in consumers) + assert found, ( + f"ComplianceReviewListener not in resolver result; got {[c.get('fqn') for c in consumers]}" + ) + + # Prefix match should also find it under 'banking.chat'. + consumers_prefix = _resolve_topic_consumers( + ladybug_graph, + topic="banking.chat", + prefix=True, + ) + assert any("ComplianceReviewListener" in c.get("fqn", "") for c in consumers_prefix), ( + f"ComplianceReviewListener not in prefix resolver result; " + f"got {[c.get('fqn') for c in consumers_prefix]}" + ) + + +# ----- Test 6: jobs lists scheduled-task ----- + + +def test_jobs_lists_scheduled_task(corpus_root: Path, ladybug_db_path: Path) -> None: + """jobs command lists symbols with SCHEDULED_TASK capability.""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + proc = _run_jrag(["jobs", "--format", "json"], env=env) + assert proc.returncode == 0, f"jobs failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + nodes = payload.get("nodes", {}) + # All nodes should be symbols with the scheduled task capability + for node_id, node in nodes.items(): + assert node.get("kind") == "symbol", f"jobs returned non-symbol: {node.get('kind')}" + + +# ----- Test 7: listeners lists message-listener ----- + + +def test_listeners_lists_message_listener(corpus_root: Path, ladybug_db_path: Path) -> None: + """listeners command lists symbols with MESSAGE_LISTENER capability.""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + proc = _run_jrag(["listeners", "--format", "json"], env=env) + assert proc.returncode == 0, f"listeners failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + nodes = payload.get("nodes", {}) + # All nodes should be symbols + for node_id, node in nodes.items(): + assert node.get("kind") == "symbol", f"listeners returned non-symbol: {node.get('kind')}" + + +# ----- Test 7a: listeners --topic-prefix narrows (real filter) ----- + + +def test_listeners_topic_prefix_narrows(corpus_root: Path, ladybug_db_path: Path) -> None: + """listeners --topic-prefix filters via listener_method -EXPOSES-> Route(topic). + + The bank-chat fixture has 3 MESSAGE_LISTENER symbols: + - ComplianceReviewListener (topic=banking.chat.compliance.review) + - ChatKafkaListener (topic=ChatTopics.INCOMING — unresolved constant) + - DistributionTriggerListener(topic=${assign.kafka.distribution-topic} — placeholder) + Filtering by 'banking.chat' must narrow to the proper subset containing + only ComplianceReviewListener, proving --topic-prefix is a real filter + (not the previous include-all stub). + """ + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + # All listeners (no filter) + proc_all = _run_jrag(["listeners", "--format", "json"], env=env) + assert proc_all.returncode == 0 + payload_all = json.loads(proc_all.stdout) + all_nodes = payload_all.get("nodes", {}) + all_count = len(all_nodes) + assert all_count >= 1, "expected at least one listener in fixture" + + # Filtered by 'banking.chat' — known resolved prefix on this fixture + proc_filtered = _run_jrag(["listeners", "--topic-prefix", "banking.chat", "--format", "json"], env=env) + assert proc_filtered.returncode == 0, ( + f"listeners --topic-prefix failed: rc={proc_filtered.returncode}\n" + f"stdout={proc_filtered.stdout}\nstderr={proc_filtered.stderr}" + ) + payload_filtered = json.loads(proc_filtered.stdout) + filtered_nodes = payload_filtered.get("nodes", {}) + + # Proper subset: strictly fewer than the unfiltered set. + assert len(filtered_nodes) < all_count, ( + f"--topic-prefix did not narrow: all={all_count}, filtered={len(filtered_nodes)}" + ) + + # The known listener-topic pair on this fixture: ComplianceReviewListener + # consumes 'banking.chat.compliance.review' (resolved topic literal). + found_compliance = False + for node_id, node in filtered_nodes.items(): + fqn = node.get("fqn", "") + if "ComplianceReviewListener" in fqn: + found_compliance = True + break + assert found_compliance, ( + f"ComplianceReviewListener not in filtered set; got: " + f"{[n.get('fqn') for n in filtered_nodes.values()]}" + ) + + +# ----- Test 8: entities lists entity role ----- + + +def test_entities_lists_entity_role(corpus_root: Path, ladybug_db_path: Path) -> None: + """entities command lists symbols with ENTITY role.""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + proc = _run_jrag(["entities", "--format", "json"], env=env) + assert proc.returncode == 0, f"entities failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + nodes = payload.get("nodes", {}) + # All nodes should be symbols with ENTITY role + for node_id, node in nodes.items(): + assert node.get("kind") == "symbol", f"entities returned non-symbol: {node.get('kind')}" + # Role should be ENTITY (normalized from backend) + assert (node.get("role") or "").upper() == "ENTITY", f"entity has wrong role: {node.get('role')}" + + +# ----- Test 9: listing service scope pushes down ----- + + +def test_listing_service_scope_pushes_down(corpus_root: Path, ladybug_db_path: Path) -> None: + """--service flag pushes down to backend list_* methods.""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + # Test with routes command + proc = _run_jrag(["routes", "--service", "chatassign", "--format", "json"], env=env) + # May return empty results if service doesn't exist, but should not error + assert proc.returncode == 0, f"routes --service failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + # If results exist, they should all be from the specified service + nodes = payload.get("nodes", {}) + for node_id, node in nodes.items(): + # All nodes should be from the specified microservice + assert node.get("microservice") == "chatassign", f"node {node_id} has wrong microservice: {node.get('microservice')}" + + +# ----- Test 10: listing truncated fires at limit ----- + + +def test_listing_truncated_fires_at_limit(corpus_root: Path, ladybug_db_path: Path) -> None: + """+1-fetch trick: truncated=True when the corpus has more routes than `limit`. + + The prior assertion only checked the field exists when exactly 2 rows + returned — it never verified ``truncated is True``. This learns the true + route count first, then asserts truncation actually fires when limit < total + (and that exactly `limit` rows are returned). + """ + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + # Learn the true route count (high limit -> no truncation expected). + proc_all = _run_jrag(["routes", "--limit", "499", "--format", "json"], env=env) + assert proc_all.returncode == 0, f"routes --limit 499 failed: {proc_all.stderr}" + total = len(json.loads(proc_all.stdout).get("nodes", {})) + + proc = _run_jrag(["routes", "--limit", "2", "--format", "json"], env=env) + assert proc.returncode == 0, f"routes --limit failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + nodes = payload.get("nodes", {}) + if total > 2: + assert len(nodes) == 2, f"expected exactly 2 rows (limit), got {len(nodes)} of {total}" + assert payload.get("truncated") is True, ( + f"expected truncated=True when total={total} > limit=2; payload={payload.get('truncated')}" + ) + else: + # Corpus has ≤2 routes total: no truncation; all returned. + assert not payload.get("truncated", False), ( + f"should not be truncated with total={total} ≤ limit=2" + ) + + +# ----- Test 11: listing client-kind enum lookup ----- + + +def test_listing_client_kind_enum_lookup(corpus_root: Path, ladybug_db_path: Path) -> None: + """--client-kind feign normalizes to feign_method via lookup table.""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + # Test with --client-kind feign (should map to feign_method) + proc = _run_jrag(["clients", "--client-kind", "feign", "--format", "json"], env=env) + assert proc.returncode == 0, f"clients --client-kind feign failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + # Results must carry the NORMALIZED client_kind. The backend stores + # `feign_method`, never the raw `feign` the user typed — the prior assertion + # accepted the un-normalized "feign", which would mask a normalization + # regression (the lookup table is the whole point of this test). + nodes = payload.get("nodes", {}) + for node_id, node in nodes.items(): + client_kind = node.get("client_kind", "") + assert client_kind in ("feign_method", ""), ( + f"unexpected/un-normalized client_kind: {client_kind!r} (raw 'feign' must normalize to 'feign_method')" + ) + + +# ----- Test 12: listing rejects offset ----- + + +def test_listing_rejects_offset() -> None: + """--offset is NOT registered on listing commands (unrecognized argument error).""" + env = os.environ.copy() + + # Test that --offset is rejected on routes + proc = _run_jrag(["routes", "--offset", "5"], env=env) + # argparse should reject this with exit code 2 (usage error) + assert proc.returncode != 0, "routes --offset should be rejected" + assert "unrecognized arguments: --offset" in proc.stderr or "usage:" in proc.stderr, \ + f"expected usage error, got: {proc.stderr}" + + # Same for clients + proc = _run_jrag(["clients", "--offset", "5"], env=env) + assert proc.returncode != 0, "clients --offset should be rejected" + + # Same for producers + proc = _run_jrag(["producers", "--offset", "5"], env=env) + assert proc.returncode != 0, "producers --offset should be rejected" + + # Same for topics + proc = _run_jrag(["topics", "--offset", "5"], env=env) + assert proc.returncode != 0, "topics --offset should be rejected" + + # Same for jobs + proc = _run_jrag(["jobs", "--offset", "5"], env=env) + assert proc.returncode != 0, "jobs --offset should be rejected" + + # Same for listeners + proc = _run_jrag(["listeners", "--offset", "5"], env=env) + assert proc.returncode != 0, "listeners --offset should be rejected" + + # Same for entities + proc = _run_jrag(["entities", "--offset", "5"], env=env) + assert proc.returncode != 0, "entities --offset should be rejected" diff --git a/tests/test_jrag_locate.py b/tests/test_jrag_locate.py new file mode 100644 index 00000000..9dfd33f4 --- /dev/null +++ b/tests/test_jrag_locate.py @@ -0,0 +1,457 @@ +"""Tests for `jrag find` + `inspect` (PR-JRAG-1b). + +Tests: +1. test_find_by_fqn_exact - query mode, exact FQN match +2. test_find_filter_mode_by_role - filter mode, --role controller +3. test_find_by_capability - --capability scheduled-task, symbol inferred +4. test_find_kind_inference_from_http_method - route inferred +5. test_find_kind_contradiction_is_error - --kind symbol --http-method GET +6. test_find_query_mode_with_non_symbol_kind_returns_error - query mode + route/client/producer errors +7. test_find_annotation_flag_filters - --annotation post-filter +8. test_find_exclude_role_flag_filters - --exclude-role post-filter +9. test_find_offset_paginates - --offset works on find +10. test_find_limit_capped_under_500 - --limit 600 behaves as ≤499 +11. test_find_query_mode_framework_and_source_layer_warn - dropped filters warn +12. test_inspect_returns_edge_summary_with_composed_keys - OVERRIDDEN_BY virtual key +13. test_inspect_ambiguous_returns_candidates - resolve returns many +14. test_inspect_populates_file_location - file_location set by resolve + +Note: --fuzzy was deferred (backend find_by_name_or_fqn is exact-only; see +plans/active/PLAN-JRAG-CLI.md Out of scope). +""" +from __future__ import annotations + +import json +import os +import shutil +import subprocess +import sys +from pathlib import Path + + +def _jrag_exe() -> str: + """Locate the installed ``jrag`` entry point next to the venv interpreter.""" + candidate = Path(sys.executable).parent / "jrag" + if candidate.is_file(): + return str(candidate) + exe = shutil.which("jrag") + assert exe is not None, "expected installed jrag entrypoint (run: pip install -e .)" + return exe + + +def _run_jrag( + args: list[str], + *, + env: dict[str, str] | None = None, + stdin: str | None = None, +) -> subprocess.CompletedProcess: + return subprocess.run( + [_jrag_exe(), *args], + capture_output=True, + text=True, + env=env, + input=stdin, + check=False, + ) + + +# ----- Test 1: find by FQN exact (query mode) ----- + + +def test_find_by_fqn_exact(corpus_root: Path, ladybug_db_path: Path) -> None: + """Query mode: exact FQN match returns the node.""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + # Find a known class from the bank-chat fixture + proc = _run_jrag(["find", "com.bank.chat.assign.ChatAssignApplication", "--format", "json"], env=env) + assert proc.returncode == 0, f"find failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + nodes = payload.get("nodes", {}) + assert len(nodes) >= 1, f"expected at least one node, got {len(nodes)}" + # The exact match should be in the results + for node_id, node in nodes.items(): + if "ChatAssignApplication" in node.get("fqn", ""): + assert "ChatAssignApplication" in node.get("fqn", "") + return + assert False, "ChatAssignApplication class not found in results" + + +# ----- Test 2: find filter mode by role ----- + + +def test_find_filter_mode_by_role(corpus_root: Path, ladybug_db_path: Path) -> None: + """Filter mode: --role controller returns only controllers.""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + proc = _run_jrag(["find", "--role", "controller", "--format", "json"], env=env) + assert proc.returncode == 0, f"find failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + nodes = payload.get("nodes", {}) + # All nodes should have role=CONTROLLER (normalized) + for node_id, node in nodes.items(): + role = node.get("role", "").upper() + assert role == "CONTROLLER", f"expected CONTROLLER role, got {role}" + + +# ----- Test 3: find by capability (symbol inferred) ----- + + +def test_find_by_capability(corpus_root: Path, ladybug_db_path: Path) -> None: + """--capability scheduled-task narrows vs. unfiltered (a real filter, not a no-op). + + The prior test asserted only ``status == 'ok'`` — it would pass even if + ``--capability`` were silently ignored. Now: the filtered set must be a + subset of all symbols, and a STRICT subset when the fixture has any + scheduled-task symbols. + """ + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + proc_all = _run_jrag(["find", "--limit", "499", "--format", "json"], env=env) + assert proc_all.returncode == 0, f"find (all) failed: {proc_all.stderr}" + all_count = len(json.loads(proc_all.stdout).get("nodes", {})) + + proc = _run_jrag( + ["find", "--capability", "scheduled-task", "--limit", "499", "--format", "json"], env=env + ) + assert proc.returncode == 0, f"find failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + nodes = payload.get("nodes", {}) + # Filtered set must not exceed the unfiltered set. + assert len(nodes) <= all_count, ( + f"--capability did not narrow: filtered={len(nodes)} > all={all_count}" + ) + for node in nodes.values(): + assert node.get("kind") == "symbol", f"--capability returned non-symbol: {node.get('kind')}" + # If the fixture has any scheduled-task symbols, the filter is a STRICT subset + # (proving the capability predicate was applied, not ignored). + if len(nodes) > 0 and all_count > 0: + assert len(nodes) < all_count, ( + f"--capability returned the full set ({len(nodes)} == {all_count}); filter is a no-op" + ) + + +# ----- Test 4: find kind inference from http_method ----- + + +def test_find_kind_inference_from_http_method(corpus_root: Path, ladybug_db_path: Path) -> None: + """--http-method GET implies kind=route.""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + proc = _run_jrag( + ["find", "--http-method", "GET", "--format", "json", "--limit", "5"], env=env + ) + assert proc.returncode == 0, f"find failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + nodes = payload.get("nodes", {}) + # All nodes should be routes + for node_id, node in nodes.items(): + kind = node.get("kind", "") + assert kind == "route", f"expected kind=route, got {kind}" + + +# ----- Test 5: find kind contradiction is error ----- + + +def test_find_kind_contradiction_is_error(corpus_root: Path, ladybug_db_path: Path) -> None: + """--kind symbol --http-method GET returns error envelope (contradiction).""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + proc = _run_jrag(["find", "--kind", "symbol", "--http-method", "GET", "--format", "json"], env=env) + assert proc.returncode == 2, f"expected error exit code, got {proc.returncode}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "error" + assert "contradiction" in payload.get("message", "").lower() or "conflict" in payload.get("message", "").lower() + + +# ----- Test 6: find query mode + non-symbol kind errors ----- + + +def test_find_query_mode_with_non_symbol_kind_returns_error( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """Query mode (positional ) + non-symbol kind -> status: error. + + find_by_name_or_fqn is Symbol-only (exact name/FQN match). A positional + with explicit --kind route OR a domain flag that infers a non-symbol + kind (--http-method -> route) must error (NOT silently return empty), + telling the user to drop the positional and use filter mode. + """ + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + # Explicit --kind route + positional query + proc = _run_jrag(["find", "--kind", "route", "SomeQuery", "--format", "json"], env=env) + assert proc.returncode == 2, f"explicit route: expected exit 2, got {proc.returncode}" + payload = json.loads(proc.stdout) + assert payload["status"] == "error", f"explicit route: {payload}" + msg = payload.get("message", "") + assert "Symbol" in msg, f"explicit route msg should mention Symbol: {msg!r}" + assert "filter mode" in msg, f"explicit route msg should mention filter mode: {msg!r}" + + # Inferred route (--http-method) + positional query + proc = _run_jrag( + ["find", "--http-method", "GET", "SomeName", "--format", "json"], env=env + ) + assert proc.returncode == 2, f"inferred route: expected exit 2, got {proc.returncode}" + payload = json.loads(proc.stdout) + assert payload["status"] == "error", f"inferred route: {payload}" + assert "Symbol" in payload.get("message", ""), "inferred route should mention Symbol" + + +# ----- Test 7: find annotation flag filters ----- + + +def test_find_annotation_flag_filters(corpus_root: Path, ladybug_db_path: Path) -> None: + """--annotation narrows vs. unfiltered (a real filter, not a no-op). + + The prior test asserted only ``status == 'ok'``. Now: the annotated set must + be a subset, and a strict subset when the fixture has any @RestController. + """ + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + proc_all = _run_jrag(["find", "--limit", "499", "--format", "json"], env=env) + assert proc_all.returncode == 0, f"find (all) failed: {proc_all.stderr}" + all_count = len(json.loads(proc_all.stdout).get("nodes", {})) + + # Find symbols with @RestController annotation + proc = _run_jrag( + ["find", "--annotation", "RestController", "--limit", "499", "--format", "json"], env=env + ) + assert proc.returncode == 0, f"find failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + nodes = payload.get("nodes", {}) + assert len(nodes) <= all_count, ( + f"--annotation did not narrow: filtered={len(nodes)} > all={all_count}" + ) + if len(nodes) > 0 and all_count > 0: + assert len(nodes) < all_count, ( + f"--annotation returned the full set ({len(nodes)} == {all_count}); filter is a no-op" + ) + + +# ----- Test 8: find exclude_role flag filters ----- + + +def test_find_exclude_role_flag_filters(corpus_root: Path, ladybug_db_path: Path) -> None: + """--exclude-role post-filters out nodes with that role.""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + # Find symbols but exclude controllers + proc = _run_jrag( + ["find", "--exclude-role", "controller", "--format", "json", "--limit", "10"], env=env + ) + assert proc.returncode == 0, f"find failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + nodes = payload.get("nodes", {}) + # No node should have role=CONTROLLER + for node_id, node in nodes.items(): + role = node.get("role", "").upper() + assert role != "CONTROLLER", f"found excluded CONTROLLER role in {node}" + + +# ----- Test 9: find offset paginates ----- + + +def test_find_offset_paginates(corpus_root: Path, ladybug_db_path: Path) -> None: + """--offset works in filter mode (page 2 differs from page 1).""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + # Get first page + proc1 = _run_jrag(["find", "--format", "json", "--limit", "3", "--offset", "0"], env=env) + assert proc1.returncode == 0 + page1 = json.loads(proc1.stdout) + nodes1 = set(page1.get("nodes", {}).keys()) + + # Get second page + proc2 = _run_jrag(["find", "--format", "json", "--limit", "3", "--offset", "3"], env=env) + assert proc2.returncode == 0 + page2 = json.loads(proc2.stdout) + nodes2 = set(page2.get("nodes", {}).keys()) + + # Pages should have different nodes (or page2 should be empty/shorter) + if nodes1 and nodes2: + assert nodes1 != nodes2, "pages should have different nodes" + + +# ----- Test 10: find limit capped under 500 ----- + + +def test_find_limit_capped_under_500(corpus_root: Path, ladybug_db_path: Path) -> None: + """--limit 600 behaves as ≤499 (backend clamp).""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + proc = _run_jrag(["find", "--limit", "600", "--format", "json"], env=env) + assert proc.returncode == 0, f"find failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + nodes = payload.get("nodes", {}) + # Should return at most 500 results (capped at 499 limit + 1 for truncation check) + # The backend clamp is at 500, so we should see ≤500 results + assert len(nodes) <= 500, f"expected ≤500 results, got {len(nodes)}" + + +# ----- Test 11: find query mode framework/source-layer warn when dropped ----- + + +def test_find_query_mode_framework_and_source_layer_warn( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """--framework/--source-layer in query mode are dropped (SymbolHit lacks those + fields) but surface a warnings[] entry so the user knows their filter had no effect. + """ + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + proc = _run_jrag( + ["find", "com.bank.chat.assign.ChatAssignApplication", "--framework", "spring-mvc", "--format", "json"], + env=env, + ) + assert proc.returncode == 0, f"find failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + warnings = payload.get("warnings", []) + assert any("--framework" in w and "ignored" in w for w in warnings), ( + f"expected --framework ignored warning, got warnings={warnings}" + ) + + # --source-layer in query mode + proc = _run_jrag( + ["find", "com.bank.chat.assign.ChatAssignApplication", "--source-layer", "layer-a", "--format", "json"], + env=env, + ) + assert proc.returncode == 0, f"find failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + warnings = payload.get("warnings", []) + assert any("--source-layer" in w and "ignored" in w for w in warnings), ( + f"expected --source-layer ignored warning, got warnings={warnings}" + ) + + +# ----- Test 12: inspect returns edge_summary with composed keys ----- + + +def test_inspect_returns_edge_summary_with_composed_keys( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """Inspect returns edge_summary with the virtual OVERRIDDEN_BY composed key. + + The abstract port method ``ChatAssignmentPort#requestAssignment`` has one + implementor in the bank-chat fixture (verified via test_mcp_v2_compose. + test_describe_abstract_port_emits_overridden_by_rollup), so its + edge_summary must carry ``OVERRIDDEN_BY = {"in": 0, "out": 1}``. + """ + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + method_fqn = "com.bank.chat.engine.assign.ChatAssignmentPort#requestAssignment(AssignmentRequest)" + proc = _run_jrag(["inspect", method_fqn, "--format", "json"], env=env) + assert proc.returncode == 0, f"inspect failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + nodes = payload.get("nodes", {}) + assert len(nodes) == 1, f"expected 1 node, got {len(nodes)}" + node = next(iter(nodes.values())) + edge_summary = node.get("edge_summary") + assert isinstance(edge_summary, dict), f"edge_summary should be a dict, got {type(edge_summary)}" + # The OVERRIDDEN_BY virtual composed key must be present with out > 0 + # (override_axis_rollup_for feeds this; see describe_v2 / mcp_v2_compose test). + assert "OVERRIDDEN_BY" in edge_summary, ( + f"expected OVERRIDDEN_BY composed key, got keys={list(edge_summary.keys())}" + ) + ob = edge_summary["OVERRIDDEN_BY"] + assert int(ob.get("out", 0)) > 0, f"expected OVERRIDDEN_BY out > 0, got {ob}" + + +# ----- Test 12: inspect ambiguous returns candidates ----- + + +def test_inspect_ambiguous_returns_candidates(corpus_root: Path, ladybug_db_path: Path) -> None: + """Inspect on a query that may match multiple nodes returns `ambiguous` with + candidates (no auto-pick). Each outcome asserts something real — the prior + ``elif status == 'ok': pass`` made the test vacuous for the most common + outcome (a clean resolve). + """ + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + # Use a generic query that might match multiple nodes + proc = _run_jrag(["inspect", "Account", "--format", "json"], env=env) + # Should either return ok (if exactly one) or ambiguous (if multiple) + assert proc.returncode in (0, 2), f"unexpected exit code: {proc.returncode}" + + payload = json.loads(proc.stdout) + status = payload["status"] + if status == "ambiguous": + candidates = payload.get("candidates", []) + assert len(candidates) > 0, "ambiguous must carry candidates" + for cand in candidates: + assert "reason" in cand, "candidate must carry reason" + elif status == "ok": + # A clean resolve must yield exactly ONE node (not a silent multi-return). + assert len(payload.get("nodes", {})) == 1, ( + f"ok must mean a single resolved node, got {len(payload.get('nodes', {}))}: {payload}" + ) + else: + assert status == "not_found", f"unexpected status: {status}" + assert payload.get("message"), "not_found must carry a message" + + +# ----- Test 13: inspect populates file_location ----- + + +def test_inspect_populates_file_location(corpus_root: Path, ladybug_db_path: Path) -> None: + """Inspect populates file_location from resolve_query (filename:start_line).""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + # Inspect a specific known symbol that resolves cleanly and has a file location. + proc = _run_jrag(["inspect", "com.bank.chat.assign.ChatAssignApplication", "--format", "json"], env=env) + assert proc.returncode == 0, f"inspect failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + # file_location is populated by resolve_query from the resolved node's + # filename + start_line (jrag_envelope._node_file_location). + file_location = payload.get("file_location") + assert file_location is not None, "expected file_location to be populated for a real symbol" + # Should be in format "filename:start_line" (start_line present for symbols). + assert "ChatAssignApplication.java" in file_location, f"unexpected file_location: {file_location}" diff --git a/tests/test_jrag_orientation.py b/tests/test_jrag_orientation.py new file mode 100644 index 00000000..3f06e8e8 --- /dev/null +++ b/tests/test_jrag_orientation.py @@ -0,0 +1,574 @@ +"""Tests for `jrag` orientation commands + semantic search + agent_next_actions (PR-JRAG-4). + +Tests: +1. test_microservices_lists_counts +2. test_map_returns_non_empty_counts_per_service +3. test_conventions_reports_dominant_roles +4. test_overview_microservice_bundle +5. test_overview_route_uses_flow +6. test_overview_topic_lists_producers_and_consumers +7. test_overview_as_overrides_polymorphic_inference +8. test_search_returns_ranked_hits +9. test_search_hybrid_calls_hybrid_path +10. test_search_table_all_runs_three_tables +11. test_search_offset_paginates +12. test_search_fuzzy_rejected_in_handler_as_status_error +13. test_next_actions_valid_runnable_commands_capped_at_5 +14. test_next_actions_zero_direction_suppressed +15. test_next_actions_covers_composed_dot_keys +16. test_next_actions_falls_back_to_result_edges_when_no_edge_summary +17. test_next_actions_omitted_when_empty +18. test_build_parser_imports_no_backend_modules +""" +from __future__ import annotations + +import json +import os +import shutil +import subprocess +import sys +from pathlib import Path + + +def _jrag_exe() -> str: + """Locate the installed ``jrag`` entry point next to the venv interpreter.""" + candidate = Path(sys.executable).parent / "jrag" + if candidate.is_file(): + return str(candidate) + exe = shutil.which("jrag") + assert exe is not None, "expected installed jrag entrypoint (run: pip install -e .)" + return exe + + +def _run_jrag( + args: list[str], + *, + env: dict[str, str] | None = None, + stdin: str | None = None, +) -> subprocess.CompletedProcess: + return subprocess.run( + [_jrag_exe(), *args], + capture_output=True, + text=True, + env=env, + input=stdin, + check=False, + ) + + +def _env_for(corpus_root: Path, ladybug_db_path: Path) -> dict[str, str]: + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + return env + + +# ===== Tests 1–3: orientation counts ===== + + +def test_microservices_lists_counts(corpus_root: Path, ladybug_db_path: Path) -> None: + """microservices command returns non-empty microservice → count map.""" + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["microservices", "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"microservices failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + counts = payload["nodes"]["microservices"]["counts"] + assert counts, f"counts dict empty: {payload}" + assert any(int(v or 0) > 0 for v in counts.values()), f"all counts zero: {counts}" + + +def test_map_returns_non_empty_counts_per_service(corpus_root: Path, ladybug_db_path: Path) -> None: + """map command returns non-empty counts grouped by microservice.""" + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["map", "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"map failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + counts = payload["nodes"]["map"]["counts"] + assert counts, f"map counts empty: {payload}" + # At least one scope should have at least one kind with positive count. + found_positive = any( + int(v or 0) > 0 + for scope_counts in counts.values() + for v in scope_counts.values() + ) + assert found_positive, f"all map counts zero: {counts}" + + +def test_conventions_reports_dominant_roles(corpus_root: Path, ladybug_db_path: Path) -> None: + """conventions command reports role distribution.""" + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["conventions", "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"conventions failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + roles = payload["nodes"]["conventions"]["roles"] + assert roles, f"roles dict empty: {payload}" + assert any(int(v or 0) > 0 for v in roles.values()), f"all role counts zero: {roles}" + + +# ===== Tests 4–7: overview dispatch ===== + + +def test_overview_microservice_bundle(corpus_root: Path, ladybug_db_path: Path) -> None: + """overview returns a bundle (routes + clients + producers counts).""" + env = _env_for(corpus_root, ladybug_db_path) + # First get microservices to find a valid one. + proc_ms = _run_jrag(["microservices", "--format", "json"], env=env) + assert proc_ms.returncode == 0 + ms_counts = json.loads(proc_ms.stdout)["nodes"]["microservices"]["counts"] + # Pick the first microservice with a non-zero count. + ms_name = next((k for k, v in ms_counts.items() if int(v or 0) > 0 and k), None) + assert ms_name, f"no valid microservice in fixture: {ms_counts}" + + proc = _run_jrag(["overview", ms_name, "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"overview failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + # The bundle should be present on the microservice node. + ms_node = next(iter(payload["nodes"].values())) + assert "bundle" in ms_node, f"missing bundle in overview node: {ms_node}" + assert ms_node["bundle"]["microservice"] == ms_name + + +def test_overview_route_uses_flow(corpus_root: Path, ladybug_db_path: Path) -> None: + """overview /chat/assign dispatches as route and returns flow data.""" + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["overview", "/chat/assign", "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"overview /chat/assign failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + # Route overview uses traversal shape (root + edges). + assert payload.get("root"), "expected root set for route overview" + assert payload.get("root") is not None + + +def test_overview_topic_lists_producers_and_consumers(corpus_root: Path, ladybug_db_path: Path) -> None: + """overview returns producers + consumers for the topic.""" + env = _env_for(corpus_root, ladybug_db_path) + # Use a topic that exists on the fixture. banking.chat.compliance.review + # is consumed by ComplianceReviewListener (verified in test_jrag_listing). + proc = _run_jrag( + ["overview", "banking.chat.compliance.review", "--format", "json"], env=env + ) + assert proc.returncode == 0, ( + f"overview failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + topic_node = next(iter(payload["nodes"].values())) + assert topic_node.get("kind") == "topic" + # The node should carry producers and consumers arrays. + assert "producers" in topic_node, f"missing producers in topic overview: {topic_node}" + assert "consumers" in topic_node, f"missing consumers in topic overview: {topic_node}" + + +def test_overview_as_overrides_polymorphic_inference(corpus_root: Path, ladybug_db_path: Path) -> None: + """overview --as microservice forces the microservice dispatch path even for + a subject that auto-detects as a route (/chat/assign). + + The node shape is a bundle (inspect), NOT a traversal (root + edges). The + prior `if payload["status"] == "ok":` guard made this test vacuously pass on + any non-ok status — now the dispatch is asserted unconditionally. + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag( + ["overview", "/chat/assign", "--as", "microservice", "--format", "json"], env=env + ) + assert proc.returncode == 0, ( + f"overview --as microservice failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got: {payload}" + node = next(iter(payload["nodes"].values()), {}) + # --as microservice dispatches to the microservice path (bundle node), NOT + # the route path (traversal with root + edges). + assert "bundle" in node or node.get("kind") == "microservice", ( + f"--as microservice should dispatch to microservice path, got: {node}" + ) + assert not payload.get("root"), ( + f"--as microservice must NOT produce traversal shape (root+edges): {payload}" + ) + + +# ===== Tests 8–12: semantic search ===== + + +def test_search_returns_ranked_hits( + monkeypatch, capsys, corpus_root: Path, ladybug_db_path: Path +) -> None: + """search returns ranked hits from search_v2 (mocked to avoid Lance dependency).""" + import mcp_v2 + from java_codebase_rag.jrag import main + + env_index = str(ladybug_db_path.parent) + monkeypatch.setenv("JAVA_CODEBASE_RAG_INDEX_DIR", env_index) + monkeypatch.setenv("JAVA_CODEBASE_RAG_SOURCE_ROOT", str(corpus_root)) + + # Mock search_v2 to return a controlled hit. + fake_hit = mcp_v2.SearchHit( + chunk_id="c1", symbol_id="sym1", fqn="com.example.Hit", + score=0.95, snippet="fake snippet", microservice="chat-core", + module="m", role="SERVICE", + ) + def mock_search_v2(query, **kwargs): + return mcp_v2.SearchOutput( + success=True, results=[fake_hit], + limit=kwargs.get("limit", 5), offset=kwargs.get("offset", 0), + advisories=[], + ) + monkeypatch.setattr(mcp_v2, "search_v2", mock_search_v2) + + rc = main(["search", "--index-dir", env_index, "assign chat", "--format", "json"]) + captured = capsys.readouterr() + assert rc == 0, f"search failed: rc={rc}\nstdout={captured.out}\nstderr={captured.err}" + payload = json.loads(captured.out) + assert payload["status"] == "ok" + nodes = payload.get("nodes", {}) + assert len(nodes) >= 1, f"expected at least one hit, got {nodes}" + + +def test_search_hybrid_calls_hybrid_path( + monkeypatch, capsys, corpus_root: Path, ladybug_db_path: Path +) -> None: + """--hybrid flag passes hybrid=True to search_v2.""" + import mcp_v2 + from java_codebase_rag.jrag import main + + env_index = str(ladybug_db_path.parent) + monkeypatch.setenv("JAVA_CODEBASE_RAG_INDEX_DIR", env_index) + monkeypatch.setenv("JAVA_CODEBASE_RAG_SOURCE_ROOT", str(corpus_root)) + + captured_kwargs: dict = {} + def mock_search_v2(query, **kwargs): + captured_kwargs.update(kwargs) + captured_kwargs["query"] = query + return mcp_v2.SearchOutput( + success=True, results=[], limit=kwargs.get("limit", 5), + offset=kwargs.get("offset", 0), advisories=[], + ) + monkeypatch.setattr(mcp_v2, "search_v2", mock_search_v2) + + rc = main(["search", "--index-dir", env_index, "audit", "--hybrid", "--format", "json"]) + assert rc == 0 + assert captured_kwargs.get("hybrid") is True, ( + f"expected hybrid=True, got hybrid={captured_kwargs.get('hybrid')}" + ) + + +def test_search_table_all_runs_three_tables( + monkeypatch, capsys, corpus_root: Path, ladybug_db_path: Path +) -> None: + """--table all passes table='all' to search_v2 (java+sql+yaml).""" + import mcp_v2 + from java_codebase_rag.jrag import main + + env_index = str(ladybug_db_path.parent) + monkeypatch.setenv("JAVA_CODEBASE_RAG_INDEX_DIR", env_index) + monkeypatch.setenv("JAVA_CODEBASE_RAG_SOURCE_ROOT", str(corpus_root)) + + captured_kwargs: dict = {} + def mock_search_v2(query, **kwargs): + captured_kwargs.update(kwargs) + return mcp_v2.SearchOutput( + success=True, results=[], limit=kwargs.get("limit", 5), + offset=kwargs.get("offset", 0), advisories=[], + ) + monkeypatch.setattr(mcp_v2, "search_v2", mock_search_v2) + + rc = main(["search", "--index-dir", env_index, "schema", "--table", "all", "--format", "json"]) + assert rc == 0 + assert captured_kwargs.get("table") == "all", ( + f"expected table='all', got table={captured_kwargs.get('table')!r}" + ) + + +def test_search_offset_paginates( + monkeypatch, capsys, corpus_root: Path, ladybug_db_path: Path +) -> None: + """--offset paginates: passes offset to search_v2 and renders next_offset hint.""" + import mcp_v2 + from java_codebase_rag.jrag import main + + env_index = str(ladybug_db_path.parent) + monkeypatch.setenv("JAVA_CODEBASE_RAG_INDEX_DIR", env_index) + monkeypatch.setenv("JAVA_CODEBASE_RAG_SOURCE_ROOT", str(corpus_root)) + + # Return limit+1 hits so truncation fires and next_offset renders. + fake_hits = [ + mcp_v2.SearchHit( + chunk_id=f"c{i}", symbol_id=f"s{i}", fqn=f"com.example.Hit{i}", + score=0.9 - i * 0.01, snippet="snip", microservice="ms", module="m", role="X", + ) + for i in range(6) # limit default 5 + 1 → truncated + ] + captured_kwargs: dict = {} + def mock_search_v2(query, **kwargs): + captured_kwargs.update(kwargs) + return mcp_v2.SearchOutput( + success=True, results=fake_hits[:kwargs.get("limit", 5) + 1], + limit=kwargs.get("limit", 5), offset=kwargs.get("offset", 0), + advisories=[], + ) + monkeypatch.setattr(mcp_v2, "search_v2", mock_search_v2) + + rc = main([ + "search", "--index-dir", env_index, "test", "--offset", "0", + "--limit", "5", "--format", "text", + ]) + captured = capsys.readouterr() + assert rc == 0 + # Offset must be passed through to search_v2. + assert captured_kwargs.get("offset") == 0 + # Text mode should carry the offset hint (truncated → next page suggestion). + assert "truncated" in captured.out.lower() or "--offset" in captured.out, ( + f"expected truncation/offset hint in output: {captured.out}" + ) + + +def test_search_fuzzy_rejected_in_handler_as_status_error( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """--fuzzy is rejected IN-HANDLER with status: error (not argparse exit 2). + + The flag is registered on the parser so argparse doesn't exit 2 before the + handler runs. The handler checks args.fuzzy and produces a canonical error + envelope with the message "search is semantic; --fuzzy is implicit". + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["search", "test", "--fuzzy", "--format", "json"], env=env) + payload = json.loads(proc.stdout) + assert payload["status"] == "error" + msg = payload.get("message") or "" + assert "fuzzy" in msg.lower(), f"expected fuzzy in error message: {msg!r}" + assert "semantic" in msg.lower(), f"expected 'semantic' in message: {msg!r}" + + +# ===== Tests 13–17: jrag_hints.next_actions ===== + + +def test_next_actions_valid_runnable_commands_capped_at_5() -> None: + """next_actions emits valid `jrag ` strings, ≤5.""" + from java_codebase_rag.jrag_hints import next_actions + + fqn = "com.example.Foo" + edge_summary = { + "CALLS": {"in": 3, "out": 2}, + "IMPLEMENTS": {"in": 0, "out": 1}, + "EXTENDS": {"in": 0, "out": 1}, + "INJECTS": {"in": 5, "out": 2}, + "OVERRIDES": {"in": 0, "out": 1}, + "OVERRIDDEN_BY": {"in": 2, "out": 0}, + } + hints = next_actions(root_fqn=fqn, edge_summary=edge_summary, result_edges=[]) + assert len(hints) <= 5, f"expected ≤5 hints, got {len(hints)}: {hints}" + # Every hint must be `jrag `. + for h in hints: + assert h.startswith("jrag "), f"bad hint prefix: {h!r}" + parts = h.split() + assert len(parts) >= 3, f"hint too short: {h!r}" + assert parts[-1] == fqn, f"fqn mismatch in {h!r}: expected {fqn}" + # All hints must be unique (de-duped). + assert len(hints) == len(set(hints)), f"duplicate hints: {hints}" + + +def test_next_actions_zero_direction_suppressed() -> None: + """A leaf with INJECTS in:0, out:3 → no `jrag dependents`; `jrag dependencies` present.""" + from java_codebase_rag.jrag_hints import next_actions + + fqn = "com.example.Leaf" + hints = next_actions( + root_fqn=fqn, + edge_summary={"INJECTS": {"in": 0, "out": 3}}, + result_edges=[], + ) + # in:0 → no `jrag dependents ` suggestion. + assert f"jrag dependents {fqn}" not in hints, ( + f"zero-direction not suppressed: {hints}" + ) + # out:3 → `jrag dependencies ` should be suggested. + assert f"jrag dependencies {fqn}" in hints, ( + f"non-zero direction missing: {hints}" + ) + + +def test_next_actions_covers_composed_dot_keys() -> None: + """Composed dot-keys like OVERRIDDEN_BY.DECLARES_CLIENT map to overridden-by.""" + from java_codebase_rag.jrag_hints import next_actions + + fqn = "com.example.Method" + hints = next_actions( + root_fqn=fqn, + edge_summary={"OVERRIDDEN_BY.DECLARES_CLIENT": {"in": 2, "out": 0}}, + result_edges=[], + ) + assert f"jrag overridden-by {fqn}" in hints, ( + f"composed dot-key OVERRIDDEN_BY.* not covered: {hints}" + ) + + +def test_next_actions_falls_back_to_result_edges_when_no_edge_summary() -> None: + """When edge_summary is None, labels from result_edges drive the hints.""" + from java_codebase_rag.jrag_hints import next_actions + + fqn = "com.example.Foo" + result_edges = [ + {"other_id": "a", "edge_type": "CALLS"}, + {"other_id": "b", "edge_type": "INJECTS"}, + ] + hints = next_actions(root_fqn=fqn, edge_summary=None, result_edges=result_edges) + # CALLS → callers + callees; INJECTS → dependents + dependencies. + assert f"jrag callers {fqn}" in hints, f"CALLS in missing from fallback: {hints}" + assert f"jrag callees {fqn}" in hints, f"CALLS out missing from fallback: {hints}" + assert f"jrag dependents {fqn}" in hints, f"INJECTS in missing from fallback: {hints}" + assert f"jrag dependencies {fqn}" in hints, f"INJECTS out missing from fallback: {hints}" + + +def test_next_actions_skips_self_command_in_fallback() -> None: + """Regression (review finding E): the result_edges fallback must not emit a + self-hint (the command just run). + + After `callers`, the CALLS edges present would yield ``jrag callers`` (self) + + ``jrag callees`` (inverse). Only the inverse is useful — the self-hint is + dropped when ``current_command`` is supplied (the inverse remains). + """ + from java_codebase_rag.jrag_hints import next_actions + + fqn = "com.example.Foo" + result_edges = [{"other_id": "a", "edge_type": "CALLS"}] + # Without current_command: both directions (back-comat with test 16). + both = next_actions(root_fqn=fqn, edge_summary=None, result_edges=result_edges) + assert f"jrag callers {fqn}" in both, f"CALLS in missing: {both}" + assert f"jrag callees {fqn}" in both, f"CALLS out missing: {both}" + # With current_command="callers": self dropped, inverse kept. + skipped = next_actions( + root_fqn=fqn, + edge_summary=None, + result_edges=result_edges, + current_command="callers", + ) + assert f"jrag callers {fqn}" not in skipped, f"self-hint not skipped: {skipped}" + assert f"jrag callees {fqn}" in skipped, f"inverse hint dropped: {skipped}" + + +def test_next_actions_omitted_when_empty() -> None: + """next_actions returns [] when no recognized edges are present.""" + from java_codebase_rag.jrag_hints import next_actions + + hints = next_actions( + root_fqn="com.example.Foo", + edge_summary={"UNKNOWN_EDGE": {"in": 5, "out": 5}}, + result_edges=[], + ) + assert hints == [], f"expected empty hints for unrecognized label, got {hints}" + + # Also empty when edge_summary is None and result_edges is empty. + hints2 = next_actions(root_fqn="com.example.Foo", result_edges=[]) + assert hints2 == [], f"expected empty hints for no edges, got {hints2}" + + +# ===== Test 17a/17b: e2e hook wiring on real inspect ===== + +# Seed FQN verified against the bank-chat fixture: resolves to "one" and carries +# INJECTS edges (ChatManagementService injects repositories and is injected by +# controllers). See test_jrag_traversal_direct.py for resolve verification. +_SEED_CLASS_FQN = "com.bank.chat.assign.service.ChatManagementService" + + +def test_inspect_populates_agent_next_actions_json( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """e2e: `jrag inspect --format json` populates agent_next_actions. + + Tests the full hook wiring: resolve → describe_v2 → edge_summary → hook → + jrag_hints.next_actions → envelope.agent_next_actions. The unit tests + (13–17) test the mapper directly; this verifies the fqn extraction from + envelope.nodes[root] and the synthetic-kind guard in the hook. + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["inspect", _SEED_CLASS_FQN, "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"inspect failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + actions = payload.get("agent_next_actions", []) + assert actions, ( + f"agent_next_actions empty — hook wiring broken: {payload}" + ) + # At least one hint must be `jrag `. + fqn = _SEED_CLASS_FQN + found_runnable = any( + a.startswith("jrag ") and a.endswith(fqn) for a in actions + ) + assert found_runnable, f"no `jrag {fqn}` in actions: {actions}" + + +def test_inspect_renders_next_actions_in_text( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """e2e: `jrag inspect ` (text mode) renders `next:` hint lines. + + After Fix 1, the inspect text renderer appends up to 2 `next: ` lines + when agent_next_actions is non-empty. This test verifies the text rendering + path (the JSON path is covered by the test above). + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["inspect", _SEED_CLASS_FQN], env=env) + assert proc.returncode == 0, ( + f"inspect (text) failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + assert "next:" in proc.stdout, ( + f"expected `next:` hint line in text output, got:\n{proc.stdout}" + ) + + +# ===== Test 18: build_parser lazy-import sentinel ===== + + +def test_build_parser_imports_no_backend_modules() -> None: + """build_parser() imports NO backend modules (torch / sentence_transformers / mcp_v2). + + Pins the lazy-import invariant: `jrag --help` stays fast and free of heavy + deps. Uses a snapshot-diff approach: snapshot sys.modules keys before and + after build_parser(), then assert the delta contains no heavy modules. This + is robust under same-session pre-load pollution (other tests may have + already imported heavy deps; we only care that build_parser doesn't ADD + them). + """ + heavy = {"torch", "sentence_transformers", "mcp_v2", "ladybug_queries", "resolve_service"} + + # Snapshot module keys before build_parser(). + before = set(sys.modules.keys()) + + from java_codebase_rag.jrag import build_parser + build_parser() + + # Delta = modules added by build_parser(). + after = set(sys.modules.keys()) + added = after - before + leaked = added & heavy + assert not leaked, ( + f"build_parser() imported backend module(s): {sorted(leaked)} — " + "lazy-import invariant broken" + ) + + # Verify the parser lists the new commands. + parser = build_parser() + # The subparsers' actions include the subcommand dest. + sub_actions = [a for a in parser._actions if hasattr(a, "choices") and isinstance(a.choices, dict)] # noqa: SLF001 + if sub_actions: + commands = set(sub_actions[0].choices.keys()) + for expected in ("microservices", "map", "conventions", "overview", "search"): + assert expected in commands, f"missing {expected} in parser subcommands: {commands}" diff --git a/tests/test_jrag_render.py b/tests/test_jrag_render.py new file mode 100644 index 00000000..42a4db08 --- /dev/null +++ b/tests/test_jrag_render.py @@ -0,0 +1,662 @@ +"""Tests for java_codebase_rag.jrag_render (PR-JRAG-1a). + +Pure unit tests for the text renderer. Constructs envelopes directly (no graph +fixtures) so the render shapes are pinned independently of resolve / traversal +backends. +""" +from __future__ import annotations + +import json + +from java_codebase_rag.jrag_envelope import Envelope, simple_name +from java_codebase_rag.jrag_render import render, tiered_name + + +# ----- Test 11: listing omits FQN ----- + + +def test_render_listing_omits_fqn() -> None: + """Listing output is `name @service` only; FQN is never rendered.""" + env = Envelope( + status="ok", + nodes={ + "sym:1": {"fqn": "com.foo.Bar.doStuff", "microservice": "foo-svc"}, + "sym:2": {"fqn": "com.foo.Baz.handle", "microservice": "bar-svc"}, + }, + ) + out = render(env, fmt="text", noun="matches") + assert "com.foo.Bar.doStuff" not in out, f"FQN leaked into listing: {out!r}" + assert "com.foo.Baz.handle" not in out, f"FQN leaked into listing: {out!r}" + lines = out.splitlines() + assert "doStuff @foo-svc" in lines + assert "handle @bar-svc" in lines + + +def test_render_listing_zero_nodes_emits_zero_line() -> None: + env = Envelope(status="ok", nodes={}) + out = render(env, fmt="text", noun="matches") + assert out.strip() == "0 matches" + + +def test_display_name_handles_routes_clients_producers() -> None: + """display_name picks the identifying field per node kind (not FQN-only). + + Regression for routes rendering blank: routes have ``path``/``method``, not + ``fqn``; the old ``simple_name`` returned ``''`` and listings showed a bare + ``@service`` with no name. The same gap affected clients/producers. + """ + from java_codebase_rag.jrag_render import display_name + + # Route: METHOD path (no FQN at all). + route = {"kind": "http_endpoint", "method": "POST", "path": "/api/chat/send"} + assert display_name(route) == "POST /api/chat/send" + # Route with no method: bare path. + assert display_name({"kind": "http_endpoint", "path": "/health"}) == "/health" + # Client: member simple-name -> target service. + client = { + "client_kind": "feign_method", + "target_service": "chat-assign", + "member_fqn": "com.bchat.Proc.send", + } + assert display_name(client) == "send → chat-assign" + # Producer: member simple-name -> topic. + producer = { + "producer_kind": "kafka_send", + "topic": "chat-messages", + "member_fqn": "com.bchat.Prod.send", + } + assert display_name(producer) == "send → chat-messages" + # Symbol fallback unchanged. + assert display_name({"fqn": "com.foo.Bar"}) == "Bar" + # Topic-only node (topics command grouping). + assert display_name({"topic": "chat-messages"}) == "chat-messages" + + +def test_render_listing_routes_shows_method_path_not_blank() -> None: + """A route listing row renders `METHOD path @service`, never a bare `@service`. + + Regression: routes carry no FQN; before ``display_name`` the listing emitted + `` @chat-core`` with a blank name (confusing — the user couldn't tell + routes apart across services). + """ + env = Envelope( + status="ok", + nodes={ + "r:1": { + "kind": "http_endpoint", + "method": "POST", + "path": "/api/chat/send", + "microservice": "chat-core", + }, + "r:2": { + "kind": "http_endpoint", + "method": "GET", + "path": "/api/chat/history", + "microservice": "chat-assign", + }, + }, + ) + out = render(env, fmt="text", noun="route") + lines = out.splitlines() + # Route rows are prefixed with a [http]/[kafka] type tag so an agent can tell + # HTTP endpoints apart from Kafka topics in a mixed listing. + assert "[http] POST /api/chat/send @chat-core" in lines, f"route row missing: {out!r}" + assert "[http] GET /api/chat/history @chat-assign" in lines, f"route row missing: {out!r}" + # No bare `@service` line (the bug signature: blank name + service suffix). + assert not any(line.strip().startswith("@") for line in lines), ( + f"blank-name listing line leaked: {out!r}" + ) + + +# ----- Test 12: traversal conf: only on CALLS-family ----- + + +def test_render_traversal_conf_only_on_calls() -> None: + """conf=N.NN is rendered only for CALLS / HTTP_CALLS / ASYNC_CALLS edges.""" + env = Envelope( + status="ok", + root="sym:0", + nodes={ + "sym:0": {"fqn": "com.foo.Caller.call", "microservice": "svc"}, + "sym:1": {"fqn": "com.foo.Callee.a", "microservice": "svc"}, + "sym:2": {"fqn": "com.foo.Parent.b", "microservice": "svc"}, + }, + edges=[ + { + "edge_type": "CALLS", + "other_id": "sym:1", + "confidence": 0.92, + }, + { + "edge_type": "OVERRIDES", + "other_id": "sym:2", + "confidence": 0.8, # MUST NOT be rendered for OVERRIDES + }, + ], + ) + out = render(env, fmt="text", noun="callees") + # The CALLS edge row carries conf=0.92. + assert "conf=0.92" in out, f"missing conf on CALLS edge: {out!r}" + # The OVERRIDES edge row has no conf=, despite carrying a confidence value. + overrides_line = next(line for line in out.splitlines() if "Parent" in line or "b @" in line) + assert "conf=" not in overrides_line, f"conf leaked onto OVERRIDES edge: {overrides_line!r}" + + +def test_render_traversal_root_line_present() -> None: + env = Envelope( + status="ok", + root="sym:0", + nodes={"sym:0": {"fqn": "com.foo.Caller.call", "microservice": "svc"}}, + edges=[], + ) + out = render(env, fmt="text", noun="callees") + assert out.splitlines()[0].startswith("root: ") + + +def test_render_overrides_does_not_mislabel_as_supertypes() -> None: + """Regression (review finding D): overrides/overridden-by edges must NOT + render under `↑ supertypes:`/`↓ subtypes:` hierarchy headers. + + The producers used to set ``direction='up'/'down'`` on the edge rows, which + tripped the renderer's ``has_direction`` guard and routed them into the + hierarchy branch. The fix dropped the direction key; overrides is a flat + list. Tests previously asserted JSON only, so the mis-label was invisible. + """ + env = Envelope( + status="ok", + root="sym:0", + nodes={ + "sym:0": {"fqn": "com.foo.Impl", "microservice": "svc"}, + "sym:1": {"fqn": "com.foo.Base", "microservice": "svc"}, + }, + edges=[{"other_id": "sym:1", "edge_type": "OVERRIDES"}], + ) + out = render(env, fmt="text", noun="overrides") + assert "supertype" not in out.lower(), f"overrides mislabeled as supertypes: {out!r}" + assert "subtype" not in out.lower(), f"overrides mislabeled as subtypes: {out!r}" + # The overridden declaration IS rendered (flat row), not swallowed. + assert "Base" in out, f"overrides target not rendered: {out!r}" + + +def test_render_warnings_visible_in_text() -> None: + """Regression (review finding F): warnings[] render as `warning:` lines in + text mode. + + Previously warnings were JSON-only — the listing/inspect/traversal shapes + never emitted them, so the 'inapplicable flags never silently ignored' spec + was effectively unenforced for text consumers. The renderer now appends one + ``warning:`` line per warning after the body. + """ + env = Envelope( + status="ok", + nodes={"sym:1": {"fqn": "com.foo.Bar", "microservice": "svc"}}, + warnings=["--service is not applied on this command", "--limit is not applied on this command"], + ) + out = render(env, fmt="text", noun="matches") + assert "warning: --service is not applied on this command" in out, ( + f"warning not rendered in text mode: {out!r}" + ) + assert "warning: --limit is not applied on this command" in out, ( + f"second warning missing: {out!r}" + ) + + +# ----- Test 13: inspect edge_summary alphabetical ----- + + +def test_render_inspect_edge_summary_alphabetical() -> None: + """Inspect renders ALL dict keys alphabetically; edge_summary is indented + sorted. + + Inspect is now declared via the explicit ``shape="inspect"`` hint (no + longer inferred from node contents - a listing node with dict-valued + fields must NOT route to inspect). Callers like ``jrag status`` and the + future ``jrag inspect`` declare their shape; the renderer does not guess. + """ + env = Envelope( + status="ok", + nodes={ + "sym:1": { + # Top-level keys intentionally unsorted. + "fqn": "com.foo.Bar", + "kind": "class", + "name": "Bar", + "role": "SERVICE", + "edge_summary": { + # Edge summary keys intentionally unsorted. + "OVERRIDES": {"in": 0, "out": 3}, + "CALLS": {"in": 5, "out": 2}, + "EXTENDS": {"in": 0, "out": 1}, + }, + } + }, + ) + out = render(env, fmt="text", noun="inspect", shape="inspect", detail="full") + lines = out.splitlines() + # Top-level keys appear in alphabetical order. + keys_in_output = [ln.split(":", 1)[0] for ln in lines if ":" in ln and not ln.startswith(" ")] + # Filter out only the known top-level keys. + expected_top = ["edge_summary", "fqn", "kind", "name", "role"] + assert keys_in_output == expected_top, f"top keys not alphabetical: {keys_in_output}" + # edge_summary recurses (dict-of-dicts): each edge type is an indent-2 header + # (alphabetical) with its in/out counts indented one more level beneath. + summary_idx = next(i for i, ln in enumerate(lines) if ln.startswith("edge_summary:")) + header_lines = [ + ln for ln in lines[summary_idx + 1:] + if ln.startswith(" ") and not ln.startswith(" ") + ] + header_keys = [ln.split(":", 1)[0].strip() for ln in header_lines] + assert header_keys == ["CALLS", "EXTENDS", "OVERRIDES"], f"summary not sorted: {header_keys}" + # Each edge-type header is followed by its nested in/out counts. + assert " in: 5" in lines and " out: 2" in lines, f"nested counts missing: {out!r}" + + +def test_render_listing_with_dict_valued_node_does_not_route_to_inspect() -> None: + """A listing node carrying dict-valued fields (typical after .model_dump()) + must NOT silently route to inspect - dispatch is explicit via shape hint. + Regression for the structural-dispatch foot-gun flagged in re-review. + """ + env = Envelope( + status="ok", + nodes={ + "sym:1": { + "fqn": "com.foo.Bar.doStuff", + "microservice": "svc", + # Symbol nodes typically carry dict-valued fields after + # .model_dump(): source_range, annotations, capabilities, etc. + "annotations": {"@Override": True}, + "source_range": {"start": 1, "end": 10}, + } + }, + ) + out = render(env, fmt="text", noun="matches") + # Listing shape: FQN is omitted (test 11 contract); only name + @service. + assert "com.foo.Bar.doStuff" not in out, ( + f"listing leaked FQN - routed to inspect by mistake: {out!r}" + ) + assert "doStuff @svc" in out.splitlines() + + +# ----- Test 14: ambiguous lists reason, no file/score ----- + + +def test_render_ambiguous_lists_reason_no_file() -> None: + """Ambiguous candidates carry `reason`; NO file or score columns.""" + env = Envelope( + status="ambiguous", + candidates=[ + { + "id": "sym:1", + "fqn": "com.foo.Bar.doStuff", + "name": "doStuff", + "microservice": "foo", + "reason": "fqn_suffix", + }, + { + "id": "sym:2", + "fqn": "com.foo.Baz.doStuff", + "name": "doStuff", + "microservice": "bar", + "reason": "short_name", + }, + ], + ) + out = render(env, fmt="text", noun="doStuff") + assert "ambiguous" in out + assert "fqn_suffix" in out + assert "short_name" in out + # No file path or score leaks into ambiguous output. + assert ".java" not in out + assert "score" not in out.lower() + + +# ----- Test 15: zero results vs not_found distinct ----- + + +def test_render_zero_results_vs_not_found_distinct() -> None: + """Zero-result ok envelope -> '0 '; not_found envelope -> 'not found: '.""" + zero_env = Envelope(status="ok", nodes={}, root="sym:1") + not_found_env = Envelope(status="not_found", message="No matches for 'foo'.") + + zero_out = render(zero_env, fmt="text", noun="callees") + nf_out = render(not_found_env, fmt="text", noun="callees") + + # Zero results line starts with "0 ". + assert "0 callees" in zero_out, f"zero-results missing '0 ': {zero_out!r}" + assert "not found" not in zero_out, f"zero-results looks like not_found: {zero_out!r}" + + # not_found line is "not found: ". + assert nf_out.startswith("not found:"), f"not_found shape wrong: {nf_out!r}" + assert "No matches for 'foo'." in nf_out + assert "0 callees" not in nf_out, f"not_found looks like zero-results: {nf_out!r}" + + +# ----- Tests 16 / 17: truncated hint ----- + + +def test_render_truncated_narrow_query_for_non_offset_commands() -> None: + """Non-offset commands (traversal/listing) emit 'narrow your query'.""" + env = Envelope(status="ok", truncated=True, nodes={"sym:1": {"fqn": "com.foo.Bar"}}) + out = render(env, fmt="text", noun="callers", next_offset=None) + assert "truncated: more results — narrow your query" in out + assert "--offset" not in out, f"offset hint leaked on non-offset command: {out!r}" + + +def test_render_truncated_offset_hint_for_offset_commands() -> None: + """Offset commands (find/search) emit 'use --offset '.""" + env = Envelope(status="ok", truncated=True, nodes={"sym:1": {"fqn": "com.foo.Bar"}}) + out = render(env, fmt="text", noun="find", next_offset=40) + assert "truncated: more results — use --offset 40" in out + + +# ----- Test 18: json path (now via projection — PR-JRAG-6) ----- + + +def test_render_json_full_is_idfree_envelope_for_projection_invariant_data() -> None: + """``render(fmt="json")`` now projects the envelope to the requested detail + level (orthogonal to text). For projection-invariant data (only identity + fields) and ``detail="full"``, the output still equals ``env.to_json()`` — + pinning that the json path is a plain ``json.dumps`` of the projected dict + with no extra decoration. Field-set trimming itself is pinned by the + orthogonality test below. + """ + env = Envelope( + status="ok", + root="sym:1", + nodes={"sym:1": {"id": "sym:1", "fqn": "com.foo.Bar"}}, + warnings=["partial"], + ) + out = render(env, fmt="json", detail="full") + assert out == env.to_json() + parsed = json.loads(out) + assert parsed["status"] == "ok" + # root + node key are the FQN (the node's natural key), NOT the graph id; + # the internal ``id`` field is stripped at the boundary. + assert parsed["root"] == "com.foo.Bar" + assert parsed["nodes"] == {"com.foo.Bar": {"fqn": "com.foo.Bar"}} + assert parsed["warnings"] == ["partial"] + + +# ----- Test 19: simple_name derived from FQN (NodeRef has no `name`) ----- + + +def test_simple_name_derived_from_fqn() -> None: + """NodeRef carries no `name` field; simple_name derives a short label from FQN. + + A pydantic NodeRef crosses the model_dump boundary as a dict, then + simple_name extracts the simple name from the FQN. + """ + from graph_types import NodeRef + + ref = NodeRef(id="sym:1", kind="symbol", fqn="com.example.MyClass.handle") + row = ref.model_dump() + assert "name" not in row or row.get("name") is None + assert simple_name(row) == "handle" + assert simple_name({"fqn": "com.foo.Bar"}) == "Bar" + assert simple_name({"fqn": ""}) == "" + assert simple_name({}) == "" + + +# ----- Bonus: tiered_name tiers ----- + + +def test_tiered_name_prefers_name_at_service() -> None: + nodes = {"sym:1": {"fqn": "com.foo.Bar.doStuff", "microservice": "foo-svc"}} + assert tiered_name("sym:1", nodes) == "doStuff @foo-svc" + + +def test_tiered_name_falls_back_to_fqn_when_no_service() -> None: + nodes = {"sym:1": {"fqn": "com.foo.Bar.doStuff"}} + # No service: just the simple name (still derived from FQN). + assert tiered_name("sym:1", nodes) == "doStuff" + + +def test_tiered_name_unknown_id_returns_id() -> None: + assert tiered_name("sym:unknown", {}) == "sym:unknown" + + +# ----- PR-JRAG-6: --detail orthogonality (text & json share the field set) ----- + + +def _search_listing_env() -> Envelope: + """A search-results envelope carrying score + snippet + empty fields.""" + return Envelope( + status="ok", + nodes={ + "chunk:1": { + "id": "chunk:1", + "kind": "search_hit", + "fqn": "com.foo.Bar", + "name": "Bar", + "microservice": "chat", + "module": "core", + "role": "SERVICE", + "score": 0.91, + "snippet": "public class Bar {\n void x();\n}", + "symbol_id": None, # empty — must vanish in json + } + }, + ) + + +def test_json_and_text_share_field_set_at_each_detail() -> None: + """Core orthogonality: at a given detail level, the SAME node keys appear + behind both ``--format json`` and ``--format text`` (the projector is the + single seam). The text line shows identity; the json dict shows the exact + projected key set. This is the whole point of PR-JRAG-6. + """ + env = _search_listing_env() + for detail, expected_keys in ( + # ``id`` is stripped at every level (graph ids are not agent-facing); + # nodes are keyed by their natural key (FQN), so look up "com.foo.Bar". + ("brief", {"kind", "fqn", "name", "microservice"}), + ("normal", {"kind", "fqn", "name", "microservice", + "module", "role", "score"}), # +file only if filename present + ("full", {"kind", "fqn", "name", "microservice", + "module", "role", "score", "snippet"}), + ): + parsed = json.loads(render(env, fmt="json", detail=detail)) + assert set(parsed["nodes"]["com.foo.Bar"].keys()) == expected_keys, ( + f"{detail}: json key set {set(parsed['nodes']['com.foo.Bar'].keys())} != {expected_keys}" + ) + # The text output at the same level shows the same identity label, and + # does NOT show keys the projector dropped (snippet at brief/normal). + text = render(env, fmt="text", noun="search", detail=detail) + assert "Bar @chat" in text, f"{detail}: identity label missing in text" + if detail == "full": + assert "void x();" in text, f"{detail}: snippet should render in full text" + else: + assert "void x();" not in text, f"{detail}: snippet leaked into {detail} text" + + +def test_listing_normal_appends_file_role_score_inline() -> None: + """normal text appends module/role/score/file inline on the SAME line. + + Direct fix for the 'text too terse (no file/score)' complaint. + """ + env = Envelope( + status="ok", + nodes={ + "sym:1": { + "id": "sym:1", "kind": "symbol", "fqn": "com.foo.Svc.find", "name": "find", + "microservice": "chat", "module": "core", "role": "SERVICE", "score": 0.77, + "filename": "src/Svc.java", "start_line": 12, + } + }, + ) + line = render(env, fmt="text", noun="symbol", detail="normal").splitlines()[0] + assert line.startswith("find @chat") + assert "module=core" in line and "role=SERVICE" in line and "score=0.77" in line + assert "file=src/Svc.java:12" in line + + +def test_listing_full_appends_indented_block() -> None: + """full text appends a per-row indented kv-block of the content fields.""" + env = Envelope( + status="ok", + nodes={ + "sym:1": { + "id": "sym:1", "kind": "symbol", "fqn": "com.foo.Svc.find", "name": "find", + "microservice": "chat", "module": "core", "role": "SERVICE", + "signature": "find(Long)", "annotations": ["@Override"], + "filename": "src/Svc.java", "start_line": 12, + } + }, + ) + out = render(env, fmt="text", noun="symbol", detail="full") + lines = out.splitlines() + assert lines[0].startswith("find @chat") + # Content fields render as an indented block under the row. + assert " signature: find(Long)" in lines, f"full block missing signature: {out!r}" + assert " annotations:" in out, f"full block missing annotations: {out!r}" + + +def test_edge_line_normal_appends_mechanism() -> None: + """normal edge line appends mechanism over the brief conf-only form.""" + env = Envelope( + status="ok", + root="sym:0", + nodes={ + "sym:0": {"fqn": "com.foo.Svc", "microservice": "svc"}, + "sym:1": {"fqn": "com.foo.Repo", "microservice": "svc"}, + }, + edges=[{"other_id": "sym:1", "edge_type": "INJECTS", "mechanism": "field"}], + ) + normal = render(env, fmt="text", noun="dependencies", detail="normal") + brief = render(env, fmt="text", noun="dependencies", detail="brief") + assert "mechanism=field" in normal, f"normal edge missing mechanism: {normal!r}" + assert "mechanism=" not in brief, f"brief edge leaked mechanism: {brief!r}" + + +def test_search_text_normal_shows_score_not_snippet() -> None: + """Regression for the complaint: text used to drop BOTH score and snippet. + + At normal, score is now visible; the snippet stays opt-in (full only). + """ + out = render(_search_listing_env(), fmt="text", noun="search", detail="normal") + assert "score=0.91" in out, f"normal search text missing score: {out!r}" + assert "void x();" not in out, f"normal search text leaked snippet: {out!r}" + + +def test_search_json_normal_omits_snippet_drops_empty_fields() -> None: + """Regression for the complaint: json used to dump the full snippet + every + None field. At normal, snippet is gone AND symbol_id (None) is dropped.""" + parsed = json.loads(render(_search_listing_env(), fmt="json", detail="normal")) + node = parsed["nodes"]["com.foo.Bar"] # keyed by natural key (FQN), not chunk_id + assert "snippet" not in node, f"normal json leaked snippet: {node!r}" + assert "symbol_id" not in node, f"normal json kept empty symbol_id: {node!r}" + assert node["score"] == 0.91 + + +# ----- Traversal label disambiguation + text/json detail parity ----- +# +# Regression for the `jrag callees/callers "SlaService"` complaint: the text +# rows were bare method names (getId x4, process x5, create x2) with no +# declaring class, and text/json diverged at the same --detail level (json +# carried module/role/file; text showed only `name @service conf`). Two fixes: +# 1. display_name renders method symbols as `Class#method`. +# 2. _format_edge_rows honors --detail symmetrically with _render_listing. + + +def test_display_name_method_includes_declaring_class() -> None: + """A method symbol (``pkg.Class#method(args)``) renders as ``Class#method``. + + Bare method names collide across classes (getId / process / create); the + declaring class is identity-level disambiguation and folds into the label. + """ + from java_codebase_rag.jrag_render import display_name + + # Method FQN with carried name -> Class#name (args stripped, name preferred). + assert display_name({ + "fqn": "com.bank.chat.contracts.InternalEvent#create(String,EventType)", + "name": "create", + }) == "InternalEvent#create" + # name absent -> method name derived from the FQN tail (args stripped). + assert display_name({"fqn": "com.foo.Repo#findById(Long)"}) == "Repo#findById" + # Class FQN (no '#') is unchanged -> simple name. + assert display_name({"fqn": "com.foo.SlaService", "name": "SlaService"}) == "SlaService" + assert display_name({"fqn": "com.foo.SlaService"}) == "SlaService" + + +def _traversal_env() -> Envelope: + """root class Symbol -> one CALLS edge to a method Symbol callee. + + Carries module/role/file (normal-tier) AND signature/annotations/modifiers + (full-tier) so the text/json parity at each level is assertable. + """ + return Envelope( + status="ok", + root="sym:0", + nodes={ + "sym:0": { + "kind": "symbol", "fqn": "com.foo.Svc", "name": "Svc", + "microservice": "chat", "module": "core", "role": "SERVICE", + "symbol_kind": "class", + }, + "sym:1": { + "kind": "symbol", + "fqn": "com.foo.Repo#findById(Long)", + "name": "findById", + "microservice": "chat", "module": "domain", "role": "REPOSITORY", + "symbol_kind": "method", + "signature": "findById(Long)", + "annotations": ["@Override"], + "modifiers": ["public"], + "package": "com.foo", + "filename": "src/Repo.java", "start_line": 42, + }, + }, + edges=[{"edge_type": "CALLS", "other_id": "sym:1", "confidence": 0.88}], + ) + + +def test_traversal_normal_text_carries_same_fields_as_json() -> None: + """At normal, a callees/callers text line shows the SAME node fields JSON + shows (module/role/file), and the label is the disambiguated Class#method. + + Pre-fix the text line was ``findById @chat conf=0.88`` only — no declaring + class, no module, no file — while JSON carried all three. + """ + env = _traversal_env() + text = render(env, fmt="text", noun="callees", detail="normal") + # Declaring class disambiguates the method target. + assert "Repo#findById @chat" in text, f"Class#method label missing: {text!r}" + # Inline extras match the listing normal-tier set (and JSON normal node keys). + assert "module=domain" in text, f"module missing: {text!r}" + assert "role=REPOSITORY" in text, f"role missing: {text!r}" + assert "file=src/Repo.java:42" in text, f"file missing: {text!r}" + assert "conf=0.88" in text + # Root line is enriched the same way. + assert "root: Svc @chat" in text and "module=core" in text, f"root not enriched: {text!r}" + # signature/annotations stay out of normal (they are full-tier) — JSON parity. + assert "@Override" not in text, f"annotation leaked into normal: {text!r}" + # JSON at the same level carries exactly these keys on the callee node. + parsed = json.loads(render(env, fmt="json", detail="normal")) + callee = parsed["nodes"]["com.foo.Repo#findById(Long)"] + assert {"module", "role", "file"} <= set(callee.keys()), callee + assert "signature" not in callee, f"json normal leaked signature: {callee!r}" + + +def test_traversal_full_text_renders_per_edge_content_block() -> None: + """At full, a per-edge indented block renders the callee's content fields + (signature/annotations/modifiers/package), matching listing full and JSON + full. + + Pre-fix ``--detail full`` was byte-identical to brief for traversals: the + full branch walked edge attrs only (confidence, already shown), never node + attrs, so the promised signature/annotations never appeared in text. + """ + env = _traversal_env() + text = render(env, fmt="text", noun="callees", detail="full") + lines = text.splitlines() + # Header line: disambiguated label + conf (no inline extras at full). + assert any(ln.startswith(" Repo#findById @chat") and "conf=0.88" in ln for ln in lines), text + # Content fields render as a block NESTED under the edge row (4-space indent). + assert " signature: findById(Long)" in lines, f"full block missing signature: {text!r}" + assert " annotations: @Override" in lines, f"full block missing annotations: {text!r}" + assert " modifiers: public" in lines, f"full block missing modifiers: {text!r}" + # Root also gets a nested block at full. + assert " role: SERVICE" in lines, f"root block missing: {text!r}" + # JSON full carries the same content fields on the callee node. + parsed = json.loads(render(env, fmt="json", detail="full")) + callee = parsed["nodes"]["com.foo.Repo#findById(Long)"] + assert {"signature", "annotations", "modifiers", "package"} <= set(callee.keys()), callee diff --git a/tests/test_jrag_status.py b/tests/test_jrag_status.py new file mode 100644 index 00000000..7ae24b47 --- /dev/null +++ b/tests/test_jrag_status.py @@ -0,0 +1,174 @@ +"""Tests for `jrag status` and the PR-JRAG-1a CLI shell (PR-JRAG-1a). + +Tests: +20. ``test_status_reports_ontology_version_and_counts`` - real index -> exit 0, + output mentions ontology 17 and counts. +21. ``test_missing_index_returns_actionable_error`` - empty index dir -> + ``status: error`` envelope mentioning ``java-codebase-rag init``, exit 2 + (NOT a traceback crash). +22. ``test_offset_is_not_a_global_flag`` - ``jrag callers --offset 5`` is a + usage error (offset is never registered globally; traversal commands don't + take it in 1a). + +Plus a subprocess smoke test for ``jrag --help``. +""" +from __future__ import annotations + +import json +import os +import shutil +import subprocess +import sys +from pathlib import Path + + +def _jrag_exe() -> str: + """Locate the installed ``jrag`` entry point next to the venv interpreter.""" + candidate = Path(sys.executable).parent / "jrag" + if candidate.is_file(): + return str(candidate) + exe = shutil.which("jrag") + assert exe is not None, "expected installed jrag entrypoint (run: pip install -e .)" + return exe + + +def _run_jrag( + args: list[str], + *, + env: dict[str, str] | None = None, + stdin: str | None = None, +) -> subprocess.CompletedProcess: + return subprocess.run( + [_jrag_exe(), *args], + capture_output=True, + text=True, + env=env, + input=stdin, + check=False, + ) + + +# ----- Test 20: status reports ontology version + counts ----- + + +def test_status_reports_ontology_version_and_counts( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """`jrag status` against a real index reports ontology 17 + non-empty counts.""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + # The entry point runs in a fresh subprocess; no in-process LadybugGraph + # singleton state leaks across. + proc = _run_jrag(["status", "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"status failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + + payload = json.loads(proc.stdout) + assert payload["status"] == "ok" + index = payload["nodes"]["index"] + assert index["ontology_version"] == 17 + # Counts is a top-level nested dict on the index node (the generic + # nested-sections dispatch signal - any dict-typed value renders as an + # indented alphabetical section; edge_summary is NOT used as the dispatch + # key, it is reserved for real edge data in PR-JRAG-3 inspect). + counts = index["counts"] + # Counts is non-empty and has at least one positive counter (the fixture + # has real Symbols / EXTENDS / INJECTS — see conftest ladybug_db_path). + assert counts, f"counts dict empty: {payload}" + assert any(int(v or 0) > 0 for v in counts.values()), f"all counts zero: {counts}" + # edge_summary is NOT populated by status (reserved for real inspect edge + # data in PR-JRAG-3). + assert "edge_summary" not in index + + +# ----- Test 21: missing index -> actionable error envelope ----- + + +def test_missing_index_returns_actionable_error(tmp_path: Path) -> None: + """Pointing `jrag status` at an empty dir -> status: error envelope, NOT a crash.""" + empty_idx = tmp_path / "does-not-exist" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(tmp_path) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(empty_idx) + proc = _run_jrag(["status", "--format", "json"], env=env) + + assert proc.returncode == 2, f"expected exit 2, got {proc.returncode}\nstdout={proc.stdout}" + payload = json.loads(proc.stdout) + assert payload["status"] == "error" + msg = payload.get("message") or "" + # Actionable message must reference the operator init command. + assert "java-codebase-rag init" in msg, f"missing init hint: {msg!r}" + # The hint must include the literal `--source-root` flag form. + assert "--source-root" in msg + # No traceback leaked to stdout (would break JSON parse); stderr may carry + # nothing because we route errors through the envelope, not tracebacks. + assert "Traceback" not in proc.stdout + + +def test_missing_index_text_format_emits_actionable_envelope(tmp_path: Path) -> None: + """Same path, default text format - error envelope must still be parseable.""" + empty_idx = tmp_path / "missing" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(tmp_path) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(empty_idx) + proc = _run_jrag(["status"], env=env) + assert proc.returncode == 2 + assert proc.stdout.startswith("error:") + assert "java-codebase-rag init" in proc.stdout + + +# ----- Test 22: --offset is NOT a global flag ----- +# +# The original brief listed `test_offset_is_not_a_global_flag` as +# `jrag callers --offset 5`, but `callers` is not a registered subcommand in +# 1a, so argparse rejects the *subcommand* (invalid choice) before it ever +# sees `--offset`. That test would pass for the WRONG reason and would not +# catch a regression that added `--offset` to the `common` parent parser. +# +# The contract ("offset is not global") is honestly covered by the three +# siblings below plus `test_jrag_help_lists_status_subcommand` (which asserts +# `--offset` is absent from the rendered `--help`). + + +def test_offset_not_accepted_on_status_subparser() -> None: + """`jrag status --offset 5` is a usage error: status has no --offset. + + `status` IS a registered subcommand in 1a, so this is the honest test that + `--offset` is not on the per-command common parser. + """ + env = os.environ.copy() + proc = _run_jrag(["status", "--offset", "5"], env=env) + assert proc.returncode != 0 + assert "Traceback" not in proc.stderr + assert "--offset" in proc.stderr or "unrecognized arguments" in proc.stderr.lower() + + +def test_offset_not_accepted_before_subcommand() -> None: + """`jrag --offset 5 status` is a usage error: --offset is not a top-level flag. + + This is the key "not on the parent parser" test. argparse sees the unknown + ``--offset`` and then treats ``5`` as the subcommand choice, which is + invalid - either way the command is rejected with a clean message (no + traceback) and non-zero exit. + """ + env = os.environ.copy() + proc = _run_jrag(["--offset", "5", "status"], env=env) + assert proc.returncode != 0 + assert "Traceback" not in proc.stderr + # Some helpful rejection text appears (specific message varies by parse path). + assert proc.stderr.strip() != "" + + +# ----- Smoke: jrag --help ----- + + +def test_jrag_help_lists_status_subcommand() -> None: + """`jrag --help` exits 0 and lists `status` under subcommands.""" + env = os.environ.copy() + proc = _run_jrag(["--help"], env=env) + assert proc.returncode == 0 + assert "status" in proc.stdout + # The --offset flag must NOT appear in the top-level help. + assert "--offset" not in proc.stdout diff --git a/tests/test_jrag_token_budget.py b/tests/test_jrag_token_budget.py new file mode 100644 index 00000000..d8f1648d --- /dev/null +++ b/tests/test_jrag_token_budget.py @@ -0,0 +1,127 @@ +"""Token-budget guard for `jrag` default (text) output (PR-JRAG-4, §14). + +Test 19: test_no_default_output_exceeds_token_ceiling + +Asserts that no jrag command's default text output exceeds a token ceiling on +the bank-chat fixture. This prevents output bloat from blowing the agent's +context window as fields accrete over time. + +Token estimation: chars / 4 (a common heuristic for English/code text; the +actual ratio for this CLI is closer to 3.5–4). The ceiling is generous (4000 +tokens ≈ 16000 chars) to allow room for legitimately large traversals (e.g. +``decompose`` with multi-stage flows) while still catching runaway growth. + +Commands that need a ```` use seed identifiers verified against the +bank-chat fixture (see test_jrag_traversal_direct.py). ``search`` is excluded +from this guard because it requires the Lance index (heavy); it has its own +truncation via +1-fetch and ``--limit``. +""" +from __future__ import annotations + +import os +import shutil +import subprocess +import sys +from pathlib import Path + + +def _jrag_exe() -> str: + """Locate the installed ``jrag`` entry point next to the venv interpreter.""" + candidate = Path(sys.executable).parent / "jrag" + if candidate.is_file(): + return str(candidate) + exe = shutil.which("jrag") + assert exe is not None, "expected installed jrag entrypoint (run: pip install -e .)" + return exe + + +def _run_jrag_text( + args: list[str], + *, + env: dict[str, str] | None = None, +) -> subprocess.CompletedProcess: + """Run jrag in text mode (default) and return the completed process.""" + return subprocess.run( + [_jrag_exe(), *args], + capture_output=True, + text=True, + env=env, + check=False, + ) + + +# Token ceiling: ~4000 tokens (≈16000 chars). Generous enough for multi-stage +# decompose flows, tight enough to catch bloat. +_TOKEN_CEILING = 4000 +_CHARS_PER_TOKEN = 4 + + +def _estimate_tokens(text: str) -> int: + """Rough token estimate: chars / 4.""" + return len(text) // _CHARS_PER_TOKEN + + +# Commands and their args. Queries use seed identifiers from the bank-chat +# fixture (verified in test_jrag_traversal_direct.py). Each tuple is +# (label, args-list). Commands that take use a known-good seed. +_SEED_METHOD = "com.bank.chat.assign.service.ChatManagementService#assign(AssignmentRequest)" +_SEED_TYPE = "com.bank.chat.engine.notification.AbstractNotificationSender" +_SEED_FILE = "chat-assign/src/main/java/com/bank/chat/assign/service/ChatManagementService.java" + + +def test_no_default_output_exceeds_token_ceiling( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """No jrag command's default text output exceeds the token ceiling.""" + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + + commands: list[tuple[str, list[str]]] = [ + # Orientation + ("status", ["status"]), + ("microservices", ["microservices"]), + ("map", ["map"]), + ("conventions", ["conventions"]), + ("overview-microservice", ["overview", "chat-assign"]), + ("overview-route", ["overview", "/chat/assign"]), + ("overview-topic", ["overview", "banking.chat.compliance.review"]), + # Locate + ("find-query", ["find", "ChatManagementService"]), + ("find-filter", ["find", "--role", "CONTROLLER"]), + ("inspect", ["inspect", "ChatManagementService"]), + ("outline", ["outline", _SEED_FILE]), + ("imports", ["imports", _SEED_FILE]), + # Listings + ("routes", ["routes"]), + ("clients", ["clients"]), + ("producers", ["producers"]), + ("topics", ["topics"]), + ("jobs", ["jobs"]), + ("listeners", ["listeners"]), + ("entities", ["entities"]), + # Traversals + ("callers", ["callers", _SEED_METHOD]), + ("callees", ["callees", _SEED_METHOD]), + ("hierarchy", ["hierarchy", _SEED_TYPE]), + ("dependents", ["dependents", _SEED_TYPE]), + ("dependencies", ["dependencies", _SEED_TYPE]), + ("impact", ["impact", _SEED_TYPE, "--depth", "1"]), + ("connection", ["connection", "chat-assign"]), + ("flow", ["flow", "/chat/assign"]), + ] + + violations: list[str] = [] + for label, args in commands: + proc = _run_jrag_text(args, env=env) + output = proc.stdout + tokens = _estimate_tokens(output) + if tokens > _TOKEN_CEILING: + violations.append( + f"{label}: {tokens} tokens ({len(output)} chars) > {_TOKEN_CEILING} ceiling" + ) + + assert not violations, ( + f"token-budget violations on {len(violations)} command(s):\n " + + "\n ".join(violations) + ) diff --git a/tests/test_jrag_traversal_compose.py b/tests/test_jrag_traversal_compose.py new file mode 100644 index 00000000..ae48b043 --- /dev/null +++ b/tests/test_jrag_traversal_compose.py @@ -0,0 +1,723 @@ +"""Tests for `jrag` compose traversals + connection + outline/imports (PR-JRAG-3b). + +Five new commands sit on top of the PR-JRAG-3a foundation: + * ``callees`` Client/Producer variant (Symbol path is unchanged from 3a). + Client root -> neighbors_v2([id], "out", ["HTTP_CALLS"]) reaching :Route. + Producer root -> neighbors_v2([id], "out", ["ASYNC_CALLS"]) reaching :Route + (the kafka_topic Route this producer publishes to, NOT :Producer). + * ``dependencies`` -> neighbors_v2([id], "out", ["INJECTS"]) (Symbol -> Symbol). + * ``connection `` -- multi-section inbound:/outbound: view. + RESOLVE-FIRST EXCEPTION: the first positional is a microservice NAME. + * ``outline `` -> find_symbols_in_file_range(start_line=1, end_line=2**31-1). + * ``imports `` -> ast_java.parse_java + resolve_v2 per imported FQN. + +Tests (bank-chat fixture): + 1. test_callees_client_reaches_route_via_http_calls + 2. test_callees_producer_reaches_route_topic_via_async_calls + 3. test_dependencies_composes_neighbors_out_injects + 4. test_connection_inbound_lists_external_callers + 5. test_connection_outbound_lists_this_service_clients + 6. test_connection_both_default + 7. test_connection_http_method_filter + 8. test_connection_first_positional_is_microservice_not_query + 9. test_outline_lists_file_symbols +10. test_outline_empty_for_missing_file +11. test_imports_resolves_graph_nodes +12. test_outline_and_import_reject_offset_or_document_unbounded +13. test_connection_calls_service_outbound_excludes_unresolved_clients (review Fix 2) +14. test_imports_text_mode_marks_unresolved (review Fix 3) + +Backend signatures verified against source at PR-JRAG-3b time: + * neighbors_v2 (mcp_v2.py:1284) returns NeighborsOutput.results: list[Edge] + where Edge.other: NodeRef, Edge.edge_type, Edge.attrs. + * find_symbols_in_file_range (ladybug_queries.py:302) requires start_line>=1 + (returns [] otherwise); returns list[SymbolHit]. + * list_clients / list_producers return list[dict] (plain dicts). + * find_route_callers returns list[RouteCaller] (caller_node_kind: client|producer). + * parse_java (ast_java.py:2612) -> JavaFileAst.explicit_imports: dict[str,str]. + * Edge directions confirmed in java_ontology.py: + - HTTP_CALLS: Client -> Route (line 352) + - ASYNC_CALLS: Producer -> Route (line 386) + - INJECTS: Symbol -> Symbol (line 216) +""" +from __future__ import annotations + +import json +import os +import shutil +import subprocess +import sys +from pathlib import Path + + +def _jrag_exe() -> str: + """Locate the installed ``jrag`` entry point next to the venv interpreter.""" + candidate = Path(sys.executable).parent / "jrag" + if candidate.is_file(): + return str(candidate) + exe = shutil.which("jrag") + assert exe is not None, "expected installed jrag entrypoint (run: pip install -e .)" + return exe + + +def _run_jrag( + args: list[str], + *, + env: dict[str, str] | None = None, + stdin: str | None = None, +) -> subprocess.CompletedProcess: + return subprocess.run( + [_jrag_exe(), *args], + capture_output=True, + text=True, + env=env, + input=stdin, + check=False, + ) + + +def _env_for(corpus_root: Path, ladybug_db_path: Path) -> dict[str, str]: + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + return env + + +# Seed identifiers verified against the bank-chat fixture (PR-JRAG-3b probe). +# Client resolve: resolve_v2 accepts " " (the +# `client_target_path` reason); "chat-core /api/v1/chat/sessions" resolves +# cleanly to ONE FeignClient (the getSession method), avoiding the +# /chat/joinOperator ambiguity (Feign + RestTemplate both target that path). +_CLIENT_GETSESSION = "chat-core /api/v1/chat/sessions" +# Producer resolve: a unique topic literal resolves to one Producer node. +_PRODUCER_AUDIT_DLQ = "banking.chat.audit.dlq" +# Type with injections (ClientMessageProcessor injects ChatAssignmentPort, +# ComplianceScanner, FollowUpKafkaPublisher, RejectionPublisher, etc). +_INJECTOR_TYPE = "com.bank.chat.engine.processors.ClientMessageProcessor" +# File path stored in the graph (POSIX-relative to source root; build_ast_graph +# line 534: rel_path = abs_path_resolved.relative_to(source_root).as_posix()). +_OUTLINE_FILE = ( + "chat-assign/src/main/java/com/bank/chat/assign/integration/ChatCoreFeignClient.java" +) + + +# ----- Test 1: callees (Client) reaches :Route via HTTP_CALLS out ----- + + +def test_callees_client_reaches_route_via_http_calls( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """Client root -> neighbors_v2([id], 'out', ['HTTP_CALLS']) reaching :Route. + + resolve_v2('chat-core /api/v1/chat/sessions', hint_kind='client') gives a + single FeignClient (the getSession method). HTTP_CALLS is Client -> Route + (java_ontology.py:352), so 'out' dispatches to the chat-core :Route the + client targets. The endpoint MUST be a :Route (not another :Client). + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag( + ["callees", _CLIENT_GETSESSION, "--kind", "client", "--format", "json"], + env=env, + ) + assert proc.returncode == 0, ( + f"callees client failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + assert payload.get("root"), "expected root id (the Client)" + edges = payload.get("edges", []) + assert len(edges) >= 1, f"expected >=1 HTTP_CALLS edge, got {edges}" + # Every edge MUST be HTTP_CALLS (the Client root variant). + for e in edges: + assert e.get("edge_type") == "HTTP_CALLS", ( + f"expected HTTP_CALLS edge, got {e.get('edge_type')}" + ) + # Every edge endpoint MUST be a :Route (the kafka_topic analog for HTTP). + nodes = payload.get("nodes", {}) + for e in edges: + ep = nodes.get(e.get("target"), {}) + assert ep.get("kind") == "route", ( + f"expected edge endpoint kind=route, got {ep.get('kind')!r} on {ep}" + ) + + +# ----- Test 2: callees (Producer) reaches :Route (kafka_topic) via ASYNC_CALLS out ----- + + +def test_callees_producer_reaches_route_topic_via_async_calls( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """Producer root -> neighbors_v2([id], 'out', ['ASYNC_CALLS']) reaching :Route. + + resolve_v2('banking.chat.audit.dlq', hint_kind='producer') resolves to one + Producer node (EventStreamBridge#sendToAudit producing to .dlq). ASYNC_CALLS + is Producer -> Route (java_ontology.py:386), so 'out' dispatches to the + kafka_topic :Route this producer publishes to (NOT a :Producer node). + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag( + ["callees", _PRODUCER_AUDIT_DLQ, "--kind", "producer", "--format", "json"], + env=env, + ) + assert proc.returncode == 0, ( + f"callees producer failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + assert payload.get("root"), "expected root id (the Producer)" + edges = payload.get("edges", []) + assert len(edges) >= 1, f"expected >=1 ASYNC_CALLS edge, got {edges}" + for e in edges: + assert e.get("edge_type") == "ASYNC_CALLS", ( + f"expected ASYNC_CALLS edge, got {e.get('edge_type')}" + ) + # The endpoint MUST be a :Route (the kafka_topic), NOT a :Producer. + nodes = payload.get("nodes", {}) + for e in edges: + ep = nodes.get(e.get("target"), {}) + assert ep.get("kind") == "route", ( + f"expected edge endpoint kind=route (kafka_topic), got {ep.get('kind')!r}" + ) + + +# ----- Test 3: dependencies composes neighbors(out, INJECTS) ----- + + +def test_dependencies_composes_neighbors_out_injects( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """dependencies on a type returns the types it injects (INJECTS out). + + ClientMessageProcessor injects ChatAssignmentPort (verified in the fixture; + also ComplianceScanner, ClientMessageRateLimiter, etc). INJECTS is Symbol -> + Symbol (java_ontology.py:216), so 'out' dispatches to the injected types. + The endpoint MUST be a Symbol. + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag( + ["dependencies", _INJECTOR_TYPE, "--format", "json"], + env=env, + ) + assert proc.returncode == 0, ( + f"dependencies failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + assert payload.get("root"), "expected root id" + edges = payload.get("edges", []) + assert len(edges) >= 1, f"expected >=1 INJECTS edge, got {edges}" + for e in edges: + assert e.get("edge_type") == "INJECTS", ( + f"expected INJECTS edge, got {e.get('edge_type')}" + ) + # The endpoint MUST be a Symbol (INJECTS is Symbol -> Symbol). + nodes = payload.get("nodes", {}) + injected_fqns = [] + for e in edges: + ep = nodes.get(e.get("target"), {}) + assert ep.get("kind") == "symbol", ( + f"expected edge endpoint kind=symbol, got {ep.get('kind')!r}" + ) + injected_fqns.append(ep.get("fqn", "")) + # ClientMessageProcessor injects ChatAssignmentPort. + assert any("ChatAssignmentPort" in fqn for fqn in injected_fqns), ( + f"expected ChatAssignmentPort in injected types, got {injected_fqns}" + ) + + +# ----- Test 4: connection --inbound lists external callers ----- + + +def test_connection_inbound_lists_external_callers( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """connection chat-core --inbound lists chat-assign clients targeting chat-core. + + chat-assign has ChatCoreFeignClient + ChatCoreJoinClient targeting chat-core + (verified via list_clients(target_service='chat-core')). The inbound section + MUST surface at least one of them with edge_type=HTTP_CALLS, section=inbound. + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag( + ["connection", "chat-core", "--inbound", "--format", "json"], + env=env, + ) + assert proc.returncode == 0, ( + f"connection inbound failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + assert payload.get("root"), "expected root id (synthetic microservice)" + edges = payload.get("edges", []) + inbound = [e for e in edges if e.get("section") == "inbound"] + assert len(inbound) >= 1, ( + f"expected >=1 inbound edge from chat-assign, got {inbound}" + ) + # All inbound edges are HTTP_CALLS or ASYNC_CALLS. + for e in inbound: + assert e.get("edge_type") in ("HTTP_CALLS", "ASYNC_CALLS"), ( + f"expected HTTP/ASYNC_CALLS, got {e.get('edge_type')}" + ) + # The synthetic microservice root node must be present and labeled. + nodes = payload.get("nodes", {}) + root_node = nodes.get(payload["root"], {}) + assert root_node.get("kind") == "microservice", ( + f"expected synthetic microservice root, got {root_node}" + ) + assert root_node.get("name") == "chat-core", ( + f"expected root name 'chat-core', got {root_node.get('name')}" + ) + # At least one chat-assign caller MUST be present (the test's main invariant). + caller_services = { + nodes.get(e.get("target"), {}).get("microservice", "") + for e in inbound + } + assert "chat-assign" in caller_services, ( + f"expected chat-assign in inbound caller services, got {caller_services}" + ) + + +# ----- Test 5: connection --outbound lists this service's clients/producers ----- + + +def test_connection_outbound_lists_this_service_clients( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """connection chat-assign --outbound lists chat-assign's clients + producers. + + chat-assign has ChatCoreFeignClient + ChatCoreJoinClient (HTTP) and + DistributionTriggerPublisher (Kafka). The outbound section MUST surface at + least one HTTP_CALLS and (when indexed) at least one ASYNC_CALLS, all + carrying section=outbound. + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag( + ["connection", "chat-assign", "--outbound", "--format", "json"], + env=env, + ) + assert proc.returncode == 0, ( + f"connection outbound failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + edges = payload.get("edges", []) + outbound = [e for e in edges if e.get("section") == "outbound"] + assert len(outbound) >= 1, ( + f"expected >=1 outbound edge from chat-assign, got {outbound}" + ) + # No inbound edges when --outbound only. + assert all(e.get("section") == "outbound" for e in edges), ( + f"expected only outbound edges, got sections={ {e.get('section') for e in edges} }" + ) + # HTTP outbound MUST be present (chat-assign's two clients target chat-core). + http_out = [e for e in outbound if e.get("edge_type") == "HTTP_CALLS"] + assert len(http_out) >= 1, f"expected >=1 outbound HTTP_CALLS, got {http_out}" + + +# ----- Test 6: connection default direction is --inbound (brief-faithful) ----- + + +def test_connection_both_default(corpus_root: Path, ladybug_db_path: Path) -> None: + """connection with no direction flag defaults to --both (full picture). + + The default is --both so `connection ` shows inbound + outbound: an + inbound-only default hid a service's outbound HTTP clients unless the agent + remembered `--both`, making services look connectionless. `--inbound` / + `--outbound` remain explicit opt-ins for a single direction. + """ + env = _env_for(corpus_root, ladybug_db_path) + # Default (no flag) MUST equal explicit --both. + proc_default = _run_jrag( + ["connection", "chat-core", "--format", "json"], + env=env, + ) + assert proc_default.returncode == 0, ( + f"connection default failed: {proc_default.stderr}" + ) + payload_default = json.loads(proc_default.stdout) + assert payload_default["status"] == "ok" + sections_default = {e.get("section") for e in payload_default.get("edges", [])} + + proc_both = _run_jrag( + ["connection", "chat-core", "--both", "--format", "json"], + env=env, + ) + assert proc_both.returncode == 0 + payload_both = json.loads(proc_both.stdout) + sections_both = {e.get("section") for e in payload_both.get("edges", [])} + + assert sections_default == sections_both, ( + f"default {sections_default} != explicit --both {sections_both}" + ) + # Default MUST include outbound (the whole point: show the full picture). + assert "outbound" in sections_default, ( + f"default direction should be --both (include outbound), got {sections_default}" + ) + + # --inbound is the explicit opt-in for inbound-only (no outbound leakage). + proc_inbound = _run_jrag( + ["connection", "chat-core", "--inbound", "--format", "json"], + env=env, + ) + assert proc_inbound.returncode == 0 + payload_inbound = json.loads(proc_inbound.stdout) + sections_inbound = {e.get("section") for e in payload_inbound.get("edges", [])} + assert "outbound" not in sections_inbound, ( + f"--inbound should be inbound-only, got {sections_inbound}" + ) + + +# ----- Test 7: --http-method filters HTTP callers ----- + + +def test_connection_http_method_filter( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """--http-method POST narrows inbound HTTP callers to POST only. + + Without the filter, chat-core inbound has at least one POST (joinOperator) + and one GET (api/v1/chat/sessions). With --http-method POST, GET callers + MUST be excluded. The result must be a strict subset. + """ + env = _env_for(corpus_root, ladybug_db_path) + proc_all = _run_jrag( + ["connection", "chat-core", "--inbound", "--format", "json"], + env=env, + ) + assert proc_all.returncode == 0 + payload_all = json.loads(proc_all.stdout) + inbound_all = [e for e in payload_all.get("edges", []) if e.get("section") == "inbound"] + + proc_post = _run_jrag( + ["connection", "chat-core", "--inbound", "--http-method", "POST", "--format", "json"], + env=env, + ) + assert proc_post.returncode == 0, f"--http-method POST failed: {proc_post.stderr}" + payload_post = json.loads(proc_post.stdout) + inbound_post = [e for e in payload_post.get("edges", []) if e.get("section") == "inbound"] + + # All surviving HTTP edges MUST have method=POST. + nodes_post = payload_post.get("nodes", {}) + for e in inbound_post: + if e.get("edge_type") == "HTTP_CALLS": + ep = nodes_post.get(e.get("target"), {}) + assert (ep.get("method") or "").upper() == "POST", ( + f"expected POST after --http-method POST, got {ep.get('method')!r} on {ep}" + ) + # The POST set must not exceed the unfiltered inbound set. + assert len(inbound_post) <= len(inbound_all), ( + f"--http-method POST should not grow inbound: post={len(inbound_post)} all={len(inbound_all)}" + ) + + +# ----- Test 8: first positional is microservice NAME (not run through resolve_v2) ----- + + +def test_connection_first_positional_is_microservice_not_query( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """The first positional is a microservice NAME, NOT a query. + + If we ran resolve_v2('chat-core'), it would NOT match (chat-core is not a + Symbol/Route/Client/Producer FQN) and the envelope would be status=not_found. + The command returns status=ok with a synthetic microservice root, proving + resolve_v2 was skipped (the resolve-first exception). + """ + env = _env_for(corpus_root, ladybug_db_path) + # 'chat-core' would resolve to a `many` of Clients (target_service match) + # if it WERE run through resolve_v2 with hint_kind=client; the result here + # is status=ok with a synthetic root, NOT ambiguous and NOT not_found. + proc = _run_jrag( + ["connection", "chat-core", "--inbound", "--format", "json"], + env=env, + ) + assert proc.returncode == 0 + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", ( + f"expected ok (resolve_v2 was NOT run on the positional), got {payload.get('status')}" + ) + # The synthetic microservice root is keyed by the microservice NAME (its + # natural key after the id-free boundary strip), not a `microservice:`- + # prefixed synthetic id and not a resolved symbol id. + assert payload.get("root") == "chat-core", ( + f"expected root == 'chat-core' (microservice natural key), got {payload.get('root')}" + ) + root_node = payload["nodes"]["chat-core"] + assert root_node.get("kind") == "microservice", ( + f"expected synthetic microservice root kind, got {root_node.get('kind')}" + ) + + # Sanity: a clearly-not-real-microservice still returns status=ok (empty + # but not not_found). This proves resolve_v2 was not invoked. + proc_unknown = _run_jrag( + ["connection", "definitely-not-a-real-microservice", "--format", "json"], + env=env, + ) + assert proc_unknown.returncode == 0, ( + f"unknown microservice failed: {proc_unknown.stderr}" + ) + payload_unknown = json.loads(proc_unknown.stdout) + assert payload_unknown["status"] == "ok", ( + f"expected ok for unknown microservice (no resolve), got {payload_unknown.get('status')}" + ) + assert payload_unknown.get("edges", []) == [], ( + f"expected 0 edges for unknown microservice, got {payload_unknown.get('edges')}" + ) + + +# ----- Test 9: outline lists file symbols ----- + + +def test_outline_lists_file_symbols( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """outline returns every Symbol declared in . + + find_symbols_in_file_range(start_line=1, end_line=2**31-1) returns ALL + symbols in the file (1-based; start_line<1 returns []). ChatCoreFeignClient + has 3 symbols (interface + 2 methods). + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag( + ["outline", _OUTLINE_FILE, "--format", "json"], + env=env, + ) + assert proc.returncode == 0, ( + f"outline failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + nodes = payload.get("nodes", {}) + assert len(nodes) >= 1, f"expected >=1 symbol in {_OUTLINE_FILE}, got {nodes}" + # Every node is a symbol. + for nid, node in nodes.items(): + assert node.get("kind") == "symbol", ( + f"expected kind=symbol, got {node.get('kind')!r} on {node}" + ) + # The interface itself MUST be present (FQN ends with the type name). + fqns = [n.get("fqn", "") for n in nodes.values()] + assert any("ChatCoreFeignClient" in fqn for fqn in fqns), ( + f"expected ChatCoreFeignClient in outline, got {fqns}" + ) + + +# ----- Test 10: outline is graceful on missing files (no crash) ----- + + +def test_outline_empty_for_missing_file( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """outline on a non-existent filename returns status=ok with 0 nodes. + + find_symbols_in_file_range matches s.filename = $fn exactly. A filename + that doesn't exist in the graph returns [] (the underlying query has no + matches). The command MUST return status=ok with empty nodes, not crash. + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag( + ["outline", "does/not/exist/Nope.java", "--format", "json"], + env=env, + ) + assert proc.returncode == 0, ( + f"outline missing file failed: rc={proc.returncode}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + assert payload.get("nodes", {}) == {}, ( + f"expected 0 nodes for missing file, got {payload.get('nodes')}" + ) + + +# ----- Test 11: imports resolves graph nodes ----- + + +def test_imports_resolves_graph_nodes( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """imports tree-sitter-parses + resolves each FQN via resolve_v2. + + ChatCoreFeignClient imports com.bank.chat.app.web.JoinOperatorRequest + (a contracts DTO that IS in the graph). That import MUST resolve to a graph + Symbol node (resolved=True). External Spring imports (org.springframework.*) + are NOT in the graph and MUST come back as unresolved (resolved=False). + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag( + ["imports", _OUTLINE_FILE, "--format", "json"], + env=env, + ) + assert proc.returncode == 0, ( + f"imports failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + nodes = payload.get("nodes", {}) + edges = payload.get("edges", []) + assert len(edges) >= 1, f"expected >=1 import edge, got {edges}" + + # Split edges by resolved flag. + resolved_edges = [e for e in edges if e.get("resolved") is True] + unresolved_edges = [e for e in edges if e.get("resolved") is False] + assert len(resolved_edges) >= 1, ( + f"expected >=1 resolved import (JoinOperatorRequest), got edges={edges}" + ) + assert len(unresolved_edges) >= 1, ( + f"expected >=1 unresolved import (org.springframework.*), got edges={edges}" + ) + + # The resolved import MUST be the JoinOperatorRequest graph Symbol. + resolved_fqns = [] + for e in resolved_edges: + node = nodes.get(e.get("target"), {}) + resolved_fqns.append(node.get("fqn", "")) + assert any("JoinOperatorRequest" in fqn for fqn in resolved_fqns), ( + f"expected JoinOperatorRequest resolved, got {resolved_fqns}" + ) + + # Unresolved imports carry the raw FQN and kind=unresolved_import. + for e in unresolved_edges: + node = nodes.get(e.get("target"), {}) + assert node.get("kind") == "unresolved_import", ( + f"expected kind=unresolved_import, got {node.get('kind')!r} on {node}" + ) + assert node.get("fqn"), f"expected fqn on unresolved import, got {node}" + + +# ----- Test 12: outline/imports reject --offset; document unbounded ----- + + +def test_outline_and_import_reject_offset_or_document_unbounded() -> None: + """--offset is rejected on outline and imports (neither takes offset). + + Per the global plan: --offset is supported only on find/search; traversal + and listing commands (including outline/imports, which take no offset) + reject it via argparse. We also assert that outline's --limit (a common + flag inherited from the parent) does NOT silently cap results — but the + flag is accepted (we cannot remove inherited common flags per-command). + """ + env = os.environ.copy() + for cmd in ("outline", "imports"): + proc = _run_jrag([cmd, "somefile.java", "--offset", "5"], env=env) + assert proc.returncode != 0, ( + f"{cmd} --offset should be rejected (rc!=0), got rc={proc.returncode}" + ) + assert ( + "unrecognized arguments: --offset" in proc.stderr or "usage:" in proc.stderr + ), f"{cmd}: expected usage error, got stderr={proc.stderr!r}" + + +# ----- Test 13 (review Fix 2): --calls-service outbound excludes unresolved clients ----- + + +def test_connection_calls_service_outbound_excludes_unresolved_clients( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """--calls-service on outbound uses STRICT target_service matching for clients. + + PR-JRAG-3b review Fix 2: the initial predicate + `(target_service == calls_service) or not target_service` was a loophole — + the `or not target_service` escape was meant for producers (genuinely no + service target) but ALSO matched unresolved clients (empty target_service, + e.g. AuditLogClient#logAssignment in the fixture). The tightened predicate + keeps producers (with a warning) and EXCLUDES unresolved clients. + + Fixture pair (verified by reviewer): + * chat-assign's ChatCoreFeignClient#joinOperator — target_service=chat-core (KEEP) + * chat-assign's AuditLogClient#logAssignment — target_service='' (EXCLUDE) + * chat-assign's producers (e.g. DistributionTriggerPublisher) — KEPT w/ warning + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag( + ["connection", "chat-assign", "--outbound", "--calls-service", "chat-core", "--format", "json"], + env=env, + ) + assert proc.returncode == 0, ( + f"connection --calls-service failed: rc={proc.returncode}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + nodes = payload.get("nodes", {}) + edges = payload.get("edges", []) + outbound = [e for e in edges if e.get("section") == "outbound"] + + # (a) Clients with target_service == chat-core MUST be present. + client_edges = [e for e in outbound if e.get("edge_type") == "HTTP_CALLS"] + assert len(client_edges) >= 1, f"expected >=1 chat-core client edge, got {client_edges}" + for e in client_edges: + node = nodes.get(e.get("target"), {}) + assert (node.get("target_service") or "") == "chat-core", ( + f"strict --calls-service leak: client target_service={node.get('target_service')!r}" + ) + + # (b) The UNRESOLVED client (AuditLogClient#logAssignment, empty target_service) + # MUST NOT appear anywhere in the result. The fixture's AuditLogClient has + # no @CodebaseHttpClient annotation, so its target_service is empty. + for nid, node in nodes.items(): + if node.get("kind") == "client": + fqn = node.get("fqn", "") or "" + assert "AuditLogClient" not in fqn, ( + f"AuditLogClient MUST be excluded under --calls-service chat-core " + f"(empty target_service); got node={node}" + ) + + # (c) Producers are KEPT (async channel stays visible) AND a warning fires + # explaining producers bypass --calls-service. + producer_edges = [e for e in outbound if e.get("edge_type") == "ASYNC_CALLS"] + warnings = payload.get("warnings", []) + if producer_edges: + # Warning MUST mention producers bypass the filter. + assert any("--calls-service" in w and "producer" in w.lower() for w in warnings), ( + f"expected --calls-service producer-bypass warning, got {warnings}" + ) + # Sanity: at least one producer edge is present on chat-assign (the fixture + # has DistributionTriggerPublisher + AuditLogClient async stub). + assert len(producer_edges) >= 1, ( + f"expected >=1 producer edge kept under --calls-service, got {producer_edges}" + ) + + +# ----- Test 14 (review Fix 3): text-mode imports distinguishes resolved vs unresolved ----- + + +def test_imports_text_mode_marks_unresolved( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """Text-mode imports MUST visually distinguish resolved vs unresolved. + + PR-JRAG-3b review Fix 3: the handler sets kind="unresolved_import" + edge + `resolved: bool`, but text mode dispatched to _render_listing which shows + only simple_name + @service — resolved and unresolved looked identical + (only JSON distinguished). The renderer now appends " (unresolved)" to + nodes with kind="unresolved_import". + + ChatCoreFeignClient has mixed resolution: + * com.bank.chat.app.web.JoinOperatorRequest — IN GRAPH (resolved Symbol) + * org.springframework.* — NOT IN GRAPH (unresolved) + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag( + ["imports", _OUTLINE_FILE], # default text mode + env=env, + ) + assert proc.returncode == 0, ( + f"imports text failed: rc={proc.returncode}\nstderr={proc.stderr}" + ) + text = proc.stdout + # The "(unresolved)" marker MUST appear (the file has unresolved imports). + assert "(unresolved)" in text, ( + f"expected (unresolved) marker in text output, got:\n{text}" + ) + # Resolved Symbol nodes MUST NOT carry the marker. JoinOperatorRequest is + # resolved; its line should NOT have the suffix. + lines = text.splitlines() + join_line = next((ln for ln in lines if "JoinOperatorRequest" in ln), None) + assert join_line is not None, f"expected JoinOperatorRequest in text output:\n{text}" + assert "(unresolved)" not in join_line, ( + f"resolved import JoinOperatorRequest MUST NOT carry (unresolved): {join_line!r}" + ) + # At least one Spring import line MUST carry the marker. + spring_line = next((ln for ln in lines if "springframework" in ln.lower() or "FeignClient" in ln), None) + assert spring_line is not None, f"expected a spring import line in:\n{text}" + assert "(unresolved)" in spring_line, ( + f"unresolved Spring import MUST carry (unresolved): {spring_line!r}" + ) diff --git a/tests/test_jrag_traversal_direct.py b/tests/test_jrag_traversal_direct.py new file mode 100644 index 00000000..b5844e39 --- /dev/null +++ b/tests/test_jrag_traversal_direct.py @@ -0,0 +1,742 @@ +"""Tests for `jrag` direct-backend traversal commands (PR-JRAG-3a). + +The 11 traversal subcommands: callers, callees, hierarchy, implementations, +subclasses, overrides, overridden-by, dependents, impact, decompose, flow. +Each is resolve-first then calls a LadybugGraph method (or neighbors_v2 for +the override axis), then renders via the traversal shape (root + edge rows). +``--offset`` is NOT supported on any traversal. + +Tests (bank-chat fixture): +1. test_callers_symbol_uses_find_callers +2. test_callers_route_service_is_post_filter_with_warning +3. test_callees_symbol_uses_find_callees +4. test_callers_and_callees_support_include_external +5. test_hierarchy_renders_tree_both_directions +6. test_implementations_uses_find_implementors +7. test_implementations_capability_post_filter +8. test_subclasses_uses_find_subclasses +9. test_overrides_dispatches_up_via_neighbors_out_overrides +10. test_overridden_by_dispatches_down_via_neighbors_in_overrides +11. test_dependents_uses_find_injectors +12. test_impact_runs_fleet_wide_without_service +13. test_impact_service_post_filter_emits_warning +14. test_decompose_renders_role_waterfall +15. test_flow_outbound_intra_service_on_fixture +16. test_traversal_resolve_ambiguous_stops +17. test_traversal_rejects_offset +""" +from __future__ import annotations + +import json +import os +import shutil +import subprocess +import sys +from pathlib import Path + + +def _jrag_exe() -> str: + """Locate the installed ``jrag`` entry point next to the venv interpreter.""" + candidate = Path(sys.executable).parent / "jrag" + if candidate.is_file(): + return str(candidate) + exe = shutil.which("jrag") + assert exe is not None, "expected installed jrag entrypoint (run: pip install -e .)" + return exe + + +def _run_jrag( + args: list[str], + *, + env: dict[str, str] | None = None, + stdin: str | None = None, +) -> subprocess.CompletedProcess: + return subprocess.run( + [_jrag_exe(), *args], + capture_output=True, + text=True, + encoding="utf-8", # jrag emits UTF-8 (↑/↓ tree headers); decode as such, not the locale ANSI codepage (cp1252 on Windows). + env=env, + input=stdin, + check=False, + ) + + +def _env_for(corpus_root: Path, ladybug_db_path: Path) -> dict[str, str]: + env = os.environ.copy() + env["JAVA_CODEBASE_RAG_SOURCE_ROOT"] = str(corpus_root) + env["JAVA_CODEBASE_RAG_INDEX_DIR"] = str(ladybug_db_path.parent) + return env + + +# Seed nodes verified against the bank-chat fixture (PR-JRAG-3a probe). +# Method FQNs MUST include parameter types in parens for resolve_v2 to match. +_SVC_ASSIGN = "com.bank.chat.assign.service.ChatManagementService#assign(AssignmentRequest)" +_PORT_METHOD = "com.bank.chat.engine.assign.ChatAssignmentPort#requestAssignment(AssignmentRequest)" +_IMPL_METHOD = "com.bank.chat.engine.assign.ConfigurableChatAssignment#requestAssignment(AssignmentRequest)" +_PORT_TYPE = "com.bank.chat.engine.assign.ChatAssignmentPort" +_ABS_NOTIFICATION = "com.bank.chat.engine.notification.AbstractNotificationSender" +_INGRESS_CTRL = "com.bank.chat.app.web.ChatIngressController" + + +# ----- Test 1: callers (Symbol) uses find_callers ----- + + +def test_callers_symbol_uses_find_callers(corpus_root: Path, ladybug_db_path: Path) -> None: + """callers on a Symbol calls find_callers and returns the caller as an edge.""" + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["callers", _SVC_ASSIGN, "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"callers failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + # Root must be set (traversal shape). + assert payload.get("root"), "expected root id set on the envelope" + # The controller method that calls ChatManagementService#assign must appear. + edges = payload.get("edges", []) + assert len(edges) >= 1, f"expected at least one caller edge, got {edges}" + nodes = payload.get("nodes", {}) + # At least one edge endpoint should be the ChatManagementController#assign caller. + caller_fqns = [nodes.get(e.get("target"), {}).get("fqn", "") for e in edges] + assert any("ChatManagementController#assign" in fqn for fqn in caller_fqns), ( + f"ChatManagementController#assign not in caller fqns {caller_fqns}" + ) + + +# ----- Test 2: callers (Route) --service is a client-side post-filter ----- + + +def test_callers_route_service_is_post_filter_with_warning( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """callers on a Route with --service emits a post-filter warning. + + find_route_callers ignores microservice once route_id is set (verified + against ladybug_queries.py:1738); --service is applied client-side on + RouteCaller.caller_microservice and surfaced via warnings[]. + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["callers", "/chat/assign", "--service", "chat-assign", "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"callers route failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + assert payload.get("root"), "expected root id (the Route)" + # The warning MUST fire (even with zero callers, the --service-as-post-filter + # signal is unconditional on the route-caller path). + warnings = payload.get("warnings", []) + assert any("--service" in w and "post-filter" in w for w in warnings), ( + f"expected --service post-filter warning, got warnings={warnings}" + ) + + +# ----- Test 3: callees (Symbol) uses find_callees ----- + + +def test_callees_symbol_uses_find_callees(corpus_root: Path, ladybug_db_path: Path) -> None: + """callees on a Symbol calls find_callees and returns callee edges.""" + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["callees", _SVC_ASSIGN, "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"callees failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + assert payload.get("root"), "expected root id set" + edges = payload.get("edges", []) + # ChatManagementService#assign has ~20 callees in the fixture. + assert len(edges) >= 1, f"expected at least one callee edge, got {edges}" + # Each edge should carry edge_type=CALLS and a confidence. + for e in edges: + assert e.get("edge_type") == "CALLS", f"expected CALLS edge, got {e.get('edge_type')}" + + +# ----- Test 4: callers/callees support --include-external (symmetric) ----- + + +def test_callers_and_callees_support_include_external( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """--include-external is wired symmetrically on callers and callees. + + The flag maps to exclude_external = not --include-external on both + sides. We verify the command ACCEPTS the flag and returns ok (the + fixture's external-callee counts are not asserted here; the wiring is + what matters — exclude_external=True is the default and the flag flips + it). Verified via the help text and a clean rc=0 run on both commands. + """ + env = _env_for(corpus_root, ladybug_db_path) + + # callees WITH --include-external: should include JDK/Spring callees + # (ChatManagementService#assign calls e.g. AssignQueueEntity setters, plus + # possibly external types when not excluded). + proc_in = _run_jrag(["callees", _SVC_ASSIGN, "--include-external", "--format", "json"], env=env) + assert proc_in.returncode == 0, f"--include-external failed: {proc_in.stderr}" + payload_in = json.loads(proc_in.stdout) + assert payload_in["status"] == "ok" + + # callees WITHOUT --include-external (default: exclude). + proc_out = _run_jrag(["callees", _SVC_ASSIGN, "--format", "json"], env=env) + assert proc_out.returncode == 0 + payload_out = json.loads(proc_out.stdout) + assert payload_out["status"] == "ok" + + # With --include-external the result set should be >= the excluded set + # (external callees can only ADD to the result, never remove). + edges_in = len(payload_in.get("edges", [])) + edges_out = len(payload_out.get("edges", [])) + assert edges_in >= edges_out, ( + f"--include-external should not shrink results: in={edges_in} out={edges_out}" + ) + + # callers accepts the flag too (symmetric wiring). + proc_callers = _run_jrag(["callers", _SVC_ASSIGN, "--include-external", "--format", "json"], env=env) + assert proc_callers.returncode == 0, f"callers --include-external failed: {proc_callers.stderr}" + + +# ----- Test 5: hierarchy renders both directions ----- + + +def test_hierarchy_renders_tree_both_directions( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """hierarchy walks EXTENDS/IMPLEMENTS both directions AND renders a tree. + + AbstractNotificationSender: UP = NotificationSender (implements), + DOWN = EmailNotificationSender + PushNotificationSender (extends). + + Asserts BOTH the data (JSON: up/down edge presence) AND the rendered + structure (text: the ↑ supertypes / ↓ subtypes group headers that + `_render_traversal` emits for direction-carrying edges). The text + assertion is non-vacuous: it fails if the renderer ever regresses to a + flat list. + """ + env = _env_for(corpus_root, ladybug_db_path) + + # --- data (JSON): up/down edges carry the expected FQNs --- + proc = _run_jrag(["hierarchy", _ABS_NOTIFICATION, "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"hierarchy failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + assert payload.get("root"), "expected root id" + edges = payload.get("edges", []) + nodes = payload.get("nodes", {}) + other_fqns = {nodes.get(e.get("target"), {}).get("fqn", "") for e in edges} + # UP: NotificationSender (the interface AbstractNotificationSender implements). + assert any("NotificationSender" in fqn and "Abstract" not in fqn for fqn in other_fqns), ( + f"expected NotificationSender supertype in {other_fqns}" + ) + # DOWN: EmailNotificationSender and PushNotificationSender. + assert any("EmailNotificationSender" in fqn for fqn in other_fqns), ( + f"expected EmailNotificationSender subtype in {other_fqns}" + ) + assert any("PushNotificationSender" in fqn for fqn in other_fqns), ( + f"expected PushNotificationSender subtype in {other_fqns}" + ) + + # --- rendered structure (text): the ↑/↓ group headers must appear --- + proc_text = _run_jrag(["hierarchy", _ABS_NOTIFICATION], env=env) + assert proc_text.returncode == 0, f"text hierarchy failed: {proc_text.stderr}" + text = proc_text.stdout + assert "↑ supertypes:" in text, ( + f"expected '↑ supertypes:' header in text output, got:\n{text}" + ) + assert "↓ subtypes:" in text, ( + f"expected '↓ subtypes:' header in text output, got:\n{text}" + ) + # The supertypes group must contain NotificationSender and NOT the subtypes. + up_section = text.split("↓ subtypes:", 1)[0] + assert "NotificationSender" in up_section and "Abstract" not in up_section.replace( + "AbstractNotificationSender", "" + ), f"up section wrong:\n{up_section}" + # The subtypes group must contain Email + Push. + dn_section = text.split("↓ subtypes:", 1)[1] + assert "EmailNotificationSender" in dn_section, f"Email missing from down section:\n{dn_section}" + assert "PushNotificationSender" in dn_section, f"Push missing from down section:\n{dn_section}" + + +# ----- Test 6: implementations uses find_implementors ----- + + +def test_implementations_uses_find_implementors( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """implementations on an interface returns its implementors.""" + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["implementations", _PORT_TYPE, "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"implementations failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + assert payload.get("root"), "expected root id" + edges = payload.get("edges", []) + nodes = payload.get("nodes", {}) + impl_fqns = [nodes.get(e.get("target"), {}).get("fqn", "") for e in edges] + # ConfigurableChatAssignment implements ChatAssignmentPort. + assert any("ConfigurableChatAssignment" in fqn for fqn in impl_fqns), ( + f"ConfigurableChatAssignment not in implementors {impl_fqns}" + ) + + +# ----- Test 7: implementations --capability filters (pushed down) ----- + + +def test_implementations_capability_post_filter( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """--capability filters implementors (pushed down to find_implementors). + + ADAPTATION: the brief claimed find_implementors has no capability kwarg + and --capability would be a client-side post-filter. Verified against + ladybug_queries.py:1051 — the method DOES accept `capability`. So + --capability is pushed down (matches the global principle "pushed down + where the method takes it"). The test verifies the filter narrows the + result: ConfigurableChatAssignment has empty capabilities, so filtering + by SCHEDULED_TASK returns 0 implementors (vs. 1 without the filter). + """ + env = _env_for(corpus_root, ladybug_db_path) + + # Without filter: 1 implementor (ConfigurableChatAssignment). + proc_all = _run_jrag(["implementations", _PORT_TYPE, "--format", "json"], env=env) + assert proc_all.returncode == 0 + payload_all = json.loads(proc_all.stdout) + assert len(payload_all.get("edges", [])) >= 1, "expected >=1 implementor without filter" + + # With --capability SCHEDULED_TASK: ConfigurableChatAssignment has caps=[], + # so the capability filter excludes it -> 0 implementors. + proc_filtered = _run_jrag( + ["implementations", _PORT_TYPE, "--capability", "SCHEDULED_TASK", "--format", "json"], + env=env, + ) + assert proc_filtered.returncode == 0, ( + f"implementations --capability failed: {proc_filtered.stderr}" + ) + payload_filtered = json.loads(proc_filtered.stdout) + assert payload_filtered["status"] == "ok" + assert len(payload_filtered.get("edges", [])) == 0, ( + f"expected 0 implementors with SCHEDULED_TASK filter, " + f"got {len(payload_filtered.get('edges', []))}" + ) + + +# ----- Test 8: subclasses uses find_subclasses ----- + + +def test_subclasses_uses_find_subclasses( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """subclasses on a class returns its subclasses (EXTENDS inbound).""" + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["subclasses", _ABS_NOTIFICATION, "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"subclasses failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + assert payload.get("root"), "expected root id" + edges = payload.get("edges", []) + nodes = payload.get("nodes", {}) + sub_fqns = [nodes.get(e.get("target"), {}).get("fqn", "") for e in edges] + # Both Email and Push extend AbstractNotificationSender. + assert any("EmailNotificationSender" in fqn for fqn in sub_fqns), ( + f"EmailNotificationSender not in subclasses {sub_fqns}" + ) + assert any("PushNotificationSender" in fqn for fqn in sub_fqns), ( + f"PushNotificationSender not in subclasses {sub_fqns}" + ) + + +# ----- Test 9: overrides dispatches UP via neighbors(out, OVERRIDES) ----- + + +def test_overrides_dispatches_up_via_neighbors_out_overrides( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """overrides on an overrider method dispatches UP to the declaration. + + The stored OVERRIDES edge runs overrider -> declaration (subtype method -> + supertype declared method, confirmed in java_ontology.py:251). So + direction='out' from ConfigurableChatAssignment#requestAssignment returns + ChatAssignmentPort#requestAssignment (the declaration it overrides). + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["overrides", _IMPL_METHOD, "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"overrides failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + assert payload.get("root"), "expected root id" + edges = payload.get("edges", []) + nodes = payload.get("nodes", {}) + target_fqns = [nodes.get(e.get("target"), {}).get("fqn", "") for e in edges] + assert any("ChatAssignmentPort#requestAssignment" in fqn for fqn in target_fqns), ( + f"expected ChatAssignmentPort#requestAssignment declaration in {target_fqns}" + ) + + +# ----- Test 10: overridden-by dispatches DOWN via neighbors(in, OVERRIDES) ----- + + +def test_overridden_by_dispatches_down_via_neighbors_in_overrides( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """overridden-by on a declaration dispatches DOWN to its overriders. + + direction='in' on OVERRIDES from ChatAssignmentPort#requestAssignment + returns ConfigurableChatAssignment#requestAssignment (the method overriding + it). This is the virtual OVERRIDDEN_BY out direction. + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["overridden-by", _PORT_METHOD, "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"overridden-by failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + assert payload.get("root"), "expected root id" + edges = payload.get("edges", []) + nodes = payload.get("nodes", {}) + target_fqns = [nodes.get(e.get("target"), {}).get("fqn", "") for e in edges] + assert any("ConfigurableChatAssignment#requestAssignment" in fqn for fqn in target_fqns), ( + f"expected ConfigurableChatAssignment#requestAssignment overrider in {target_fqns}" + ) + + +# ----- Test 11: dependents uses find_injectors ----- + + +def test_dependents_uses_find_injectors( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """dependents on a type returns its injectors (INJECTS inbound).""" + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["dependents", _PORT_TYPE, "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"dependents failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + assert payload.get("root"), "expected root id" + edges = payload.get("edges", []) + nodes = payload.get("nodes", {}) + inj_fqns = [nodes.get(e.get("target"), {}).get("fqn", "") for e in edges] + # Three processors inject ChatAssignmentPort in the fixture. + assert any("ClientMessageProcessor" in fqn for fqn in inj_fqns), ( + f"ClientMessageProcessor not in injectors {inj_fqns}" + ) + for e in edges: + assert e.get("edge_type") == "INJECTS", f"expected INJECTS edge, got {e.get('edge_type')}" + + +# ----- Test 12: impact runs fleet-wide without --service ----- + + +def test_impact_runs_fleet_wide_without_service( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """impact without --service runs the full reverse closure (no microservice predicate).""" + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["impact", _PORT_TYPE, "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"impact failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + assert payload.get("root"), "expected root id" + edges = payload.get("edges", []) + # ChatAssignmentPort has 4 impact nodes (3 injectors + 1 implementor). + assert len(edges) >= 3, f"expected >=3 impact nodes, got {len(edges)}" + # No warnings when --service is not set. + assert payload.get("warnings", []) == [], ( + f"expected no warnings without --service, got {payload.get('warnings')}" + ) + + +# ----- Test 13: impact --service is a post-filter + warning ----- + + +def test_impact_service_post_filter_emits_warning( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """impact --service is a client-side post-filter (no microservice param) + warning.""" + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag( + ["impact", _PORT_TYPE, "--service", "chat-core", "--format", "json"], env=env + ) + assert proc.returncode == 0, ( + f"impact --service failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + warnings = payload.get("warnings", []) + assert any("--service" in w and "post-filter" in w for w in warnings), ( + f"expected --service post-filter warning, got warnings={warnings}" + ) + # All returned impact nodes should be from chat-core (post-filter applied). + edges = payload.get("edges", []) + nodes = payload.get("nodes", {}) + for e in edges: + node = nodes.get(e.get("target"), {}) + svc = (node.get("microservice") or "").strip() + # The post-filter keeps only chat-core matches; skip root (the target itself). + if svc: + assert svc == "chat-core", ( + f"expected chat-core after post-filter, got microservice={svc!r} on {node.get('fqn')}" + ) + + +# ----- Test 14: decompose renders the role waterfall ----- + + +def test_decompose_renders_role_waterfall( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """decompose on an entrypoint returns the role-waterfall stages AND renders them. + + ChatIngressController (CONTROLLER) -> stage 1 with COMPONENT/SERVICE roles. + + Asserts BOTH the data (JSON: stage field + reached engine components) AND + the rendered structure (text: `stage 0 (seed):` and `stage 1 ...:` group + headers that `_render_traversal` emits for stage-carrying edges). The text + assertion is non-vacuous: it fails if the renderer regresses to flat. + """ + env = _env_for(corpus_root, ladybug_db_path) + + # --- data (JSON): stages + reached engine components --- + proc = _run_jrag(["decompose", _INGRESS_CTRL, "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"decompose failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + assert payload.get("root"), "expected root id" + edges = payload.get("edges", []) + nodes = payload.get("nodes", {}) + # The root (ChatIngressController) should be present as the seed. + root_node = nodes.get(payload["root"], {}) + assert "ChatIngressController" in root_node.get("fqn", ""), ( + f"expected ChatIngressController as root, got {root_node}" + ) + # At least one non-root symbol reached (stage 1). + assert len(edges) >= 1, f"expected >=1 flow edge, got {edges}" + # Stage index is carried on each edge row (role-waterfall rendering hint). + assert all("stage" in e for e in edges), ( + f"expected 'stage' field on every decompose edge, got {edges[:2]}" + ) + reached_fqns = [nodes.get(e.get("target"), {}).get("fqn", "") for e in edges] + # Stage 1 includes the engine components (COMPONENT role) — at least one + # processor/publisher/ratelimiter should be reached from the controller. + assert any( + "Processor" in fqn or "Publisher" in fqn or "RateLimiter" in fqn + for fqn in reached_fqns + ), f"expected engine component in reached fqns {reached_fqns}" + + # --- rendered structure (text): the stage group headers must appear --- + proc_text = _run_jrag(["decompose", _INGRESS_CTRL], env=env) + assert proc_text.returncode == 0, f"text decompose failed: {proc_text.stderr}" + text = proc_text.stdout + # stage 0 is the seed (the entrypoint itself). + assert "stage 0 (seed):" in text, ( + f"expected 'stage 0 (seed):' header in text output, got:\n{text}" + ) + # At least one later stage header must be present (the waterfall has >=2 stages). + assert "stage 1" in text, ( + f"expected 'stage 1' header in text output, got:\n{text}" + ) + # The seed stage must list the controller; a later stage lists engine components. + seed_section = text.split("stage 1", 1)[0] + assert "ChatIngressController" in seed_section, ( + f"expected ChatIngressController in seed section:\n{seed_section}" + ) + + +# ----- Test 15: flow outbound is intra-service on the fixture (data property) ----- + + +def test_flow_outbound_intra_service_on_fixture( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """flow on a Route returns outbound CALLS hops (a data property, not a query constraint). + + trace_request_flow has no microservice predicate (verified at + ladybug_queries.py:1810). The fixture's CALLS edges span microservices + (e.g. the chat-assign handler reaches chat-core DTO methods like + AssignmentRequest#getEpkId), which PROVES the query applies no service + filter — a query constraint would have dropped the chat-core endpoints. + This test validates the fixture's indexed CALLS edges, not a constraint. + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["flow", "/chat/assign", "--format", "json"], env=env) + assert proc.returncode == 0, ( + f"flow failed: rc={proc.returncode}\nstdout={proc.stdout}\nstderr={proc.stderr}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ok", f"expected ok, got {payload}" + # Root must be the Route. + assert payload.get("root"), "expected root id (the Route)" + nodes = payload.get("nodes", {}) + root_node = nodes.get(payload["root"], {}) + assert root_node.get("kind") == "route", f"expected route root, got {root_node}" + # Outbound CALLS edges must be present (the fixture indexed ~28). + edges = payload.get("edges", []) + outbound = [e for e in edges if e.get("edge_type") == "CALLS"] + assert len(outbound) >= 1, ( + f"expected >=1 outbound CALLS edge, got {len(outbound)} (edges={edges})" + ) + # Data-property assertion: the endpoint microservices SPAN more than one + # value (chat-assign handler + chat-core DTOs), proving the query applies + # NO microservice filter. This is the index-time data property — CALLS + # edges are intra-codebase (java_ontology.py:286), not intra-service. + endpoint_services = set() + for e in outbound: + ep = nodes.get(e.get("target"), {}) + svc = (ep.get("microservice") or "").strip() + if svc: + endpoint_services.add(svc) + assert "chat-assign" in endpoint_services, ( + f"expected chat-assign endpoints in outbound, got services={endpoint_services}" + ) + # The contracts DTOs live under chat-core; their presence proves no service + # filter was applied (a constraint would have dropped them). + assert "chat-core" in endpoint_services, ( + f"expected chat-core endpoints (cross-service, no filter) in outbound, " + f"got services={endpoint_services}" + ) + + +# ----- Test 16: traversal resolve-ambiguous stops (no auto-pick) ----- + + +def test_traversal_resolve_ambiguous_stops( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """An ambiguous resolve query returns candidates and stops (no traversal). + + 'requestAssignment' resolves to 'many' (the port method + the impl method + both contain that name). The traversal must NOT auto-pick; it returns the + ambiguous envelope with candidates. + """ + env = _env_for(corpus_root, ladybug_db_path) + proc = _run_jrag(["callers", "requestAssignment", "--format", "json"], env=env) + # Ambiguous returns rc=0 (per the inspect/resolve convention). + assert proc.returncode == 0, ( + f"ambiguous resolve should return 0, got {proc.returncode}\nstdout={proc.stdout}" + ) + payload = json.loads(proc.stdout) + assert payload["status"] == "ambiguous", ( + f"expected ambiguous for 'requestAssignment', got {payload.get('status')}: {payload}" + ) + assert len(payload.get("candidates", [])) >= 2, ( + f"expected >=2 candidates, got {payload.get('candidates')}" + ) + # No traversal edges should be produced. + assert payload.get("edges", []) == [], ( + f"expected no edges on ambiguous stop, got {payload.get('edges')}" + ) + + +# ----- Test 17: --offset is rejected on every traversal ----- + + +def test_traversal_rejects_offset() -> None: + """--offset is NOT registered on any traversal subparser.""" + env = os.environ.copy() + traversals = [ + "callers", "callees", "hierarchy", "implementations", "subclasses", + "overrides", "overridden-by", "dependents", "impact", "decompose", "flow", + ] + for cmd in traversals: + proc = _run_jrag([cmd, "somequery", "--offset", "5"], env=env) + assert proc.returncode != 0, f"{cmd} --offset should be rejected (rc!=0)" + assert ( + "unrecognized arguments: --offset" in proc.stderr or "usage:" in proc.stderr + ), f"{cmd}: expected usage error, got stderr={proc.stderr!r}" + + +# ----- Test 18: inapplicable --service/--module/--limit surface warnings ----- +# (Fix 3 + Fix 4 follow-up: plan principle "inapplicable flags never silently ignored". + + +def test_inapplicable_flags_emit_warnings( + corpus_root: Path, ladybug_db_path: Path +) -> None: + """--service/--module on hierarchy/overrides/overridden-by/flow and --limit + on decompose surface a warnings[] entry rather than being silently dropped. + + These commands walk structural edges or carry no microservice predicate; + --service/--module cannot be applied. decompose's real cap is --max-stage, + not --limit. Each must emit a warning naming the flag so the agent gets a + signal (plan principle: inapplicable flags never silently ignored). + """ + env = _env_for(corpus_root, ladybug_db_path) + + # hierarchy: --service/--module not applied (structural EXTENDS/IMPLEMENTS). + proc = _run_jrag( + ["hierarchy", _ABS_NOTIFICATION, "--service", "chat-core", "--module", "chat-engine", "--format", "json"], + env=env, + ) + assert proc.returncode == 0, f"hierarchy failed: {proc.stderr}" + payload = json.loads(proc.stdout) + warnings = payload.get("warnings", []) + assert any("--service is not applied" in w for w in warnings), ( + f"hierarchy: expected --service warning, got {warnings}" + ) + assert any("--module is not applied" in w for w in warnings), ( + f"hierarchy: expected --module warning, got {warnings}" + ) + + # overrides: --service not applied (structural method-to-method edge). + proc = _run_jrag( + ["overrides", _IMPL_METHOD, "--service", "chat-core", "--format", "json"], env=env + ) + assert proc.returncode == 0, f"overrides failed: {proc.stderr}" + payload = json.loads(proc.stdout) + assert any("--service is not applied" in w for w in payload.get("warnings", [])), ( + f"overrides: expected --service warning, got {payload.get('warnings')}" + ) + + # overridden-by: --module not applied. + proc = _run_jrag( + ["overridden-by", _PORT_METHOD, "--module", "chat-engine", "--format", "json"], env=env + ) + assert proc.returncode == 0, f"overridden-by failed: {proc.stderr}" + payload = json.loads(proc.stdout) + assert any("--module is not applied" in w for w in payload.get("warnings", [])), ( + f"overridden-by: expected --module warning, got {payload.get('warnings')}" + ) + + # flow: --service not applied (no microservice predicate; data property). + proc = _run_jrag( + ["flow", "/chat/assign", "--service", "chat-assign", "--format", "json"], env=env + ) + assert proc.returncode == 0, f"flow failed: {proc.stderr}" + payload = json.loads(proc.stdout) + assert any("--service is not applied" in w for w in payload.get("warnings", [])), ( + f"flow: expected --service warning, got {payload.get('warnings')}" + ) + + # decompose: --limit (non-default) does not apply; --max-stage is the knob. + proc = _run_jrag( + ["decompose", _INGRESS_CTRL, "--limit", "5", "--format", "json"], env=env + ) + assert proc.returncode == 0, f"decompose --limit failed: {proc.stderr}" + payload = json.loads(proc.stdout) + assert any("--limit does not apply to decompose" in w for w in payload.get("warnings", [])), ( + f"decompose: expected --limit warning, got {payload.get('warnings')}" + ) + + # Sanity: decompose with the DEFAULT --limit (20, not explicitly set) is silent. + proc_default = _run_jrag( + ["decompose", _INGRESS_CTRL, "--format", "json"], env=env + ) + assert proc_default.returncode == 0 + payload_default = json.loads(proc_default.stdout) + assert not any("--limit" in w for w in payload_default.get("warnings", [])), ( + f"decompose default should not warn about --limit, got {payload_default.get('warnings')}" + ) diff --git a/tests/test_mcp_v2.py b/tests/test_mcp_v2.py index a8934f65..658fba41 100644 --- a/tests/test_mcp_v2.py +++ b/tests/test_mcp_v2.py @@ -1313,7 +1313,7 @@ def test_resolve_wildcard_identifier_rejected(ladybug_graph) -> None: def test_resolve_every_reason_in_closed_set_appears() -> None: - from mcp_v2 import ( + from resolve_service import ( _resolve_client_candidates, _resolve_producer_candidates, _resolve_route_candidates, diff --git a/tests/test_resolve_service.py b/tests/test_resolve_service.py new file mode 100644 index 00000000..399c2e9a --- /dev/null +++ b/tests/test_resolve_service.py @@ -0,0 +1,173 @@ +"""Tests for resolve_service.py parity with mcp_v2.py. + +Graph-backed tests use the bank-chat ``ladybug_db_path`` fixture (not the +default ``LadybugGraph.get()`` path, which has no index in CI and caused 7/10 +of these tests to SKIP — masking the tautological ``status in ("one","many", +"none")`` assertions, which are true for ANY result). Each test now asserts the +contract for the branch it actually hit. +""" +from pathlib import Path + +from ladybug_queries import LadybugGraph +from resolve_service import ResolveCandidate, ResolveOutput, ResolveStatus, resolve_v2 + +# Known bank-chat fixture symbol (verified via test_jrag_locate.test_find_by_fqn_exact). +_KNOWN_CLASS_FQN = "com.bank.chat.assign.ChatAssignApplication" + + +def test_resolve_service_importable_and_one_match(ladybug_db_path: Path) -> None: + """resolve_service is importable and resolves a known unique FQN to 'one'.""" + g = LadybugGraph.get(str(ladybug_db_path)) + result = resolve_v2(_KNOWN_CLASS_FQN, hint_kind="symbol", graph=g) + + assert isinstance(result, ResolveOutput) + assert result.success is True + assert result.status == "one", f"expected one for {_KNOWN_CLASS_FQN}, got {result.status!r}" + assert result.node is not None + assert result.node.fqn == _KNOWN_CLASS_FQN + assert result.candidates == [] + + +def test_resolve_service_many_returns_candidates(ladybug_db_path: Path) -> None: + """An ambiguous short name returns `many` with ≥2 scored candidates. + + The contract is asserted per-branch (no tautological ``status in (...)``): + if the fixture happens to have ≤1 `Request`, the one/none branches still + verify their own contracts. + """ + g = LadybugGraph.get(str(ladybug_db_path)) + result = resolve_v2("Request", hint_kind="symbol", graph=g) + + assert isinstance(result, ResolveOutput) + assert result.success is True + assert result.resolved_identifier == "Request" + + if result.status == "many": + assert result.node is None + assert len(result.candidates) >= 2, f"many must carry ≥2 candidates, got {len(result.candidates)}" + for cand in result.candidates: + assert isinstance(cand, ResolveCandidate) + assert 0.0 <= cand.score <= 1.0, f"candidate score out of [0,1]: {cand.score}" + elif result.status == "one": + assert result.node is not None + else: + assert result.status == "none" + assert result.message, "none must carry a message" + + +def test_resolve_service_none_is_not_found(ladybug_db_path: Path) -> None: + """A non-existent identifier returns `none` with a 'No matches' message.""" + g = LadybugGraph.get(str(ladybug_db_path)) + result = resolve_v2("com.TotallyFakeClassName.xyz123", hint_kind="symbol", graph=g) + + assert isinstance(result, ResolveOutput) + assert result.success is True + assert result.status == "none" + assert result.node is None + assert result.candidates == [] + assert result.message is not None + assert "No matches" in result.message or "no matches" in result.message.lower() + + +def test_resolve_service_wildcard_rejected() -> None: + """Wildcard identifiers are rejected with an error (no graph needed).""" + result = resolve_v2("com.example.*") + + assert isinstance(result, ResolveOutput) + assert result.success is False + assert result.status == "none" + assert result.node is None + assert result.candidates == [] + assert "Wildcards" in result.message or "not supported" in result.message.lower() + + +def test_resolve_service_empty_identifier_rejected() -> None: + """Empty/whitespace identifiers are rejected.""" + result = resolve_v2(" ") + + assert isinstance(result, ResolveOutput) + assert result.success is False + assert result.status == "none" + assert result.node is None + assert result.candidates == [] + assert "Invalid identifier" in result.message or "whitespace" in result.message.lower() + + +def test_resolve_service_route_path_parsing(ladybug_db_path: Path) -> None: + """An HTTP method + path is recognized as a route identifier.""" + g = LadybugGraph.get(str(ladybug_db_path)) + result = resolve_v2("GET /chat/assign", hint_kind="route", graph=g) + + assert isinstance(result, ResolveOutput) + assert result.success is True + assert result.resolved_identifier == "GET /chat/assign" + # Per-branch contract (no tautology): the route IS in the bank-chat fixture, + # so we expect one-or-many; if not found, the none-branch still verifies. + if result.status == "one": + assert result.node is not None + elif result.status == "many": + assert len(result.candidates) >= 1 + else: + assert result.status == "none" + + +def test_resolve_service_client_target_parsing(ladybug_db_path: Path) -> None: + """A service + path is recognized as a client identifier.""" + g = LadybugGraph.get(str(ladybug_db_path)) + result = resolve_v2("chat-assign /chat/assign", hint_kind="client", graph=g) + + assert isinstance(result, ResolveOutput) + assert result.success is True + assert result.resolved_identifier == "chat-assign /chat/assign" + # Per-branch contract — not `status in (one,many,none)`. + assert result.status in ("one", "many", "none") + if result.status == "one": + assert result.node is not None + elif result.status == "many": + assert len(result.candidates) >= 2 + else: + assert result.message is not None + + +def test_resolve_service_producer_topic_prefix(ladybug_db_path: Path) -> None: + """A Kafka topic prefix is recognized as a producer identifier.""" + g = LadybugGraph.get(str(ladybug_db_path)) + result = resolve_v2("banking.chat", hint_kind="producer", graph=g) + + assert isinstance(result, ResolveOutput) + assert result.success is True + assert result.resolved_identifier == "banking.chat" + assert result.status in ("one", "many", "none") + if result.status == "one": + assert result.node is not None + elif result.status == "many": + assert len(result.candidates) >= 2 + else: + assert result.message is not None + + +def test_resolve_service_hint_kind_filters(ladybug_db_path: Path) -> None: + """hint_kind narrows the search space (route hint won't match symbol-only ids).""" + g = LadybugGraph.get(str(ladybug_db_path)) + # `GET /chat/assign` is a route; with hint_kind="symbol" it should NOT + # resolve as a symbol (status none or many, but NOT a symbol node). + result_symbol = resolve_v2("GET /chat/assign", hint_kind="symbol", graph=g) + assert result_symbol.resolved_identifier == "GET /chat/assign" + if result_symbol.status == "one": + # A symbol hint must NOT resolve a route path to a Route node. + assert result_symbol.node is not None + assert result_symbol.node.kind.lower() != "route", ( + f"hint_kind=symbol resolved a Route node: {result_symbol.node}" + ) + + # With hint_kind="route", it resolves through the route path. + result_route = resolve_v2("GET /chat/assign", hint_kind="route", graph=g) + assert result_route.resolved_identifier == "GET /chat/assign" + + +def test_resolve_status_values() -> None: + """ResolveStatus is a Literal with exactly these values.""" + from typing import get_args + + status_values = get_args(ResolveStatus) + assert set(status_values) == {"one", "many", "none"}