diff --git a/AGENTS.md b/AGENTS.md index 1c5379b9..77384229 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -39,7 +39,7 @@ when needed. operator guide for the `java-codebase-rag` CLI (`init` / `increment` / `reprocess` / `erase`, `meta`, `tables`, `diagnose-ignore`, `analyze-pr`; hidden `refresh` alias → `reprocess` — see that doc). -- `CODEBASE_REQUIREMENTS.md` — Java-repo assumptions and per-file map of +- `docs/CODEBASE_REQUIREMENTS.md` — Java-repo assumptions and per-file map of what to edit when a target tree doesn't match defaults. - `tests/README.md` — testing philosophy. - **`propose/`** — design proposes. **In-flight** proposes are **`*.md` @@ -112,7 +112,7 @@ For any non-trivial change, read the relevant doc first instead of inferring from code: - Behaviour / public surface → `README.md`. -- Brownfield assumptions, role/capability tuning → `CODEBASE_REQUIREMENTS.md`. +- Brownfield assumptions, role/capability tuning → `docs/CODEBASE_REQUIREMENTS.md`. - In-flight design proposes → **`propose/*.md` at the root of `propose/`** (not under `propose/completed/`). **List or search** for current names. - Why current design exists → `propose/completed/` and `plans/completed/`. diff --git a/README.md b/README.md index 0eb1174c..57c333c8 100644 --- a/README.md +++ b/README.md @@ -128,7 +128,7 @@ The operator-facing surface is small: pick an index dir, pick an embedding model | Understand the graph (nodes, edges, capabilities, ranking) | [`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) §3 | | Steer a brownfield Java tree (custom stereotypes, non-Spring stacks) | [`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) §4 | | Control which files the indexer walks | [`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) §5 | -| Check whether your repo fits this tool's assumptions | [`CODEBASE_REQUIREMENTS.md`](./CODEBASE_REQUIREMENTS.md) | +| Check whether your repo fits this tool's assumptions | [`docs/CODEBASE_REQUIREMENTS.md`](./docs/CODEBASE_REQUIREMENTS.md) | --- @@ -158,9 +158,9 @@ Run `java-codebase-rag --help` to list grouped subcommands. Operator playbook wi | [`docs/EDGE-NAVIGATION.md`](./docs/EDGE-NAVIGATION.md) | MCP-traversable edges, directions, dot-key composition. | | [`docs/skills/java-codebase-explore.md`](./docs/skills/java-codebase-explore.md) | Agent exploration skill (strategy, missions, fallbacks); packaged zip [`docs/skills/java-codebase-explore.zip`](./docs/skills/java-codebase-explore.zip) for Perplexity-style hosts. | | [`docs/MANUAL-VERIFICATION-CHECKLIST.md`](./docs/MANUAL-VERIFICATION-CHECKLIST.md) | 7-phase agent-driven verification after indexing your project. | -| [`CODEBASE_REQUIREMENTS.md`](./CODEBASE_REQUIREMENTS.md) | Assumptions about your Java repo + per-file edit map for non-conforming codebases. | +| [`docs/CODEBASE_REQUIREMENTS.md`](./docs/CODEBASE_REQUIREMENTS.md) | Assumptions about your Java repo + per-file edit map for non-conforming codebases. | | [`automation/cursor_propose_only/README.md`](./automation/cursor_propose_only/README.md) | Optional proposal orchestration workflow (single-command autopilot, planning bundles, automated execution/review loops). | -| [`propose/PRODUCT-VISION.md`](./propose/PRODUCT-VISION.md) | Long-term product direction. | +| [`docs/PRODUCT-VISION.md`](./docs/PRODUCT-VISION.md) | Long-term product direction. | --- diff --git a/CODEBASE_REQUIREMENTS.md b/docs/CODEBASE_REQUIREMENTS.md similarity index 99% rename from CODEBASE_REQUIREMENTS.md rename to docs/CODEBASE_REQUIREMENTS.md index 01737872..8442ba20 100644 --- a/CODEBASE_REQUIREMENTS.md +++ b/docs/CODEBASE_REQUIREMENTS.md @@ -173,7 +173,7 @@ and `capabilities`, register inbound routes, and register outbound clients/producers for a given repo via `.java-codebase-rag.yml` at the project root (`role_overrides:`, `route_overrides:`, `http_client_overrides:`, `async_producer_overrides:`) and/or by copying the in-source stubs from -[`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) into your sources: +[`docs/CONFIGURATION.md`](./CONFIGURATION.md) into your sources: - `@CodebaseRole` / `@CodebaseCapability` / `@CodebaseCapabilities` (class-level role + capabilities) — see `docs/CONFIGURATION.md` §4.3. diff --git a/docs/JAVA-CODEBASE-RAG-CLI.md b/docs/JAVA-CODEBASE-RAG-CLI.md index 9c5655d1..80a971be 100644 --- a/docs/JAVA-CODEBASE-RAG-CLI.md +++ b/docs/JAVA-CODEBASE-RAG-CLI.md @@ -226,5 +226,5 @@ Prefer **`java-codebase-rag reprocess --graph-only`** when you only need Kuzu re ## See also - [README.md](../README.md) — env vars, MCP tool table, ignore layout. -- [CODEBASE_REQUIREMENTS.md](../CODEBASE_REQUIREMENTS.md) — repo layout, brownfield, when to rebuild. +- [CODEBASE_REQUIREMENTS.md](./CODEBASE_REQUIREMENTS.md) — repo layout, brownfield, when to rebuild. - [MANUAL-VERIFICATION-CHECKLIST.md](./MANUAL-VERIFICATION-CHECKLIST.md) — phased checks that mix CLI + MCP. diff --git a/propose/PRODUCT-VISION.md b/docs/PRODUCT-VISION.md similarity index 100% rename from propose/PRODUCT-VISION.md rename to docs/PRODUCT-VISION.md diff --git a/reports/call-graph-review.md b/docs/reports/call-graph-review.md similarity index 100% rename from reports/call-graph-review.md rename to docs/reports/call-graph-review.md diff --git a/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md b/docs/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md similarity index 100% rename from reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md rename to docs/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md diff --git a/reports/what-to-borrow-from-cmm.md b/docs/reports/what-to-borrow-from-cmm.md similarity index 100% rename from reports/what-to-borrow-from-cmm.md rename to docs/reports/what-to-borrow-from-cmm.md diff --git a/plans/AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL.md b/plans/completed/AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL.md similarity index 100% rename from plans/AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL.md rename to plans/completed/AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL.md diff --git a/plans/completed/AGENT-PROMPTS-MCP-API-V2.md b/plans/completed/AGENT-PROMPTS-MCP-API-V2.md index bf079cee..2dc4e914 100644 --- a/plans/completed/AGENT-PROMPTS-MCP-API-V2.md +++ b/plans/completed/AGENT-PROMPTS-MCP-API-V2.md @@ -279,7 +279,7 @@ Headline items: 2. `README.md`: delete v1 tool reference; promote "v2 navigation tools (preview)" to primary `### Tool reference`. Keep ops tools listed as "operational — moving to `user-rag` CLI in next release". -3. `propose/PRODUCT-VISION.md`: rewrite v1 example invocations to v2 (per +3. `docs/PRODUCT-VISION.md`: rewrite v1 example invocations to v2 (per propose §11 mapping). 4. Delete `tests/test_mcp_v2_equivalence.py` entirely — v1 no longer exists. 5. Update `tests/test_server.py` (or add if missing) tool-count assertion to: @@ -322,9 +322,9 @@ grep -nE "^### Tool reference" README.md - [ ] `tests/test_mcp_v2_equivalence.py` does not exist. - [ ] README §"Tool reference" lists exactly the 4 v2 tools as primary; ops tools noted as transitional. -- [ ] `propose/PRODUCT-VISION.md` example invocations updated to v2. +- [ ] `docs/PRODUCT-VISION.md` example invocations updated to v2. - [ ] Diff is confined to deliverables in this prompt (`server.py`, `README.md`, - `propose/PRODUCT-VISION.md`, deleted `tests/test_mcp_v2_equivalence.py`, + `docs/PRODUCT-VISION.md`, deleted `tests/test_mcp_v2_equivalence.py`, `tests/test_server.py` or equivalent surface-assertion test), plus narrowly-related test harness/import updates required to make those changes pass. @@ -337,7 +337,7 @@ grep -nE "^### Tool reference" README.md - [`plans/PLAN-MCP-API-V2.md` § PR-V2-3](./PLAN-MCP-API-V2.md#pr-v2-3--delete-v1-navigation-tools) — list of the 18 tools to delete. - [`propose/completed/MCP-API-V2-REDESIGN-PROPOSE.md`](../../propose/completed/MCP-API-V2-REDESIGN-PROPOSE.md) - §11 mapping table — for rewriting `propose/PRODUCT-VISION.md` examples. + §11 mapping table — for rewriting `docs/PRODUCT-VISION.md` examples. - `server.py` history (git log) — to identify each tool's helper-function graveyard. diff --git a/plans/completed/AGENT-PROMPTS-TIER1B.md b/plans/completed/AGENT-PROMPTS-TIER1B.md index 23e2458c..b8e420c3 100644 --- a/plans/completed/AGENT-PROMPTS-TIER1B.md +++ b/plans/completed/AGENT-PROMPTS-TIER1B.md @@ -575,7 +575,7 @@ Concretely: plan §5.3) — two services + a third "ambiguous" controller. - Create `tests/test_call_edge_matching.py` with cases 32–40. - Extend `tests/test_mcp_tools.py` with cases 41–48. -- Flip `propose/PRODUCT-VISION.md` `HTTP_CALLS` / `ASYNC_CALLS` rows +- Flip `docs/PRODUCT-VISION.md` `HTTP_CALLS` / `ASYNC_CALLS` rows from *planned* to *shipped*. ## Out of scope (do NOT touch) @@ -620,7 +620,7 @@ don't ship it. 11. New fixture `tests/fixtures/cross_service_smoke/`. 12. New test file `tests/test_call_edge_matching.py` with cases 32–40. 13. Cases 41–48 added to `tests/test_mcp_tools.py`. -14. `propose/PRODUCT-VISION.md` flipped (planned → shipped). +14. `docs/PRODUCT-VISION.md` flipped (planned → shipped). 15. `README.md` MCP tools section updated. ## Tests @@ -681,7 +681,7 @@ Expected: at least one caller from `chat-assign` with `match='cross_service'`. - [ ] Sentinel greps return expected results. - [ ] No file outside `build_ast_graph.py`, `kuzu_queries.py`, `server.py`, `pr_analysis.py`, `README.md`, - `propose/PRODUCT-VISION.md`, and the new `tests/` paths is + `docs/PRODUCT-VISION.md`, and the new `tests/` paths is modified. - [ ] PR description includes the scope statement, the manual evidence output (pass6 log + meta() snippet + find_route_callers output), diff --git a/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md b/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md index 13ec61f3..e87ee7e4 100644 --- a/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md +++ b/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md @@ -6,7 +6,7 @@ Status: **completed** — applied. Companion document to ## Why this file exists The brownfield plan grew through two review rounds; the second review -(`reports/review/active/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md`) +(`docs/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md`) flagged design issues that were folded back into the plan in-place. Once they're inlined, they stop standing out — but they are exactly the parts an implementer is most likely to skim past or get wrong, diff --git a/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES.md b/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES.md index b6ef32f6..96bc081c 100644 --- a/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES.md +++ b/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES.md @@ -805,7 +805,7 @@ order in this list is the only correct interleaving; do not reorder. - `README.md` — new section "Brownfield overrides" walking through Layer B (config), with a complete example block. Mention Layer C as the last resort, with the four interface declarations to copy-paste. -- `CODEBASE_REQUIREMENTS.md` — expand the role-inference section to note +- `docs/CODEBASE_REQUIREMENTS.md` — expand the role-inference section to note the override layers exist. - MCP server `instructions` string in `server.py` — one extra sentence noting that "role and capability inference can be customised per-project diff --git a/plans/completed/PLAN-CALL-GRAPH.md b/plans/completed/PLAN-CALL-GRAPH.md index bcde568e..3880d15e 100644 --- a/plans/completed/PLAN-CALL-GRAPH.md +++ b/plans/completed/PLAN-CALL-GRAPH.md @@ -460,7 +460,7 @@ Additions (~80 lines, no removals): - How to filter by `min_confidence`. - Why phantoms aren't dropped at index time. -### 8. `CODEBASE_REQUIREMENTS.md` +### 8. `docs/CODEBASE_REQUIREMENTS.md` Add a "Call graph" note listing the tree-sitter node types the extractor depends on: @@ -585,7 +585,7 @@ Single PR. Breaking changes: | 8 | Augment `_graph_expand_merge` to also call `expand_methods`. | `search_lancedb.py` | Graph-expand results include method-reachable chunks on the smoke corpus. | | 9 | Add MCP tools (`find_callers`, `find_callees`), `follow_calls` param on `trace_flow`, update `_INSTRUCTIONS`. | `server.py` | `test_mcp_tools.py` additions pass. | | 10 | Update tests: new files + extend `test_ast_graph_build.py` / `test_kuzu_queries.py` / `test_mcp_tools.py`. | `tests/` | `pytest` green. | -| 11 | Update `README.md` + `CODEBASE_REQUIREMENTS.md`. | docs | Manual review. | +| 11 | Update `README.md` + `docs/CODEBASE_REQUIREMENTS.md`. | docs | Manual review. | | 12 | Confirm `propose/completed/CALL-GRAPH-PROPOSE.md` is the only active call-graph proposal (old deferred draft already removed; git history retains it). | `propose/` | Directory listing shows a single call-graph proposal. | ## Out of scope (for this plan, tracked elsewhere) diff --git a/plans/completed/PLAN-CAPABILITIES-MODEL.md b/plans/completed/PLAN-CAPABILITIES-MODEL.md index 7c8d9e65..6707a17e 100644 --- a/plans/completed/PLAN-CAPABILITIES-MODEL.md +++ b/plans/completed/PLAN-CAPABILITIES-MODEL.md @@ -453,7 +453,7 @@ callers see them in results. - `README.md` — add a section "Capabilities" describing the multi-tag axis, the initial capability set, and `list_by_capability`. Keep the existing "Roles" section intact. -- `CODEBASE_REQUIREMENTS.md` — note the type-level granularity choice +- `docs/CODEBASE_REQUIREMENTS.md` — note the type-level granularity choice and the deferred per-method storage (link to this plan). - MCP server `instructions` string in `server.py` — one extra sentence pointing at `list_by_capability` for behavioural questions about diff --git a/plans/completed/PLAN-CLI-SCENARIOS.md b/plans/completed/PLAN-CLI-SCENARIOS.md index c07e3011..d1156d8e 100644 --- a/plans/completed/PLAN-CLI-SCENARIOS.md +++ b/plans/completed/PLAN-CLI-SCENARIOS.md @@ -73,7 +73,7 @@ before PR-CLI-2 so contributors exercising new subcommands do not pay multi-seco | --- | --- | --- | --- | --- | --- | | **PR-CLI-1** | Land / freeze propose (doc-only merge of `CLI-SCENARIOS-PROPOSE.md` if not already on `master`) | none | `propose/completed/CLI-SCENARIOS-PROPOSE.md` (status bump); `plans/completed/PLAN-CLI-SCENARIOS.md` (tracking) | n/a | none | | **PR-CLI-2** | Full implementation: lifecycle handlers, env + YAML + index layout, package rename, `server.py` / indexer / path helpers, **`mcp_v2.py`**, **`path_filtering.py`** (`.lancedb-mcp/ignore` → `.java-codebase-rag/ignore`), help redesign, tracking issue constant, user-visible stderr hints; **`mcp.json.example`** env keys = source of truth | none | `pyproject.toml`, package dir rename, `server.py`, `mcp_v2.py`, `java_codebase_rag/cli.py`, `java_index_flow_lancedb.py`, `graph_enrich.py`, `path_filtering.py`, `search_lancedb.py`, `kuzu_queries.py`, `build_ast_graph.py`, tests, `mcp.json.example`, `.gitignore`, any other `user_rag` / env / path references in Python | unit + integration + help-structure test (see below) | PR-CLI-1 merged | -| **PR-CLI-3** | Doc and example sweep + **`.cursor/rules/`** + migration sections + acceptance grep; **`mcp.json.example`** comment/example polish only (keys already correct from PR-CLI-2) | none | `README.md`, `docs/*`, `AGENTS.md`, `.cursor/rules/*.mdc`, `CODEBASE_REQUIREMENTS.md`, `mcp.json.example` (prose only if needed), selected `propose/*.md`, `.gitignore` notes | manual grep audit; `ruff` / `pytest` unchanged by docs | PR-CLI-2 merged | +| **PR-CLI-3** | Doc and example sweep + **`.cursor/rules/`** + migration sections + acceptance grep; **`mcp.json.example`** comment/example polish only (keys already correct from PR-CLI-2) | none | `README.md`, `docs/*`, `AGENTS.md`, `.cursor/rules/*.mdc`, `docs/CODEBASE_REQUIREMENTS.md`, `mcp.json.example` (prose only if needed), selected `propose/*.md`, `.gitignore` notes | manual grep audit; `ruff` / `pytest` unchanged by docs | PR-CLI-2 merged | Landing order: **PR-CLI-1 → PR-CLI-2 → PR-CLI-3**. @@ -284,7 +284,7 @@ Follow the **explicit file list** in propose §6 (`README.md`, `paper.pdf`, `AGENTS.md`, **`.cursor/rules/*.mdc`** (agent rules audit), `CODEBASE_REQUIREMENTS.md`, `mcp.json.example` (comments only — keys from PR-CLI-2), `propose/INDEX-AUTO-MODE-PROPOSE.md`, -`propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md`, `propose/PRODUCT-VISION.md`, +`propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md`, `docs/PRODUCT-VISION.md`, `.gitignore`). Add **Migration from legacy names** sections with explicit `mv` commands @@ -314,7 +314,7 @@ Add **Migration from legacy names** sections with explicit `mv` commands | # | Step | File(s) | Done when | | --- | --- | --- | --- | | 1 | README + CLI operator guide | `README.md`, `docs/JAVA-CODEBASE-RAG-CLI.md` | New subcommand table + 5 env vars + migration | -| 2 | Agent + checklist + requirements | `docs/*`, `CODEBASE_REQUIREMENTS.md` | No stale operator paths | +| 2 | Agent + checklist + requirements | `docs/*`, `docs/CODEBASE_REQUIREMENTS.md` | No stale operator paths | | 3 | Paper + proposes + example MCP JSON | `docs/paper/`, `propose/*`, `mcp.json.example` | PDF rebuilt; examples updated | | 4 | Acceptance grep | repo root | Reviewer sign-off | diff --git a/plans/completed/PLAN-CLIENT-ROLE-RENAME.md b/plans/completed/PLAN-CLIENT-ROLE-RENAME.md index bf2f2001..4edbcd08 100644 --- a/plans/completed/PLAN-CLIENT-ROLE-RENAME.md +++ b/plans/completed/PLAN-CLIENT-ROLE-RENAME.md @@ -188,7 +188,7 @@ Six references at `server.py:49, 689, 1141, 1338, 1342, 1418`: - Line 1342: docstring `"...FEIGN_CLIENT/REPOSITORY/MAPPER..."` → `"...CLIENT/REPOSITORY/MAPPER..."` - Line 1418: `entry_roles = ["CONTROLLER", "COMPONENT", "SERVICE", "FEIGN_CLIENT"]` → `[..., "CLIENT"]` -#### Change 6: Update `README.md` and `CODEBASE_REQUIREMENTS.md` +#### Change 6: Update `README.md` and `docs/CODEBASE_REQUIREMENTS.md` `README.md`: - Line 137: `trace_flow` description's stage chain `FEIGN_CLIENT/REPOSITORY/MAPPER` → `CLIENT/REPOSITORY/MAPPER` @@ -338,7 +338,7 @@ description. - [ ] `_ROLE_SCORE_WEIGHTS["CLIENT"] = 0.06` (was `FEIGN_CLIENT`) (`search_lancedb.py:188`) - [ ] Six `server.py` literal references updated (lines 49, 689, 1141, 1338, 1342, 1418) - [ ] `README.md` updated (3 lines + brownfield note) -- [ ] `CODEBASE_REQUIREMENTS.md` updated (lines 146, 162, 346-347) +- [ ] `docs/CODEBASE_REQUIREMENTS.md` updated (lines 146, 162, 346-347) - [ ] `tests/test_lancedb_e2e.py:342` allow-list updated - [ ] `ONTOLOGY_VERSION` bumped 8 → 9 with phase-comment update - [ ] All 9 new tests in `tests/test_client_role_rename.py` pass diff --git a/plans/PLAN-DESCRIBE-HINTS-STRUCTURAL.md b/plans/completed/PLAN-DESCRIBE-HINTS-STRUCTURAL.md similarity index 100% rename from plans/PLAN-DESCRIBE-HINTS-STRUCTURAL.md rename to plans/completed/PLAN-DESCRIBE-HINTS-STRUCTURAL.md diff --git a/plans/completed/PLAN-HTTP-ROUTE-METHOD-ENUM.md b/plans/completed/PLAN-HTTP-ROUTE-METHOD-ENUM.md index 266edf78..a966c4d9 100644 --- a/plans/completed/PLAN-HTTP-ROUTE-METHOD-ENUM.md +++ b/plans/completed/PLAN-HTTP-ROUTE-METHOD-ENUM.md @@ -31,7 +31,7 @@ Depends on: **none** (lands on current `master`). | PR | Scope | Ontology bump | Files touched (approx) | Test buckets | Independent of | | --- | --- | --- | --- | --- | --- | | **PR-1** | `CodebaseHttpMethod.java` stub under route fixtures; **parameterized** structured stderr emitter (new small module and/or `build_ast_graph.py`) — **no production call sites** | **none** | `tests/fixtures/brownfield_route_stubs/...`, emitter module, **1** new test | Unit test exercises emitter directly (INFO + WARN shapes as needed) | none | -| **PR-2** | Rename stubs + route stub field type; `ast_java.py` recognition + client `method` enum parse + **extractor-time** INFO shadowing + WARN on string `method`; `graph_enrich.py` HTTP branch replace **without** merge-time shadowing + `meta_chain` / log strings; tighten `test_23`; **new** inbound exclusivity test; README + `CODEBASE_REQUIREMENTS.md` + any other doc hits for examples | **11 → 12** (`ast_java.ONTOLOGY_VERSION`; README / `AGENTS.md` callouts) | `ast_java.py`, `graph_enrich.py`, structured-log module (see PR-2 §4), stubs, `tests/test_*.py` listed below, `README.md`, `CODEBASE_REQUIREMENTS.md`, `build_ast_graph.py` if comment at ~1904 | Full `pytest tests`; new exclusivity + optional shadowing log test | PR-1 merged | +| **PR-2** | Rename stubs + route stub field type; `ast_java.py` recognition + client `method` enum parse + **extractor-time** INFO shadowing + WARN on string `method`; `graph_enrich.py` HTTP branch replace **without** merge-time shadowing + `meta_chain` / log strings; tighten `test_23`; **new** inbound exclusivity test; README + `docs/CODEBASE_REQUIREMENTS.md` + any other doc hits for examples | **11 → 12** (`ast_java.ONTOLOGY_VERSION`; README / `AGENTS.md` callouts) | `ast_java.py`, `graph_enrich.py`, structured-log module (see PR-2 §4), stubs, `tests/test_*.py` listed below, `README.md`, `docs/CODEBASE_REQUIREMENTS.md`, `build_ast_graph.py` if comment at ~1904 | Full `pytest tests`; new exclusivity + optional shadowing log test | PR-1 merged | | **PR-3** | Agent docs + v2 addendum only | **none** | `docs/AGENT-GUIDE.md`, `docs/skills/java-codebase-explore.md` (if needed), `propose/completed/BROWNFIELD-ANNOTATIONS-V2-ADDENDUM-HTTP-METHOD-ENUM.md` | Docs-only CI | PR-2 merged | Landing order: **PR-1 → PR-2 → PR-3**. @@ -136,7 +136,7 @@ Landing order: **PR-1 → PR-2 → PR-3**. - `tests/test_assign_endpoint_client_extraction.py` — Feign + JAX-RS mirror; `method = HttpMethod.POST` may need alignment with **`CodebaseHttpMethod.POST`** on brownfield annotation per surface rules. - `tests/test_cross_service_resolution_flag.py` — generated Java strings. -### 8. `README.md` and `CODEBASE_REQUIREMENTS.md` +### 8. `README.md` and `docs/CODEBASE_REQUIREMENTS.md` - Replace annotation names and examples with `@CodebaseHttpClient` / enum `method`; add **Re-index required** callout for PR-2 + ontology **12**. @@ -185,7 +185,7 @@ Landing order: **PR-1 → PR-2 → PR-3**. | 3 | Merge HTTP replace (behaviour only) | `graph_enrich.py` | Replace pattern; **zero** shadowing logs from this file | | 4 | Verbose plumbing if needed | `build_ast_graph.py` → parse entry | Shadowing INFO respects volume gate | | 5 | `meta_chain` + log strings | `graph_enrich.py` | Grep clean for old simple names | -| 6 | Tests + docs | `tests/*`, `README.md`, `CODEBASE_REQUIREMENTS.md` | Full pytest; doc examples match stubs | +| 6 | Tests + docs | `tests/*`, `README.md`, `docs/CODEBASE_REQUIREMENTS.md` | Full pytest; doc examples match stubs | | 7 | Ontology + meta test | `ast_java.py`, `tests/test_call_edges_e2e.py` | `meta()` reports 12 | --- diff --git a/plans/completed/PLAN-MCP-API-V2.md b/plans/completed/PLAN-MCP-API-V2.md index 01c2ba9f..eaa3be60 100644 --- a/plans/completed/PLAN-MCP-API-V2.md +++ b/plans/completed/PLAN-MCP-API-V2.md @@ -356,7 +356,7 @@ For each, also delete: "operational — moving to `user-rag` CLI in next release". This is a one-PR transition state. -### 3. `propose/PRODUCT-VISION.md` — agent-recipe examples +### 3. `docs/PRODUCT-VISION.md` — agent-recipe examples - Update any example invocations from v1 to v2. Search for `find_callers`, `list_routes`, `list_clients`, `find_route_*`, `trace_*`, `impact_*` and @@ -551,7 +551,7 @@ DoD is the delta + suite-green, not an absolute total. | v2 handlers diverge in behaviour from v1 | V2-1 | Equivalence tests (14 of them) compare returned id sets directly. Drift is caught at PR review. | | `direction`/`edge_types` required-field change breaks existing clients | V2-1 | No existing clients — confirmed by Dmitry ("nobody uses this MCP bundle yet"). Tests assert `ValidationError` is raised, which is the contract. | | `describe.edge_summary` adds N round-trips per call | V2-2 | Single grouped count query, not 9 round-trips. Test asserts call count via Kuzu connection mock. | -| Removing v1 tools breaks the agent system prompt | V2-3 | `propose/PRODUCT-VISION.md` and README are updated in the same PR. Agent prompt is separate (not in this repo). | +| Removing v1 tools breaks the agent system prompt | V2-3 | `docs/PRODUCT-VISION.md` and README are updated in the same PR. Agent prompt is separate (not in this repo). | | CLI subprocess tests are slow / flaky | V2-4 | Each subprocess invocation hits a pre-built fixture under `/tmp`; no rebuilds inside tests. Targeted at < 5s total. | | `pyproject.toml` package layout breaks the existing flat-script bundle | V2-4 | Today's `packages = []` is intentional; we promote it to `packages = ["user_rag"]` only — root scripts (`server.py`, `build_ast_graph.py`, etc.) stay outside the package. Tested by `pip install .` succeeding. | | `pr-review` skill under `.cursor/skills/pr-review/` still calls `analyze_pr` MCP after V2-4 | V2-4 | PR description includes a manual TODO to update that skill. CLI version of the call is documented in README's "Migration from v1" subsection. | diff --git a/plans/completed/PLAN-REMOTE-PROJECT-INDEXING.md b/plans/completed/PLAN-REMOTE-PROJECT-INDEXING.md index 3ed41801..b36e4ee0 100644 --- a/plans/completed/PLAN-REMOTE-PROJECT-INDEXING.md +++ b/plans/completed/PLAN-REMOTE-PROJECT-INDEXING.md @@ -163,7 +163,7 @@ Users will configure their MCP client (Cursor, Claude Code, etc.) like this: | `server.py` | `_cocoindex_subprocess_env(root)` + pass `env=` to CocoIndex subprocess | | `README.md` | Env table + `refresh_code_index` subprocess note | | `mcp.json.example` | Example `LANCEDB_MCP_PROJECT_ROOT` path | -| `CODEBASE_REQUIREMENTS.md` | Project root / CocoIndex / MCP consistency | +| `docs/CODEBASE_REQUIREMENTS.md` | Project root / CocoIndex / MCP consistency | | `tests/test_mcp_tools.py` | `test_cocoindex_subprocess_env_sets_project_root` | --- @@ -172,5 +172,5 @@ Users will configure their MCP client (Cursor, Claude Code, etc.) like this: - [x] Step 1: Modify `java_index_flow_lancedb.py` - [x] Step 2: Modify `server.py` (including `_cocoindex_subprocess_env` helper + unit test) -- [x] Step 3: Update documentation (`README.md`, `mcp.json.example`, `CODEBASE_REQUIREMENTS.md`, flow docstring) +- [x] Step 3: Update documentation (`README.md`, `mcp.json.example`, `docs/CODEBASE_REQUIREMENTS.md`, flow docstring) - [x] Testing: `test_cocoindex_subprocess_env_sets_project_root`; heavy e2e unchanged (`cwd` + no env → `.` root) diff --git a/plans/completed/PLAN-TIER1B-COMPLETION.md b/plans/completed/PLAN-TIER1B-COMPLETION.md index aa32e46a..09b9acd1 100644 --- a/plans/completed/PLAN-TIER1B-COMPLETION.md +++ b/plans/completed/PLAN-TIER1B-COMPLETION.md @@ -938,7 +938,7 @@ path in both services. Used by tests 32, 34, 37. | 10 | Extend `analyze_pr`: `cross_service_callers_count` per changed symbol | `server.py`, `pr_analysis.py` | test 47 passes | | 11 | Risk-score weight bump in `pr_analysis.py` | `pr_analysis.py` | docstring + unit test | | 12 | Create `tests/fixtures/cross_service_smoke/` | `tests/fixtures/...` | fixture files in place | -| 13 | Update `README.md` MCP tools section + `propose/PRODUCT-VISION.md` (`HTTP_CALLS` planned → shipped) | `README.md`, `propose/PRODUCT-VISION.md` | manual review | +| 13 | Update `README.md` MCP tools section + `docs/PRODUCT-VISION.md` (`HTTP_CALLS` planned → shipped) | `README.md`, `docs/PRODUCT-VISION.md` | manual review | --- @@ -988,7 +988,7 @@ path in both services. Used by tests 32, 34, 37. 6. Existing MCP tools extended (`impact_analysis`, `trace_flow`, `analyze_pr`). 7. `README.md` updated for caller-side edges, brownfield clients, - match outcomes; `propose/PRODUCT-VISION.md` flips + match outcomes; `docs/PRODUCT-VISION.md` flips `HTTP_CALLS` / `ASYNC_CALLS` from *planned* to *shipped*. 8. Each PR's description quotes the relevant stats from a manual run on bank-chat-system as evidence. diff --git a/propose/completed/CALL-GRAPH-PROPOSE.md b/propose/completed/CALL-GRAPH-PROPOSE.md index 9f2a6028..a37284af 100644 --- a/propose/completed/CALL-GRAPH-PROPOSE.md +++ b/propose/completed/CALL-GRAPH-PROPOSE.md @@ -4,7 +4,7 @@ Status: **completed** — shipped (static intra-JVM `CALLS` + `DECLARES`; plan: [`plans/completed/PLAN-CALL-GRAPH.md`](../../plans/completed/PLAN-CALL-GRAPH.md) for the step-by-step implementation. -This proposal realises **point 4 of `PRODUCT-VISION.md`** ("Adding a Call +This proposal realises **point 4 of `docs/PRODUCT-VISION.md`** ("Adding a Call Graph Layer") with a deliberately narrow scope: **static, intra-JVM method-to-method edges**. Cross-service HTTP/async, AOP-proxy resolution, and runtime-trace ingestion are explicit non-goals of this phase. @@ -535,7 +535,7 @@ round-trip test. 0.5 day: server surface + search-side expansion). - Validation on `bank-chat-system` + micro-fixture: **1 day** (unit + integration + regression run; manual trace_flow spot-checks). -- Documentation update (`README.md`, `CODEBASE_REQUIREMENTS.md`, MCP +- Documentation update (`README.md`, `docs/CODEBASE_REQUIREMENTS.md`, MCP instructions): **2 hours**. Total: **3–4 working days** including tests and docs. diff --git a/propose/completed/CLI-SCENARIOS-PROPOSE.md b/propose/completed/CLI-SCENARIOS-PROPOSE.md index 42e58875..f68ca868 100644 --- a/propose/completed/CLI-SCENARIOS-PROPOSE.md +++ b/propose/completed/CLI-SCENARIOS-PROPOSE.md @@ -312,11 +312,11 @@ Open a GitHub issue titled **"AST graph (Kuzu) incremental rebuild"** referencin - `docs/paper/paper.tex` — architecture paper updated for new CLI verbs / env vars / file paths; rebuild `paper.pdf` (Russian translation `paper_ru.tex` is a standalone artifact outside the repo and is not in scope). - `AGENTS.md` — CLI doc reference + any `refresh` mention. - `.cursor/rules/*.mdc` — agent workflow / env / CLI contract; see **Agent rules audit** below (must match post-rename surface). -- `CODEBASE_REQUIREMENTS.md` — every `.lancedb-mcp.yml` / `LANCEDB_MCP_*` / `lancedb_data` reference updated. +- `docs/CODEBASE_REQUIREMENTS.md` — every `.lancedb-mcp.yml` / `LANCEDB_MCP_*` / `lancedb_data` reference updated. - `mcp.json.example` — **PR-CLI-3 is a second pass only:** PR-CLI-2 updates this file so **env keys match the live server**; PR-CLI-3 reconciles comments, examples, and any doc drift — **no conflicting edits**; if both PRs touch it, **PR-CLI-2 wins** for structure, PR-CLI-3 for prose polish. - `propose/INDEX-AUTO-MODE-PROPOSE.md` — one-line note that `refresh` is being renamed to `reprocess`. - `propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md` — one-line note that the new tracking issue (created in PR-CLI-2) is the user-facing handle. -- `propose/PRODUCT-VISION.md` — update `lancedb_data` mention (§ about Kuzu's on-disk footprint) and any `refresh` reference. +- `docs/PRODUCT-VISION.md` — update `lancedb_data` mention (§ about Kuzu's on-disk footprint) and any `refresh` reference. - `.gitignore` — add `.java-codebase-rag/`, keep `lancedb_data/` for grace-period cleanup, or remove if PR-CLI-2 drops that grace-period entry. **Agent rules audit (PR-CLI-3, manual checklist — use together with acceptance grep below):** @@ -342,7 +342,7 @@ Expected output after PR-CLI-3 (docs + rules): The startup-slowness fix (deferred imports in `cli.py`) is a **separate, prior PR** outside this migration; it does not change the surface and should land before PR-CLI-2 so contributors testing the new subcommands aren't taxed by the multi-second startup. -**PR-CLI-3 (docs sweep):** README, `docs/*`, `AGENTS.md`, `.cursor/rules/*.mdc`, `CODEBASE_REQUIREMENTS.md`, `mcp.json.example` path placeholders, selected `propose/*.md` one-line notes, `docs/paper/paper.tex` + rebuilt `paper.pdf`, migration `mv` sections, and acceptance grep per the command in this section. +**PR-CLI-3 (docs sweep):** README, `docs/*`, `AGENTS.md`, `.cursor/rules/*.mdc`, `docs/CODEBASE_REQUIREMENTS.md`, `mcp.json.example` path placeholders, selected `propose/*.md` one-line notes, `docs/paper/paper.tex` + rebuilt `paper.pdf`, migration `mv` sections, and acceptance grep per the command in this section. --- diff --git a/propose/completed/CLIENT-ROLE-RENAME-PROPOSE.md b/propose/completed/CLIENT-ROLE-RENAME-PROPOSE.md index 67ef8d3b..806aa3e5 100644 --- a/propose/completed/CLIENT-ROLE-RENAME-PROPOSE.md +++ b/propose/completed/CLIENT-ROLE-RENAME-PROPOSE.md @@ -239,7 +239,7 @@ Verified count: **5 production files, ~12 references**: | `server.py` | 49, 687, 1138, 1335, 1339, 1415 | MCP tool docstrings, `role` enum strings, entry-role filter | | `tests/test_lancedb_e2e.py` | 342 | One assertion | -Plus docs: `README.md`, `CODEBASE_REQUIREMENTS.md`. Doc sweep is +Plus docs: `README.md`, `docs/CODEBASE_REQUIREMENTS.md`. Doc sweep is straightforward. ### 4.4 Brownfield input today @@ -398,7 +398,7 @@ story as every other graph-shape change). - README: rename role table; mention `CLIENT` + `HTTP_CLIENT` capability; document the `MESSAGE_PRODUCER` capability that already exists for symmetry. -- `CODEBASE_REQUIREMENTS.md`: rename references. +- `docs/CODEBASE_REQUIREMENTS.md`: rename references. - `propose/DEFERRED-REST-CLIENT-MIGRATION-PROPOSE.md`: **delete** (this proposal supersedes it; the rename-vs-capability decision is reversed by current architecture). diff --git a/propose/HINTS-STRUCTURED-LABEL.md b/propose/completed/HINTS-STRUCTURED-LABEL.md similarity index 100% rename from propose/HINTS-STRUCTURED-LABEL.md rename to propose/completed/HINTS-STRUCTURED-LABEL.md diff --git a/propose/completed/TIER1-COMPLETION-PROPOSE.md b/propose/completed/TIER1-COMPLETION-PROPOSE.md index c3b1a541..25585df8 100644 --- a/propose/completed/TIER1-COMPLETION-PROPOSE.md +++ b/propose/completed/TIER1-COMPLETION-PROPOSE.md @@ -1,7 +1,7 @@ # Tier 1 completion — proposal (shipped) Status: **completed — shipped via PR-A1 → PR-C** (merged 2026-04 → 2026-05). Moved to `propose/completed/` after PR-D3 (Tier 1B) landed. Pairs with the borrow guide -[`reports/what-to-borrow-from-cmm.md`](../../reports/what-to-borrow-from-cmm.md) +[`docs/reports/what-to-borrow-from-cmm.md`](../../docs/reports/what-to-borrow-from-cmm.md) and follows on from the completed [`propose/completed/CALL-GRAPH-PROPOSE.md`](CALL-GRAPH-PROPOSE.md). @@ -859,10 +859,10 @@ Independent PRs, but a sensible review order: section with `route_overrides` examples and `@CodebaseRoute` source stub** — same shape as the existing `role_overrides` / `@CodebaseRole` material. -- `CODEBASE_REQUIREMENTS.md`: update the schema diagram and the env-var +- `docs/CODEBASE_REQUIREMENTS.md`: update the schema diagram and the env-var table (`.lancedb-mcp-ignore` mention). Document the route resolver five-layer composition table from §4.6.4. -- `propose/PRODUCT-VISION.md`: tick B2a / B4 / B5 off the roadmap; note +- `docs/PRODUCT-VISION.md`: tick B2a / B4 / B5 off the roadmap; note B2b + B6 as the next proposal. --- @@ -902,7 +902,7 @@ Open questions to settle during implementation, not now: - [ ] `ONTOLOGY_VERSION` bumped 4 → 5; stale-graph guard test added. - [ ] README brownfield section extended with `route_overrides` and `@CodebaseRoute` examples. -- [ ] `CODEBASE_REQUIREMENTS.md` documents the §4.6.4 five-layer +- [ ] `docs/CODEBASE_REQUIREMENTS.md` documents the §4.6.4 five-layer composition table. - [ ] No regressions in existing role / capability resolution (run the existing brownfield test suite). @@ -918,7 +918,7 @@ Open questions to settle during implementation, not now: - [ ] Old `compile_excluded_glob_patterns` call sites replaced (3 of them). - [ ] `graph_meta` exposes `ignore_layers`. -- [ ] `CODEBASE_REQUIREMENTS.md` documents the layer order. +- [ ] `docs/CODEBASE_REQUIREMENTS.md` documents the layer order. --- @@ -947,9 +947,9 @@ follow-ups, in order of leverage: ## 11. References - [`TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md`](TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md) - B2b + B6 propose -- [`reports/what-to-borrow-from-cmm.md`](../../reports/what-to-borrow-from-cmm.md) — original borrow guide (Tier 1 §B1–B5). +- [`docs/reports/what-to-borrow-from-cmm.md`](../../docs/reports/what-to-borrow-from-cmm.md) — original borrow guide (Tier 1 §B1–B5). - [`propose/completed/CALL-GRAPH-PROPOSE.md`](CALL-GRAPH-PROPOSE.md) — completed call-graph proposal; same shape & style. -- [`reports/call-graph-review.md`](../../reports/call-graph-review.md) — review that surfaced the resolver / extractor invariants. +- [`docs/reports/call-graph-review.md`](../../docs/reports/call-graph-review.md) — review that surfaced the resolver / extractor invariants. - [`plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md`](../../plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md) — **mandatory reading** for the implementer of §4.6 (brownfield route resolver mirrors this design). - `graph_enrich.py` §"brownfield role / capability overrides" — the existing implementation B2a extends. diff --git a/propose/completed/TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md b/propose/completed/TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md index 0a4f0150..b78e3e1a 100644 --- a/propose/completed/TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md +++ b/propose/completed/TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md @@ -21,9 +21,9 @@ Before working on this proposal, read in order: 1. [`TIER1-COMPLETION-PROPOSE.md`](TIER1-COMPLETION-PROPOSE.md) §4 (B2a `Route` + `EXPOSES`) — defines every join key used here. -2. [`reports/what-to-borrow-from-cmm.md`](../../reports/what-to-borrow-from-cmm.md) +2. [`docs/reports/what-to-borrow-from-cmm.md`](../../docs/reports/what-to-borrow-from-cmm.md) §B2 (Route shape) and §B6 (cross-service edges). -3. [`reports/call-graph-review.md`](../../reports/call-graph-review.md) +3. [`docs/reports/call-graph-review.md`](../../docs/reports/call-graph-review.md) — same correctness invariants apply (microservice scoping, confidence semantics, phantom-id collisions). 4. [`plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md`](../../plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md) @@ -466,7 +466,7 @@ but conservative. ## 12. References - [`TIER1-COMPLETION-PROPOSE.md`](TIER1-COMPLETION-PROPOSE.md) — B2a, B4, B5 (active). -- [`reports/what-to-borrow-from-cmm.md`](../../reports/what-to-borrow-from-cmm.md) §B2, §B6. -- [`reports/call-graph-review.md`](../../reports/call-graph-review.md) — invariants this proposal must not regress. +- [`docs/reports/what-to-borrow-from-cmm.md`](../../docs/reports/what-to-borrow-from-cmm.md) §B2, §B6. +- [`docs/reports/call-graph-review.md`](../../docs/reports/call-graph-review.md) — invariants this proposal must not regress. - [`plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md`](../../plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md) — mandatory reading for §6. -- [`propose/PRODUCT-VISION.md`](PRODUCT-VISION.md) §3 — `HTTP_CALLS` / `ASYNC_CALLS` are listed as *planned*; this proposal flips them to *shipped*. +- [`docs/PRODUCT-VISION.md`](../docs/PRODUCT-VISION.md) §3 — `HTTP_CALLS` / `ASYNC_CALLS` are listed as *planned*; this proposal flips them to *shipped*. diff --git a/propose/ENHANCED-ROLE-RECOGNITION-PROPOSE.md b/propose/stale/ENHANCED-ROLE-RECOGNITION-PROPOSE.md similarity index 100% rename from propose/ENHANCED-ROLE-RECOGNITION-PROPOSE.md rename to propose/stale/ENHANCED-ROLE-RECOGNITION-PROPOSE.md diff --git a/propose/INDEX-AUTO-MODE-PROPOSE.md b/propose/stale/INDEX-AUTO-MODE-PROPOSE.md similarity index 100% rename from propose/INDEX-AUTO-MODE-PROPOSE.md rename to propose/stale/INDEX-AUTO-MODE-PROPOSE.md diff --git a/propose/RANKING-MICROSERVICE-PROPOSE.md b/propose/stale/RANKING-MICROSERVICE-PROPOSE.md similarity index 100% rename from propose/RANKING-MICROSERVICE-PROPOSE.md rename to propose/stale/RANKING-MICROSERVICE-PROPOSE.md diff --git a/propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md b/propose/stale/TIER2-INCREMENTAL-REBUILD-PROPOSE.md similarity index 99% rename from propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md rename to propose/stale/TIER2-INCREMENTAL-REBUILD-PROPOSE.md index 81e8b092..73101047 100644 --- a/propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md +++ b/propose/stale/TIER2-INCREMENTAL-REBUILD-PROPOSE.md @@ -5,7 +5,7 @@ User-facing tracking for graph-side incremental work: GitHub issue **#73** (link Pairs with the focused MCP-tool proposal [`propose/INDEX-AUTO-MODE-PROPOSE.md`](INDEX-AUTO-MODE-PROPOSE.md) (decision engine for `refresh_code_index`) and supersedes its -"future Kuzu work" footnote in [`propose/PRODUCT-VISION.md`](PRODUCT-VISION.md) §99. +"future Kuzu work" footnote in [`docs/PRODUCT-VISION.md`](../docs/PRODUCT-VISION.md) §99. This is a **proposal**, not an implementable plan. After review and scoping decisions (the §11 [TBD] list), an implementable diff --git a/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-implementation-issues.md b/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-implementation-issues.md deleted file mode 100644 index b7035db8..00000000 --- a/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-implementation-issues.md +++ /dev/null @@ -1,115 +0,0 @@ -# Implementation issues: PLAN-BROWNFIELD-ROLE-OVERRIDES - -**Plan file:** `plans/todo/PLAN-BROWNFIELD-ROLE-OVERRIDES.md` -**Review date:** 2026-04-26 -**Scope:** Gaps, mistakes, and risks in the *implementation* as compared to the plan’s stated behaviour and test matrix. Code and tests are assumed to live under `mcp_lancedb_bundle/`. - -**Tests run:** `pytest tests/test_brownfield_overrides.py` — 20 passed (at time of review). - ---- - -## 1. Pre-flight LanceDB + search regression is incomplete vs. plan - -The plan’s pre-flight test (item 9) requires, in order: build a **fresh** Lance index using the real pipeline with FQN `role`/`capabilities`, assert rows on **direct table** read, then **`codebase_search(..., capability=...)`** and assert the type is returned. - -**Implemented approximations:** - -- **`enrich_chunk` + YAML** — checks resolver + chunk path; does not exercise `process_java_file` / `JavaLanceChunk` materialisation end-to-end. -- **Raw LanceDB / PyArrow** — proves `list(string)` round-trip, not the CocoIndex / `JavaLanceChunk` row shape. -- **Dataclass introspection** — confirms `JavaLanceChunk` has a `capabilities` field only. - -**Risk:** A regression that removes or mis-wires the CocoIndex write path could slip past the suite; `codebase_search` capability filtering is not covered by the brownfield tests. - -**Severity:** Medium (guards are weaker than specified, not a known production bug in the reviewed snapshot). - ---- - -## 2. “Malformed YAML” test does not use malformed YAML - -The plan (Phase 1, test 8) calls for malformed YAML to yield empty overrides without crashing. - -**Current behaviour:** a test exercises loading from a **non-existent** path, which is closer to “missing file” than “invalid YAML.” Invalid YAML in an existing file is only covered implicitly by the loader’s `except` branch, not by a named test. - -**Follow-up:** add a `tmp_path` file with content that is not valid YAML, or rename the test to match “missing config / empty” semantics. - -**Severity:** Low (behaviour likely correct; test is mis-specified or misnamed). - ---- - -## 3. Phase 2 test matrix: gaps - -The plan’s Phase 2 test list includes scenarios **not** present in `tests/test_brownfield_overrides.py` (at review time): - -- **Cyclic** meta-annotation graph (A ↔ B): no crash, role remains `OTHER`. -- **Long chain** (e.g. six wrappers): after depth cap, role `OTHER` (or whatever the spec fixes). -- **FQN + meta + Layer B together:** FQN should still win; explicit per-class config overrides automatic meta and annotation maps. - -**Covered and notable:** B-beats-A regression, two-hop to `SERVICE`, method-level meta to capability, basic `@Service` on custom `@interface`. - -**Severity:** Medium for **cycle** and **depth** (guard against stack bugs and cap drift); **low** for the FQN interaction if hand-tested or covered elsewhere (not verified here). - ---- - -## 4. Phase 3 test matrix: minor gaps - -The plan asks for: - -- **Additive capability** — `@CodebaseCapability` in addition to AST-inferred capabilities (e.g. alongside a Spring stereotype). -- **Two separate** `@CodebaseCapability` annotations on the same class, as well as the **container** form. - -**Current coverage** focuses on `CodebaseRole` variants, invalid role warnings, and **`@CodebaseCapabilities({...})` container** with two inner values. The **stacked** `@CodebaseCapability` / `@CodebaseCapability` case is not clearly duplicated as a dedicated test; additive-on-AST is not isolated. - -**Severity:** Low (behaviour is straightforward from code structure; risk is **regression** in parser or resolver order, not a known bug). - ---- - -## 5. Possible Lance vs. Kuzu disagreement on meta maps - -**Implementation detail:** the graph writer derives annotation declarations from **in-memory graph tables**; **`enrich_chunk`** builds meta from a **separate** full-disk walk (`_collect_annotation_decls_from_disk` + cache). - -If the two ever differ (excludes, parse errors, or partial scans), the **same** Java type could get **different** Layer A results in Kuzu than on Lance chunks. The plan’s intent is consistency across stores; this is an **integration consistency** risk, not a single-file bug. - -**Severity:** Low until observed in a real project; worth monitoring or converging the two inputs. - ---- - -## 6. Depth cap semantics (implementation) vs. plan’s sketch - -The resolver’s recursive walk uses a **path set** and stops when `len(path) > 4`. The plan’s pseudocode used a slightly different shape (`seen` and `len(seen) > 4`). - -**Risk:** off-by-one vs. the plan’s “depth 4 / six links `OTHER`” without an automated test (see §3), so behaviour could drift in a refactor. - -**Severity:** Low–medium, mitigated if Phase 2 depth test is added. - ---- - -## 7. Kuzu member nodes and capabilities - -`Symbol` rows for **methods** use `_node_row` defaults (`capabilities: []`, `role: "OTHER"`) and do not run the brownfield resolver per method. The plan is **type-centric**; this is not a plan violation, but any future expectation of “method symbol capabilities in the graph” would be unmet. - -**Severity:** N/A for current plan; documentation only if users assume otherwise. - ---- - -## Summary - -| ID | Topic | Severity | -|----|--------------------------------------|------------| -| 1 | Pre-flight E2E (index + search) | Medium | -| 2 | Malformed YAML test naming / body | Low | -| 3 | Phase 2: cycle, depth, FQN+meta tests| Medium (partial) | -| 4 | Phase 3: stacked caps + AST additive | Low | -| 5 | Meta map source: graph vs. disk | Low (consistency) | -| 6 | Depth cap without test | Low–medium | -| 7 | Method `Symbol` rows / capabilities | N/A | - ---- - -## What was in good shape (for balance) - -- `BrownfieldOverrides` loader, validation against shared ontology, stderr warnings for unknowns. -- `resolve_role_and_capabilities` execution order and **B-before-A** semantics with **OTHER** guards; FQN and `@CodebaseRole` ordering relative to C. -- `AnnotationRef.arguments` and `CodebaseCapabilities` value extraction in `ast_java.py`. -- Wiring: `build_ast_graph` type nodes, `enrich_chunk`, `JavaLanceChunk` + `process_java_file` for `capabilities`. -- README, CODEBASE_REQUIREMENTS, and MCP `instructions` mention customisation. -- B-beats-A regression test is present (critical for the plan’s execution-order invariant). diff --git a/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-fixes.md b/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-fixes.md deleted file mode 100644 index eb808f52..00000000 --- a/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-fixes.md +++ /dev/null @@ -1,431 +0,0 @@ -# PLAN-CAPABILITIES-MODEL — implementation fixes - -**Inputs:** `reports/review/PLAN-CAPABILITIES-MODEL-implement-report.md` + -designer review of that report. -**Plan file (now amended):** `plans/PLAN-CAPABILITIES-MODEL.md`. Re-read -the plan first — it has two new sections (**Filter strategy** and the -expanded **`trace_flow` seeding** subsection) that change how some of -the original instructions should be implemented. -**Goal of this pass:** close 7 issues from the review **plus** correct a -design-level flaw the review surfaced but did not call out — the -existing four `capability` filters use naive post-filter, which silently -under-delivers results against the `limit` contract. - -Apply the fixes in priority order. Run the full test suite after each -group; do not bundle group D into A/B/C. - ---- - -## Group A — `codebase_search` response & filter (Issues 1, 2 + design correction) - -### A.1 Add `capabilities` to `CodeChunkHit` - -**File:** `server.py` - -In the `CodeChunkHit` Pydantic model (currently around line 65), add the -field next to `annotations_on_type` / `symbols`: - -```python -capabilities: list[str] = Field( - default_factory=list, - description=( - "Multi-tag capabilities derived from method/type annotations " - "and injected types (MESSAGE_LISTENER, MESSAGE_PRODUCER, " - "SCHEDULED_TASK, EXCEPTION_HANDLER). A class can carry several." - ), -) -``` - -In `_rows_to_hits` (around line 402), populate it alongside the other -list fields: - -```python -capabilities=_clean_str_list(r.get("capabilities")), -``` - -`_clean_str_list` already handles the legacy-string / native-list dual -shape — no new helper needed. - -`JAVA_ENRICHED_COLUMNS` already includes `"capabilities"` -(`search_lancedb.py` line 37), so the column is fetched when present. -The schema-presence guard on line 459 means stale indexes without the -column degrade gracefully. - -### A.2 Add `capability` filter to `codebase_search` (storage-pushdown) - -**Files:** `search_lancedb.py`, `server.py` - -This is **not** a post-filter. The plan's amended **Filter strategy** -section is explicit: post-filter without over-fetch widening violates -the `limit` contract and is rejected. - -#### Step A.2.1 — extend `_build_extra_predicates` - -In `search_lancedb.py` (around line 65), accept a new keyword: - -```python -def _build_extra_predicates( - *, - columns: set[str], - role: str | None, - module: str | None, - microservice: str | None, - package_prefix: str | None, - fqn_in: list[str] | None, - role_in: list[str] | None = None, - exclude_roles: list[str] | None = None, - capability: str | None = None, # NEW - capability_in: list[str] | None = None, # NEW — used by trace_flow seeding -) -> list[str]: - ... -``` - -Emit a list-contains predicate when the column exists. **Verify the -exact LanceDB SQL syntax for the project's installed version** before -wiring — likely candidates, in order of compatibility: - -```python -# Preferred (Lance >=0.10): -preds.append(f"array_has(capabilities, '{_escape_sql_str(capability)}')") -# Fallback if array_has unavailable: -preds.append(f"array_position(capabilities, '{_escape_sql_str(capability)}') >= 0") -# Last resort (some Lance builds): -preds.append(f"'{_escape_sql_str(capability)}' = ANY(capabilities)") -``` - -Run a tiny ad-hoc query against the local index to confirm which form -parses. Pick one and use it consistently. - -For the multi-value variant (`capability_in`, used only by `trace_flow` -seeding — see Group B), build a disjunction: - -```python -if capability_in and "capabilities" in columns: - parts = [ - f"array_has(capabilities, '{_escape_sql_str(c)}')" - for c in capability_in - ] - preds.append("(" + " OR ".join(parts) + ")") -``` - -Both predicates must be conditioned on `"capabilities" in columns` so -older indexes lacking the column still answer queries (filter ignored). - -#### Step A.2.2 — surface in `run_search` - -`run_search` (around line 722) gains a `capability: str | None = None` -parameter and forwards it to `_build_extra_predicates`. Same for -`capability_in: list[str] | None = None`. No other ranking change. - -#### Step A.2.3 — surface in `codebase_search` MCP tool - -In `server.py::codebase_search` (around line 488), add the parameter -next to `role`: - -```python -capability: str | None = Field( - default=None, - description=( - "Java only: AND-filter to chunks whose enclosing type carries " - "this capability (MESSAGE_LISTENER|MESSAGE_PRODUCER|" - "SCHEDULED_TASK|EXCEPTION_HANDLER). Use `list_by_capability` " - "for graph-only queries." - ), -), -``` - -Forward to `run_search(..., capability=capability, ...)`. - -### A.3 Update unit + integration tests - -- Extend `tests/test_lancedb_e2e.py` with the **`limit` contract** - assertion (plan test #3): a fixture with 50 `@Service` classes of - which 5 are also `MESSAGE_PRODUCER`; `list_by_role("SERVICE", - capability="MESSAGE_PRODUCER", limit=50)` must return exactly the 5. - Same shape for `codebase_search(..., capability=...)` (plan test #6). - ---- - -## Group B — `trace_flow` capability seeding coordination (Issue 4 + design fix) - -This is the design gap the review surfaced. The implementer faithfully -wrote the Kuzu OR predicate the plan asked for, but the LanceDB -pre-filter in `server.py::trace_flow` discards capability-only -entrypoints (role=OTHER, capability=SCHEDULED_TASK) before the Kuzu -seed query ever sees their FQNs. **Both sides must learn about -capabilities together.** - -The plan's amended **`trace_flow` seeding** subsection is now explicit -about this. The Kuzu side is already implemented; only the LanceDB side -needs work. - -### B.1 Widen the LanceDB seed pre-filter - -**File:** `server.py` - -In `trace_flow` (around line 880), the existing seed helper is: - -```python -entry_roles = ["CONTROLLER", "COMPONENT", "SERVICE", "FEIGN_CLIENT"] - -def _seed(role_allowlist: list[str] | None) -> list[dict[str, Any]]: - return run_search( - ... - role_in=role_allowlist, - exclude_roles=None if role_allowlist else sorted(baseline_excludes), - ) -``` - -Extend it to also pass capability allowlist. Match the Kuzu side -exactly — `["MESSAGE_LISTENER", "SCHEDULED_TASK"]`: - -```python -entry_roles = ["CONTROLLER", "COMPONENT", "SERVICE", "FEIGN_CLIENT"] -entry_capabilities = ["MESSAGE_LISTENER", "SCHEDULED_TASK"] - -def _seed(role_allowlist: list[str] | None, - capability_allowlist: list[str] | None) -> list[dict[str, Any]]: - return run_search( - ... - role_in=role_allowlist, - capability_in=capability_allowlist, - exclude_roles=( - None if (role_allowlist or capability_allowlist) - else sorted(baseline_excludes) - ), - ) -``` - -Then in the calling code: - -```python -# First pass: restricted to entrypoint-like role OR entrypoint capability. -seed_rows = await asyncio.to_thread(_seed, entry_roles, entry_capabilities) -if not seed_rows: - seed_rows = await asyncio.to_thread(_seed, None, None) -``` - -The `OR` semantics between `role_in` and `capability_in` are produced -by `_build_extra_predicates`: each predicate is a separate string, -joined with `AND` at the top level. To get the right semantics -(role-OR-capability rather than role-AND-capability), emit a *combined -disjunction* when both are set: - -```python -# In _build_extra_predicates: -role_pred = None -if role_in and "role" in columns: - vals = ", ".join(f"'{_escape_sql_str(v)}'" for v in role_in) - role_pred = f"role IN ({vals})" - -cap_pred = None -if capability_in and "capabilities" in columns: - parts = [ - f"array_has(capabilities, '{_escape_sql_str(c)}')" - for c in capability_in - ] - cap_pred = "(" + " OR ".join(parts) + ")" - -if role_pred and cap_pred: - preds.append(f"({role_pred} OR {cap_pred})") -elif role_pred: - preds.append(role_pred) -elif cap_pred: - preds.append(cap_pred) -``` - -The standalone `role_in` / single `capability` cases keep their existing -behaviour (each emitted independently as before). Only the *paired -seeding case* triggers the OR composition. - -### B.2 Verify with a fixture - -Add to `tests/test_lancedb_e2e.py` (plan test #5): a fixture class -implementing `org.quartz.Job` with **no** Spring stereotype. Confirm -that `trace_flow("scheduled order cleanup", ...)` returns this class as -a stage-0 seed. Without B.1 it will not — that is the regression guard. - ---- - -## Group C — `find_*` and `list_by_*` storage pushdown (Issue 3 + design fix) - -The four already-landed `capability` filters -(`find_implementors`, `find_subclasses`, `list_by_role`, -`list_by_annotation`) use naive post-filter. `find_injectors` is -missing the parameter entirely. Both flaws fix together by switching to -storage pushdown in Kuzu. - -### C.1 Push the `capability` filter into `KuzuGraph` methods - -**File:** `kuzu_queries.py` - -For each of the five graph methods consumed by these tools -(`find_implementors`, `find_subclasses`, `find_injectors`, -`list_by_role`, `list_by_annotation`), add an optional `capability` -parameter: - -```python -def list_by_role( - self, role: str, *, - module: str | None = None, - microservice: str | None = None, - capability: str | None = None, # NEW - limit: int = 100, -) -> list[SymbolHit]: - filters = ["s.role = $role"] - params: dict[str, Any] = {"role": role} - if capability: - filters.append("$capability IN s.capabilities") - params["capability"] = capability - filters.extend(_scope_filters("s", module=module, microservice=microservice, params=params)) - where = " AND ".join(filters) - query = f"MATCH (s:Symbol) WHERE {where} RETURN {_SYMBOL_RETURN} LIMIT {int(limit)}" - return [_row_to_symbol(r) for r in self._rows(query, params)] -``` - -Same shape for `list_by_annotation`, `find_implementors`, -`find_subclasses`. Apply the predicate against the result-node alias -(`s` for the `list_by_*` queries; whatever alias is used in the -implementor / subclass query). The `LIMIT` clause **must** come after -the capability filter — Kuzu's planner handles this automatically once -it's part of `WHERE`. - -For `find_injectors`, the result is an *edge* between two `Symbol` -nodes (`src` injects `dst`). The user-relevant capability is on the -**consumer** (`src`): - -```python -def find_injectors( - self, name: str, *, - module: str | None = None, - microservice: str | None = None, - capability: str | None = None, # NEW - limit: int = 100, -) -> list[EdgeHit]: - # ... existing query that binds (src)-[:INJECTS]->(dst) ... - if capability: - filters.append("$capability IN src.capabilities") - params["capability"] = capability - ... -``` - -### C.2 Replace post-filter with parameter pass-through in `server.py` - -For each of the five tools (`find_implementors`, `find_subclasses`, -`find_injectors`, `list_by_role`, `list_by_annotation`): - -- Remove the post-filter line `rows = [r for r in rows if capability in r.capabilities]`. -- Pass `capability=capability` to the corresponding `KuzuGraph` method. -- For `find_injectors` (Issue 3): add the `capability` parameter to - the tool signature in the first place. Reuse the same - `Field(default=None, description=...)` shape as the other four. Pass - through to `graph.find_injectors(..., capability=capability)`. - -`list_by_capability` is unaffected — it already pushes down via Cypher. - -### C.3 Tests - -Convert the existing `capability` post-filter tests to assert -pushdown semantics: build a fixture with N=50 services of which only 5 -have the requested capability, request `limit=50`, expect exactly 5 -results. The previous post-filter implementation would also pass this -specific shape, but a stronger fixture (50 services, capability=Y on 5 -services that are *not* in the first 50 vector hits or graph rows) -will distinguish the two implementations. Pick the stronger fixture. - ---- - -## Group D — Documentation (Issues 5, 6) - -### D.1 `README.md` - -Add a new section **after** the existing "Roles" section, before the -search-tools section. Suggested skeleton: - -```markdown -## Capabilities - -In addition to the single primary `role` per Java type, the indexer -extracts a multi-tag `capabilities: list[str]` field from method-level -annotations, type-level annotations, injected types, and supertypes. -A type can carry zero or many capabilities. Capabilities never -*replace* the role; they augment it. - -| Capability | Trigger | -|---|---| -| `MESSAGE_LISTENER` | `@KafkaListener`, `@RabbitListener`, `@JmsListener`, `@SqsListener`, `@EventListener`, `@StreamListener` on any method | -| `MESSAGE_PRODUCER` | type injects `KafkaTemplate`, `RabbitTemplate`, `JmsTemplate`, `StreamBridge`, or `ApplicationEventPublisher` | -| `SCHEDULED_TASK` | `@Scheduled` on any method, or class implements `org.quartz.Job` | -| `EXCEPTION_HANDLER`| `@ControllerAdvice`, `@RestControllerAdvice`, or any method with `@ExceptionHandler` | - -Use `list_by_capability` to enumerate types carrying a capability, or -pass `capability=...` to `codebase_search` / `list_by_role` / -`list_by_annotation` / `find_*` to AND-filter results. -``` - -### D.2 `CODEBASE_REQUIREMENTS.md` - -Add a short note under the role-inference section: - -```markdown -Capabilities are derived at the **type level**: method-level annotation -evidence is aggregated up to the enclosing type. Per-method capability -storage is intentionally out of scope for the current ontology -(version 3) — see `plans/PLAN-CAPABILITIES-MODEL.md`. The deferred -call-graph layer (`propose/DEFERRED-CALL-GRAPH-PROPOSE.md`) is the -designated place to revisit method-granularity if the need arises. -``` - ---- - -## Group E — Style nit (Issue 7) - -**File:** `ast_java.py`, around line 113. - -Insert a single blank line between `_SUPERTYPE_TO_CAPABILITY` and -`_TYPE_KINDS`. No other change. Verify by running the existing -formatter / linter the project uses. - ---- - -## Acceptance checklist - -Run before declaring done: - -- [ ] **Group A:** `codebase_search` returns `capabilities` per hit; - `capability` filter present and pushed down; `limit` contract - test passes (50 services / 5 producers / `limit=50` → exactly 5). -- [ ] **Group B:** `trace_flow` returns a Quartz `Job` implementor - (role=OTHER, capability=SCHEDULED_TASK) as a stage-0 seed. -- [ ] **Group C:** all five graph-backed tools push the `capability` - filter into Cypher; `find_injectors` has the parameter; no Python - post-filter on `r.capabilities` remains in `server.py` for these - tools (verify with `rg "for r in rows if capability in" server.py` - → no matches). -- [ ] **Group D:** `README.md` has a "Capabilities" section; - `CODEBASE_REQUIREMENTS.md` notes the type-level granularity. -- [ ] **Group E:** blank line restored. -- [ ] All existing tests still pass. -- [ ] New tests cover (a) `limit` contract, (b) capability-only - `trace_flow` seeding, (c) `codebase_search` capability filter. -- [ ] No new ontology bump (still `3`); no unrelated API changes. - -## Notes for the implementer - -- The plan was updated alongside this fix list. **Re-read - `plans/PLAN-CAPABILITIES-MODEL.md`** — the **Filter strategy** and - **`trace_flow` seeding** sections are new and binding. Anything in - this file that conflicts with the plan, the plan wins. -- The reviewer attributed Issue 4 (`trace_flow` dead code) to - implementation. It's actually a plan gap — the plan asked for a - Kuzu change without specifying the LanceDB coordination. Group B - closes that gap. You did not do anything wrong on that one; you - faithfully implemented what the plan said. The plan is now - complete. -- Verify LanceDB array-predicate syntax against the project's - installed Lance version *before* writing the predicate. If the - preferred form (`array_has`) is unavailable, document the chosen - fallback in a comment on `_build_extra_predicates`. -- `find_injectors`' `capability` semantic (consumer side, not target) - is a deliberate API decision; surface it in the Pydantic - description string so callers don't guess wrong. diff --git a/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-report.md b/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-report.md deleted file mode 100644 index d6454873..00000000 --- a/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-report.md +++ /dev/null @@ -1,140 +0,0 @@ -# Implementation Review: PLAN-CAPABILITIES-MODEL - -**Plan file:** `plans/PLAN-CAPABILITIES-MODEL.md` -**Review date:** 2026-04-26 -**Status:** Partially implemented — 4 hard misses, 1 design gap, 2 doc gaps, 1 style nit - ---- - -## Summary - -The core capability machinery is correctly implemented: -- `ONTOLOGY_VERSION` bumped 2 → 3 in `ast_java.py` -- All four detector tables (`_METHOD_ANN_TO_CAPABILITY`, `_TYPE_ANN_TO_CAPABILITY`, `_INJECTED_TYPES_TO_CAPABILITY`, `_SUPERTYPE_TO_CAPABILITY`) are present with the right entries -- `TypeDecl.capabilities` field added; populated by `infer_capabilities_for_type` after construction in `_parse_type` -- `infer_capabilities_for_type` and all tables exported in `__all__` -- `ChunkEnrichment.capabilities` plumbed from `encl.capabilities` in `graph_enrich.py` -- `Symbol` schema extended with `capabilities STRING[]`; `_node_row` defaults and `_CREATE_SYMBOL` Cypher updated; type nodes write `list(d.capabilities)`; phantoms carry `"capabilities": []` -- `SymbolHit.capabilities` field added; `_symbol_return_for` and `_row_to_symbol` updated -- `list_by_capability` added to `KuzuGraph` with correct `list_contains` Cypher -- `list_by_capability` MCP tool added to `server.py` -- `capability` post-filter parameter added to `find_implementors`, `find_subclasses`, `list_by_role`, `list_by_annotation` -- `capabilities: list[str]` added to `SymbolDto` -- `_INSTRUCTIONS` and `trace_flow` tool description updated to mention capabilities -- `"capabilities"` added to `JAVA_ENRICHED_COLUMNS` in `search_lancedb.py` -- Version guard in `KuzuGraph.get` raises on ontology mismatch -- Unit tests in `tests/test_ast_java_capabilities.py` cover all 9 plan scenarios -- `test_symbol_has_capabilities_column` regression guard added to `test_ast_graph_build.py` - ---- - -## Issues - -### Issue 1 — `CodeChunkHit` missing `capabilities` field (Hard miss) - -**File:** `server.py` - -`JAVA_ENRICHED_COLUMNS` in `search_lancedb.py` includes `"capabilities"` so the value is fetched from LanceDB, but `CodeChunkHit` has no `capabilities` field and `_rows_to_hits` never maps it. The plan explicitly requires: - -> Plumb `capabilities` through whatever Pydantic / dataclass models the search path uses to surface Java hits, so callers see them in results. - -**Fix needed:** Add `capabilities: list[str] = Field(default_factory=list)` to `CodeChunkHit`, and map it in `_rows_to_hits` via `_clean_str_list(r.get("capabilities"))`. - ---- - -### Issue 2 — `codebase_search` missing `capability` filter parameter (Hard miss) - -**File:** `server.py` - -The plan says: - -> In `codebase_search`, `find_*`, `list_by_role`, add an optional parameter `capability: str | None` that, when set, AND-filters results to those carrying that capability. (Implementation: post-filter on the returned `SymbolHit.capabilities` list — no Cypher change needed.) - -`list_by_role`, `find_implementors`, `find_subclasses`, and `list_by_annotation` all received the parameter. `codebase_search` did not. - -Note: for `codebase_search` the post-filter would operate on `CodeChunkHit.capabilities` (which also depends on Issue 1 being fixed first). - -**Fix needed:** Add `capability: str | None = Field(default=None, description="...")` to `codebase_search`; post-filter `hits` to `[h for h in hits if capability in h.capabilities]` when `capability` is set. - ---- - -### Issue 3 — `find_injectors` missing `capability` parameter (Hard miss) - -**File:** `server.py` - -The plan says "In `codebase_search`, `find_*`, …". `find_injectors` is a `find_*` tool and did not receive the parameter. The other two `find_*` tools (`find_implementors`, `find_subclasses`) did. - -For `find_injectors` the natural semantic is to filter on the injecting symbol (consumer): keep edges where `edge.src.capabilities` contains the requested capability. - -**Fix needed:** Add `capability: str | None = Field(default=None, …)` to `find_injectors`; post-filter `edges` to those where `capability in e.src.capabilities`. - ---- - -### Issue 4 — Kuzu capability-OR in `_run_seed_query` is effectively dead code (Design gap) - -**File:** `kuzu_queries.py` + `server.py` - -`_run_seed_query` (kuzu_queries.py) correctly adds: - -```python -f"(s.role IN $entry_roles OR {cap_predicates})" -``` - -However, in `server.py`'s `trace_flow`, the first pass already filters LanceDB results with `role_in=["CONTROLLER", "COMPONENT", "SERVICE", "FEIGN_CLIENT"]`. Every FQN that arrives at Kuzu's seed query therefore already has a role in `_ENTRYPOINT_ROLES`, making the `OR cap_predicates` branch unreachable for any class with role `OTHER`. - -Concretely: a plain `Job` implementor (role `OTHER`, capability `SCHEDULED_TASK`) is excluded by the LanceDB role filter before the Kuzu check ever sees it. The plan's stated test case #4 ("returns the `MESSAGE_LISTENER` class as a stage-0 seed even when its primary role is `SERVICE`") does work because `SERVICE` is in `entry_roles`. But the broader intent — expanding seeding beyond role boundaries via capabilities — is not achieved. - -**Fix needed:** In `server.py`'s `trace_flow`, add a third LanceDB seed pass that searches without role restriction but filters on known entry-capability values (`MESSAGE_LISTENER`, `SCHEDULED_TASK`) using a LanceDB predicate on the `capabilities` column, then merges unique FQNs into the seed set before calling `graph.trace_flow`. - ---- - -### Issue 5 — `README.md` not updated (Plan requirement skipped) - -**File:** `README.md` - -The plan requires: - -> `README.md` — add a section "Capabilities" describing the multi-tag axis, the initial capability set, and `list_by_capability`. Keep the existing "Roles" section intact. - -No change was made to `README.md`. - ---- - -### Issue 6 — `CODEBASE_REQUIREMENTS.md` not updated (Plan requirement skipped) - -**File:** `CODEBASE_REQUIREMENTS.md` - -The plan requires: - -> `CODEBASE_REQUIREMENTS.md` — note the type-level granularity choice and the deferred per-method storage (link to this plan). - -No change was made to `CODEBASE_REQUIREMENTS.md`. - ---- - -### Issue 7 — Missing blank line between `_SUPERTYPE_TO_CAPABILITY` and `_TYPE_KINDS` (Style nit) - -**File:** `ast_java.py`, line ~113 - -```python -_SUPERTYPE_TO_CAPABILITY: dict[str, str] = { - "Job": "SCHEDULED_TASK", -} -_TYPE_KINDS = { # <-- no blank line before this -``` - -Every other pair of top-level variables in the file is separated by a blank line. The missing line here was likely a merge artefact. - ---- - -## Priority Order for Fixes - -| # | Severity | File | Description | -|---|----------|------|-------------| -| 1 | High | `server.py` | `CodeChunkHit` missing `capabilities` field | -| 2 | High | `server.py` | `codebase_search` missing `capability` filter | -| 3 | High | `server.py` | `find_injectors` missing `capability` filter | -| 4 | Medium | `server.py` + `kuzu_queries.py` | `trace_flow` capability seeding is dead code for role=OTHER classes | -| 5 | Low | `README.md` | "Capabilities" section not written | -| 6 | Low | `CODEBASE_REQUIREMENTS.md` | Granularity note not added | -| 7 | Nit | `ast_java.py` | Missing blank line between two dict constants |