From 170ebea56c575c443b80e2d8a78dc4d5aa24c1f3 Mon Sep 17 00:00:00 2001
From: Dmitry Teryaev <doudmitry@gmail.com>
Date: Sun, 24 May 2026 12:57:01 +0300
Subject: [PATCH 1/3] chore: cleanup proposals, plans, and reports folder
 structure

- Move completed proposes (HINTS-STRUCTURED-LABEL) to propose/completed/
- Move stale proposes to propose/stale/ (ENHANCED-ROLE-RECOGNITION, INDEX-AUTO-MODE, TIER2-INCREMENTAL-REBUILD, RANKING-MICROSERVICE)
- Move completed plans to plans/completed/ (PLAN-DESCRIBE-HINTS-STRUCTURAL, AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL)
- Create propose/active/ and plans/active/ for future active items
- Move PRODUCT-VISION.md to docs/
- Move CODEBASE_REQUIREMENTS.md to docs/
- Remove reports/ folder (call-graph-review.md, what-to-borrow-from-cmm.md, review/)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .../CODEBASE_REQUIREMENTS.md                  |   0
 {propose => docs}/PRODUCT-VISION.md           |   0
 ...AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL.md |   0
 .../PLAN-DESCRIBE-HINTS-STRUCTURAL.md         |   0
 .../{ => completed}/HINTS-STRUCTURED-LABEL.md |   0
 .../ENHANCED-ROLE-RECOGNITION-PROPOSE.md      |   0
 .../{ => stale}/INDEX-AUTO-MODE-PROPOSE.md    |   0
 .../RANKING-MICROSERVICE-PROPOSE.md           |   0
 .../TIER2-INCREMENTAL-REBUILD-PROPOSE.md      |   0
 reports/call-graph-review.md                  | 364 ---------------
 ...BROWNFIELD-ROLE-OVERRIDES-design-issues.md |  62 ---
 ...LD-ROLE-OVERRIDES-implementation-issues.md | 115 -----
 ...PLAN-CAPABILITIES-MODEL-implement-fixes.md | 431 ------------------
 ...LAN-CAPABILITIES-MODEL-implement-report.md | 140 ------
 reports/what-to-borrow-from-cmm.md            | 247 ----------
 15 files changed, 1359 deletions(-)
 rename CODEBASE_REQUIREMENTS.md => docs/CODEBASE_REQUIREMENTS.md (100%)
 rename {propose => docs}/PRODUCT-VISION.md (100%)
 rename plans/{ => completed}/AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL.md (100%)
 rename plans/{ => completed}/PLAN-DESCRIBE-HINTS-STRUCTURAL.md (100%)
 rename propose/{ => completed}/HINTS-STRUCTURED-LABEL.md (100%)
 rename propose/{ => stale}/ENHANCED-ROLE-RECOGNITION-PROPOSE.md (100%)
 rename propose/{ => stale}/INDEX-AUTO-MODE-PROPOSE.md (100%)
 rename propose/{ => stale}/RANKING-MICROSERVICE-PROPOSE.md (100%)
 rename propose/{ => stale}/TIER2-INCREMENTAL-REBUILD-PROPOSE.md (100%)
 delete mode 100644 reports/call-graph-review.md
 delete mode 100644 reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md
 delete mode 100644 reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-implementation-issues.md
 delete mode 100644 reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-fixes.md
 delete mode 100644 reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-report.md
 delete mode 100644 reports/what-to-borrow-from-cmm.md

diff --git a/CODEBASE_REQUIREMENTS.md b/docs/CODEBASE_REQUIREMENTS.md
similarity index 100%
rename from CODEBASE_REQUIREMENTS.md
rename to docs/CODEBASE_REQUIREMENTS.md
diff --git a/propose/PRODUCT-VISION.md b/docs/PRODUCT-VISION.md
similarity index 100%
rename from propose/PRODUCT-VISION.md
rename to docs/PRODUCT-VISION.md
diff --git a/plans/AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL.md b/plans/completed/AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL.md
similarity index 100%
rename from plans/AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL.md
rename to plans/completed/AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL.md
diff --git a/plans/PLAN-DESCRIBE-HINTS-STRUCTURAL.md b/plans/completed/PLAN-DESCRIBE-HINTS-STRUCTURAL.md
similarity index 100%
rename from plans/PLAN-DESCRIBE-HINTS-STRUCTURAL.md
rename to plans/completed/PLAN-DESCRIBE-HINTS-STRUCTURAL.md
diff --git a/propose/HINTS-STRUCTURED-LABEL.md b/propose/completed/HINTS-STRUCTURED-LABEL.md
similarity index 100%
rename from propose/HINTS-STRUCTURED-LABEL.md
rename to propose/completed/HINTS-STRUCTURED-LABEL.md
diff --git a/propose/ENHANCED-ROLE-RECOGNITION-PROPOSE.md b/propose/stale/ENHANCED-ROLE-RECOGNITION-PROPOSE.md
similarity index 100%
rename from propose/ENHANCED-ROLE-RECOGNITION-PROPOSE.md
rename to propose/stale/ENHANCED-ROLE-RECOGNITION-PROPOSE.md
diff --git a/propose/INDEX-AUTO-MODE-PROPOSE.md b/propose/stale/INDEX-AUTO-MODE-PROPOSE.md
similarity index 100%
rename from propose/INDEX-AUTO-MODE-PROPOSE.md
rename to propose/stale/INDEX-AUTO-MODE-PROPOSE.md
diff --git a/propose/RANKING-MICROSERVICE-PROPOSE.md b/propose/stale/RANKING-MICROSERVICE-PROPOSE.md
similarity index 100%
rename from propose/RANKING-MICROSERVICE-PROPOSE.md
rename to propose/stale/RANKING-MICROSERVICE-PROPOSE.md
diff --git a/propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md b/propose/stale/TIER2-INCREMENTAL-REBUILD-PROPOSE.md
similarity index 100%
rename from propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md
rename to propose/stale/TIER2-INCREMENTAL-REBUILD-PROPOSE.md
diff --git a/reports/call-graph-review.md b/reports/call-graph-review.md
deleted file mode 100644
index e6ed718e..00000000
--- a/reports/call-graph-review.md
+++ /dev/null
@@ -1,364 +0,0 @@
-# Call Graph Layer — Code Review
-
-**Repository:** [HumanBean17/java-codebase-rag](https://github.com/HumanBean17/java-codebase-rag)
-**Commits reviewed:**
-- `b3a15d8` — *call graph layer propose*
-- `fb5473f` — *call graph layer implementation*
-
-**Reference docs:**
-- [`propose/completed/CALL-GRAPH-PROPOSE.md`](https://github.com/HumanBean17/java-codebase-rag/blob/master/propose/completed/CALL-GRAPH-PROPOSE.md)
-- [`plans/completed/PLAN-CALL-GRAPH.md`](https://github.com/HumanBean17/java-codebase-rag/blob/master/plans/completed/PLAN-CALL-GRAPH.md)
-
-**Test status:** all 24 new call-graph tests pass locally
-(`tests/test_ast_java_calls.py`, `tests/test_call_graph_smoke_roundtrip.py`,
-`tests/test_call_graph_receiver_resolution.py`).
-
----
-
-## Overall verdict
-
-**Strong, faithfully-scoped implementation.** The proposal is realised as
-written, the receiver-type resolver is well-structured, the schema and edge
-metadata match the design (confidence + strategy + source), and the test
-coverage targets concrete proposal section numbers. Scope discipline is
-visible — no creep into HTTP / async / AOP / traces.
-
-There are **three correctness bugs** that should land as a quick follow-up
-before Phase 3 is closed, plus a handful of design issues worth pushing back
-on. All three bugs share one root cause: **resolution strategy and
-confidence are silently downgraded at edge-emit time when the receiver was
-already resolved successfully.**
-
----
-
-## What's done well
-
-- **Confidence + strategy tagging is faithful to the design.** Every edge
-  carries (`confidence`, `strategy`, `source='static'`) — clean migration
-  path for trace ingestion later.
-- **Multigraph dedup at write time** (`(src_id, dst_id, arg_count, line)`)
-  is correctly shaped: prevents accidental duplication while preserving
-  overload-ambiguous fan-out at distinct call sites.
-- **Receiver-type resolver** is clear and matches the proposal: scope table
-  built once per method, supertype-bounded lookup, explicit
-  `chained_receiver` phantom path, deterministic phantom IDs.
-- **Receiver-disambiguation discipline.** `_unique_type_simple_resolve`
-  deliberately uses the *type* registry (not a per-method simple-name
-  index). The dedicated test
-  `test_receiver_disambiguation_uses_type_index_not_method_unique` is
-  exactly the right kind of negative test — this is the precise trap
-  CMM-style cascades fall into and the implementation avoids it.
-- **`_method_ids_for_call_graph_needle`** elegantly accepts type FQN,
-  method FQN, or simple method name; fan-out through `DECLARES` from a
-  type needle is the right move and matches §6.1.
-- **`exclude_external` is filter-on-result, not filter-on-store.** Phantoms
-  stay in the graph (so impact analysis can see JDK-adjacent signals), but
-  query consumers get clean lists by default. Matches risk #2 mitigation
-  in the proposal.
-- **Tests target proposal section numbers.** 24 tests, all passing,
-  including a Kuzu round-trip on a real fixture project. The shadowing
-  test (`test_local_shadows_field_same_name_resolves_receiver`) is the
-  kind of edge case that bites in real codebases.
-- **Diagnostics are baked in** — `pass3_calls` prints the chained-phantom
-  percentage as the proposal mandates.
-
----
-
-## Bugs (must fix)
-
-### B1. Constructor calls always become phantoms when the class has no explicit constructor
-
-**Severity: high — most common Java call site is broken.**
-
-`new Svc()` in `ScopeReceivers.byLocal()` resolves the receiver type to
-`smoke.Svc` correctly. But `Svc` has no explicit constructor in source, so
-`_parse_method` is never invoked for an `<init>`, and no constructor
-`MemberEntry` is created. `_lookup_method_candidates(type='smoke.Svc',
-callee='<init>', argc=0)` finds nothing → fallthrough to phantom at
-`confidence=0.0`.
-
-Confirmed empirically against the smoke fixture:
-
-```
-['smoke.ScopeReceivers#byLocal()',           'smoke.Svc#<init>(0)', 'phantom', False, 0.0]
-['smoke.ScopeReceivers#shadowLocalOverField()', 'smoke.Svc#<init>(0)', 'phantom', False, 0.0]
-```
-
-In a real Spring codebase, **every** `new MyDto()`, `new HashMap<>()`,
-`new ArrayList<>()` on a project type without a hand-written constructor
-lands as a phantom.
-
-**Fix.** When parsing a `TypeDecl` and discovering no constructor
-declaration, synthesize a default
-`MethodDecl(name="<init>", signature="<init>()", is_constructor=True, ...)`
-with `start_line` / `start_byte` from the type declaration and
-`parameters=[]`. Make sure it gets a `MemberEntry`.
-
-Two corollary checks:
-
-- `_emit_call_edge` for `new Svc()` should then resolve to the synthesized
-  member with `strategy='constructor'` (not `phantom`), `confidence`
-  inherited from the receiver-resolution tier.
-- Confirm existing `INJECTS` / `DECLARES` accounting doesn't double-count
-  the synthesized node.
-
-**Suggested test** — add to `tests/test_call_graph_smoke_roundtrip.py`
-(`test_implicit_default_ctor_is_resolved`):
-
-```java
-public class HasNoCtor {}
-public class Caller { void m() { new HasNoCtor(); } }
-```
-
-Assert: `(Caller#m)-[CALLS {strategy:'constructor', resolved:true}]->(HasNoCtor#<init>())`.
-
----
-
-### B2. Implicit `super()` for a class that doesn't extend anything is mis-tagged as `phantom`
-
-**Severity: medium — diagnostic regression, not a wrong answer.**
-
-`WildUtils` has an explicit `private WildUtils() {}` constructor with no
-`super(...)` body, so the AST extractor synthesizes the implicit-super
-call site. `_first_supertype_fqn` returns `None` (no `EXTENDS` row →
-there is no `Object` node in the index), so `_resolve_receiver_type`
-returns `(None, "phantom", 0.0)`. Result:
-
-```
-['smoke.WildUtils#WildUtils()', '?super#<init>(0)', 'phantom', False, 0.0]
-```
-
-The proposal §4.2 promises strategy `implicit_super (0.90)` for this case.
-Right now the agent cannot distinguish "implicit super to `Object`" from
-"I have no idea what this call resolved to" — real signal loss.
-
-**Fix.** In `_resolve_receiver_type`, when `expr == 'super'` and
-`_first_supertype_fqn(...) is None`, return
-`("java.lang.Object", "implicit_super", 0.90)`. In `_emit_call_edge`,
-allow phantom callee (no member resolved on `Object`) but **preserve
-`strategy='implicit_super'` and `confidence=0.90`** instead of overriding
-to `phantom` / `0.0`. This is the same fix-shape as B3 below.
-
----
-
-### B3. Resolution strategy and confidence are silently overridden to `phantom` / `0.0` when the callee can't be located on a resolved external receiver
-
-**Severity: high — collapses static-import precision when callees are JDK / Spring.**
-
-In `_resolve_and_emit_call`:
-
-```python
-if not candidates:
-    pid = _phantom_method_id(...)
-    _emit_call_edge(..., confidence=0.0, strategy="phantom", resolved=False)
-    return
-```
-
-This branch fires whenever the receiver type *did* resolve (e.g.
-`java.util.Objects` via `static_import`, confidence 0.95) but the callee
-method isn't on a type we indexed. The static-import smoke test confirms it:
-
-```
-requireNonNull edges: 1
-  phantom 0.0 False java.util.Objects#requireNonNull(1)
-```
-
-The README and the MCP instructions both tell agents to use
-`min_confidence=0.9` to filter noise. Under that filter, **every JDK
-static-import call disappears from the graph**, even though the resolver
-*knew* the call's target type with 0.95 confidence.
-
-**Fix.** Decouple the *receiver-resolution strategy/confidence* from the
-*callee-found* boolean. When `candidates` is empty:
-
-- Keep the phantom callee (creating it on the resolved receiver type —
-  already done).
-- Keep `resolved=False` on the edge (the *callee node* is a phantom).
-- **Preserve the receiver-resolution `strat` and `conf`** unless they're
-  `'chained_receiver'`. Specifically: `strategy` stays `'static_import'` /
-  `'static_import_wildcard'` / `'import_map'` / `'same_module'` etc.;
-  `confidence` stays the receiver-tier value.
-
-The only case where `confidence=0.0, strategy='phantom'` is honest is when
-the receiver itself was unresolvable. Distinguishing those two failure
-modes is the whole point of the cascade.
-
-Optional: add a small property `callee_found BOOLEAN` on the edge so a
-query like *"high-confidence edges with phantom callees"* (= calls into
-well-known external libraries) becomes one Cypher predicate.
-
-**Suggested tests:**
-
-- `test_static_import_to_jdk_keeps_high_confidence` — `requireNonNull`
-  edge has `confidence>=0.95` and `strategy='static_import'`, with
-  `resolved=False` on the edge.
-- `test_min_confidence_filter_keeps_high_confidence_static_import_callers`
-  — `find_callers('java.util.Objects#requireNonNull(1)', min_confidence=0.9)`
-  returns the in-repo caller.
-
----
-
-## Design issues (push back on the proposal here, not just the implementation)
-
-### D1. Phantom-ID `arg_count` semantics are inconsistent across method-references and regular calls
-
-`_phantom_method_id` builds the FQN as `{receiver}#{callee}({arg_count})`.
-For method references the `arg_count` is `-1`. So the same external method
-can exist as both `Foo#bar(2)` and `Foo#bar(-1)` phantom nodes — distinct
-nodes for the same logical target. The dedup key
-`(src_id, dst_id, arg_count, line)` then keeps both edges, doubling the
-graph for code that mixes calls and method references on the same target.
-
-**Recommendation.** Either normalize phantom IDs without `arg_count` for
-method references (`?{recv}#{callee}(?)`) or drop `arg_count` from the
-dedup key and use `(src_id, dst_id, line, byte)` (line+byte already pin a
-unique call site).
-
----
-
-### D2. Method-reference precision is leaving free wins on the table
-
-Method references that *are* unambiguous on name (single method, no
-overloads) currently still emit with `arg_count=-1`. Cheap precision win,
-no extra resolver complexity: when the receiver type is known and exactly
-one method with `name == callee_simple` exists on the receiver type, pick
-that single-arity match and emit a fully-resolved edge with the receiver's
-real arity instead of `-1`.
-
----
-
-### D3. Anonymous-inner-class call attribution does the proposal-correct thing, but the design is questionable
-
-Right now `pingFromAnon()` (called from inside
-`new Runnable() { run() { pingFromAnon(); } }`) is attributed to
-**`NestedCalls#m()`**, the enclosing named method, with
-`strategy='this_super'`. That matches §4.1's wording.
-
-But: the anonymous `Runnable` *does* get parsed as a nested type in
-`_parse_type` (kind `class`). It produces a `MemberEntry` for its
-`run()` method. So the graph has two contradictory facts: the call edge
-goes from `NestedCalls#m`, and the structural fact "there exists a
-`run()` method here" lives on a separate, disconnected anonymous type
-node.
-
-**Recommendation.** Re-attribute calls inside an anonymous-class body to
-the anonymous-class member. The named-enclosing fallback is only needed
-for **lambdas** (which don't synthesize a member) and static / instance
-initializers. For anonymous classes, the call-site naturally belongs to
-the anonymous member. This makes
-`find_callers('OperatorAssignedProcessor.onOperatorAssigned')` find the
-anonymous handler that actually contains the call, instead of the outer
-service method.
-
----
-
-### D4. `expand_methods` discards confidence on the way out
-
-The output is `list[str]` of type FQNs. There's no way for the search-side
-fusion in `_graph_expand_merge` to weight a CALLS-derived hit lower than
-a structural one. The proposal §6.2 says "merged via existing RRF, no new
-caller-visible parameters" — so RRF treats every reach equally regardless
-of whether it came from a 0.95 import-map edge or a 0.55 suffix edge.
-
-**Recommendation (small).** Have `expand_methods` return
-`list[tuple[str, float]]` (type FQN + max confidence on the discovery
-path), and let `_graph_expand_merge` pass that as the RRF rank weight.
-Internal-only signature change; no MCP surface change.
-
----
-
-### D5. `trace_flow`'s default change quietly rebudgets stage capacity across two qualitatively different edge sources
-
-`follow_calls=True` is the new default. Existing agent prompts that
-expected type-only stages now get extra entries with
-`via.edge_type='CALLS'`. That's good — agents can infer it. But the
-per-stage cap (`stage_limit`) now budgets across both edge classes, so a
-high-fan-out service can starve INJECTS results in favor of CALLS results.
-
-**Recommendation.** Either:
-
-1. Keep separate budgets (`stage_limit_structural`, `stage_limit_calls`,
-   default to `stage_limit` each), or
-2. Order ingestion to prefer INJECTS / EXTENDS / IMPLEMENTS first, then
-   top up with CALLS until `stage_limit`. The current code already runs
-   the structural query first — just keep the CALLS top-up bounded by
-   `stage_limit - len(stage_results)` instead of a separate
-   `stage_limit * 4` LIMIT.
-
----
-
-### D6. `_resolve_this_super_field_chain` lacks fixture coverage
-
-The resolver line
-`chain = _resolve_this_super_field_chain(expr, member=member, ast=ast, tables=tables)`
-is a real bonus over what CMM does — if it walks
-`this.fieldA.fieldB.fieldC.method()` correctly. Add a smoke fixture that
-exercises it; none of the existing files do.
-
----
-
-## Smaller nits
-
-- **N1 — Per-call rebuild of `_scope_table`.** `_resolve_and_emit_call`
-  calls `_scope_table(member, ast, tables)` on every call site.
-  Field / parameter scope is identical for every call inside a single
-  method body — locals only grow as you step through the body. Build it
-  once per `member` in `_resolve_method_calls` and pass it in. On a
-  5-microservice corpus this is the kind of constant-factor that doubles
-  `pass3_calls` runtime.
-- **N2 — `_lookup_method_candidates`'s `name_only` fallback rule is good,
-  but the strategy logic in `_resolve_and_emit_call` is intricate.**
-  The branch
-  `elif name_only_fb and len(candidates) == 1: edge_strat = strat` is
-  correct but easy to misread — the inline comment is good; consider
-  promoting it to a docstring section.
-- **N3 — `is_static_call` heuristic.** `_infer_static_method_invocation`
-  returns `True` when the receiver starts with an uppercase identifier.
-  For `var Foo = supplier.get();` followed by `Foo.bar()` this
-  misclassifies. Rare in practice, but worth a TODO; conservative fix is
-  to consult the scope table (if `Foo` is in scope as a variable, it's
-  not a static call).
-- **N4 — Ontology guard.** `ONTOLOGY_VERSION` 3 → 4 is set, but confirm
-  `KuzuGraph.get` actually raises on `GraphMeta.ontology_version`
-  mismatch at read time so a stale graph fails loudly (proposal §5.3).
-- **N5 — `pass3_calls` diagnostics.** The log line reports
-  chained-phantom % only. Add the `phantom_other` ratio (the bigger one
-  in real codebases) so you can spot B1 / B3 regressions in the log
-  immediately.
-- **N6 — Method reference inside lambda.** `visit` sets
-  `lam=lam or chained` for method references with a chained qualifier.
-  That conflates "I'm in a lambda" with "this method ref is itself
-  chained." `chained` should propagate as a separate flag, not as
-  `in_lambda`.
-- **N7 — Empty `expr` and `is_static_call=False` branch.** The condition
-  `expr in ("", "this") or (not expr and call.is_static_call is False
-  and not call.receiver_expr)` is redundant: if `expr == ""` the second
-  clause is also true. Simplify to `expr in ("", "this")`.
-
----
-
-## Suggested fix order
-
-1. **B1, B2, B3 as one PR** titled
-   *"call graph: faithful confidence preservation across the resolver→writer boundary"*
-   — the three bugs share one architectural fix (don't downgrade
-   strategy / confidence at edge-emit time when the receiver was
-   resolved). Add the suggested tests in the same PR.
-2. **D5 as a separate PR** — `trace_flow` budget split with a regression
-   test that seeds a service whose CALLS fan-out exceeds the structural
-   one.
-3. **D3 (anon-class re-attribution), D4 (`expand_methods` confidence),
-   N1 (scope-table caching) as a small follow-up** before opening the
-   next phase.
-
----
-
-## Closing note
-
-This is solid Phase-3 work. Land the three bug fixes and the codebase is
-in an excellent spot to start on the next phase — either cross-service
-`HTTP_CALLS` (B6 / B7 in
-[`what-to-borrow-from-cmm.md`](https://github.com/HumanBean17/java-codebase-rag/blob/master/tmp/what-to-borrow-from-cmm.md))
-or runtime-trace ingestion (B3 from the same doc). Both will lean on the
-resolver and confidence machinery just built; the bug fixes above make
-that lean trustworthy.
diff --git a/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md b/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md
deleted file mode 100644
index 83a99cae..00000000
--- a/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md
+++ /dev/null
@@ -1,62 +0,0 @@
-# Design issues: PLAN-BROWNFIELD-ROLE-OVERRIDES (plan / specification)
-
-**Plan file:** `plans/todo/PLAN-BROWNFIELD-ROLE-OVERRIDES.md`  
-**Review date:** 2026-04-26  
-**Scope:** Problems, ambiguities, or gaps in the *written plan* (not the codebase).
-
----
-
-## 1. Dual pipeline for meta-annotation data (spec gap)
-
-The plan describes building Layer A (meta-annotation reachability) from a two-pass process anchored in `build_ast_graph.py` and `GraphTables`. The chunk-enrichment / Lance path must also apply the same resolution rules, but the plan does **not** require a single shared primitive for “which `@interface` definitions exist in the project.”
-
-A careful reader can infer that graph build and index enrichment should agree, but two independent implementations (graph tables vs. a separate tree walk) are **not** ruled out. If file coverage, exclude patterns, or parse-failure handling differ, Lance and Kuzu can **disagree** on `meta_chain` for the same type. The plan would be stronger with an explicit constraint: e.g. “meta maps MUST be derived from the same file set and exclusion rules as `build_ast_graph` pass1,” or “Lance and Kuzu MUST share one builder function.”
-
----
-
-## 2. Depth cap for meta-annotation resolution is under-specified
-
-The plan gives a sketch of `_resolve_meta_chain` with `len(seen) > 4` and cycle handling. As written, the `seen` set is used both for **cycle** detection and as a stand-in for **path depth**. On a *linear* chain of meta-annotations, set size tracks depth. On **branching** shapes, set cardinality and “steps from root” diverge, so the sketch does not define a single clear semantics (strict path depth vs. global visit count).
-
-The follow-up test (“six wrappers → `OTHER`”) depends on a precise cap. The plan should name the exact metric (e.g. maximum path length from the start simple name) and the integer bound, so implementers and tests are aligned.
-
----
-
-## 3. Pre-flight test 9 mixes “unit” and “integration” scope
-
-The pre-flight item asks for a “unit-style” regression but specifies: build a **fresh** Lance index with FQN overrides, **query the table directly**, and then run **`codebase_search(..., capability=...)`** end to end. That is a **multi-layer** test (indexer + storage + search API) and is expensive to run and to keep stable in CI.
-
-A tiered requirement would match intent better: (1) schema / `JavaLanceChunk` field, (2) `process_java_file` row, (3) optional full search. As written, teams may either skip the heavy part or over-invest in flaky integration for what is mainly a **write-path** contract.
-
----
-
-## 4. “Precedence” vs. “execution order” is correct but error-prone to skim
-
-The plan is internally consistent: execution order is the *reverse* of listed priority, and guards use the **current** `role` after each step. Still, a reader who only scans the “Precedence summary (final)” table may implement **C before FQN** in the wrong direction or mis-order **B vs. A** without reading the “Execution order in code (REQUIRED)” block.
-
-This is a **documentation hazard** in the spec, not a logic error. A short, single bullet at the top (“Apply steps in *only* the order: …; do not reorder”) or a Mermaid sequence diagram would reduce mis-implementation.
-
----
-
-## 5. Layer A duplicate `@interface` simple names
-
-The plan correctly specifies first-seen-wins and a stderr warning. The **implication** (colliding simple names in different packages map to one `meta_chain` entry) is only obvious if you already know Java’s annotation resolution limits in this indexer. A one-line “Limitation:” callout in the plan would set expectations for monorepos with same-named annotations.
-
----
-
-## 6. Rollout vs. single document
-
-The plan says three independent PRs (Phase 1 → 2 → 3) while also presenting all phases in one file. That is fine for a complete picture, but the **merge strategy** (squashed single PR vs. three) is a process choice the plan does not need to fix—only note that “shippable phases” and “one landing” can conflict in review scope unless branches are cut accordingly.
-
----
-
-## Summary
-
-| ID | Topic                         | Severity (spec) |
-|----|------------------------------|-----------------|
-| 1  | Single source of truth for meta map inputs | High (consistency) |
-| 2  | Depth / cycle semantics       | Medium          |
-| 3  | Pre-flight test cost / tiers   | Low–medium      |
-| 4  | Precedence skimming hazard    | Low             |
-| 5  | Duplicate simple-name limits  | Low             |
-| 6  | Multi-PR vs one doc            | Process only    |
diff --git a/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-implementation-issues.md b/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-implementation-issues.md
deleted file mode 100644
index b7035db8..00000000
--- a/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-implementation-issues.md
+++ /dev/null
@@ -1,115 +0,0 @@
-# Implementation issues: PLAN-BROWNFIELD-ROLE-OVERRIDES
-
-**Plan file:** `plans/todo/PLAN-BROWNFIELD-ROLE-OVERRIDES.md`  
-**Review date:** 2026-04-26  
-**Scope:** Gaps, mistakes, and risks in the *implementation* as compared to the plan’s stated behaviour and test matrix. Code and tests are assumed to live under `mcp_lancedb_bundle/`.
-
-**Tests run:** `pytest tests/test_brownfield_overrides.py` — 20 passed (at time of review).
-
----
-
-## 1. Pre-flight LanceDB + search regression is incomplete vs. plan
-
-The plan’s pre-flight test (item 9) requires, in order: build a **fresh** Lance index using the real pipeline with FQN `role`/`capabilities`, assert rows on **direct table** read, then **`codebase_search(..., capability=...)`** and assert the type is returned.
-
-**Implemented approximations:**
-
-- **`enrich_chunk` + YAML** — checks resolver + chunk path; does not exercise `process_java_file` / `JavaLanceChunk` materialisation end-to-end.
-- **Raw LanceDB / PyArrow** — proves `list(string)` round-trip, not the CocoIndex / `JavaLanceChunk` row shape.
-- **Dataclass introspection** — confirms `JavaLanceChunk` has a `capabilities` field only.
-
-**Risk:** A regression that removes or mis-wires the CocoIndex write path could slip past the suite; `codebase_search` capability filtering is not covered by the brownfield tests.
-
-**Severity:** Medium (guards are weaker than specified, not a known production bug in the reviewed snapshot).
-
----
-
-## 2. “Malformed YAML” test does not use malformed YAML
-
-The plan (Phase 1, test 8) calls for malformed YAML to yield empty overrides without crashing.
-
-**Current behaviour:** a test exercises loading from a **non-existent** path, which is closer to “missing file” than “invalid YAML.” Invalid YAML in an existing file is only covered implicitly by the loader’s `except` branch, not by a named test.
-
-**Follow-up:** add a `tmp_path` file with content that is not valid YAML, or rename the test to match “missing config / empty” semantics.
-
-**Severity:** Low (behaviour likely correct; test is mis-specified or misnamed).
-
----
-
-## 3. Phase 2 test matrix: gaps
-
-The plan’s Phase 2 test list includes scenarios **not** present in `tests/test_brownfield_overrides.py` (at review time):
-
-- **Cyclic** meta-annotation graph (A ↔ B): no crash, role remains `OTHER`.
-- **Long chain** (e.g. six wrappers): after depth cap, role `OTHER` (or whatever the spec fixes).
-- **FQN + meta + Layer B together:** FQN should still win; explicit per-class config overrides automatic meta and annotation maps.
-
-**Covered and notable:** B-beats-A regression, two-hop to `SERVICE`, method-level meta to capability, basic `@Service` on custom `@interface`.
-
-**Severity:** Medium for **cycle** and **depth** (guard against stack bugs and cap drift); **low** for the FQN interaction if hand-tested or covered elsewhere (not verified here).
-
----
-
-## 4. Phase 3 test matrix: minor gaps
-
-The plan asks for:
-
-- **Additive capability** — `@CodebaseCapability` in addition to AST-inferred capabilities (e.g. alongside a Spring stereotype).
-- **Two separate** `@CodebaseCapability` annotations on the same class, as well as the **container** form.
-
-**Current coverage** focuses on `CodebaseRole` variants, invalid role warnings, and **`@CodebaseCapabilities({...})` container** with two inner values. The **stacked** `@CodebaseCapability` / `@CodebaseCapability` case is not clearly duplicated as a dedicated test; additive-on-AST is not isolated.
-
-**Severity:** Low (behaviour is straightforward from code structure; risk is **regression** in parser or resolver order, not a known bug).
-
----
-
-## 5. Possible Lance vs. Kuzu disagreement on meta maps
-
-**Implementation detail:** the graph writer derives annotation declarations from **in-memory graph tables**; **`enrich_chunk`** builds meta from a **separate** full-disk walk (`_collect_annotation_decls_from_disk` + cache).
-
-If the two ever differ (excludes, parse errors, or partial scans), the **same** Java type could get **different** Layer A results in Kuzu than on Lance chunks. The plan’s intent is consistency across stores; this is an **integration consistency** risk, not a single-file bug.
-
-**Severity:** Low until observed in a real project; worth monitoring or converging the two inputs.
-
----
-
-## 6. Depth cap semantics (implementation) vs. plan’s sketch
-
-The resolver’s recursive walk uses a **path set** and stops when `len(path) > 4`. The plan’s pseudocode used a slightly different shape (`seen` and `len(seen) > 4`).
-
-**Risk:** off-by-one vs. the plan’s “depth 4 / six links `OTHER`” without an automated test (see §3), so behaviour could drift in a refactor.
-
-**Severity:** Low–medium, mitigated if Phase 2 depth test is added.
-
----
-
-## 7. Kuzu member nodes and capabilities
-
-`Symbol` rows for **methods** use `_node_row` defaults (`capabilities: []`, `role: "OTHER"`) and do not run the brownfield resolver per method. The plan is **type-centric**; this is not a plan violation, but any future expectation of “method symbol capabilities in the graph” would be unmet.
-
-**Severity:** N/A for current plan; documentation only if users assume otherwise.
-
----
-
-## Summary
-
-| ID | Topic                                | Severity   |
-|----|--------------------------------------|------------|
-| 1  | Pre-flight E2E (index + search)      | Medium     |
-| 2  | Malformed YAML test naming / body    | Low        |
-| 3  | Phase 2: cycle, depth, FQN+meta tests| Medium (partial) |
-| 4  | Phase 3: stacked caps + AST additive | Low        |
-| 5  | Meta map source: graph vs. disk     | Low (consistency) |
-| 6  | Depth cap without test               | Low–medium |
-| 7  | Method `Symbol` rows / capabilities  | N/A        |
-
----
-
-## What was in good shape (for balance)
-
-- `BrownfieldOverrides` loader, validation against shared ontology, stderr warnings for unknowns.
-- `resolve_role_and_capabilities` execution order and **B-before-A** semantics with **OTHER** guards; FQN and `@CodebaseRole` ordering relative to C.
-- `AnnotationRef.arguments` and `CodebaseCapabilities` value extraction in `ast_java.py`.
-- Wiring: `build_ast_graph` type nodes, `enrich_chunk`, `JavaLanceChunk` + `process_java_file` for `capabilities`.
-- README, CODEBASE_REQUIREMENTS, and MCP `instructions` mention customisation.
-- B-beats-A regression test is present (critical for the plan’s execution-order invariant).
diff --git a/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-fixes.md b/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-fixes.md
deleted file mode 100644
index eb808f52..00000000
--- a/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-fixes.md
+++ /dev/null
@@ -1,431 +0,0 @@
-# PLAN-CAPABILITIES-MODEL — implementation fixes
-
-**Inputs:** `reports/review/PLAN-CAPABILITIES-MODEL-implement-report.md` +
-designer review of that report.
-**Plan file (now amended):** `plans/PLAN-CAPABILITIES-MODEL.md`. Re-read
-the plan first — it has two new sections (**Filter strategy** and the
-expanded **`trace_flow` seeding** subsection) that change how some of
-the original instructions should be implemented.
-**Goal of this pass:** close 7 issues from the review **plus** correct a
-design-level flaw the review surfaced but did not call out — the
-existing four `capability` filters use naive post-filter, which silently
-under-delivers results against the `limit` contract.
-
-Apply the fixes in priority order. Run the full test suite after each
-group; do not bundle group D into A/B/C.
-
----
-
-## Group A — `codebase_search` response & filter (Issues 1, 2 + design correction)
-
-### A.1 Add `capabilities` to `CodeChunkHit`
-
-**File:** `server.py`
-
-In the `CodeChunkHit` Pydantic model (currently around line 65), add the
-field next to `annotations_on_type` / `symbols`:
-
-```python
-capabilities: list[str] = Field(
-    default_factory=list,
-    description=(
-        "Multi-tag capabilities derived from method/type annotations "
-        "and injected types (MESSAGE_LISTENER, MESSAGE_PRODUCER, "
-        "SCHEDULED_TASK, EXCEPTION_HANDLER). A class can carry several."
-    ),
-)
-```
-
-In `_rows_to_hits` (around line 402), populate it alongside the other
-list fields:
-
-```python
-capabilities=_clean_str_list(r.get("capabilities")),
-```
-
-`_clean_str_list` already handles the legacy-string / native-list dual
-shape — no new helper needed.
-
-`JAVA_ENRICHED_COLUMNS` already includes `"capabilities"`
-(`search_lancedb.py` line 37), so the column is fetched when present.
-The schema-presence guard on line 459 means stale indexes without the
-column degrade gracefully.
-
-### A.2 Add `capability` filter to `codebase_search` (storage-pushdown)
-
-**Files:** `search_lancedb.py`, `server.py`
-
-This is **not** a post-filter. The plan's amended **Filter strategy**
-section is explicit: post-filter without over-fetch widening violates
-the `limit` contract and is rejected.
-
-#### Step A.2.1 — extend `_build_extra_predicates`
-
-In `search_lancedb.py` (around line 65), accept a new keyword:
-
-```python
-def _build_extra_predicates(
-    *,
-    columns: set[str],
-    role: str | None,
-    module: str | None,
-    microservice: str | None,
-    package_prefix: str | None,
-    fqn_in: list[str] | None,
-    role_in: list[str] | None = None,
-    exclude_roles: list[str] | None = None,
-    capability: str | None = None,        # NEW
-    capability_in: list[str] | None = None,  # NEW — used by trace_flow seeding
-) -> list[str]:
-    ...
-```
-
-Emit a list-contains predicate when the column exists. **Verify the
-exact LanceDB SQL syntax for the project's installed version** before
-wiring — likely candidates, in order of compatibility:
-
-```python
-# Preferred (Lance >=0.10):
-preds.append(f"array_has(capabilities, '{_escape_sql_str(capability)}')")
-# Fallback if array_has unavailable:
-preds.append(f"array_position(capabilities, '{_escape_sql_str(capability)}') >= 0")
-# Last resort (some Lance builds):
-preds.append(f"'{_escape_sql_str(capability)}' = ANY(capabilities)")
-```
-
-Run a tiny ad-hoc query against the local index to confirm which form
-parses. Pick one and use it consistently.
-
-For the multi-value variant (`capability_in`, used only by `trace_flow`
-seeding — see Group B), build a disjunction:
-
-```python
-if capability_in and "capabilities" in columns:
-    parts = [
-        f"array_has(capabilities, '{_escape_sql_str(c)}')"
-        for c in capability_in
-    ]
-    preds.append("(" + " OR ".join(parts) + ")")
-```
-
-Both predicates must be conditioned on `"capabilities" in columns` so
-older indexes lacking the column still answer queries (filter ignored).
-
-#### Step A.2.2 — surface in `run_search`
-
-`run_search` (around line 722) gains a `capability: str | None = None`
-parameter and forwards it to `_build_extra_predicates`. Same for
-`capability_in: list[str] | None = None`. No other ranking change.
-
-#### Step A.2.3 — surface in `codebase_search` MCP tool
-
-In `server.py::codebase_search` (around line 488), add the parameter
-next to `role`:
-
-```python
-capability: str | None = Field(
-    default=None,
-    description=(
-        "Java only: AND-filter to chunks whose enclosing type carries "
-        "this capability (MESSAGE_LISTENER|MESSAGE_PRODUCER|"
-        "SCHEDULED_TASK|EXCEPTION_HANDLER). Use `list_by_capability` "
-        "for graph-only queries."
-    ),
-),
-```
-
-Forward to `run_search(..., capability=capability, ...)`.
-
-### A.3 Update unit + integration tests
-
-- Extend `tests/test_lancedb_e2e.py` with the **`limit` contract**
-  assertion (plan test #3): a fixture with 50 `@Service` classes of
-  which 5 are also `MESSAGE_PRODUCER`; `list_by_role("SERVICE",
-  capability="MESSAGE_PRODUCER", limit=50)` must return exactly the 5.
-  Same shape for `codebase_search(..., capability=...)` (plan test #6).
-
----
-
-## Group B — `trace_flow` capability seeding coordination (Issue 4 + design fix)
-
-This is the design gap the review surfaced. The implementer faithfully
-wrote the Kuzu OR predicate the plan asked for, but the LanceDB
-pre-filter in `server.py::trace_flow` discards capability-only
-entrypoints (role=OTHER, capability=SCHEDULED_TASK) before the Kuzu
-seed query ever sees their FQNs. **Both sides must learn about
-capabilities together.**
-
-The plan's amended **`trace_flow` seeding** subsection is now explicit
-about this. The Kuzu side is already implemented; only the LanceDB side
-needs work.
-
-### B.1 Widen the LanceDB seed pre-filter
-
-**File:** `server.py`
-
-In `trace_flow` (around line 880), the existing seed helper is:
-
-```python
-entry_roles = ["CONTROLLER", "COMPONENT", "SERVICE", "FEIGN_CLIENT"]
-
-def _seed(role_allowlist: list[str] | None) -> list[dict[str, Any]]:
-    return run_search(
-        ...
-        role_in=role_allowlist,
-        exclude_roles=None if role_allowlist else sorted(baseline_excludes),
-    )
-```
-
-Extend it to also pass capability allowlist. Match the Kuzu side
-exactly — `["MESSAGE_LISTENER", "SCHEDULED_TASK"]`:
-
-```python
-entry_roles = ["CONTROLLER", "COMPONENT", "SERVICE", "FEIGN_CLIENT"]
-entry_capabilities = ["MESSAGE_LISTENER", "SCHEDULED_TASK"]
-
-def _seed(role_allowlist: list[str] | None,
-          capability_allowlist: list[str] | None) -> list[dict[str, Any]]:
-    return run_search(
-        ...
-        role_in=role_allowlist,
-        capability_in=capability_allowlist,
-        exclude_roles=(
-            None if (role_allowlist or capability_allowlist)
-            else sorted(baseline_excludes)
-        ),
-    )
-```
-
-Then in the calling code:
-
-```python
-# First pass: restricted to entrypoint-like role OR entrypoint capability.
-seed_rows = await asyncio.to_thread(_seed, entry_roles, entry_capabilities)
-if not seed_rows:
-    seed_rows = await asyncio.to_thread(_seed, None, None)
-```
-
-The `OR` semantics between `role_in` and `capability_in` are produced
-by `_build_extra_predicates`: each predicate is a separate string,
-joined with `AND` at the top level. To get the right semantics
-(role-OR-capability rather than role-AND-capability), emit a *combined
-disjunction* when both are set:
-
-```python
-# In _build_extra_predicates:
-role_pred = None
-if role_in and "role" in columns:
-    vals = ", ".join(f"'{_escape_sql_str(v)}'" for v in role_in)
-    role_pred = f"role IN ({vals})"
-
-cap_pred = None
-if capability_in and "capabilities" in columns:
-    parts = [
-        f"array_has(capabilities, '{_escape_sql_str(c)}')"
-        for c in capability_in
-    ]
-    cap_pred = "(" + " OR ".join(parts) + ")"
-
-if role_pred and cap_pred:
-    preds.append(f"({role_pred} OR {cap_pred})")
-elif role_pred:
-    preds.append(role_pred)
-elif cap_pred:
-    preds.append(cap_pred)
-```
-
-The standalone `role_in` / single `capability` cases keep their existing
-behaviour (each emitted independently as before). Only the *paired
-seeding case* triggers the OR composition.
-
-### B.2 Verify with a fixture
-
-Add to `tests/test_lancedb_e2e.py` (plan test #5): a fixture class
-implementing `org.quartz.Job` with **no** Spring stereotype. Confirm
-that `trace_flow("scheduled order cleanup", ...)` returns this class as
-a stage-0 seed. Without B.1 it will not — that is the regression guard.
-
----
-
-## Group C — `find_*` and `list_by_*` storage pushdown (Issue 3 + design fix)
-
-The four already-landed `capability` filters
-(`find_implementors`, `find_subclasses`, `list_by_role`,
-`list_by_annotation`) use naive post-filter. `find_injectors` is
-missing the parameter entirely. Both flaws fix together by switching to
-storage pushdown in Kuzu.
-
-### C.1 Push the `capability` filter into `KuzuGraph` methods
-
-**File:** `kuzu_queries.py`
-
-For each of the five graph methods consumed by these tools
-(`find_implementors`, `find_subclasses`, `find_injectors`,
-`list_by_role`, `list_by_annotation`), add an optional `capability`
-parameter:
-
-```python
-def list_by_role(
-    self, role: str, *,
-    module: str | None = None,
-    microservice: str | None = None,
-    capability: str | None = None,    # NEW
-    limit: int = 100,
-) -> list[SymbolHit]:
-    filters = ["s.role = $role"]
-    params: dict[str, Any] = {"role": role}
-    if capability:
-        filters.append("$capability IN s.capabilities")
-        params["capability"] = capability
-    filters.extend(_scope_filters("s", module=module, microservice=microservice, params=params))
-    where = " AND ".join(filters)
-    query = f"MATCH (s:Symbol) WHERE {where} RETURN {_SYMBOL_RETURN} LIMIT {int(limit)}"
-    return [_row_to_symbol(r) for r in self._rows(query, params)]
-```
-
-Same shape for `list_by_annotation`, `find_implementors`,
-`find_subclasses`. Apply the predicate against the result-node alias
-(`s` for the `list_by_*` queries; whatever alias is used in the
-implementor / subclass query). The `LIMIT` clause **must** come after
-the capability filter — Kuzu's planner handles this automatically once
-it's part of `WHERE`.
-
-For `find_injectors`, the result is an *edge* between two `Symbol`
-nodes (`src` injects `dst`). The user-relevant capability is on the
-**consumer** (`src`):
-
-```python
-def find_injectors(
-    self, name: str, *,
-    module: str | None = None,
-    microservice: str | None = None,
-    capability: str | None = None,    # NEW
-    limit: int = 100,
-) -> list[EdgeHit]:
-    # ... existing query that binds (src)-[:INJECTS]->(dst) ...
-    if capability:
-        filters.append("$capability IN src.capabilities")
-        params["capability"] = capability
-    ...
-```
-
-### C.2 Replace post-filter with parameter pass-through in `server.py`
-
-For each of the five tools (`find_implementors`, `find_subclasses`,
-`find_injectors`, `list_by_role`, `list_by_annotation`):
-
-- Remove the post-filter line `rows = [r for r in rows if capability in r.capabilities]`.
-- Pass `capability=capability` to the corresponding `KuzuGraph` method.
-- For `find_injectors` (Issue 3): add the `capability` parameter to
-  the tool signature in the first place. Reuse the same
-  `Field(default=None, description=...)` shape as the other four. Pass
-  through to `graph.find_injectors(..., capability=capability)`.
-
-`list_by_capability` is unaffected — it already pushes down via Cypher.
-
-### C.3 Tests
-
-Convert the existing `capability` post-filter tests to assert
-pushdown semantics: build a fixture with N=50 services of which only 5
-have the requested capability, request `limit=50`, expect exactly 5
-results. The previous post-filter implementation would also pass this
-specific shape, but a stronger fixture (50 services, capability=Y on 5
-services that are *not* in the first 50 vector hits or graph rows)
-will distinguish the two implementations. Pick the stronger fixture.
-
----
-
-## Group D — Documentation (Issues 5, 6)
-
-### D.1 `README.md`
-
-Add a new section **after** the existing "Roles" section, before the
-search-tools section. Suggested skeleton:
-
-```markdown
-## Capabilities
-
-In addition to the single primary `role` per Java type, the indexer
-extracts a multi-tag `capabilities: list[str]` field from method-level
-annotations, type-level annotations, injected types, and supertypes.
-A type can carry zero or many capabilities. Capabilities never
-*replace* the role; they augment it.
-
-| Capability | Trigger |
-|---|---|
-| `MESSAGE_LISTENER` | `@KafkaListener`, `@RabbitListener`, `@JmsListener`, `@SqsListener`, `@EventListener`, `@StreamListener` on any method |
-| `MESSAGE_PRODUCER` | type injects `KafkaTemplate`, `RabbitTemplate`, `JmsTemplate`, `StreamBridge`, or `ApplicationEventPublisher` |
-| `SCHEDULED_TASK`   | `@Scheduled` on any method, or class implements `org.quartz.Job` |
-| `EXCEPTION_HANDLER`| `@ControllerAdvice`, `@RestControllerAdvice`, or any method with `@ExceptionHandler` |
-
-Use `list_by_capability` to enumerate types carrying a capability, or
-pass `capability=...` to `codebase_search` / `list_by_role` /
-`list_by_annotation` / `find_*` to AND-filter results.
-```
-
-### D.2 `CODEBASE_REQUIREMENTS.md`
-
-Add a short note under the role-inference section:
-
-```markdown
-Capabilities are derived at the **type level**: method-level annotation
-evidence is aggregated up to the enclosing type. Per-method capability
-storage is intentionally out of scope for the current ontology
-(version 3) — see `plans/PLAN-CAPABILITIES-MODEL.md`. The deferred
-call-graph layer (`propose/DEFERRED-CALL-GRAPH-PROPOSE.md`) is the
-designated place to revisit method-granularity if the need arises.
-```
-
----
-
-## Group E — Style nit (Issue 7)
-
-**File:** `ast_java.py`, around line 113.
-
-Insert a single blank line between `_SUPERTYPE_TO_CAPABILITY` and
-`_TYPE_KINDS`. No other change. Verify by running the existing
-formatter / linter the project uses.
-
----
-
-## Acceptance checklist
-
-Run before declaring done:
-
-- [ ] **Group A:** `codebase_search` returns `capabilities` per hit;
-  `capability` filter present and pushed down; `limit` contract
-  test passes (50 services / 5 producers / `limit=50` → exactly 5).
-- [ ] **Group B:** `trace_flow` returns a Quartz `Job` implementor
-  (role=OTHER, capability=SCHEDULED_TASK) as a stage-0 seed.
-- [ ] **Group C:** all five graph-backed tools push the `capability`
-  filter into Cypher; `find_injectors` has the parameter; no Python
-  post-filter on `r.capabilities` remains in `server.py` for these
-  tools (verify with `rg "for r in rows if capability in" server.py`
-  → no matches).
-- [ ] **Group D:** `README.md` has a "Capabilities" section;
-  `CODEBASE_REQUIREMENTS.md` notes the type-level granularity.
-- [ ] **Group E:** blank line restored.
-- [ ] All existing tests still pass.
-- [ ] New tests cover (a) `limit` contract, (b) capability-only
-  `trace_flow` seeding, (c) `codebase_search` capability filter.
-- [ ] No new ontology bump (still `3`); no unrelated API changes.
-
-## Notes for the implementer
-
-- The plan was updated alongside this fix list. **Re-read
-  `plans/PLAN-CAPABILITIES-MODEL.md`** — the **Filter strategy** and
-  **`trace_flow` seeding** sections are new and binding. Anything in
-  this file that conflicts with the plan, the plan wins.
-- The reviewer attributed Issue 4 (`trace_flow` dead code) to
-  implementation. It's actually a plan gap — the plan asked for a
-  Kuzu change without specifying the LanceDB coordination. Group B
-  closes that gap. You did not do anything wrong on that one; you
-  faithfully implemented what the plan said. The plan is now
-  complete.
-- Verify LanceDB array-predicate syntax against the project's
-  installed Lance version *before* writing the predicate. If the
-  preferred form (`array_has`) is unavailable, document the chosen
-  fallback in a comment on `_build_extra_predicates`.
-- `find_injectors`' `capability` semantic (consumer side, not target)
-  is a deliberate API decision; surface it in the Pydantic
-  description string so callers don't guess wrong.
diff --git a/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-report.md b/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-report.md
deleted file mode 100644
index d6454873..00000000
--- a/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-report.md
+++ /dev/null
@@ -1,140 +0,0 @@
-# Implementation Review: PLAN-CAPABILITIES-MODEL
-
-**Plan file:** `plans/PLAN-CAPABILITIES-MODEL.md`
-**Review date:** 2026-04-26
-**Status:** Partially implemented — 4 hard misses, 1 design gap, 2 doc gaps, 1 style nit
-
----
-
-## Summary
-
-The core capability machinery is correctly implemented:
-- `ONTOLOGY_VERSION` bumped 2 → 3 in `ast_java.py`
-- All four detector tables (`_METHOD_ANN_TO_CAPABILITY`, `_TYPE_ANN_TO_CAPABILITY`, `_INJECTED_TYPES_TO_CAPABILITY`, `_SUPERTYPE_TO_CAPABILITY`) are present with the right entries
-- `TypeDecl.capabilities` field added; populated by `infer_capabilities_for_type` after construction in `_parse_type`
-- `infer_capabilities_for_type` and all tables exported in `__all__`
-- `ChunkEnrichment.capabilities` plumbed from `encl.capabilities` in `graph_enrich.py`
-- `Symbol` schema extended with `capabilities STRING[]`; `_node_row` defaults and `_CREATE_SYMBOL` Cypher updated; type nodes write `list(d.capabilities)`; phantoms carry `"capabilities": []`
-- `SymbolHit.capabilities` field added; `_symbol_return_for` and `_row_to_symbol` updated
-- `list_by_capability` added to `KuzuGraph` with correct `list_contains` Cypher
-- `list_by_capability` MCP tool added to `server.py`
-- `capability` post-filter parameter added to `find_implementors`, `find_subclasses`, `list_by_role`, `list_by_annotation`
-- `capabilities: list[str]` added to `SymbolDto`
-- `_INSTRUCTIONS` and `trace_flow` tool description updated to mention capabilities
-- `"capabilities"` added to `JAVA_ENRICHED_COLUMNS` in `search_lancedb.py`
-- Version guard in `KuzuGraph.get` raises on ontology mismatch
-- Unit tests in `tests/test_ast_java_capabilities.py` cover all 9 plan scenarios
-- `test_symbol_has_capabilities_column` regression guard added to `test_ast_graph_build.py`
-
----
-
-## Issues
-
-### Issue 1 — `CodeChunkHit` missing `capabilities` field (Hard miss)
-
-**File:** `server.py`
-
-`JAVA_ENRICHED_COLUMNS` in `search_lancedb.py` includes `"capabilities"` so the value is fetched from LanceDB, but `CodeChunkHit` has no `capabilities` field and `_rows_to_hits` never maps it. The plan explicitly requires:
-
-> Plumb `capabilities` through whatever Pydantic / dataclass models the search path uses to surface Java hits, so callers see them in results.
-
-**Fix needed:** Add `capabilities: list[str] = Field(default_factory=list)` to `CodeChunkHit`, and map it in `_rows_to_hits` via `_clean_str_list(r.get("capabilities"))`.
-
----
-
-### Issue 2 — `codebase_search` missing `capability` filter parameter (Hard miss)
-
-**File:** `server.py`
-
-The plan says:
-
-> In `codebase_search`, `find_*`, `list_by_role`, add an optional parameter `capability: str | None` that, when set, AND-filters results to those carrying that capability. (Implementation: post-filter on the returned `SymbolHit.capabilities` list — no Cypher change needed.)
-
-`list_by_role`, `find_implementors`, `find_subclasses`, and `list_by_annotation` all received the parameter. `codebase_search` did not.
-
-Note: for `codebase_search` the post-filter would operate on `CodeChunkHit.capabilities` (which also depends on Issue 1 being fixed first).
-
-**Fix needed:** Add `capability: str | None = Field(default=None, description="...")` to `codebase_search`; post-filter `hits` to `[h for h in hits if capability in h.capabilities]` when `capability` is set.
-
----
-
-### Issue 3 — `find_injectors` missing `capability` parameter (Hard miss)
-
-**File:** `server.py`
-
-The plan says "In `codebase_search`, `find_*`, …". `find_injectors` is a `find_*` tool and did not receive the parameter. The other two `find_*` tools (`find_implementors`, `find_subclasses`) did.
-
-For `find_injectors` the natural semantic is to filter on the injecting symbol (consumer): keep edges where `edge.src.capabilities` contains the requested capability.
-
-**Fix needed:** Add `capability: str | None = Field(default=None, …)` to `find_injectors`; post-filter `edges` to those where `capability in e.src.capabilities`.
-
----
-
-### Issue 4 — Kuzu capability-OR in `_run_seed_query` is effectively dead code (Design gap)
-
-**File:** `kuzu_queries.py` + `server.py`
-
-`_run_seed_query` (kuzu_queries.py) correctly adds:
-
-```python
-f"(s.role IN $entry_roles OR {cap_predicates})"
-```
-
-However, in `server.py`'s `trace_flow`, the first pass already filters LanceDB results with `role_in=["CONTROLLER", "COMPONENT", "SERVICE", "FEIGN_CLIENT"]`. Every FQN that arrives at Kuzu's seed query therefore already has a role in `_ENTRYPOINT_ROLES`, making the `OR cap_predicates` branch unreachable for any class with role `OTHER`.
-
-Concretely: a plain `Job` implementor (role `OTHER`, capability `SCHEDULED_TASK`) is excluded by the LanceDB role filter before the Kuzu check ever sees it. The plan's stated test case #4 ("returns the `MESSAGE_LISTENER` class as a stage-0 seed even when its primary role is `SERVICE`") does work because `SERVICE` is in `entry_roles`. But the broader intent — expanding seeding beyond role boundaries via capabilities — is not achieved.
-
-**Fix needed:** In `server.py`'s `trace_flow`, add a third LanceDB seed pass that searches without role restriction but filters on known entry-capability values (`MESSAGE_LISTENER`, `SCHEDULED_TASK`) using a LanceDB predicate on the `capabilities` column, then merges unique FQNs into the seed set before calling `graph.trace_flow`.
-
----
-
-### Issue 5 — `README.md` not updated (Plan requirement skipped)
-
-**File:** `README.md`
-
-The plan requires:
-
-> `README.md` — add a section "Capabilities" describing the multi-tag axis, the initial capability set, and `list_by_capability`. Keep the existing "Roles" section intact.
-
-No change was made to `README.md`.
-
----
-
-### Issue 6 — `CODEBASE_REQUIREMENTS.md` not updated (Plan requirement skipped)
-
-**File:** `CODEBASE_REQUIREMENTS.md`
-
-The plan requires:
-
-> `CODEBASE_REQUIREMENTS.md` — note the type-level granularity choice and the deferred per-method storage (link to this plan).
-
-No change was made to `CODEBASE_REQUIREMENTS.md`.
-
----
-
-### Issue 7 — Missing blank line between `_SUPERTYPE_TO_CAPABILITY` and `_TYPE_KINDS` (Style nit)
-
-**File:** `ast_java.py`, line ~113
-
-```python
-_SUPERTYPE_TO_CAPABILITY: dict[str, str] = {
-    "Job": "SCHEDULED_TASK",
-}
-_TYPE_KINDS = {   # <-- no blank line before this
-```
-
-Every other pair of top-level variables in the file is separated by a blank line. The missing line here was likely a merge artefact.
-
----
-
-## Priority Order for Fixes
-
-| # | Severity | File | Description |
-|---|----------|------|-------------|
-| 1 | High | `server.py` | `CodeChunkHit` missing `capabilities` field |
-| 2 | High | `server.py` | `codebase_search` missing `capability` filter |
-| 3 | High | `server.py` | `find_injectors` missing `capability` filter |
-| 4 | Medium | `server.py` + `kuzu_queries.py` | `trace_flow` capability seeding is dead code for role=OTHER classes |
-| 5 | Low | `README.md` | "Capabilities" section not written |
-| 6 | Low | `CODEBASE_REQUIREMENTS.md` | Granularity note not added |
-| 7 | Nit | `ast_java.py` | Missing blank line between two dict constants |
diff --git a/reports/what-to-borrow-from-cmm.md b/reports/what-to-borrow-from-cmm.md
deleted file mode 100644
index e2258de3..00000000
--- a/reports/what-to-borrow-from-cmm.md
+++ /dev/null
@@ -1,247 +0,0 @@
-# What to Borrow from Codebase-Memory MCP
-
-A focused, prioritized guide for evolving `java-codebase-rag` (AMA agent) by adopting proven patterns from [DeusData/codebase-memory-mcp](https://github.com/DeusData/codebase-memory-mcp) (paper: [arXiv:2603.27277](https://arxiv.org/abs/2603.27277)) — without giving up your Spring-aware, hybrid (vector + graph) edge.
-
-> **Guiding principle.** CMM optimizes for *token efficiency at acceptable quality* across 66 languages. Your AMA agent optimizes for *answer quality on Spring/Java microservices* via hybrid retrieval. Borrow CMM's structural mechanics; keep your semantic / role-aware layer as the differentiator.
-
----
-
-## Snapshot — where each tool wins
-
-| Layer | Your AMA agent | Codebase-Memory MCP | Action |
-|---|---|---|---|
-| Java/Spring DI semantics | Strong (`@Autowired`, `@Inject`, Lombok, `@FeignClient`) | None | Keep yours |
-| Vector / hybrid retrieval (LanceDB + RRF + `graph_expand`) | Yes | None | Keep yours |
-| Role / capability ontology (`CONTROLLER`, `MESSAGE_LISTENER`, ...) | Yes | None | Keep yours |
-| Microservice topology + brownfield overrides | Yes | Generic `Project` only | Keep yours |
-| `CALLS` / `HTTP_CALLS` / `ASYNC_CALLS` resolution | Roadmap (Phase 3) | Shipped, mature | **Borrow** |
-| `Route` as first-class node | Roadmap | Shipped | **Borrow** |
-| Cross-repo / cross-service edges | Roadmap | Shipped (`pass_cross_repo`) | **Borrow** |
-| Runtime trace ingestion | None | Shipped (`ingest_traces`) | **Borrow** |
-| Git-diff impact + risk classification | Partial (`impact_analysis`) | Shipped (`detect_changes`) | **Borrow** |
-| Layered ignore (`.gitignore` + project ignore) | Constant list | Layered (`.cbmignore`) | **Borrow** |
-| Louvain community detection | None | Shipped | **Borrow (Phase 4)** |
-| Dead-code detection | None | Shipped | **Borrow (Phase 4)** |
-| 66-language tree-sitter grammars | Java only | Yes | Skip (off-strategy) |
-| Single static binary distribution | Python venv | Yes | Skip until Phase 5+ |
-| 3D graph UI | None | Yes | Skip |
-| `get_architecture` mega-tool | Split into small tools | One bundled tool | Skip — keep yours |
-
----
-
-## Tier 1 — Borrow now (cheap, high impact)
-
-### B1. Confidence-scored CALLS resolution cascade
-
-CMM's [`pass_calls.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_calls.c) and [`extract_calls.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/internal/cbm/extract_calls.c) resolve calls via a deterministic cascade. Adopt the **shape**, not the C code.
-
-**What to lift:**
-
-- A 4-strategy cascade with explicit confidence values:
-  1. Import-map resolved (`0.95`)
-  2. Same-module / same-package (`0.90`)
-  3. Globally unique simple name (`0.75`)
-  4. Suffix / fuzzy match (`0.55`)
-- A `confidence` property on every `CALLS` edge so downstream tools (and the MCP agent) can filter (`WHERE c.confidence >= 0.8`).
-- A `source` property: `"static"` vs `"trace"` vs `"di_proxy"`.
-
-**Why now:** Add the property when you create the Kuzu schema for Phase 3 — retrofitting columns later is painful.
-
-**Suggested Kuzu DDL:**
-
-```sql
-CREATE REL TABLE CALLS (
-    FROM Method TO Method,
-    confidence DOUBLE,         -- 0.55 .. 1.0
-    source     STRING,         -- 'static' | 'trace' | 'di_proxy'
-    strategy   STRING,         -- 'import_map' | 'same_module' | 'unique_name' | 'suffix'
-    call_site  STRING          -- file:line
-);
-```
-
----
-
-### B2. `Route` as a first-class node
-
-CMM models REST endpoints and message channels as a single `Route` label so that *any* call site can attach to *any* endpoint via `HTTP_CALLS` / `ASYNC_CALLS`. See [`pass_route_nodes.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_route_nodes.c).
-
-**What to lift:**
-
-- Adopt the **`Route`** label (instead of `RestEndpoint` from your current PRODUCT-VISION) — keeps you semantically interoperable if anyone runs both MCPs in parallel.
-- Properties: `path`, `method`, `framework` (`spring_mvc`, `webflux`, `feign`, `kafka`, `rabbitmq`), `broker` (for async), `service` (microservice name).
-- Edges:
-  - `(Method)-[:EXPOSES]->(Route)` for `@RequestMapping`/`@KafkaListener`
-  - `(Method)-[:HTTP_CALLS]->(Route)` for `RestTemplate`/`WebClient`/`@FeignClient`
-  - `(Method)-[:ASYNC_CALLS]->(Route)` for `KafkaTemplate.send`/`StreamBridge.send`
-- A normalization rule: `/api/users/{id}` and `/api/users/123` collapse to the same `Route` (path-template canonicalization).
-
----
-
-### B3. Runtime trace ingestion (`ingest_traces`)
-
-This is the single biggest quality lever you don't have yet. Static analysis misses Spring AOP proxies, polymorphic dispatch, reflection, and event-driven flows — runtime traces capture all of them.
-
-**What to lift:**
-
-- A new MCP tool `ingest_traces(spans: List[Span], source: str)`.
-- Accept OpenTelemetry / Sleuth / Micrometer JSON natively.
-- For each `(parent_span, child_span)` pair, emit `(caller:Method)-[:CALLS {source:"trace", confidence:1.0}]->(callee:Method)`.
-- For HTTP client spans, emit `(caller)-[:HTTP_CALLS]->(Route)` using `http.url` + `http.method` to match an existing `Route` node.
-- Deduplicate via `(source_id, target_id, source)` so re-ingesting traces is idempotent.
-
-**Why this matters:** Lifts Phase 3 from "static approximation" to "ground-truth where traces exist, static elsewhere" — and the agent can prefer `confidence:1.0` edges automatically.
-
----
-
-### B4. Git-diff impact mapping with risk score
-
-CMM's [`detect_changes`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_gitdiff.c) maps a diff to affected symbols and a blast radius. You already have `impact_analysis` — make it diff-driven and add risk classification.
-
-**What to lift:**
-
-- New MCP tool `analyze_pr(diff: str | git_ref: str)`:
-  1. Parse `git diff` line ranges per file
-  2. Map line ranges → chunks → graph nodes (functions/methods)
-  3. Run your existing reverse closure
-  4. Return `{ changed_nodes, blast_radius, risk_score, risk_level }`
-- Risk formula (start simple, tune later):
-
-```
-risk = log10(1 + downstream_consumers) * role_weight * cross_service_factor
-
-role_weight        = { CONTROLLER:1.5, SERVICE:1.2, REPOSITORY:1.0, CONFIG:1.8, ENTITY:1.3, ... }
-cross_service_factor = 1.0 if changes only touch one microservice, 2.0 otherwise
-risk_level         = "low" (<1.0), "medium" (1.0..2.5), "high" (>2.5)
-```
-
-- Output usable directly in PR review or CI gating.
-
----
-
-### B5. Layered ignore patterns
-
-CMM uses **hardcoded patterns → `.gitignore` hierarchy → `.cbmignore`** ([`discover/`](https://github.com/DeusData/codebase-memory-mcp/tree/master/src/discover)). Cleaner than your current `COMMON_EXCLUDED_PATH_PATTERNS` constant.
-
-**What to lift:**
-
-- Layer order:
-  1. Hardcoded must-skip (`.git`, `node_modules`, `target`, `build`, `out`, `.idea`, `.gradle`, `bin`)
-  2. Walk up `.gitignore` files from each indexed directory
-  3. Project-level `.lancedb-mcp.yml`'s `ignore:` list
-  4. NEW: optional `.lancedb-mcp-ignore` file with gitignore syntax
-- Always skip symlinks (cycle protection).
-- Reuse `pathspec` (Python) — it's the gitignore-spec-compliant matcher.
-
----
-
-## Tier 2 — Borrow during Phase 2 / 3
-
-### B6. Cross-repo / cross-service edges
-
-CMM's [`pass_cross_repo.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_cross_repo.c) matches an `HTTP_CALLS` edge in service A to a `Route` node in service B and creates a `CROSS_HTTP_CALLS` edge. This is the killer feature for a multi-microservice AMA.
-
-**What to lift:**
-
-- After per-service indexing, run a global pass:
-  - For each `HTTP_CALLS` edge with `path` + `method`, find the matching `Route` node in any other indexed service.
-  - Emit `(callerMethod)-[:CALLS_HTTP]->(Route)<-[:EXPOSES]-(calleeMethod)` so traversal in either direction works.
-- Same for async: match `topic`/`queue` strings in `KafkaTemplate.send` calls to `@KafkaListener` `Route` nodes.
-- Path template matching: `GET /api/orders/{id}` matches a call to `GET /api/orders/123` — use a `path_pattern` regex stored on the `Route`.
-
-**Killer query unlocked:** *"What breaks if I rename `POST /api/orders` in `order-service`?"* → traverse `Route` → cross-service `HTTP_CALLS` → caller methods → reverse closure → affected controllers in `checkout-service`.
-
----
-
-### B7. Louvain community detection
-
-CMM runs Louvain over `CALLS` to discover functional modules. Useful for onboarding and architecture pitches.
-
-**What to lift:**
-
-- After Phase 3 `CALLS` lands, run Louvain on the call subgraph (use `python-igraph` or `networkx-community`).
-- Store `cluster_id` and `cluster_size` as `Method` properties.
-- New MCP tool `find_module_clusters(min_size: int)` returning ranked clusters with their dominant role mix and entry methods.
-- Bonus: weight edges by call frequency from traces (B3) for higher-quality partitions.
-
----
-
-### B8. Dead-code detection
-
-Trivial once `CALLS` exists, but valuable for cleanup and consulting deliverables.
-
-**What to lift:**
-
-- New MCP tool `find_dead_code(exclude_entry_points: bool = true)`.
-- Definition: `Method` with zero incoming `CALLS` and zero incoming `EXPOSES`.
-- Entry-point predicates to exclude:
-  - Spring stereotypes that auto-invoke: `@Scheduled`, `@PostConstruct`, `@EventListener`, `@KafkaListener`, `@RabbitListener`, `@JmsListener`
-  - HTTP entry points: any method with an `EXPOSES` edge
-  - Test methods: `@Test`, `@ParameterizedTest`, lifecycle annotations
-  - `public static void main(String[])`
-- Cypher (one query):
-
-```cypher
-MATCH (m:Method)
-WHERE NOT (m)<-[:CALLS]-()
-  AND NOT (m)-[:EXPOSES]->()
-  AND NOT m.is_entry_point
-RETURN m.qualified_name, m.role, m.file, m.line
-ORDER BY m.role, m.qualified_name
-```
-
----
-
-## Tier 3 — Borrow later or skip
-
-### Borrow only if you go poly-language (Phase 5+)
-
-- **B9. Multi-grammar indexing.** CMM ships 66 grammars vendored. Adopt only if you sell to non-Java SMBs.
-- **B10. Static binary distribution.** Compelling for SMB clients ("download → run"). Not relevant while you're a Python venv.
-
-### Skip (don't fit your strategy)
-
-- **`get_architecture` mega-tool.** Your split tools (`graph_meta`, `list_by_role`, `list_by_capability`) are more agent-friendly because each is named and small. The agent picks better when tool intent is narrow.
-- **3D graph UI.** Not the differentiator. If you need visualization, render Kuzu subgraphs to Mermaid or Graphviz on demand from a tool — far less code, embeds in chat.
-- **Their ADR module.** Markdown folder + your existing search is enough. Adding ADR CRUD is scope creep.
-- **CMM's mini-Cypher executor.** You already have Kuzu — strictly more capable.
-
----
-
-## Suggested roadmap reorder
-
-A revised ordering that front-loads borrowed pieces with the highest ROI:
-
-| Phase | Goal | Borrowed items |
-|---|---|---|
-| **2** (now) | `Route` nodes + `HTTP_CALLS` / `ASYNC_CALLS` from Spring/Feign/Kafka, with `confidence` columns | B2 |
-| **2.5** | `ingest_traces` MCP tool (cheap, huge quality lift) | B3 |
-| **3** | Static `CALLS` with 4-strategy cascade; `find_callers` / `find_callees`; dead code | B1, B8 |
-| **3.5** | `pass_cross_repo`-style cross-service edges | B6 |
-| **4** | `analyze_pr` (diff → impact + risk); Louvain clusters | B4, B7 |
-| **5** | Eval harness; head-to-head benchmark vs. CMM on Java repos | — |
-| **5+** | Optional poly-language grammars; static-binary packaging | B9, B10 |
-
-Layered ignores (B5) can land anywhere — drop it in alongside the next indexer change.
-
----
-
-## Strategic notes
-
-- **Run both MCPs in parallel as a zero-integration option.** `.mcp.json` supports many servers. Let your tool answer Java/architectural queries; CMM handles non-Java or generic structural queries when you eventually touch poly-glot codebases. Zero integration cost, maximum optionality.
-- **Use the comparison itself as a portfolio asset.** When you start pitching SMB clients on AI automation, "I built a Spring-aware hybrid retrieval system that beats the published Codebase-Memory baseline on Java microservice questions" — with numbers from your Phase 5 eval harness — is a credible artifact. Few consultants can show that.
-- **Don't fork CMM.** It's MIT-licensed C with vendored grammars; maintenance cost is high and the code style diverges from your Python stack. Read it as documentation, port the patterns.
-
----
-
-## References
-
-- Codebase-Memory MCP source — [github.com/DeusData/codebase-memory-mcp](https://github.com/DeusData/codebase-memory-mcp)
-- Paper — [Codebase-Memory: Tree-Sitter-Based Knowledge Graphs for LLM Code Exploration via MCP (arXiv:2603.27277)](https://arxiv.org/abs/2603.27277)
-- Your repo — [HumanBean17/java-codebase-rag](https://github.com/HumanBean17/java-codebase-rag)
-- Key CMM files referenced above:
-  - [`pass_calls.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_calls.c) — call resolution
-  - [`pass_route_nodes.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_route_nodes.c) — route nodes
-  - [`pass_cross_repo.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_cross_repo.c) — cross-service edges
-  - [`pass_gitdiff.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_gitdiff.c) — git diff impact
-  - [`extract_channels.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/internal/cbm/extract_channels.c) — async patterns
-  - [`service_patterns.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/internal/cbm/service_patterns.c) — framework markers

From 63df99ca6a988ddbcf45190f3937dbf4ce52ceb7 Mon Sep 17 00:00:00 2001
From: Dmitry Teryaev <doudmitry@gmail.com>
Date: Sun, 24 May 2026 13:05:20 +0300
Subject: [PATCH 2/3] fix: update references and restore deleted reports
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Move reports/ to docs/reports/ to preserve technical content
- Update all references to CODEBASE_REQUIREMENTS.md → docs/CODEBASE_REQUIREMENTS.md
- Update all references to PRODUCT-VISION.md → docs/PRODUCT-VISION.md
- Update reports/ references to docs/reports/
- Fix broken links in README.md, AGENTS.md, and 21 proposal/plan files

The deleted reports contained valuable technical content:
- what-to-borrow-from-cmm.md: borrowing patterns guide from Codebase-Memory MCP
- call-graph-review.md: code review with bugs and design invariants
- PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md: design issues and gaps

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 AGENTS.md                                     |   4 +-
 README.md                                     |   6 +-
 docs/JAVA-CODEBASE-RAG-CLI.md                 |   2 +-
 docs/reports/call-graph-review.md             | 364 ++++++++++++++++++
 ...BROWNFIELD-ROLE-OVERRIDES-design-issues.md |  62 +++
 docs/reports/what-to-borrow-from-cmm.md       | 247 ++++++++++++
 plans/completed/AGENT-PROMPTS-MCP-API-V2.md   |   8 +-
 plans/completed/AGENT-PROMPTS-TIER1B.md       |   6 +-
 ...-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md |   2 +-
 .../PLAN-BROWNFIELD-ROLE-OVERRIDES.md         |   2 +-
 plans/completed/PLAN-CALL-GRAPH.md            |   4 +-
 plans/completed/PLAN-CAPABILITIES-MODEL.md    |   2 +-
 plans/completed/PLAN-CLI-SCENARIOS.md         |   6 +-
 plans/completed/PLAN-CLIENT-ROLE-RENAME.md    |   4 +-
 .../completed/PLAN-HTTP-ROUTE-METHOD-ENUM.md  |   6 +-
 plans/completed/PLAN-MCP-API-V2.md            |   4 +-
 .../completed/PLAN-REMOTE-PROJECT-INDEXING.md |   4 +-
 plans/completed/PLAN-TIER1B-COMPLETION.md     |   4 +-
 propose/completed/CALL-GRAPH-PROPOSE.md       |   4 +-
 propose/completed/CLI-SCENARIOS-PROPOSE.md    |   6 +-
 .../completed/CLIENT-ROLE-RENAME-PROPOSE.md   |   4 +-
 propose/completed/TIER1-COMPLETION-PROPOSE.md |  14 +-
 .../TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md        |  10 +-
 .../TIER2-INCREMENTAL-REBUILD-PROPOSE.md      |   2 +-
 24 files changed, 725 insertions(+), 52 deletions(-)
 create mode 100644 docs/reports/call-graph-review.md
 create mode 100644 docs/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md
 create mode 100644 docs/reports/what-to-borrow-from-cmm.md

diff --git a/AGENTS.md b/AGENTS.md
index 1c5379b9..77384229 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -39,7 +39,7 @@ when needed.
   operator guide for the `java-codebase-rag` CLI (`init` / `increment` /
   `reprocess` / `erase`, `meta`, `tables`, `diagnose-ignore`,
   `analyze-pr`; hidden `refresh` alias → `reprocess` — see that doc).
-- `CODEBASE_REQUIREMENTS.md` — Java-repo assumptions and per-file map of
+- `docs/CODEBASE_REQUIREMENTS.md` — Java-repo assumptions and per-file map of
   what to edit when a target tree doesn't match defaults.
 - `tests/README.md` — testing philosophy.
 - **`propose/`** — design proposes. **In-flight** proposes are **`*.md`
@@ -112,7 +112,7 @@ For any non-trivial change, read the relevant doc first instead of
 inferring from code:
 
 - Behaviour / public surface → `README.md`.
-- Brownfield assumptions, role/capability tuning → `CODEBASE_REQUIREMENTS.md`.
+- Brownfield assumptions, role/capability tuning → `docs/CODEBASE_REQUIREMENTS.md`.
 - In-flight design proposes → **`propose/*.md` at the root of `propose/`**
   (not under `propose/completed/`). **List or search** for current names.
 - Why current design exists → `propose/completed/` and `plans/completed/`.
diff --git a/README.md b/README.md
index 0eb1174c..57c333c8 100644
--- a/README.md
+++ b/README.md
@@ -128,7 +128,7 @@ The operator-facing surface is small: pick an index dir, pick an embedding model
 | Understand the graph (nodes, edges, capabilities, ranking) | [`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) §3 |
 | Steer a brownfield Java tree (custom stereotypes, non-Spring stacks) | [`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) §4 |
 | Control which files the indexer walks | [`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) §5 |
-| Check whether your repo fits this tool's assumptions | [`CODEBASE_REQUIREMENTS.md`](./CODEBASE_REQUIREMENTS.md) |
+| Check whether your repo fits this tool's assumptions | [`docs/CODEBASE_REQUIREMENTS.md`](./docs/CODEBASE_REQUIREMENTS.md) |
 
 ---
 
@@ -158,9 +158,9 @@ Run `java-codebase-rag --help` to list grouped subcommands. Operator playbook wi
 | [`docs/EDGE-NAVIGATION.md`](./docs/EDGE-NAVIGATION.md) | MCP-traversable edges, directions, dot-key composition. |
 | [`docs/skills/java-codebase-explore.md`](./docs/skills/java-codebase-explore.md) | Agent exploration skill (strategy, missions, fallbacks); packaged zip [`docs/skills/java-codebase-explore.zip`](./docs/skills/java-codebase-explore.zip) for Perplexity-style hosts. |
 | [`docs/MANUAL-VERIFICATION-CHECKLIST.md`](./docs/MANUAL-VERIFICATION-CHECKLIST.md) | 7-phase agent-driven verification after indexing your project. |
-| [`CODEBASE_REQUIREMENTS.md`](./CODEBASE_REQUIREMENTS.md) | Assumptions about your Java repo + per-file edit map for non-conforming codebases. |
+| [`docs/CODEBASE_REQUIREMENTS.md`](./docs/CODEBASE_REQUIREMENTS.md) | Assumptions about your Java repo + per-file edit map for non-conforming codebases. |
 | [`automation/cursor_propose_only/README.md`](./automation/cursor_propose_only/README.md) | Optional proposal orchestration workflow (single-command autopilot, planning bundles, automated execution/review loops). |
-| [`propose/PRODUCT-VISION.md`](./propose/PRODUCT-VISION.md) | Long-term product direction. |
+| [`docs/PRODUCT-VISION.md`](./docs/PRODUCT-VISION.md) | Long-term product direction. |
 
 ---
 
diff --git a/docs/JAVA-CODEBASE-RAG-CLI.md b/docs/JAVA-CODEBASE-RAG-CLI.md
index 9c5655d1..80a971be 100644
--- a/docs/JAVA-CODEBASE-RAG-CLI.md
+++ b/docs/JAVA-CODEBASE-RAG-CLI.md
@@ -226,5 +226,5 @@ Prefer **`java-codebase-rag reprocess --graph-only`** when you only need Kuzu re
 ## See also
 
 - [README.md](../README.md) — env vars, MCP tool table, ignore layout.
-- [CODEBASE_REQUIREMENTS.md](../CODEBASE_REQUIREMENTS.md) — repo layout, brownfield, when to rebuild.
+- [CODEBASE_REQUIREMENTS.md](./CODEBASE_REQUIREMENTS.md) — repo layout, brownfield, when to rebuild.
 - [MANUAL-VERIFICATION-CHECKLIST.md](./MANUAL-VERIFICATION-CHECKLIST.md) — phased checks that mix CLI + MCP.
diff --git a/docs/reports/call-graph-review.md b/docs/reports/call-graph-review.md
new file mode 100644
index 00000000..e6ed718e
--- /dev/null
+++ b/docs/reports/call-graph-review.md
@@ -0,0 +1,364 @@
+# Call Graph Layer — Code Review
+
+**Repository:** [HumanBean17/java-codebase-rag](https://github.com/HumanBean17/java-codebase-rag)
+**Commits reviewed:**
+- `b3a15d8` — *call graph layer propose*
+- `fb5473f` — *call graph layer implementation*
+
+**Reference docs:**
+- [`propose/completed/CALL-GRAPH-PROPOSE.md`](https://github.com/HumanBean17/java-codebase-rag/blob/master/propose/completed/CALL-GRAPH-PROPOSE.md)
+- [`plans/completed/PLAN-CALL-GRAPH.md`](https://github.com/HumanBean17/java-codebase-rag/blob/master/plans/completed/PLAN-CALL-GRAPH.md)
+
+**Test status:** all 24 new call-graph tests pass locally
+(`tests/test_ast_java_calls.py`, `tests/test_call_graph_smoke_roundtrip.py`,
+`tests/test_call_graph_receiver_resolution.py`).
+
+---
+
+## Overall verdict
+
+**Strong, faithfully-scoped implementation.** The proposal is realised as
+written, the receiver-type resolver is well-structured, the schema and edge
+metadata match the design (confidence + strategy + source), and the test
+coverage targets concrete proposal section numbers. Scope discipline is
+visible — no creep into HTTP / async / AOP / traces.
+
+There are **three correctness bugs** that should land as a quick follow-up
+before Phase 3 is closed, plus a handful of design issues worth pushing back
+on. All three bugs share one root cause: **resolution strategy and
+confidence are silently downgraded at edge-emit time when the receiver was
+already resolved successfully.**
+
+---
+
+## What's done well
+
+- **Confidence + strategy tagging is faithful to the design.** Every edge
+  carries (`confidence`, `strategy`, `source='static'`) — clean migration
+  path for trace ingestion later.
+- **Multigraph dedup at write time** (`(src_id, dst_id, arg_count, line)`)
+  is correctly shaped: prevents accidental duplication while preserving
+  overload-ambiguous fan-out at distinct call sites.
+- **Receiver-type resolver** is clear and matches the proposal: scope table
+  built once per method, supertype-bounded lookup, explicit
+  `chained_receiver` phantom path, deterministic phantom IDs.
+- **Receiver-disambiguation discipline.** `_unique_type_simple_resolve`
+  deliberately uses the *type* registry (not a per-method simple-name
+  index). The dedicated test
+  `test_receiver_disambiguation_uses_type_index_not_method_unique` is
+  exactly the right kind of negative test — this is the precise trap
+  CMM-style cascades fall into and the implementation avoids it.
+- **`_method_ids_for_call_graph_needle`** elegantly accepts type FQN,
+  method FQN, or simple method name; fan-out through `DECLARES` from a
+  type needle is the right move and matches §6.1.
+- **`exclude_external` is filter-on-result, not filter-on-store.** Phantoms
+  stay in the graph (so impact analysis can see JDK-adjacent signals), but
+  query consumers get clean lists by default. Matches risk #2 mitigation
+  in the proposal.
+- **Tests target proposal section numbers.** 24 tests, all passing,
+  including a Kuzu round-trip on a real fixture project. The shadowing
+  test (`test_local_shadows_field_same_name_resolves_receiver`) is the
+  kind of edge case that bites in real codebases.
+- **Diagnostics are baked in** — `pass3_calls` prints the chained-phantom
+  percentage as the proposal mandates.
+
+---
+
+## Bugs (must fix)
+
+### B1. Constructor calls always become phantoms when the class has no explicit constructor
+
+**Severity: high — most common Java call site is broken.**
+
+`new Svc()` in `ScopeReceivers.byLocal()` resolves the receiver type to
+`smoke.Svc` correctly. But `Svc` has no explicit constructor in source, so
+`_parse_method` is never invoked for an `<init>`, and no constructor
+`MemberEntry` is created. `_lookup_method_candidates(type='smoke.Svc',
+callee='<init>', argc=0)` finds nothing → fallthrough to phantom at
+`confidence=0.0`.
+
+Confirmed empirically against the smoke fixture:
+
+```
+['smoke.ScopeReceivers#byLocal()',           'smoke.Svc#<init>(0)', 'phantom', False, 0.0]
+['smoke.ScopeReceivers#shadowLocalOverField()', 'smoke.Svc#<init>(0)', 'phantom', False, 0.0]
+```
+
+In a real Spring codebase, **every** `new MyDto()`, `new HashMap<>()`,
+`new ArrayList<>()` on a project type without a hand-written constructor
+lands as a phantom.
+
+**Fix.** When parsing a `TypeDecl` and discovering no constructor
+declaration, synthesize a default
+`MethodDecl(name="<init>", signature="<init>()", is_constructor=True, ...)`
+with `start_line` / `start_byte` from the type declaration and
+`parameters=[]`. Make sure it gets a `MemberEntry`.
+
+Two corollary checks:
+
+- `_emit_call_edge` for `new Svc()` should then resolve to the synthesized
+  member with `strategy='constructor'` (not `phantom`), `confidence`
+  inherited from the receiver-resolution tier.
+- Confirm existing `INJECTS` / `DECLARES` accounting doesn't double-count
+  the synthesized node.
+
+**Suggested test** — add to `tests/test_call_graph_smoke_roundtrip.py`
+(`test_implicit_default_ctor_is_resolved`):
+
+```java
+public class HasNoCtor {}
+public class Caller { void m() { new HasNoCtor(); } }
+```
+
+Assert: `(Caller#m)-[CALLS {strategy:'constructor', resolved:true}]->(HasNoCtor#<init>())`.
+
+---
+
+### B2. Implicit `super()` for a class that doesn't extend anything is mis-tagged as `phantom`
+
+**Severity: medium — diagnostic regression, not a wrong answer.**
+
+`WildUtils` has an explicit `private WildUtils() {}` constructor with no
+`super(...)` body, so the AST extractor synthesizes the implicit-super
+call site. `_first_supertype_fqn` returns `None` (no `EXTENDS` row →
+there is no `Object` node in the index), so `_resolve_receiver_type`
+returns `(None, "phantom", 0.0)`. Result:
+
+```
+['smoke.WildUtils#WildUtils()', '?super#<init>(0)', 'phantom', False, 0.0]
+```
+
+The proposal §4.2 promises strategy `implicit_super (0.90)` for this case.
+Right now the agent cannot distinguish "implicit super to `Object`" from
+"I have no idea what this call resolved to" — real signal loss.
+
+**Fix.** In `_resolve_receiver_type`, when `expr == 'super'` and
+`_first_supertype_fqn(...) is None`, return
+`("java.lang.Object", "implicit_super", 0.90)`. In `_emit_call_edge`,
+allow phantom callee (no member resolved on `Object`) but **preserve
+`strategy='implicit_super'` and `confidence=0.90`** instead of overriding
+to `phantom` / `0.0`. This is the same fix-shape as B3 below.
+
+---
+
+### B3. Resolution strategy and confidence are silently overridden to `phantom` / `0.0` when the callee can't be located on a resolved external receiver
+
+**Severity: high — collapses static-import precision when callees are JDK / Spring.**
+
+In `_resolve_and_emit_call`:
+
+```python
+if not candidates:
+    pid = _phantom_method_id(...)
+    _emit_call_edge(..., confidence=0.0, strategy="phantom", resolved=False)
+    return
+```
+
+This branch fires whenever the receiver type *did* resolve (e.g.
+`java.util.Objects` via `static_import`, confidence 0.95) but the callee
+method isn't on a type we indexed. The static-import smoke test confirms it:
+
+```
+requireNonNull edges: 1
+  phantom 0.0 False java.util.Objects#requireNonNull(1)
+```
+
+The README and the MCP instructions both tell agents to use
+`min_confidence=0.9` to filter noise. Under that filter, **every JDK
+static-import call disappears from the graph**, even though the resolver
+*knew* the call's target type with 0.95 confidence.
+
+**Fix.** Decouple the *receiver-resolution strategy/confidence* from the
+*callee-found* boolean. When `candidates` is empty:
+
+- Keep the phantom callee (creating it on the resolved receiver type —
+  already done).
+- Keep `resolved=False` on the edge (the *callee node* is a phantom).
+- **Preserve the receiver-resolution `strat` and `conf`** unless they're
+  `'chained_receiver'`. Specifically: `strategy` stays `'static_import'` /
+  `'static_import_wildcard'` / `'import_map'` / `'same_module'` etc.;
+  `confidence` stays the receiver-tier value.
+
+The only case where `confidence=0.0, strategy='phantom'` is honest is when
+the receiver itself was unresolvable. Distinguishing those two failure
+modes is the whole point of the cascade.
+
+Optional: add a small property `callee_found BOOLEAN` on the edge so a
+query like *"high-confidence edges with phantom callees"* (= calls into
+well-known external libraries) becomes one Cypher predicate.
+
+**Suggested tests:**
+
+- `test_static_import_to_jdk_keeps_high_confidence` — `requireNonNull`
+  edge has `confidence>=0.95` and `strategy='static_import'`, with
+  `resolved=False` on the edge.
+- `test_min_confidence_filter_keeps_high_confidence_static_import_callers`
+  — `find_callers('java.util.Objects#requireNonNull(1)', min_confidence=0.9)`
+  returns the in-repo caller.
+
+---
+
+## Design issues (push back on the proposal here, not just the implementation)
+
+### D1. Phantom-ID `arg_count` semantics are inconsistent across method-references and regular calls
+
+`_phantom_method_id` builds the FQN as `{receiver}#{callee}({arg_count})`.
+For method references the `arg_count` is `-1`. So the same external method
+can exist as both `Foo#bar(2)` and `Foo#bar(-1)` phantom nodes — distinct
+nodes for the same logical target. The dedup key
+`(src_id, dst_id, arg_count, line)` then keeps both edges, doubling the
+graph for code that mixes calls and method references on the same target.
+
+**Recommendation.** Either normalize phantom IDs without `arg_count` for
+method references (`?{recv}#{callee}(?)`) or drop `arg_count` from the
+dedup key and use `(src_id, dst_id, line, byte)` (line+byte already pin a
+unique call site).
+
+---
+
+### D2. Method-reference precision is leaving free wins on the table
+
+Method references that *are* unambiguous on name (single method, no
+overloads) currently still emit with `arg_count=-1`. Cheap precision win,
+no extra resolver complexity: when the receiver type is known and exactly
+one method with `name == callee_simple` exists on the receiver type, pick
+that single-arity match and emit a fully-resolved edge with the receiver's
+real arity instead of `-1`.
+
+---
+
+### D3. Anonymous-inner-class call attribution does the proposal-correct thing, but the design is questionable
+
+Right now `pingFromAnon()` (called from inside
+`new Runnable() { run() { pingFromAnon(); } }`) is attributed to
+**`NestedCalls#m()`**, the enclosing named method, with
+`strategy='this_super'`. That matches §4.1's wording.
+
+But: the anonymous `Runnable` *does* get parsed as a nested type in
+`_parse_type` (kind `class`). It produces a `MemberEntry` for its
+`run()` method. So the graph has two contradictory facts: the call edge
+goes from `NestedCalls#m`, and the structural fact "there exists a
+`run()` method here" lives on a separate, disconnected anonymous type
+node.
+
+**Recommendation.** Re-attribute calls inside an anonymous-class body to
+the anonymous-class member. The named-enclosing fallback is only needed
+for **lambdas** (which don't synthesize a member) and static / instance
+initializers. For anonymous classes, the call-site naturally belongs to
+the anonymous member. This makes
+`find_callers('OperatorAssignedProcessor.onOperatorAssigned')` find the
+anonymous handler that actually contains the call, instead of the outer
+service method.
+
+---
+
+### D4. `expand_methods` discards confidence on the way out
+
+The output is `list[str]` of type FQNs. There's no way for the search-side
+fusion in `_graph_expand_merge` to weight a CALLS-derived hit lower than
+a structural one. The proposal §6.2 says "merged via existing RRF, no new
+caller-visible parameters" — so RRF treats every reach equally regardless
+of whether it came from a 0.95 import-map edge or a 0.55 suffix edge.
+
+**Recommendation (small).** Have `expand_methods` return
+`list[tuple[str, float]]` (type FQN + max confidence on the discovery
+path), and let `_graph_expand_merge` pass that as the RRF rank weight.
+Internal-only signature change; no MCP surface change.
+
+---
+
+### D5. `trace_flow`'s default change quietly rebudgets stage capacity across two qualitatively different edge sources
+
+`follow_calls=True` is the new default. Existing agent prompts that
+expected type-only stages now get extra entries with
+`via.edge_type='CALLS'`. That's good — agents can infer it. But the
+per-stage cap (`stage_limit`) now budgets across both edge classes, so a
+high-fan-out service can starve INJECTS results in favor of CALLS results.
+
+**Recommendation.** Either:
+
+1. Keep separate budgets (`stage_limit_structural`, `stage_limit_calls`,
+   default to `stage_limit` each), or
+2. Order ingestion to prefer INJECTS / EXTENDS / IMPLEMENTS first, then
+   top up with CALLS until `stage_limit`. The current code already runs
+   the structural query first — just keep the CALLS top-up bounded by
+   `stage_limit - len(stage_results)` instead of a separate
+   `stage_limit * 4` LIMIT.
+
+---
+
+### D6. `_resolve_this_super_field_chain` lacks fixture coverage
+
+The resolver line
+`chain = _resolve_this_super_field_chain(expr, member=member, ast=ast, tables=tables)`
+is a real bonus over what CMM does — if it walks
+`this.fieldA.fieldB.fieldC.method()` correctly. Add a smoke fixture that
+exercises it; none of the existing files do.
+
+---
+
+## Smaller nits
+
+- **N1 — Per-call rebuild of `_scope_table`.** `_resolve_and_emit_call`
+  calls `_scope_table(member, ast, tables)` on every call site.
+  Field / parameter scope is identical for every call inside a single
+  method body — locals only grow as you step through the body. Build it
+  once per `member` in `_resolve_method_calls` and pass it in. On a
+  5-microservice corpus this is the kind of constant-factor that doubles
+  `pass3_calls` runtime.
+- **N2 — `_lookup_method_candidates`'s `name_only` fallback rule is good,
+  but the strategy logic in `_resolve_and_emit_call` is intricate.**
+  The branch
+  `elif name_only_fb and len(candidates) == 1: edge_strat = strat` is
+  correct but easy to misread — the inline comment is good; consider
+  promoting it to a docstring section.
+- **N3 — `is_static_call` heuristic.** `_infer_static_method_invocation`
+  returns `True` when the receiver starts with an uppercase identifier.
+  For `var Foo = supplier.get();` followed by `Foo.bar()` this
+  misclassifies. Rare in practice, but worth a TODO; conservative fix is
+  to consult the scope table (if `Foo` is in scope as a variable, it's
+  not a static call).
+- **N4 — Ontology guard.** `ONTOLOGY_VERSION` 3 → 4 is set, but confirm
+  `KuzuGraph.get` actually raises on `GraphMeta.ontology_version`
+  mismatch at read time so a stale graph fails loudly (proposal §5.3).
+- **N5 — `pass3_calls` diagnostics.** The log line reports
+  chained-phantom % only. Add the `phantom_other` ratio (the bigger one
+  in real codebases) so you can spot B1 / B3 regressions in the log
+  immediately.
+- **N6 — Method reference inside lambda.** `visit` sets
+  `lam=lam or chained` for method references with a chained qualifier.
+  That conflates "I'm in a lambda" with "this method ref is itself
+  chained." `chained` should propagate as a separate flag, not as
+  `in_lambda`.
+- **N7 — Empty `expr` and `is_static_call=False` branch.** The condition
+  `expr in ("", "this") or (not expr and call.is_static_call is False
+  and not call.receiver_expr)` is redundant: if `expr == ""` the second
+  clause is also true. Simplify to `expr in ("", "this")`.
+
+---
+
+## Suggested fix order
+
+1. **B1, B2, B3 as one PR** titled
+   *"call graph: faithful confidence preservation across the resolver→writer boundary"*
+   — the three bugs share one architectural fix (don't downgrade
+   strategy / confidence at edge-emit time when the receiver was
+   resolved). Add the suggested tests in the same PR.
+2. **D5 as a separate PR** — `trace_flow` budget split with a regression
+   test that seeds a service whose CALLS fan-out exceeds the structural
+   one.
+3. **D3 (anon-class re-attribution), D4 (`expand_methods` confidence),
+   N1 (scope-table caching) as a small follow-up** before opening the
+   next phase.
+
+---
+
+## Closing note
+
+This is solid Phase-3 work. Land the three bug fixes and the codebase is
+in an excellent spot to start on the next phase — either cross-service
+`HTTP_CALLS` (B6 / B7 in
+[`what-to-borrow-from-cmm.md`](https://github.com/HumanBean17/java-codebase-rag/blob/master/tmp/what-to-borrow-from-cmm.md))
+or runtime-trace ingestion (B3 from the same doc). Both will lean on the
+resolver and confidence machinery just built; the bug fixes above make
+that lean trustworthy.
diff --git a/docs/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md b/docs/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md
new file mode 100644
index 00000000..83a99cae
--- /dev/null
+++ b/docs/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md
@@ -0,0 +1,62 @@
+# Design issues: PLAN-BROWNFIELD-ROLE-OVERRIDES (plan / specification)
+
+**Plan file:** `plans/todo/PLAN-BROWNFIELD-ROLE-OVERRIDES.md`  
+**Review date:** 2026-04-26  
+**Scope:** Problems, ambiguities, or gaps in the *written plan* (not the codebase).
+
+---
+
+## 1. Dual pipeline for meta-annotation data (spec gap)
+
+The plan describes building Layer A (meta-annotation reachability) from a two-pass process anchored in `build_ast_graph.py` and `GraphTables`. The chunk-enrichment / Lance path must also apply the same resolution rules, but the plan does **not** require a single shared primitive for “which `@interface` definitions exist in the project.”
+
+A careful reader can infer that graph build and index enrichment should agree, but two independent implementations (graph tables vs. a separate tree walk) are **not** ruled out. If file coverage, exclude patterns, or parse-failure handling differ, Lance and Kuzu can **disagree** on `meta_chain` for the same type. The plan would be stronger with an explicit constraint: e.g. “meta maps MUST be derived from the same file set and exclusion rules as `build_ast_graph` pass1,” or “Lance and Kuzu MUST share one builder function.”
+
+---
+
+## 2. Depth cap for meta-annotation resolution is under-specified
+
+The plan gives a sketch of `_resolve_meta_chain` with `len(seen) > 4` and cycle handling. As written, the `seen` set is used both for **cycle** detection and as a stand-in for **path depth**. On a *linear* chain of meta-annotations, set size tracks depth. On **branching** shapes, set cardinality and “steps from root” diverge, so the sketch does not define a single clear semantics (strict path depth vs. global visit count).
+
+The follow-up test (“six wrappers → `OTHER`”) depends on a precise cap. The plan should name the exact metric (e.g. maximum path length from the start simple name) and the integer bound, so implementers and tests are aligned.
+
+---
+
+## 3. Pre-flight test 9 mixes “unit” and “integration” scope
+
+The pre-flight item asks for a “unit-style” regression but specifies: build a **fresh** Lance index with FQN overrides, **query the table directly**, and then run **`codebase_search(..., capability=...)`** end to end. That is a **multi-layer** test (indexer + storage + search API) and is expensive to run and to keep stable in CI.
+
+A tiered requirement would match intent better: (1) schema / `JavaLanceChunk` field, (2) `process_java_file` row, (3) optional full search. As written, teams may either skip the heavy part or over-invest in flaky integration for what is mainly a **write-path** contract.
+
+---
+
+## 4. “Precedence” vs. “execution order” is correct but error-prone to skim
+
+The plan is internally consistent: execution order is the *reverse* of listed priority, and guards use the **current** `role` after each step. Still, a reader who only scans the “Precedence summary (final)” table may implement **C before FQN** in the wrong direction or mis-order **B vs. A** without reading the “Execution order in code (REQUIRED)” block.
+
+This is a **documentation hazard** in the spec, not a logic error. A short, single bullet at the top (“Apply steps in *only* the order: …; do not reorder”) or a Mermaid sequence diagram would reduce mis-implementation.
+
+---
+
+## 5. Layer A duplicate `@interface` simple names
+
+The plan correctly specifies first-seen-wins and a stderr warning. The **implication** (colliding simple names in different packages map to one `meta_chain` entry) is only obvious if you already know Java’s annotation resolution limits in this indexer. A one-line “Limitation:” callout in the plan would set expectations for monorepos with same-named annotations.
+
+---
+
+## 6. Rollout vs. single document
+
+The plan says three independent PRs (Phase 1 → 2 → 3) while also presenting all phases in one file. That is fine for a complete picture, but the **merge strategy** (squashed single PR vs. three) is a process choice the plan does not need to fix—only note that “shippable phases” and “one landing” can conflict in review scope unless branches are cut accordingly.
+
+---
+
+## Summary
+
+| ID | Topic                         | Severity (spec) |
+|----|------------------------------|-----------------|
+| 1  | Single source of truth for meta map inputs | High (consistency) |
+| 2  | Depth / cycle semantics       | Medium          |
+| 3  | Pre-flight test cost / tiers   | Low–medium      |
+| 4  | Precedence skimming hazard    | Low             |
+| 5  | Duplicate simple-name limits  | Low             |
+| 6  | Multi-PR vs one doc            | Process only    |
diff --git a/docs/reports/what-to-borrow-from-cmm.md b/docs/reports/what-to-borrow-from-cmm.md
new file mode 100644
index 00000000..e2258de3
--- /dev/null
+++ b/docs/reports/what-to-borrow-from-cmm.md
@@ -0,0 +1,247 @@
+# What to Borrow from Codebase-Memory MCP
+
+A focused, prioritized guide for evolving `java-codebase-rag` (AMA agent) by adopting proven patterns from [DeusData/codebase-memory-mcp](https://github.com/DeusData/codebase-memory-mcp) (paper: [arXiv:2603.27277](https://arxiv.org/abs/2603.27277)) — without giving up your Spring-aware, hybrid (vector + graph) edge.
+
+> **Guiding principle.** CMM optimizes for *token efficiency at acceptable quality* across 66 languages. Your AMA agent optimizes for *answer quality on Spring/Java microservices* via hybrid retrieval. Borrow CMM's structural mechanics; keep your semantic / role-aware layer as the differentiator.
+
+---
+
+## Snapshot — where each tool wins
+
+| Layer | Your AMA agent | Codebase-Memory MCP | Action |
+|---|---|---|---|
+| Java/Spring DI semantics | Strong (`@Autowired`, `@Inject`, Lombok, `@FeignClient`) | None | Keep yours |
+| Vector / hybrid retrieval (LanceDB + RRF + `graph_expand`) | Yes | None | Keep yours |
+| Role / capability ontology (`CONTROLLER`, `MESSAGE_LISTENER`, ...) | Yes | None | Keep yours |
+| Microservice topology + brownfield overrides | Yes | Generic `Project` only | Keep yours |
+| `CALLS` / `HTTP_CALLS` / `ASYNC_CALLS` resolution | Roadmap (Phase 3) | Shipped, mature | **Borrow** |
+| `Route` as first-class node | Roadmap | Shipped | **Borrow** |
+| Cross-repo / cross-service edges | Roadmap | Shipped (`pass_cross_repo`) | **Borrow** |
+| Runtime trace ingestion | None | Shipped (`ingest_traces`) | **Borrow** |
+| Git-diff impact + risk classification | Partial (`impact_analysis`) | Shipped (`detect_changes`) | **Borrow** |
+| Layered ignore (`.gitignore` + project ignore) | Constant list | Layered (`.cbmignore`) | **Borrow** |
+| Louvain community detection | None | Shipped | **Borrow (Phase 4)** |
+| Dead-code detection | None | Shipped | **Borrow (Phase 4)** |
+| 66-language tree-sitter grammars | Java only | Yes | Skip (off-strategy) |
+| Single static binary distribution | Python venv | Yes | Skip until Phase 5+ |
+| 3D graph UI | None | Yes | Skip |
+| `get_architecture` mega-tool | Split into small tools | One bundled tool | Skip — keep yours |
+
+---
+
+## Tier 1 — Borrow now (cheap, high impact)
+
+### B1. Confidence-scored CALLS resolution cascade
+
+CMM's [`pass_calls.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_calls.c) and [`extract_calls.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/internal/cbm/extract_calls.c) resolve calls via a deterministic cascade. Adopt the **shape**, not the C code.
+
+**What to lift:**
+
+- A 4-strategy cascade with explicit confidence values:
+  1. Import-map resolved (`0.95`)
+  2. Same-module / same-package (`0.90`)
+  3. Globally unique simple name (`0.75`)
+  4. Suffix / fuzzy match (`0.55`)
+- A `confidence` property on every `CALLS` edge so downstream tools (and the MCP agent) can filter (`WHERE c.confidence >= 0.8`).
+- A `source` property: `"static"` vs `"trace"` vs `"di_proxy"`.
+
+**Why now:** Add the property when you create the Kuzu schema for Phase 3 — retrofitting columns later is painful.
+
+**Suggested Kuzu DDL:**
+
+```sql
+CREATE REL TABLE CALLS (
+    FROM Method TO Method,
+    confidence DOUBLE,         -- 0.55 .. 1.0
+    source     STRING,         -- 'static' | 'trace' | 'di_proxy'
+    strategy   STRING,         -- 'import_map' | 'same_module' | 'unique_name' | 'suffix'
+    call_site  STRING          -- file:line
+);
+```
+
+---
+
+### B2. `Route` as a first-class node
+
+CMM models REST endpoints and message channels as a single `Route` label so that *any* call site can attach to *any* endpoint via `HTTP_CALLS` / `ASYNC_CALLS`. See [`pass_route_nodes.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_route_nodes.c).
+
+**What to lift:**
+
+- Adopt the **`Route`** label (instead of `RestEndpoint` from your current PRODUCT-VISION) — keeps you semantically interoperable if anyone runs both MCPs in parallel.
+- Properties: `path`, `method`, `framework` (`spring_mvc`, `webflux`, `feign`, `kafka`, `rabbitmq`), `broker` (for async), `service` (microservice name).
+- Edges:
+  - `(Method)-[:EXPOSES]->(Route)` for `@RequestMapping`/`@KafkaListener`
+  - `(Method)-[:HTTP_CALLS]->(Route)` for `RestTemplate`/`WebClient`/`@FeignClient`
+  - `(Method)-[:ASYNC_CALLS]->(Route)` for `KafkaTemplate.send`/`StreamBridge.send`
+- A normalization rule: `/api/users/{id}` and `/api/users/123` collapse to the same `Route` (path-template canonicalization).
+
+---
+
+### B3. Runtime trace ingestion (`ingest_traces`)
+
+This is the single biggest quality lever you don't have yet. Static analysis misses Spring AOP proxies, polymorphic dispatch, reflection, and event-driven flows — runtime traces capture all of them.
+
+**What to lift:**
+
+- A new MCP tool `ingest_traces(spans: List[Span], source: str)`.
+- Accept OpenTelemetry / Sleuth / Micrometer JSON natively.
+- For each `(parent_span, child_span)` pair, emit `(caller:Method)-[:CALLS {source:"trace", confidence:1.0}]->(callee:Method)`.
+- For HTTP client spans, emit `(caller)-[:HTTP_CALLS]->(Route)` using `http.url` + `http.method` to match an existing `Route` node.
+- Deduplicate via `(source_id, target_id, source)` so re-ingesting traces is idempotent.
+
+**Why this matters:** Lifts Phase 3 from "static approximation" to "ground-truth where traces exist, static elsewhere" — and the agent can prefer `confidence:1.0` edges automatically.
+
+---
+
+### B4. Git-diff impact mapping with risk score
+
+CMM's [`detect_changes`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_gitdiff.c) maps a diff to affected symbols and a blast radius. You already have `impact_analysis` — make it diff-driven and add risk classification.
+
+**What to lift:**
+
+- New MCP tool `analyze_pr(diff: str | git_ref: str)`:
+  1. Parse `git diff` line ranges per file
+  2. Map line ranges → chunks → graph nodes (functions/methods)
+  3. Run your existing reverse closure
+  4. Return `{ changed_nodes, blast_radius, risk_score, risk_level }`
+- Risk formula (start simple, tune later):
+
+```
+risk = log10(1 + downstream_consumers) * role_weight * cross_service_factor
+
+role_weight        = { CONTROLLER:1.5, SERVICE:1.2, REPOSITORY:1.0, CONFIG:1.8, ENTITY:1.3, ... }
+cross_service_factor = 1.0 if changes only touch one microservice, 2.0 otherwise
+risk_level         = "low" (<1.0), "medium" (1.0..2.5), "high" (>2.5)
+```
+
+- Output usable directly in PR review or CI gating.
+
+---
+
+### B5. Layered ignore patterns
+
+CMM uses **hardcoded patterns → `.gitignore` hierarchy → `.cbmignore`** ([`discover/`](https://github.com/DeusData/codebase-memory-mcp/tree/master/src/discover)). Cleaner than your current `COMMON_EXCLUDED_PATH_PATTERNS` constant.
+
+**What to lift:**
+
+- Layer order:
+  1. Hardcoded must-skip (`.git`, `node_modules`, `target`, `build`, `out`, `.idea`, `.gradle`, `bin`)
+  2. Walk up `.gitignore` files from each indexed directory
+  3. Project-level `.lancedb-mcp.yml`'s `ignore:` list
+  4. NEW: optional `.lancedb-mcp-ignore` file with gitignore syntax
+- Always skip symlinks (cycle protection).
+- Reuse `pathspec` (Python) — it's the gitignore-spec-compliant matcher.
+
+---
+
+## Tier 2 — Borrow during Phase 2 / 3
+
+### B6. Cross-repo / cross-service edges
+
+CMM's [`pass_cross_repo.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_cross_repo.c) matches an `HTTP_CALLS` edge in service A to a `Route` node in service B and creates a `CROSS_HTTP_CALLS` edge. This is the killer feature for a multi-microservice AMA.
+
+**What to lift:**
+
+- After per-service indexing, run a global pass:
+  - For each `HTTP_CALLS` edge with `path` + `method`, find the matching `Route` node in any other indexed service.
+  - Emit `(callerMethod)-[:CALLS_HTTP]->(Route)<-[:EXPOSES]-(calleeMethod)` so traversal in either direction works.
+- Same for async: match `topic`/`queue` strings in `KafkaTemplate.send` calls to `@KafkaListener` `Route` nodes.
+- Path template matching: `GET /api/orders/{id}` matches a call to `GET /api/orders/123` — use a `path_pattern` regex stored on the `Route`.
+
+**Killer query unlocked:** *"What breaks if I rename `POST /api/orders` in `order-service`?"* → traverse `Route` → cross-service `HTTP_CALLS` → caller methods → reverse closure → affected controllers in `checkout-service`.
+
+---
+
+### B7. Louvain community detection
+
+CMM runs Louvain over `CALLS` to discover functional modules. Useful for onboarding and architecture pitches.
+
+**What to lift:**
+
+- After Phase 3 `CALLS` lands, run Louvain on the call subgraph (use `python-igraph` or `networkx-community`).
+- Store `cluster_id` and `cluster_size` as `Method` properties.
+- New MCP tool `find_module_clusters(min_size: int)` returning ranked clusters with their dominant role mix and entry methods.
+- Bonus: weight edges by call frequency from traces (B3) for higher-quality partitions.
+
+---
+
+### B8. Dead-code detection
+
+Trivial once `CALLS` exists, but valuable for cleanup and consulting deliverables.
+
+**What to lift:**
+
+- New MCP tool `find_dead_code(exclude_entry_points: bool = true)`.
+- Definition: `Method` with zero incoming `CALLS` and zero incoming `EXPOSES`.
+- Entry-point predicates to exclude:
+  - Spring stereotypes that auto-invoke: `@Scheduled`, `@PostConstruct`, `@EventListener`, `@KafkaListener`, `@RabbitListener`, `@JmsListener`
+  - HTTP entry points: any method with an `EXPOSES` edge
+  - Test methods: `@Test`, `@ParameterizedTest`, lifecycle annotations
+  - `public static void main(String[])`
+- Cypher (one query):
+
+```cypher
+MATCH (m:Method)
+WHERE NOT (m)<-[:CALLS]-()
+  AND NOT (m)-[:EXPOSES]->()
+  AND NOT m.is_entry_point
+RETURN m.qualified_name, m.role, m.file, m.line
+ORDER BY m.role, m.qualified_name
+```
+
+---
+
+## Tier 3 — Borrow later or skip
+
+### Borrow only if you go poly-language (Phase 5+)
+
+- **B9. Multi-grammar indexing.** CMM ships 66 grammars vendored. Adopt only if you sell to non-Java SMBs.
+- **B10. Static binary distribution.** Compelling for SMB clients ("download → run"). Not relevant while you're a Python venv.
+
+### Skip (don't fit your strategy)
+
+- **`get_architecture` mega-tool.** Your split tools (`graph_meta`, `list_by_role`, `list_by_capability`) are more agent-friendly because each is named and small. The agent picks better when tool intent is narrow.
+- **3D graph UI.** Not the differentiator. If you need visualization, render Kuzu subgraphs to Mermaid or Graphviz on demand from a tool — far less code, embeds in chat.
+- **Their ADR module.** Markdown folder + your existing search is enough. Adding ADR CRUD is scope creep.
+- **CMM's mini-Cypher executor.** You already have Kuzu — strictly more capable.
+
+---
+
+## Suggested roadmap reorder
+
+A revised ordering that front-loads borrowed pieces with the highest ROI:
+
+| Phase | Goal | Borrowed items |
+|---|---|---|
+| **2** (now) | `Route` nodes + `HTTP_CALLS` / `ASYNC_CALLS` from Spring/Feign/Kafka, with `confidence` columns | B2 |
+| **2.5** | `ingest_traces` MCP tool (cheap, huge quality lift) | B3 |
+| **3** | Static `CALLS` with 4-strategy cascade; `find_callers` / `find_callees`; dead code | B1, B8 |
+| **3.5** | `pass_cross_repo`-style cross-service edges | B6 |
+| **4** | `analyze_pr` (diff → impact + risk); Louvain clusters | B4, B7 |
+| **5** | Eval harness; head-to-head benchmark vs. CMM on Java repos | — |
+| **5+** | Optional poly-language grammars; static-binary packaging | B9, B10 |
+
+Layered ignores (B5) can land anywhere — drop it in alongside the next indexer change.
+
+---
+
+## Strategic notes
+
+- **Run both MCPs in parallel as a zero-integration option.** `.mcp.json` supports many servers. Let your tool answer Java/architectural queries; CMM handles non-Java or generic structural queries when you eventually touch poly-glot codebases. Zero integration cost, maximum optionality.
+- **Use the comparison itself as a portfolio asset.** When you start pitching SMB clients on AI automation, "I built a Spring-aware hybrid retrieval system that beats the published Codebase-Memory baseline on Java microservice questions" — with numbers from your Phase 5 eval harness — is a credible artifact. Few consultants can show that.
+- **Don't fork CMM.** It's MIT-licensed C with vendored grammars; maintenance cost is high and the code style diverges from your Python stack. Read it as documentation, port the patterns.
+
+---
+
+## References
+
+- Codebase-Memory MCP source — [github.com/DeusData/codebase-memory-mcp](https://github.com/DeusData/codebase-memory-mcp)
+- Paper — [Codebase-Memory: Tree-Sitter-Based Knowledge Graphs for LLM Code Exploration via MCP (arXiv:2603.27277)](https://arxiv.org/abs/2603.27277)
+- Your repo — [HumanBean17/java-codebase-rag](https://github.com/HumanBean17/java-codebase-rag)
+- Key CMM files referenced above:
+  - [`pass_calls.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_calls.c) — call resolution
+  - [`pass_route_nodes.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_route_nodes.c) — route nodes
+  - [`pass_cross_repo.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_cross_repo.c) — cross-service edges
+  - [`pass_gitdiff.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_gitdiff.c) — git diff impact
+  - [`extract_channels.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/internal/cbm/extract_channels.c) — async patterns
+  - [`service_patterns.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/internal/cbm/service_patterns.c) — framework markers
diff --git a/plans/completed/AGENT-PROMPTS-MCP-API-V2.md b/plans/completed/AGENT-PROMPTS-MCP-API-V2.md
index bf079cee..2dc4e914 100644
--- a/plans/completed/AGENT-PROMPTS-MCP-API-V2.md
+++ b/plans/completed/AGENT-PROMPTS-MCP-API-V2.md
@@ -279,7 +279,7 @@ Headline items:
 2. `README.md`: delete v1 tool reference; promote "v2 navigation tools
    (preview)" to primary `### Tool reference`. Keep ops tools listed as
    "operational — moving to `user-rag` CLI in next release".
-3. `propose/PRODUCT-VISION.md`: rewrite v1 example invocations to v2 (per
+3. `docs/PRODUCT-VISION.md`: rewrite v1 example invocations to v2 (per
    propose §11 mapping).
 4. Delete `tests/test_mcp_v2_equivalence.py` entirely — v1 no longer exists.
 5. Update `tests/test_server.py` (or add if missing) tool-count assertion to:
@@ -322,9 +322,9 @@ grep -nE "^### Tool reference" README.md
 - [ ] `tests/test_mcp_v2_equivalence.py` does not exist.
 - [ ] README §"Tool reference" lists exactly the 4 v2 tools as primary; ops
       tools noted as transitional.
-- [ ] `propose/PRODUCT-VISION.md` example invocations updated to v2.
+- [ ] `docs/PRODUCT-VISION.md` example invocations updated to v2.
 - [ ] Diff is confined to deliverables in this prompt (`server.py`, `README.md`,
-      `propose/PRODUCT-VISION.md`, deleted `tests/test_mcp_v2_equivalence.py`,
+      `docs/PRODUCT-VISION.md`, deleted `tests/test_mcp_v2_equivalence.py`,
       `tests/test_server.py` or equivalent surface-assertion test), plus
       narrowly-related test harness/import updates required to make those
       changes pass.
@@ -337,7 +337,7 @@ grep -nE "^### Tool reference" README.md
 - [`plans/PLAN-MCP-API-V2.md` § PR-V2-3](./PLAN-MCP-API-V2.md#pr-v2-3--delete-v1-navigation-tools)
   — list of the 18 tools to delete.
 - [`propose/completed/MCP-API-V2-REDESIGN-PROPOSE.md`](../../propose/completed/MCP-API-V2-REDESIGN-PROPOSE.md)
-  §11 mapping table — for rewriting `propose/PRODUCT-VISION.md` examples.
+  §11 mapping table — for rewriting `docs/PRODUCT-VISION.md` examples.
 - `server.py` history (git log) — to identify each tool's helper-function
   graveyard.
 
diff --git a/plans/completed/AGENT-PROMPTS-TIER1B.md b/plans/completed/AGENT-PROMPTS-TIER1B.md
index 23e2458c..b8e420c3 100644
--- a/plans/completed/AGENT-PROMPTS-TIER1B.md
+++ b/plans/completed/AGENT-PROMPTS-TIER1B.md
@@ -575,7 +575,7 @@ Concretely:
   plan §5.3) — two services + a third "ambiguous" controller.
 - Create `tests/test_call_edge_matching.py` with cases 32–40.
 - Extend `tests/test_mcp_tools.py` with cases 41–48.
-- Flip `propose/PRODUCT-VISION.md` `HTTP_CALLS` / `ASYNC_CALLS` rows
+- Flip `docs/PRODUCT-VISION.md` `HTTP_CALLS` / `ASYNC_CALLS` rows
   from *planned* to *shipped*.
 
 ## Out of scope (do NOT touch)
@@ -620,7 +620,7 @@ don't ship it.
 11. New fixture `tests/fixtures/cross_service_smoke/`.
 12. New test file `tests/test_call_edge_matching.py` with cases 32–40.
 13. Cases 41–48 added to `tests/test_mcp_tools.py`.
-14. `propose/PRODUCT-VISION.md` flipped (planned → shipped).
+14. `docs/PRODUCT-VISION.md` flipped (planned → shipped).
 15. `README.md` MCP tools section updated.
 
 ## Tests
@@ -681,7 +681,7 @@ Expected: at least one caller from `chat-assign` with `match='cross_service'`.
 - [ ] Sentinel greps return expected results.
 - [ ] No file outside `build_ast_graph.py`, `kuzu_queries.py`,
       `server.py`, `pr_analysis.py`, `README.md`,
-      `propose/PRODUCT-VISION.md`, and the new `tests/` paths is
+      `docs/PRODUCT-VISION.md`, and the new `tests/` paths is
       modified.
 - [ ] PR description includes the scope statement, the manual evidence
       output (pass6 log + meta() snippet + find_route_callers output),
diff --git a/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md b/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md
index 13ec61f3..e87ee7e4 100644
--- a/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md
+++ b/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md
@@ -6,7 +6,7 @@ Status: **completed** — applied. Companion document to
 ## Why this file exists
 
 The brownfield plan grew through two review rounds; the second review
-(`reports/review/active/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md`)
+(`docs/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md`)
 flagged design issues that were folded back into the plan in-place.
 Once they're inlined, they stop standing out — but they are exactly
 the parts an implementer is most likely to skim past or get wrong,
diff --git a/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES.md b/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES.md
index b6ef32f6..96bc081c 100644
--- a/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES.md
+++ b/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES.md
@@ -805,7 +805,7 @@ order in this list is the only correct interleaving; do not reorder.
 - `README.md` — new section "Brownfield overrides" walking through Layer
   B (config), with a complete example block. Mention Layer C as the last
   resort, with the four interface declarations to copy-paste.
-- `CODEBASE_REQUIREMENTS.md` — expand the role-inference section to note
+- `docs/CODEBASE_REQUIREMENTS.md` — expand the role-inference section to note
   the override layers exist.
 - MCP server `instructions` string in `server.py` — one extra sentence
   noting that "role and capability inference can be customised per-project
diff --git a/plans/completed/PLAN-CALL-GRAPH.md b/plans/completed/PLAN-CALL-GRAPH.md
index bcde568e..3880d15e 100644
--- a/plans/completed/PLAN-CALL-GRAPH.md
+++ b/plans/completed/PLAN-CALL-GRAPH.md
@@ -460,7 +460,7 @@ Additions (~80 lines, no removals):
    - How to filter by `min_confidence`.
    - Why phantoms aren't dropped at index time.
 
-### 8. `CODEBASE_REQUIREMENTS.md`
+### 8. `docs/CODEBASE_REQUIREMENTS.md`
 
 Add a "Call graph" note listing the tree-sitter node types the extractor
 depends on:
@@ -585,7 +585,7 @@ Single PR. Breaking changes:
 | 8 | Augment `_graph_expand_merge` to also call `expand_methods`. | `search_lancedb.py` | Graph-expand results include method-reachable chunks on the smoke corpus. |
 | 9 | Add MCP tools (`find_callers`, `find_callees`), `follow_calls` param on `trace_flow`, update `_INSTRUCTIONS`. | `server.py` | `test_mcp_tools.py` additions pass. |
 | 10 | Update tests: new files + extend `test_ast_graph_build.py` / `test_kuzu_queries.py` / `test_mcp_tools.py`. | `tests/` | `pytest` green. |
-| 11 | Update `README.md` + `CODEBASE_REQUIREMENTS.md`. | docs | Manual review. |
+| 11 | Update `README.md` + `docs/CODEBASE_REQUIREMENTS.md`. | docs | Manual review. |
 | 12 | Confirm `propose/completed/CALL-GRAPH-PROPOSE.md` is the only active call-graph proposal (old deferred draft already removed; git history retains it). | `propose/` | Directory listing shows a single call-graph proposal. |
 
 ## Out of scope (for this plan, tracked elsewhere)
diff --git a/plans/completed/PLAN-CAPABILITIES-MODEL.md b/plans/completed/PLAN-CAPABILITIES-MODEL.md
index 7c8d9e65..6707a17e 100644
--- a/plans/completed/PLAN-CAPABILITIES-MODEL.md
+++ b/plans/completed/PLAN-CAPABILITIES-MODEL.md
@@ -453,7 +453,7 @@ callers see them in results.
 - `README.md` — add a section "Capabilities" describing the multi-tag
   axis, the initial capability set, and `list_by_capability`. Keep the
   existing "Roles" section intact.
-- `CODEBASE_REQUIREMENTS.md` — note the type-level granularity choice
+- `docs/CODEBASE_REQUIREMENTS.md` — note the type-level granularity choice
   and the deferred per-method storage (link to this plan).
 - MCP server `instructions` string in `server.py` — one extra sentence
   pointing at `list_by_capability` for behavioural questions about
diff --git a/plans/completed/PLAN-CLI-SCENARIOS.md b/plans/completed/PLAN-CLI-SCENARIOS.md
index c07e3011..d1156d8e 100644
--- a/plans/completed/PLAN-CLI-SCENARIOS.md
+++ b/plans/completed/PLAN-CLI-SCENARIOS.md
@@ -73,7 +73,7 @@ before PR-CLI-2 so contributors exercising new subcommands do not pay multi-seco
 | --- | --- | --- | --- | --- | --- |
 | **PR-CLI-1** | Land / freeze propose (doc-only merge of `CLI-SCENARIOS-PROPOSE.md` if not already on `master`) | none | `propose/completed/CLI-SCENARIOS-PROPOSE.md` (status bump); `plans/completed/PLAN-CLI-SCENARIOS.md` (tracking) | n/a | none |
 | **PR-CLI-2** | Full implementation: lifecycle handlers, env + YAML + index layout, package rename, `server.py` / indexer / path helpers, **`mcp_v2.py`**, **`path_filtering.py`** (`.lancedb-mcp/ignore` → `.java-codebase-rag/ignore`), help redesign, tracking issue constant, user-visible stderr hints; **`mcp.json.example`** env keys = source of truth | none | `pyproject.toml`, package dir rename, `server.py`, `mcp_v2.py`, `java_codebase_rag/cli.py`, `java_index_flow_lancedb.py`, `graph_enrich.py`, `path_filtering.py`, `search_lancedb.py`, `kuzu_queries.py`, `build_ast_graph.py`, tests, `mcp.json.example`, `.gitignore`, any other `user_rag` / env / path references in Python | unit + integration + help-structure test (see below) | PR-CLI-1 merged |
-| **PR-CLI-3** | Doc and example sweep + **`.cursor/rules/`** + migration sections + acceptance grep; **`mcp.json.example`** comment/example polish only (keys already correct from PR-CLI-2) | none | `README.md`, `docs/*`, `AGENTS.md`, `.cursor/rules/*.mdc`, `CODEBASE_REQUIREMENTS.md`, `mcp.json.example` (prose only if needed), selected `propose/*.md`, `.gitignore` notes | manual grep audit; `ruff` / `pytest` unchanged by docs | PR-CLI-2 merged |
+| **PR-CLI-3** | Doc and example sweep + **`.cursor/rules/`** + migration sections + acceptance grep; **`mcp.json.example`** comment/example polish only (keys already correct from PR-CLI-2) | none | `README.md`, `docs/*`, `AGENTS.md`, `.cursor/rules/*.mdc`, `docs/CODEBASE_REQUIREMENTS.md`, `mcp.json.example` (prose only if needed), selected `propose/*.md`, `.gitignore` notes | manual grep audit; `ruff` / `pytest` unchanged by docs | PR-CLI-2 merged |
 
 Landing order: **PR-CLI-1 → PR-CLI-2 → PR-CLI-3**.
 
@@ -284,7 +284,7 @@ Follow the **explicit file list** in propose §6 (`README.md`,
 `paper.pdf`, `AGENTS.md`, **`.cursor/rules/*.mdc`** (agent rules audit),
 `CODEBASE_REQUIREMENTS.md`, `mcp.json.example` (comments only — keys from PR-CLI-2),
 `propose/INDEX-AUTO-MODE-PROPOSE.md`,
-`propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md`, `propose/PRODUCT-VISION.md`,
+`propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md`, `docs/PRODUCT-VISION.md`,
 `.gitignore`).
 
 Add **Migration from legacy names** sections with explicit `mv` commands
@@ -314,7 +314,7 @@ Add **Migration from legacy names** sections with explicit `mv` commands
 | # | Step | File(s) | Done when |
 | --- | --- | --- | --- |
 | 1 | README + CLI operator guide | `README.md`, `docs/JAVA-CODEBASE-RAG-CLI.md` | New subcommand table + 5 env vars + migration |
-| 2 | Agent + checklist + requirements | `docs/*`, `CODEBASE_REQUIREMENTS.md` | No stale operator paths |
+| 2 | Agent + checklist + requirements | `docs/*`, `docs/CODEBASE_REQUIREMENTS.md` | No stale operator paths |
 | 3 | Paper + proposes + example MCP JSON | `docs/paper/`, `propose/*`, `mcp.json.example` | PDF rebuilt; examples updated |
 | 4 | Acceptance grep | repo root | Reviewer sign-off |
 
diff --git a/plans/completed/PLAN-CLIENT-ROLE-RENAME.md b/plans/completed/PLAN-CLIENT-ROLE-RENAME.md
index bf2f2001..4edbcd08 100644
--- a/plans/completed/PLAN-CLIENT-ROLE-RENAME.md
+++ b/plans/completed/PLAN-CLIENT-ROLE-RENAME.md
@@ -188,7 +188,7 @@ Six references at `server.py:49, 689, 1141, 1338, 1342, 1418`:
 - Line 1342: docstring `"...FEIGN_CLIENT/REPOSITORY/MAPPER..."` → `"...CLIENT/REPOSITORY/MAPPER..."`
 - Line 1418: `entry_roles = ["CONTROLLER", "COMPONENT", "SERVICE", "FEIGN_CLIENT"]` → `[..., "CLIENT"]`
 
-#### Change 6: Update `README.md` and `CODEBASE_REQUIREMENTS.md`
+#### Change 6: Update `README.md` and `docs/CODEBASE_REQUIREMENTS.md`
 
 `README.md`:
 - Line 137: `trace_flow` description's stage chain `FEIGN_CLIENT/REPOSITORY/MAPPER` → `CLIENT/REPOSITORY/MAPPER`
@@ -338,7 +338,7 @@ description.
 - [ ] `_ROLE_SCORE_WEIGHTS["CLIENT"] = 0.06` (was `FEIGN_CLIENT`) (`search_lancedb.py:188`)
 - [ ] Six `server.py` literal references updated (lines 49, 689, 1141, 1338, 1342, 1418)
 - [ ] `README.md` updated (3 lines + brownfield note)
-- [ ] `CODEBASE_REQUIREMENTS.md` updated (lines 146, 162, 346-347)
+- [ ] `docs/CODEBASE_REQUIREMENTS.md` updated (lines 146, 162, 346-347)
 - [ ] `tests/test_lancedb_e2e.py:342` allow-list updated
 - [ ] `ONTOLOGY_VERSION` bumped 8 → 9 with phase-comment update
 - [ ] All 9 new tests in `tests/test_client_role_rename.py` pass
diff --git a/plans/completed/PLAN-HTTP-ROUTE-METHOD-ENUM.md b/plans/completed/PLAN-HTTP-ROUTE-METHOD-ENUM.md
index 266edf78..a966c4d9 100644
--- a/plans/completed/PLAN-HTTP-ROUTE-METHOD-ENUM.md
+++ b/plans/completed/PLAN-HTTP-ROUTE-METHOD-ENUM.md
@@ -31,7 +31,7 @@ Depends on: **none** (lands on current `master`).
 | PR | Scope | Ontology bump | Files touched (approx) | Test buckets | Independent of |
 | --- | --- | --- | --- | --- | --- |
 | **PR-1** | `CodebaseHttpMethod.java` stub under route fixtures; **parameterized** structured stderr emitter (new small module and/or `build_ast_graph.py`) — **no production call sites** | **none** | `tests/fixtures/brownfield_route_stubs/...`, emitter module, **1** new test | Unit test exercises emitter directly (INFO + WARN shapes as needed) | none |
-| **PR-2** | Rename stubs + route stub field type; `ast_java.py` recognition + client `method` enum parse + **extractor-time** INFO shadowing + WARN on string `method`; `graph_enrich.py` HTTP branch replace **without** merge-time shadowing + `meta_chain` / log strings; tighten `test_23`; **new** inbound exclusivity test; README + `CODEBASE_REQUIREMENTS.md` + any other doc hits for examples | **11 → 12** (`ast_java.ONTOLOGY_VERSION`; README / `AGENTS.md` callouts) | `ast_java.py`, `graph_enrich.py`, structured-log module (see PR-2 §4), stubs, `tests/test_*.py` listed below, `README.md`, `CODEBASE_REQUIREMENTS.md`, `build_ast_graph.py` if comment at ~1904 | Full `pytest tests`; new exclusivity + optional shadowing log test | PR-1 merged |
+| **PR-2** | Rename stubs + route stub field type; `ast_java.py` recognition + client `method` enum parse + **extractor-time** INFO shadowing + WARN on string `method`; `graph_enrich.py` HTTP branch replace **without** merge-time shadowing + `meta_chain` / log strings; tighten `test_23`; **new** inbound exclusivity test; README + `docs/CODEBASE_REQUIREMENTS.md` + any other doc hits for examples | **11 → 12** (`ast_java.ONTOLOGY_VERSION`; README / `AGENTS.md` callouts) | `ast_java.py`, `graph_enrich.py`, structured-log module (see PR-2 §4), stubs, `tests/test_*.py` listed below, `README.md`, `docs/CODEBASE_REQUIREMENTS.md`, `build_ast_graph.py` if comment at ~1904 | Full `pytest tests`; new exclusivity + optional shadowing log test | PR-1 merged |
 | **PR-3** | Agent docs + v2 addendum only | **none** | `docs/AGENT-GUIDE.md`, `docs/skills/java-codebase-explore.md` (if needed), `propose/completed/BROWNFIELD-ANNOTATIONS-V2-ADDENDUM-HTTP-METHOD-ENUM.md` | Docs-only CI | PR-2 merged |
 
 Landing order: **PR-1 → PR-2 → PR-3**.
@@ -136,7 +136,7 @@ Landing order: **PR-1 → PR-2 → PR-3**.
 - `tests/test_assign_endpoint_client_extraction.py` — Feign + JAX-RS mirror; `method = HttpMethod.POST` may need alignment with **`CodebaseHttpMethod.POST`** on brownfield annotation per surface rules.
 - `tests/test_cross_service_resolution_flag.py` — generated Java strings.
 
-### 8. `README.md` and `CODEBASE_REQUIREMENTS.md`
+### 8. `README.md` and `docs/CODEBASE_REQUIREMENTS.md`
 
 - Replace annotation names and examples with `@CodebaseHttpClient` / enum `method`; add **Re-index required** callout for PR-2 + ontology **12**.
 
@@ -185,7 +185,7 @@ Landing order: **PR-1 → PR-2 → PR-3**.
 | 3 | Merge HTTP replace (behaviour only) | `graph_enrich.py` | Replace pattern; **zero** shadowing logs from this file |
 | 4 | Verbose plumbing if needed | `build_ast_graph.py` → parse entry | Shadowing INFO respects volume gate |
 | 5 | `meta_chain` + log strings | `graph_enrich.py` | Grep clean for old simple names |
-| 6 | Tests + docs | `tests/*`, `README.md`, `CODEBASE_REQUIREMENTS.md` | Full pytest; doc examples match stubs |
+| 6 | Tests + docs | `tests/*`, `README.md`, `docs/CODEBASE_REQUIREMENTS.md` | Full pytest; doc examples match stubs |
 | 7 | Ontology + meta test | `ast_java.py`, `tests/test_call_edges_e2e.py` | `meta()` reports 12 |
 
 ---
diff --git a/plans/completed/PLAN-MCP-API-V2.md b/plans/completed/PLAN-MCP-API-V2.md
index 01c2ba9f..eaa3be60 100644
--- a/plans/completed/PLAN-MCP-API-V2.md
+++ b/plans/completed/PLAN-MCP-API-V2.md
@@ -356,7 +356,7 @@ For each, also delete:
   "operational — moving to `user-rag` CLI in next release". This is a one-PR
   transition state.
 
-### 3. `propose/PRODUCT-VISION.md` — agent-recipe examples
+### 3. `docs/PRODUCT-VISION.md` — agent-recipe examples
 
 - Update any example invocations from v1 to v2. Search for `find_callers`,
   `list_routes`, `list_clients`, `find_route_*`, `trace_*`, `impact_*` and
@@ -551,7 +551,7 @@ DoD is the delta + suite-green, not an absolute total.
 | v2 handlers diverge in behaviour from v1 | V2-1 | Equivalence tests (14 of them) compare returned id sets directly. Drift is caught at PR review. |
 | `direction`/`edge_types` required-field change breaks existing clients | V2-1 | No existing clients — confirmed by Dmitry ("nobody uses this MCP bundle yet"). Tests assert `ValidationError` is raised, which is the contract. |
 | `describe.edge_summary` adds N round-trips per call | V2-2 | Single grouped count query, not 9 round-trips. Test asserts call count via Kuzu connection mock. |
-| Removing v1 tools breaks the agent system prompt | V2-3 | `propose/PRODUCT-VISION.md` and README are updated in the same PR. Agent prompt is separate (not in this repo). |
+| Removing v1 tools breaks the agent system prompt | V2-3 | `docs/PRODUCT-VISION.md` and README are updated in the same PR. Agent prompt is separate (not in this repo). |
 | CLI subprocess tests are slow / flaky | V2-4 | Each subprocess invocation hits a pre-built fixture under `/tmp`; no rebuilds inside tests. Targeted at < 5s total. |
 | `pyproject.toml` package layout breaks the existing flat-script bundle | V2-4 | Today's `packages = []` is intentional; we promote it to `packages = ["user_rag"]` only — root scripts (`server.py`, `build_ast_graph.py`, etc.) stay outside the package. Tested by `pip install .` succeeding. |
 | `pr-review` skill under `.cursor/skills/pr-review/` still calls `analyze_pr` MCP after V2-4 | V2-4 | PR description includes a manual TODO to update that skill. CLI version of the call is documented in README's "Migration from v1" subsection. |
diff --git a/plans/completed/PLAN-REMOTE-PROJECT-INDEXING.md b/plans/completed/PLAN-REMOTE-PROJECT-INDEXING.md
index 3ed41801..b36e4ee0 100644
--- a/plans/completed/PLAN-REMOTE-PROJECT-INDEXING.md
+++ b/plans/completed/PLAN-REMOTE-PROJECT-INDEXING.md
@@ -163,7 +163,7 @@ Users will configure their MCP client (Cursor, Claude Code, etc.) like this:
 | `server.py` | `_cocoindex_subprocess_env(root)` + pass `env=` to CocoIndex subprocess |
 | `README.md` | Env table + `refresh_code_index` subprocess note |
 | `mcp.json.example` | Example `LANCEDB_MCP_PROJECT_ROOT` path |
-| `CODEBASE_REQUIREMENTS.md` | Project root / CocoIndex / MCP consistency |
+| `docs/CODEBASE_REQUIREMENTS.md` | Project root / CocoIndex / MCP consistency |
 | `tests/test_mcp_tools.py` | `test_cocoindex_subprocess_env_sets_project_root` |
 
 ---
@@ -172,5 +172,5 @@ Users will configure their MCP client (Cursor, Claude Code, etc.) like this:
 
 - [x] Step 1: Modify `java_index_flow_lancedb.py`
 - [x] Step 2: Modify `server.py` (including `_cocoindex_subprocess_env` helper + unit test)
-- [x] Step 3: Update documentation (`README.md`, `mcp.json.example`, `CODEBASE_REQUIREMENTS.md`, flow docstring)
+- [x] Step 3: Update documentation (`README.md`, `mcp.json.example`, `docs/CODEBASE_REQUIREMENTS.md`, flow docstring)
 - [x] Testing: `test_cocoindex_subprocess_env_sets_project_root`; heavy e2e unchanged (`cwd` + no env → `.` root)
diff --git a/plans/completed/PLAN-TIER1B-COMPLETION.md b/plans/completed/PLAN-TIER1B-COMPLETION.md
index aa32e46a..09b9acd1 100644
--- a/plans/completed/PLAN-TIER1B-COMPLETION.md
+++ b/plans/completed/PLAN-TIER1B-COMPLETION.md
@@ -938,7 +938,7 @@ path in both services. Used by tests 32, 34, 37.
 | 10 | Extend `analyze_pr`: `cross_service_callers_count` per changed symbol  | `server.py`, `pr_analysis.py` | test 47 passes                    |
 | 11 | Risk-score weight bump in `pr_analysis.py`                             | `pr_analysis.py`         | docstring + unit test                  |
 | 12 | Create `tests/fixtures/cross_service_smoke/`                           | `tests/fixtures/...`     | fixture files in place                 |
-| 13 | Update `README.md` MCP tools section + `propose/PRODUCT-VISION.md` (`HTTP_CALLS` planned → shipped) | `README.md`, `propose/PRODUCT-VISION.md` | manual review            |
+| 13 | Update `README.md` MCP tools section + `docs/PRODUCT-VISION.md` (`HTTP_CALLS` planned → shipped) | `README.md`, `docs/PRODUCT-VISION.md` | manual review            |
 
 ---
 
@@ -988,7 +988,7 @@ path in both services. Used by tests 32, 34, 37.
 6. Existing MCP tools extended (`impact_analysis`, `trace_flow`,
    `analyze_pr`).
 7. `README.md` updated for caller-side edges, brownfield clients,
-   match outcomes; `propose/PRODUCT-VISION.md` flips
+   match outcomes; `docs/PRODUCT-VISION.md` flips
    `HTTP_CALLS` / `ASYNC_CALLS` from *planned* to *shipped*.
 8. Each PR's description quotes the relevant stats from a manual run
    on bank-chat-system as evidence.
diff --git a/propose/completed/CALL-GRAPH-PROPOSE.md b/propose/completed/CALL-GRAPH-PROPOSE.md
index 9f2a6028..a37284af 100644
--- a/propose/completed/CALL-GRAPH-PROPOSE.md
+++ b/propose/completed/CALL-GRAPH-PROPOSE.md
@@ -4,7 +4,7 @@ Status: **completed** — shipped (static intra-JVM `CALLS` + `DECLARES`; plan:
 [`plans/completed/PLAN-CALL-GRAPH.md`](../../plans/completed/PLAN-CALL-GRAPH.md) for the
 step-by-step implementation.
 
-This proposal realises **point 4 of `PRODUCT-VISION.md`** ("Adding a Call
+This proposal realises **point 4 of `docs/PRODUCT-VISION.md`** ("Adding a Call
 Graph Layer") with a deliberately narrow scope: **static, intra-JVM
 method-to-method edges**. Cross-service HTTP/async, AOP-proxy resolution,
 and runtime-trace ingestion are explicit non-goals of this phase.
@@ -535,7 +535,7 @@ round-trip test.
   0.5 day: server surface + search-side expansion).
 - Validation on `bank-chat-system` + micro-fixture: **1 day**
   (unit + integration + regression run; manual trace_flow spot-checks).
-- Documentation update (`README.md`, `CODEBASE_REQUIREMENTS.md`, MCP
+- Documentation update (`README.md`, `docs/CODEBASE_REQUIREMENTS.md`, MCP
   instructions): **2 hours**.
 
 Total: **3–4 working days** including tests and docs.
diff --git a/propose/completed/CLI-SCENARIOS-PROPOSE.md b/propose/completed/CLI-SCENARIOS-PROPOSE.md
index 42e58875..f68ca868 100644
--- a/propose/completed/CLI-SCENARIOS-PROPOSE.md
+++ b/propose/completed/CLI-SCENARIOS-PROPOSE.md
@@ -312,11 +312,11 @@ Open a GitHub issue titled **"AST graph (Kuzu) incremental rebuild"** referencin
 - `docs/paper/paper.tex` — architecture paper updated for new CLI verbs / env vars / file paths; rebuild `paper.pdf` (Russian translation `paper_ru.tex` is a standalone artifact outside the repo and is not in scope).
 - `AGENTS.md` — CLI doc reference + any `refresh` mention.
 - `.cursor/rules/*.mdc` — agent workflow / env / CLI contract; see **Agent rules audit** below (must match post-rename surface).
-- `CODEBASE_REQUIREMENTS.md` — every `.lancedb-mcp.yml` / `LANCEDB_MCP_*` / `lancedb_data` reference updated.
+- `docs/CODEBASE_REQUIREMENTS.md` — every `.lancedb-mcp.yml` / `LANCEDB_MCP_*` / `lancedb_data` reference updated.
 - `mcp.json.example` — **PR-CLI-3 is a second pass only:** PR-CLI-2 updates this file so **env keys match the live server**; PR-CLI-3 reconciles comments, examples, and any doc drift — **no conflicting edits**; if both PRs touch it, **PR-CLI-2 wins** for structure, PR-CLI-3 for prose polish.
 - `propose/INDEX-AUTO-MODE-PROPOSE.md` — one-line note that `refresh` is being renamed to `reprocess`.
 - `propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md` — one-line note that the new tracking issue (created in PR-CLI-2) is the user-facing handle.
-- `propose/PRODUCT-VISION.md` — update `lancedb_data` mention (§ about Kuzu's on-disk footprint) and any `refresh` reference.
+- `docs/PRODUCT-VISION.md` — update `lancedb_data` mention (§ about Kuzu's on-disk footprint) and any `refresh` reference.
 - `.gitignore` — add `.java-codebase-rag/`, keep `lancedb_data/` for grace-period cleanup, or remove if PR-CLI-2 drops that grace-period entry.
 
 **Agent rules audit (PR-CLI-3, manual checklist — use together with acceptance grep below):**
@@ -342,7 +342,7 @@ Expected output after PR-CLI-3 (docs + rules):
 
 The startup-slowness fix (deferred imports in `cli.py`) is a **separate, prior PR** outside this migration; it does not change the surface and should land before PR-CLI-2 so contributors testing the new subcommands aren't taxed by the multi-second startup.
 
-**PR-CLI-3 (docs sweep):** README, `docs/*`, `AGENTS.md`, `.cursor/rules/*.mdc`, `CODEBASE_REQUIREMENTS.md`, `mcp.json.example` path placeholders, selected `propose/*.md` one-line notes, `docs/paper/paper.tex` + rebuilt `paper.pdf`, migration `mv` sections, and acceptance grep per the command in this section.
+**PR-CLI-3 (docs sweep):** README, `docs/*`, `AGENTS.md`, `.cursor/rules/*.mdc`, `docs/CODEBASE_REQUIREMENTS.md`, `mcp.json.example` path placeholders, selected `propose/*.md` one-line notes, `docs/paper/paper.tex` + rebuilt `paper.pdf`, migration `mv` sections, and acceptance grep per the command in this section.
 
 ---
 
diff --git a/propose/completed/CLIENT-ROLE-RENAME-PROPOSE.md b/propose/completed/CLIENT-ROLE-RENAME-PROPOSE.md
index 67ef8d3b..806aa3e5 100644
--- a/propose/completed/CLIENT-ROLE-RENAME-PROPOSE.md
+++ b/propose/completed/CLIENT-ROLE-RENAME-PROPOSE.md
@@ -239,7 +239,7 @@ Verified count: **5 production files, ~12 references**:
 | `server.py` | 49, 687, 1138, 1335, 1339, 1415 | MCP tool docstrings, `role` enum strings, entry-role filter |
 | `tests/test_lancedb_e2e.py` | 342 | One assertion |
 
-Plus docs: `README.md`, `CODEBASE_REQUIREMENTS.md`. Doc sweep is
+Plus docs: `README.md`, `docs/CODEBASE_REQUIREMENTS.md`. Doc sweep is
 straightforward.
 
 ### 4.4 Brownfield input today
@@ -398,7 +398,7 @@ story as every other graph-shape change).
 - README: rename role table; mention `CLIENT` + `HTTP_CLIENT` capability;
   document the `MESSAGE_PRODUCER` capability that already exists for
   symmetry.
-- `CODEBASE_REQUIREMENTS.md`: rename references.
+- `docs/CODEBASE_REQUIREMENTS.md`: rename references.
 - `propose/DEFERRED-REST-CLIENT-MIGRATION-PROPOSE.md`: **delete** (this
   proposal supersedes it; the rename-vs-capability decision is
   reversed by current architecture).
diff --git a/propose/completed/TIER1-COMPLETION-PROPOSE.md b/propose/completed/TIER1-COMPLETION-PROPOSE.md
index c3b1a541..25585df8 100644
--- a/propose/completed/TIER1-COMPLETION-PROPOSE.md
+++ b/propose/completed/TIER1-COMPLETION-PROPOSE.md
@@ -1,7 +1,7 @@
 # Tier 1 completion — proposal (shipped)
 
 Status: **completed — shipped via PR-A1 → PR-C** (merged 2026-04 → 2026-05). Moved to `propose/completed/` after PR-D3 (Tier 1B) landed. Pairs with the borrow guide
-[`reports/what-to-borrow-from-cmm.md`](../../reports/what-to-borrow-from-cmm.md)
+[`docs/reports/what-to-borrow-from-cmm.md`](../../docs/reports/what-to-borrow-from-cmm.md)
 and follows on from the completed
 [`propose/completed/CALL-GRAPH-PROPOSE.md`](CALL-GRAPH-PROPOSE.md).
 
@@ -859,10 +859,10 @@ Independent PRs, but a sensible review order:
   section with `route_overrides` examples and `@CodebaseRoute`
   source stub** — same shape as the existing `role_overrides` /
   `@CodebaseRole` material.
-- `CODEBASE_REQUIREMENTS.md`: update the schema diagram and the env-var
+- `docs/CODEBASE_REQUIREMENTS.md`: update the schema diagram and the env-var
   table (`.lancedb-mcp-ignore` mention). Document the route resolver
   five-layer composition table from §4.6.4.
-- `propose/PRODUCT-VISION.md`: tick B2a / B4 / B5 off the roadmap; note
+- `docs/PRODUCT-VISION.md`: tick B2a / B4 / B5 off the roadmap; note
   B2b + B6 as the next proposal.
 
 ---
@@ -902,7 +902,7 @@ Open questions to settle during implementation, not now:
 - [ ] `ONTOLOGY_VERSION` bumped 4 → 5; stale-graph guard test added.
 - [ ] README brownfield section extended with `route_overrides` and
   `@CodebaseRoute` examples.
-- [ ] `CODEBASE_REQUIREMENTS.md` documents the §4.6.4 five-layer
+- [ ] `docs/CODEBASE_REQUIREMENTS.md` documents the §4.6.4 five-layer
   composition table.
 - [ ] No regressions in existing role / capability resolution
   (run the existing brownfield test suite).
@@ -918,7 +918,7 @@ Open questions to settle during implementation, not now:
 - [ ] Old `compile_excluded_glob_patterns` call sites replaced (3 of
   them).
 - [ ] `graph_meta` exposes `ignore_layers`.
-- [ ] `CODEBASE_REQUIREMENTS.md` documents the layer order.
+- [ ] `docs/CODEBASE_REQUIREMENTS.md` documents the layer order.
 
 ---
 
@@ -947,9 +947,9 @@ follow-ups, in order of leverage:
 ## 11. References
 
 - [`TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md`](TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md) - B2b + B6 propose
-- [`reports/what-to-borrow-from-cmm.md`](../../reports/what-to-borrow-from-cmm.md) — original borrow guide (Tier 1 §B1–B5).
+- [`docs/reports/what-to-borrow-from-cmm.md`](../../docs/reports/what-to-borrow-from-cmm.md) — original borrow guide (Tier 1 §B1–B5).
 - [`propose/completed/CALL-GRAPH-PROPOSE.md`](CALL-GRAPH-PROPOSE.md) — completed call-graph proposal; same shape & style.
-- [`reports/call-graph-review.md`](../../reports/call-graph-review.md) — review that surfaced the resolver / extractor invariants.
+- [`docs/reports/call-graph-review.md`](../../docs/reports/call-graph-review.md) — review that surfaced the resolver / extractor invariants.
 - [`plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md`](../../plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md) — **mandatory reading** for the implementer of §4.6 (brownfield route resolver mirrors this design).
 - `graph_enrich.py` §"brownfield role / capability overrides" — the
   existing implementation B2a extends.
diff --git a/propose/completed/TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md b/propose/completed/TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md
index 0a4f0150..b78e3e1a 100644
--- a/propose/completed/TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md
+++ b/propose/completed/TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md
@@ -21,9 +21,9 @@ Before working on this proposal, read in order:
 
 1. [`TIER1-COMPLETION-PROPOSE.md`](TIER1-COMPLETION-PROPOSE.md) §4
    (B2a `Route` + `EXPOSES`) — defines every join key used here.
-2. [`reports/what-to-borrow-from-cmm.md`](../../reports/what-to-borrow-from-cmm.md)
+2. [`docs/reports/what-to-borrow-from-cmm.md`](../../docs/reports/what-to-borrow-from-cmm.md)
    §B2 (Route shape) and §B6 (cross-service edges).
-3. [`reports/call-graph-review.md`](../../reports/call-graph-review.md)
+3. [`docs/reports/call-graph-review.md`](../../docs/reports/call-graph-review.md)
    — same correctness invariants apply (microservice scoping,
    confidence semantics, phantom-id collisions).
 4. [`plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md`](../../plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md)
@@ -466,7 +466,7 @@ but conservative.
 ## 12. References
 
 - [`TIER1-COMPLETION-PROPOSE.md`](TIER1-COMPLETION-PROPOSE.md) — B2a, B4, B5 (active).
-- [`reports/what-to-borrow-from-cmm.md`](../../reports/what-to-borrow-from-cmm.md) §B2, §B6.
-- [`reports/call-graph-review.md`](../../reports/call-graph-review.md) — invariants this proposal must not regress.
+- [`docs/reports/what-to-borrow-from-cmm.md`](../../docs/reports/what-to-borrow-from-cmm.md) §B2, §B6.
+- [`docs/reports/call-graph-review.md`](../../docs/reports/call-graph-review.md) — invariants this proposal must not regress.
 - [`plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md`](../../plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md) — mandatory reading for §6.
-- [`propose/PRODUCT-VISION.md`](PRODUCT-VISION.md) §3 — `HTTP_CALLS` / `ASYNC_CALLS` are listed as *planned*; this proposal flips them to *shipped*.
+- [`docs/PRODUCT-VISION.md`](../docs/PRODUCT-VISION.md) §3 — `HTTP_CALLS` / `ASYNC_CALLS` are listed as *planned*; this proposal flips them to *shipped*.
diff --git a/propose/stale/TIER2-INCREMENTAL-REBUILD-PROPOSE.md b/propose/stale/TIER2-INCREMENTAL-REBUILD-PROPOSE.md
index 81e8b092..73101047 100644
--- a/propose/stale/TIER2-INCREMENTAL-REBUILD-PROPOSE.md
+++ b/propose/stale/TIER2-INCREMENTAL-REBUILD-PROPOSE.md
@@ -5,7 +5,7 @@ User-facing tracking for graph-side incremental work: GitHub issue **#73** (link
 Pairs with the focused MCP-tool proposal
 [`propose/INDEX-AUTO-MODE-PROPOSE.md`](INDEX-AUTO-MODE-PROPOSE.md)
 (decision engine for `refresh_code_index`) and supersedes its
-"future Kuzu work" footnote in [`propose/PRODUCT-VISION.md`](PRODUCT-VISION.md) §99.
+"future Kuzu work" footnote in [`docs/PRODUCT-VISION.md`](../docs/PRODUCT-VISION.md) §99.
 
 This is a **proposal**, not an implementable plan. After review and
 scoping decisions (the §11 [TBD] list), an implementable

From 67ce67b3be05c57db348ab90daf5f46d9e753f1b Mon Sep 17 00:00:00 2001
From: Dmitry Teryaev <doudmitry@gmail.com>
Date: Sun, 24 May 2026 13:11:17 +0300
Subject: [PATCH 3/3] fix: update CONFIGURATION.md link in
 CODEBASE_REQUIREMENTS.md

The link was pointing to ./docs/CONFIGURATION.md which is incorrect from
within the docs/ directory. Changed to ./CONFIGURATION.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 docs/CODEBASE_REQUIREMENTS.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/CODEBASE_REQUIREMENTS.md b/docs/CODEBASE_REQUIREMENTS.md
index 01737872..8442ba20 100644
--- a/docs/CODEBASE_REQUIREMENTS.md
+++ b/docs/CODEBASE_REQUIREMENTS.md
@@ -173,7 +173,7 @@ and `capabilities`, register inbound routes, and register outbound
 clients/producers for a given repo via `.java-codebase-rag.yml` at the project
 root (`role_overrides:`, `route_overrides:`, `http_client_overrides:`,
 `async_producer_overrides:`) and/or by copying the in-source stubs from
-[`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) into your sources:
+[`docs/CONFIGURATION.md`](./CONFIGURATION.md) into your sources:
 
 - `@CodebaseRole` / `@CodebaseCapability` / `@CodebaseCapabilities`
   (class-level role + capabilities) — see `docs/CONFIGURATION.md` §4.3.