From 170ebea56c575c443b80e2d8a78dc4d5aa24c1f3 Mon Sep 17 00:00:00 2001 From: Dmitry Teryaev Date: Sun, 24 May 2026 12:57:01 +0300 Subject: [PATCH 1/3] chore: cleanup proposals, plans, and reports folder structure - Move completed proposes (HINTS-STRUCTURED-LABEL) to propose/completed/ - Move stale proposes to propose/stale/ (ENHANCED-ROLE-RECOGNITION, INDEX-AUTO-MODE, TIER2-INCREMENTAL-REBUILD, RANKING-MICROSERVICE) - Move completed plans to plans/completed/ (PLAN-DESCRIBE-HINTS-STRUCTURAL, AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL) - Create propose/active/ and plans/active/ for future active items - Move PRODUCT-VISION.md to docs/ - Move CODEBASE_REQUIREMENTS.md to docs/ - Remove reports/ folder (call-graph-review.md, what-to-borrow-from-cmm.md, review/) Co-Authored-By: Claude Opus 4.7 --- .../CODEBASE_REQUIREMENTS.md | 0 {propose => docs}/PRODUCT-VISION.md | 0 ...AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL.md | 0 .../PLAN-DESCRIBE-HINTS-STRUCTURAL.md | 0 .../{ => completed}/HINTS-STRUCTURED-LABEL.md | 0 .../ENHANCED-ROLE-RECOGNITION-PROPOSE.md | 0 .../{ => stale}/INDEX-AUTO-MODE-PROPOSE.md | 0 .../RANKING-MICROSERVICE-PROPOSE.md | 0 .../TIER2-INCREMENTAL-REBUILD-PROPOSE.md | 0 reports/call-graph-review.md | 364 --------------- ...BROWNFIELD-ROLE-OVERRIDES-design-issues.md | 62 --- ...LD-ROLE-OVERRIDES-implementation-issues.md | 115 ----- ...PLAN-CAPABILITIES-MODEL-implement-fixes.md | 431 ------------------ ...LAN-CAPABILITIES-MODEL-implement-report.md | 140 ------ reports/what-to-borrow-from-cmm.md | 247 ---------- 15 files changed, 1359 deletions(-) rename CODEBASE_REQUIREMENTS.md => docs/CODEBASE_REQUIREMENTS.md (100%) rename {propose => docs}/PRODUCT-VISION.md (100%) rename plans/{ => completed}/AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL.md (100%) rename plans/{ => completed}/PLAN-DESCRIBE-HINTS-STRUCTURAL.md (100%) rename propose/{ => completed}/HINTS-STRUCTURED-LABEL.md (100%) rename propose/{ => stale}/ENHANCED-ROLE-RECOGNITION-PROPOSE.md (100%) rename propose/{ => stale}/INDEX-AUTO-MODE-PROPOSE.md (100%) rename propose/{ => stale}/RANKING-MICROSERVICE-PROPOSE.md (100%) rename propose/{ => stale}/TIER2-INCREMENTAL-REBUILD-PROPOSE.md (100%) delete mode 100644 reports/call-graph-review.md delete mode 100644 reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md delete mode 100644 reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-implementation-issues.md delete mode 100644 reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-fixes.md delete mode 100644 reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-report.md delete mode 100644 reports/what-to-borrow-from-cmm.md diff --git a/CODEBASE_REQUIREMENTS.md b/docs/CODEBASE_REQUIREMENTS.md similarity index 100% rename from CODEBASE_REQUIREMENTS.md rename to docs/CODEBASE_REQUIREMENTS.md diff --git a/propose/PRODUCT-VISION.md b/docs/PRODUCT-VISION.md similarity index 100% rename from propose/PRODUCT-VISION.md rename to docs/PRODUCT-VISION.md diff --git a/plans/AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL.md b/plans/completed/AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL.md similarity index 100% rename from plans/AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL.md rename to plans/completed/AGENT-PROMPTS-DESCRIBE-HINTS-STRUCTURAL.md diff --git a/plans/PLAN-DESCRIBE-HINTS-STRUCTURAL.md b/plans/completed/PLAN-DESCRIBE-HINTS-STRUCTURAL.md similarity index 100% rename from plans/PLAN-DESCRIBE-HINTS-STRUCTURAL.md rename to plans/completed/PLAN-DESCRIBE-HINTS-STRUCTURAL.md diff --git a/propose/HINTS-STRUCTURED-LABEL.md b/propose/completed/HINTS-STRUCTURED-LABEL.md similarity index 100% rename from propose/HINTS-STRUCTURED-LABEL.md rename to propose/completed/HINTS-STRUCTURED-LABEL.md diff --git a/propose/ENHANCED-ROLE-RECOGNITION-PROPOSE.md b/propose/stale/ENHANCED-ROLE-RECOGNITION-PROPOSE.md similarity index 100% rename from propose/ENHANCED-ROLE-RECOGNITION-PROPOSE.md rename to propose/stale/ENHANCED-ROLE-RECOGNITION-PROPOSE.md diff --git a/propose/INDEX-AUTO-MODE-PROPOSE.md b/propose/stale/INDEX-AUTO-MODE-PROPOSE.md similarity index 100% rename from propose/INDEX-AUTO-MODE-PROPOSE.md rename to propose/stale/INDEX-AUTO-MODE-PROPOSE.md diff --git a/propose/RANKING-MICROSERVICE-PROPOSE.md b/propose/stale/RANKING-MICROSERVICE-PROPOSE.md similarity index 100% rename from propose/RANKING-MICROSERVICE-PROPOSE.md rename to propose/stale/RANKING-MICROSERVICE-PROPOSE.md diff --git a/propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md b/propose/stale/TIER2-INCREMENTAL-REBUILD-PROPOSE.md similarity index 100% rename from propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md rename to propose/stale/TIER2-INCREMENTAL-REBUILD-PROPOSE.md diff --git a/reports/call-graph-review.md b/reports/call-graph-review.md deleted file mode 100644 index e6ed718e..00000000 --- a/reports/call-graph-review.md +++ /dev/null @@ -1,364 +0,0 @@ -# Call Graph Layer — Code Review - -**Repository:** [HumanBean17/java-codebase-rag](https://github.com/HumanBean17/java-codebase-rag) -**Commits reviewed:** -- `b3a15d8` — *call graph layer propose* -- `fb5473f` — *call graph layer implementation* - -**Reference docs:** -- [`propose/completed/CALL-GRAPH-PROPOSE.md`](https://github.com/HumanBean17/java-codebase-rag/blob/master/propose/completed/CALL-GRAPH-PROPOSE.md) -- [`plans/completed/PLAN-CALL-GRAPH.md`](https://github.com/HumanBean17/java-codebase-rag/blob/master/plans/completed/PLAN-CALL-GRAPH.md) - -**Test status:** all 24 new call-graph tests pass locally -(`tests/test_ast_java_calls.py`, `tests/test_call_graph_smoke_roundtrip.py`, -`tests/test_call_graph_receiver_resolution.py`). - ---- - -## Overall verdict - -**Strong, faithfully-scoped implementation.** The proposal is realised as -written, the receiver-type resolver is well-structured, the schema and edge -metadata match the design (confidence + strategy + source), and the test -coverage targets concrete proposal section numbers. Scope discipline is -visible — no creep into HTTP / async / AOP / traces. - -There are **three correctness bugs** that should land as a quick follow-up -before Phase 3 is closed, plus a handful of design issues worth pushing back -on. All three bugs share one root cause: **resolution strategy and -confidence are silently downgraded at edge-emit time when the receiver was -already resolved successfully.** - ---- - -## What's done well - -- **Confidence + strategy tagging is faithful to the design.** Every edge - carries (`confidence`, `strategy`, `source='static'`) — clean migration - path for trace ingestion later. -- **Multigraph dedup at write time** (`(src_id, dst_id, arg_count, line)`) - is correctly shaped: prevents accidental duplication while preserving - overload-ambiguous fan-out at distinct call sites. -- **Receiver-type resolver** is clear and matches the proposal: scope table - built once per method, supertype-bounded lookup, explicit - `chained_receiver` phantom path, deterministic phantom IDs. -- **Receiver-disambiguation discipline.** `_unique_type_simple_resolve` - deliberately uses the *type* registry (not a per-method simple-name - index). The dedicated test - `test_receiver_disambiguation_uses_type_index_not_method_unique` is - exactly the right kind of negative test — this is the precise trap - CMM-style cascades fall into and the implementation avoids it. -- **`_method_ids_for_call_graph_needle`** elegantly accepts type FQN, - method FQN, or simple method name; fan-out through `DECLARES` from a - type needle is the right move and matches §6.1. -- **`exclude_external` is filter-on-result, not filter-on-store.** Phantoms - stay in the graph (so impact analysis can see JDK-adjacent signals), but - query consumers get clean lists by default. Matches risk #2 mitigation - in the proposal. -- **Tests target proposal section numbers.** 24 tests, all passing, - including a Kuzu round-trip on a real fixture project. The shadowing - test (`test_local_shadows_field_same_name_resolves_receiver`) is the - kind of edge case that bites in real codebases. -- **Diagnostics are baked in** — `pass3_calls` prints the chained-phantom - percentage as the proposal mandates. - ---- - -## Bugs (must fix) - -### B1. Constructor calls always become phantoms when the class has no explicit constructor - -**Severity: high — most common Java call site is broken.** - -`new Svc()` in `ScopeReceivers.byLocal()` resolves the receiver type to -`smoke.Svc` correctly. But `Svc` has no explicit constructor in source, so -`_parse_method` is never invoked for an ``, and no constructor -`MemberEntry` is created. `_lookup_method_candidates(type='smoke.Svc', -callee='', argc=0)` finds nothing → fallthrough to phantom at -`confidence=0.0`. - -Confirmed empirically against the smoke fixture: - -``` -['smoke.ScopeReceivers#byLocal()', 'smoke.Svc#(0)', 'phantom', False, 0.0] -['smoke.ScopeReceivers#shadowLocalOverField()', 'smoke.Svc#(0)', 'phantom', False, 0.0] -``` - -In a real Spring codebase, **every** `new MyDto()`, `new HashMap<>()`, -`new ArrayList<>()` on a project type without a hand-written constructor -lands as a phantom. - -**Fix.** When parsing a `TypeDecl` and discovering no constructor -declaration, synthesize a default -`MethodDecl(name="", signature="()", is_constructor=True, ...)` -with `start_line` / `start_byte` from the type declaration and -`parameters=[]`. Make sure it gets a `MemberEntry`. - -Two corollary checks: - -- `_emit_call_edge` for `new Svc()` should then resolve to the synthesized - member with `strategy='constructor'` (not `phantom`), `confidence` - inherited from the receiver-resolution tier. -- Confirm existing `INJECTS` / `DECLARES` accounting doesn't double-count - the synthesized node. - -**Suggested test** — add to `tests/test_call_graph_smoke_roundtrip.py` -(`test_implicit_default_ctor_is_resolved`): - -```java -public class HasNoCtor {} -public class Caller { void m() { new HasNoCtor(); } } -``` - -Assert: `(Caller#m)-[CALLS {strategy:'constructor', resolved:true}]->(HasNoCtor#())`. - ---- - -### B2. Implicit `super()` for a class that doesn't extend anything is mis-tagged as `phantom` - -**Severity: medium — diagnostic regression, not a wrong answer.** - -`WildUtils` has an explicit `private WildUtils() {}` constructor with no -`super(...)` body, so the AST extractor synthesizes the implicit-super -call site. `_first_supertype_fqn` returns `None` (no `EXTENDS` row → -there is no `Object` node in the index), so `_resolve_receiver_type` -returns `(None, "phantom", 0.0)`. Result: - -``` -['smoke.WildUtils#WildUtils()', '?super#(0)', 'phantom', False, 0.0] -``` - -The proposal §4.2 promises strategy `implicit_super (0.90)` for this case. -Right now the agent cannot distinguish "implicit super to `Object`" from -"I have no idea what this call resolved to" — real signal loss. - -**Fix.** In `_resolve_receiver_type`, when `expr == 'super'` and -`_first_supertype_fqn(...) is None`, return -`("java.lang.Object", "implicit_super", 0.90)`. In `_emit_call_edge`, -allow phantom callee (no member resolved on `Object`) but **preserve -`strategy='implicit_super'` and `confidence=0.90`** instead of overriding -to `phantom` / `0.0`. This is the same fix-shape as B3 below. - ---- - -### B3. Resolution strategy and confidence are silently overridden to `phantom` / `0.0` when the callee can't be located on a resolved external receiver - -**Severity: high — collapses static-import precision when callees are JDK / Spring.** - -In `_resolve_and_emit_call`: - -```python -if not candidates: - pid = _phantom_method_id(...) - _emit_call_edge(..., confidence=0.0, strategy="phantom", resolved=False) - return -``` - -This branch fires whenever the receiver type *did* resolve (e.g. -`java.util.Objects` via `static_import`, confidence 0.95) but the callee -method isn't on a type we indexed. The static-import smoke test confirms it: - -``` -requireNonNull edges: 1 - phantom 0.0 False java.util.Objects#requireNonNull(1) -``` - -The README and the MCP instructions both tell agents to use -`min_confidence=0.9` to filter noise. Under that filter, **every JDK -static-import call disappears from the graph**, even though the resolver -*knew* the call's target type with 0.95 confidence. - -**Fix.** Decouple the *receiver-resolution strategy/confidence* from the -*callee-found* boolean. When `candidates` is empty: - -- Keep the phantom callee (creating it on the resolved receiver type — - already done). -- Keep `resolved=False` on the edge (the *callee node* is a phantom). -- **Preserve the receiver-resolution `strat` and `conf`** unless they're - `'chained_receiver'`. Specifically: `strategy` stays `'static_import'` / - `'static_import_wildcard'` / `'import_map'` / `'same_module'` etc.; - `confidence` stays the receiver-tier value. - -The only case where `confidence=0.0, strategy='phantom'` is honest is when -the receiver itself was unresolvable. Distinguishing those two failure -modes is the whole point of the cascade. - -Optional: add a small property `callee_found BOOLEAN` on the edge so a -query like *"high-confidence edges with phantom callees"* (= calls into -well-known external libraries) becomes one Cypher predicate. - -**Suggested tests:** - -- `test_static_import_to_jdk_keeps_high_confidence` — `requireNonNull` - edge has `confidence>=0.95` and `strategy='static_import'`, with - `resolved=False` on the edge. -- `test_min_confidence_filter_keeps_high_confidence_static_import_callers` - — `find_callers('java.util.Objects#requireNonNull(1)', min_confidence=0.9)` - returns the in-repo caller. - ---- - -## Design issues (push back on the proposal here, not just the implementation) - -### D1. Phantom-ID `arg_count` semantics are inconsistent across method-references and regular calls - -`_phantom_method_id` builds the FQN as `{receiver}#{callee}({arg_count})`. -For method references the `arg_count` is `-1`. So the same external method -can exist as both `Foo#bar(2)` and `Foo#bar(-1)` phantom nodes — distinct -nodes for the same logical target. The dedup key -`(src_id, dst_id, arg_count, line)` then keeps both edges, doubling the -graph for code that mixes calls and method references on the same target. - -**Recommendation.** Either normalize phantom IDs without `arg_count` for -method references (`?{recv}#{callee}(?)`) or drop `arg_count` from the -dedup key and use `(src_id, dst_id, line, byte)` (line+byte already pin a -unique call site). - ---- - -### D2. Method-reference precision is leaving free wins on the table - -Method references that *are* unambiguous on name (single method, no -overloads) currently still emit with `arg_count=-1`. Cheap precision win, -no extra resolver complexity: when the receiver type is known and exactly -one method with `name == callee_simple` exists on the receiver type, pick -that single-arity match and emit a fully-resolved edge with the receiver's -real arity instead of `-1`. - ---- - -### D3. Anonymous-inner-class call attribution does the proposal-correct thing, but the design is questionable - -Right now `pingFromAnon()` (called from inside -`new Runnable() { run() { pingFromAnon(); } }`) is attributed to -**`NestedCalls#m()`**, the enclosing named method, with -`strategy='this_super'`. That matches §4.1's wording. - -But: the anonymous `Runnable` *does* get parsed as a nested type in -`_parse_type` (kind `class`). It produces a `MemberEntry` for its -`run()` method. So the graph has two contradictory facts: the call edge -goes from `NestedCalls#m`, and the structural fact "there exists a -`run()` method here" lives on a separate, disconnected anonymous type -node. - -**Recommendation.** Re-attribute calls inside an anonymous-class body to -the anonymous-class member. The named-enclosing fallback is only needed -for **lambdas** (which don't synthesize a member) and static / instance -initializers. For anonymous classes, the call-site naturally belongs to -the anonymous member. This makes -`find_callers('OperatorAssignedProcessor.onOperatorAssigned')` find the -anonymous handler that actually contains the call, instead of the outer -service method. - ---- - -### D4. `expand_methods` discards confidence on the way out - -The output is `list[str]` of type FQNs. There's no way for the search-side -fusion in `_graph_expand_merge` to weight a CALLS-derived hit lower than -a structural one. The proposal §6.2 says "merged via existing RRF, no new -caller-visible parameters" — so RRF treats every reach equally regardless -of whether it came from a 0.95 import-map edge or a 0.55 suffix edge. - -**Recommendation (small).** Have `expand_methods` return -`list[tuple[str, float]]` (type FQN + max confidence on the discovery -path), and let `_graph_expand_merge` pass that as the RRF rank weight. -Internal-only signature change; no MCP surface change. - ---- - -### D5. `trace_flow`'s default change quietly rebudgets stage capacity across two qualitatively different edge sources - -`follow_calls=True` is the new default. Existing agent prompts that -expected type-only stages now get extra entries with -`via.edge_type='CALLS'`. That's good — agents can infer it. But the -per-stage cap (`stage_limit`) now budgets across both edge classes, so a -high-fan-out service can starve INJECTS results in favor of CALLS results. - -**Recommendation.** Either: - -1. Keep separate budgets (`stage_limit_structural`, `stage_limit_calls`, - default to `stage_limit` each), or -2. Order ingestion to prefer INJECTS / EXTENDS / IMPLEMENTS first, then - top up with CALLS until `stage_limit`. The current code already runs - the structural query first — just keep the CALLS top-up bounded by - `stage_limit - len(stage_results)` instead of a separate - `stage_limit * 4` LIMIT. - ---- - -### D6. `_resolve_this_super_field_chain` lacks fixture coverage - -The resolver line -`chain = _resolve_this_super_field_chain(expr, member=member, ast=ast, tables=tables)` -is a real bonus over what CMM does — if it walks -`this.fieldA.fieldB.fieldC.method()` correctly. Add a smoke fixture that -exercises it; none of the existing files do. - ---- - -## Smaller nits - -- **N1 — Per-call rebuild of `_scope_table`.** `_resolve_and_emit_call` - calls `_scope_table(member, ast, tables)` on every call site. - Field / parameter scope is identical for every call inside a single - method body — locals only grow as you step through the body. Build it - once per `member` in `_resolve_method_calls` and pass it in. On a - 5-microservice corpus this is the kind of constant-factor that doubles - `pass3_calls` runtime. -- **N2 — `_lookup_method_candidates`'s `name_only` fallback rule is good, - but the strategy logic in `_resolve_and_emit_call` is intricate.** - The branch - `elif name_only_fb and len(candidates) == 1: edge_strat = strat` is - correct but easy to misread — the inline comment is good; consider - promoting it to a docstring section. -- **N3 — `is_static_call` heuristic.** `_infer_static_method_invocation` - returns `True` when the receiver starts with an uppercase identifier. - For `var Foo = supplier.get();` followed by `Foo.bar()` this - misclassifies. Rare in practice, but worth a TODO; conservative fix is - to consult the scope table (if `Foo` is in scope as a variable, it's - not a static call). -- **N4 — Ontology guard.** `ONTOLOGY_VERSION` 3 → 4 is set, but confirm - `KuzuGraph.get` actually raises on `GraphMeta.ontology_version` - mismatch at read time so a stale graph fails loudly (proposal §5.3). -- **N5 — `pass3_calls` diagnostics.** The log line reports - chained-phantom % only. Add the `phantom_other` ratio (the bigger one - in real codebases) so you can spot B1 / B3 regressions in the log - immediately. -- **N6 — Method reference inside lambda.** `visit` sets - `lam=lam or chained` for method references with a chained qualifier. - That conflates "I'm in a lambda" with "this method ref is itself - chained." `chained` should propagate as a separate flag, not as - `in_lambda`. -- **N7 — Empty `expr` and `is_static_call=False` branch.** The condition - `expr in ("", "this") or (not expr and call.is_static_call is False - and not call.receiver_expr)` is redundant: if `expr == ""` the second - clause is also true. Simplify to `expr in ("", "this")`. - ---- - -## Suggested fix order - -1. **B1, B2, B3 as one PR** titled - *"call graph: faithful confidence preservation across the resolver→writer boundary"* - — the three bugs share one architectural fix (don't downgrade - strategy / confidence at edge-emit time when the receiver was - resolved). Add the suggested tests in the same PR. -2. **D5 as a separate PR** — `trace_flow` budget split with a regression - test that seeds a service whose CALLS fan-out exceeds the structural - one. -3. **D3 (anon-class re-attribution), D4 (`expand_methods` confidence), - N1 (scope-table caching) as a small follow-up** before opening the - next phase. - ---- - -## Closing note - -This is solid Phase-3 work. Land the three bug fixes and the codebase is -in an excellent spot to start on the next phase — either cross-service -`HTTP_CALLS` (B6 / B7 in -[`what-to-borrow-from-cmm.md`](https://github.com/HumanBean17/java-codebase-rag/blob/master/tmp/what-to-borrow-from-cmm.md)) -or runtime-trace ingestion (B3 from the same doc). Both will lean on the -resolver and confidence machinery just built; the bug fixes above make -that lean trustworthy. diff --git a/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md b/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md deleted file mode 100644 index 83a99cae..00000000 --- a/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md +++ /dev/null @@ -1,62 +0,0 @@ -# Design issues: PLAN-BROWNFIELD-ROLE-OVERRIDES (plan / specification) - -**Plan file:** `plans/todo/PLAN-BROWNFIELD-ROLE-OVERRIDES.md` -**Review date:** 2026-04-26 -**Scope:** Problems, ambiguities, or gaps in the *written plan* (not the codebase). - ---- - -## 1. Dual pipeline for meta-annotation data (spec gap) - -The plan describes building Layer A (meta-annotation reachability) from a two-pass process anchored in `build_ast_graph.py` and `GraphTables`. The chunk-enrichment / Lance path must also apply the same resolution rules, but the plan does **not** require a single shared primitive for “which `@interface` definitions exist in the project.” - -A careful reader can infer that graph build and index enrichment should agree, but two independent implementations (graph tables vs. a separate tree walk) are **not** ruled out. If file coverage, exclude patterns, or parse-failure handling differ, Lance and Kuzu can **disagree** on `meta_chain` for the same type. The plan would be stronger with an explicit constraint: e.g. “meta maps MUST be derived from the same file set and exclusion rules as `build_ast_graph` pass1,” or “Lance and Kuzu MUST share one builder function.” - ---- - -## 2. Depth cap for meta-annotation resolution is under-specified - -The plan gives a sketch of `_resolve_meta_chain` with `len(seen) > 4` and cycle handling. As written, the `seen` set is used both for **cycle** detection and as a stand-in for **path depth**. On a *linear* chain of meta-annotations, set size tracks depth. On **branching** shapes, set cardinality and “steps from root” diverge, so the sketch does not define a single clear semantics (strict path depth vs. global visit count). - -The follow-up test (“six wrappers → `OTHER`”) depends on a precise cap. The plan should name the exact metric (e.g. maximum path length from the start simple name) and the integer bound, so implementers and tests are aligned. - ---- - -## 3. Pre-flight test 9 mixes “unit” and “integration” scope - -The pre-flight item asks for a “unit-style” regression but specifies: build a **fresh** Lance index with FQN overrides, **query the table directly**, and then run **`codebase_search(..., capability=...)`** end to end. That is a **multi-layer** test (indexer + storage + search API) and is expensive to run and to keep stable in CI. - -A tiered requirement would match intent better: (1) schema / `JavaLanceChunk` field, (2) `process_java_file` row, (3) optional full search. As written, teams may either skip the heavy part or over-invest in flaky integration for what is mainly a **write-path** contract. - ---- - -## 4. “Precedence” vs. “execution order” is correct but error-prone to skim - -The plan is internally consistent: execution order is the *reverse* of listed priority, and guards use the **current** `role` after each step. Still, a reader who only scans the “Precedence summary (final)” table may implement **C before FQN** in the wrong direction or mis-order **B vs. A** without reading the “Execution order in code (REQUIRED)” block. - -This is a **documentation hazard** in the spec, not a logic error. A short, single bullet at the top (“Apply steps in *only* the order: …; do not reorder”) or a Mermaid sequence diagram would reduce mis-implementation. - ---- - -## 5. Layer A duplicate `@interface` simple names - -The plan correctly specifies first-seen-wins and a stderr warning. The **implication** (colliding simple names in different packages map to one `meta_chain` entry) is only obvious if you already know Java’s annotation resolution limits in this indexer. A one-line “Limitation:” callout in the plan would set expectations for monorepos with same-named annotations. - ---- - -## 6. Rollout vs. single document - -The plan says three independent PRs (Phase 1 → 2 → 3) while also presenting all phases in one file. That is fine for a complete picture, but the **merge strategy** (squashed single PR vs. three) is a process choice the plan does not need to fix—only note that “shippable phases” and “one landing” can conflict in review scope unless branches are cut accordingly. - ---- - -## Summary - -| ID | Topic | Severity (spec) | -|----|------------------------------|-----------------| -| 1 | Single source of truth for meta map inputs | High (consistency) | -| 2 | Depth / cycle semantics | Medium | -| 3 | Pre-flight test cost / tiers | Low–medium | -| 4 | Precedence skimming hazard | Low | -| 5 | Duplicate simple-name limits | Low | -| 6 | Multi-PR vs one doc | Process only | diff --git a/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-implementation-issues.md b/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-implementation-issues.md deleted file mode 100644 index b7035db8..00000000 --- a/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-implementation-issues.md +++ /dev/null @@ -1,115 +0,0 @@ -# Implementation issues: PLAN-BROWNFIELD-ROLE-OVERRIDES - -**Plan file:** `plans/todo/PLAN-BROWNFIELD-ROLE-OVERRIDES.md` -**Review date:** 2026-04-26 -**Scope:** Gaps, mistakes, and risks in the *implementation* as compared to the plan’s stated behaviour and test matrix. Code and tests are assumed to live under `mcp_lancedb_bundle/`. - -**Tests run:** `pytest tests/test_brownfield_overrides.py` — 20 passed (at time of review). - ---- - -## 1. Pre-flight LanceDB + search regression is incomplete vs. plan - -The plan’s pre-flight test (item 9) requires, in order: build a **fresh** Lance index using the real pipeline with FQN `role`/`capabilities`, assert rows on **direct table** read, then **`codebase_search(..., capability=...)`** and assert the type is returned. - -**Implemented approximations:** - -- **`enrich_chunk` + YAML** — checks resolver + chunk path; does not exercise `process_java_file` / `JavaLanceChunk` materialisation end-to-end. -- **Raw LanceDB / PyArrow** — proves `list(string)` round-trip, not the CocoIndex / `JavaLanceChunk` row shape. -- **Dataclass introspection** — confirms `JavaLanceChunk` has a `capabilities` field only. - -**Risk:** A regression that removes or mis-wires the CocoIndex write path could slip past the suite; `codebase_search` capability filtering is not covered by the brownfield tests. - -**Severity:** Medium (guards are weaker than specified, not a known production bug in the reviewed snapshot). - ---- - -## 2. “Malformed YAML” test does not use malformed YAML - -The plan (Phase 1, test 8) calls for malformed YAML to yield empty overrides without crashing. - -**Current behaviour:** a test exercises loading from a **non-existent** path, which is closer to “missing file” than “invalid YAML.” Invalid YAML in an existing file is only covered implicitly by the loader’s `except` branch, not by a named test. - -**Follow-up:** add a `tmp_path` file with content that is not valid YAML, or rename the test to match “missing config / empty” semantics. - -**Severity:** Low (behaviour likely correct; test is mis-specified or misnamed). - ---- - -## 3. Phase 2 test matrix: gaps - -The plan’s Phase 2 test list includes scenarios **not** present in `tests/test_brownfield_overrides.py` (at review time): - -- **Cyclic** meta-annotation graph (A ↔ B): no crash, role remains `OTHER`. -- **Long chain** (e.g. six wrappers): after depth cap, role `OTHER` (or whatever the spec fixes). -- **FQN + meta + Layer B together:** FQN should still win; explicit per-class config overrides automatic meta and annotation maps. - -**Covered and notable:** B-beats-A regression, two-hop to `SERVICE`, method-level meta to capability, basic `@Service` on custom `@interface`. - -**Severity:** Medium for **cycle** and **depth** (guard against stack bugs and cap drift); **low** for the FQN interaction if hand-tested or covered elsewhere (not verified here). - ---- - -## 4. Phase 3 test matrix: minor gaps - -The plan asks for: - -- **Additive capability** — `@CodebaseCapability` in addition to AST-inferred capabilities (e.g. alongside a Spring stereotype). -- **Two separate** `@CodebaseCapability` annotations on the same class, as well as the **container** form. - -**Current coverage** focuses on `CodebaseRole` variants, invalid role warnings, and **`@CodebaseCapabilities({...})` container** with two inner values. The **stacked** `@CodebaseCapability` / `@CodebaseCapability` case is not clearly duplicated as a dedicated test; additive-on-AST is not isolated. - -**Severity:** Low (behaviour is straightforward from code structure; risk is **regression** in parser or resolver order, not a known bug). - ---- - -## 5. Possible Lance vs. Kuzu disagreement on meta maps - -**Implementation detail:** the graph writer derives annotation declarations from **in-memory graph tables**; **`enrich_chunk`** builds meta from a **separate** full-disk walk (`_collect_annotation_decls_from_disk` + cache). - -If the two ever differ (excludes, parse errors, or partial scans), the **same** Java type could get **different** Layer A results in Kuzu than on Lance chunks. The plan’s intent is consistency across stores; this is an **integration consistency** risk, not a single-file bug. - -**Severity:** Low until observed in a real project; worth monitoring or converging the two inputs. - ---- - -## 6. Depth cap semantics (implementation) vs. plan’s sketch - -The resolver’s recursive walk uses a **path set** and stops when `len(path) > 4`. The plan’s pseudocode used a slightly different shape (`seen` and `len(seen) > 4`). - -**Risk:** off-by-one vs. the plan’s “depth 4 / six links `OTHER`” without an automated test (see §3), so behaviour could drift in a refactor. - -**Severity:** Low–medium, mitigated if Phase 2 depth test is added. - ---- - -## 7. Kuzu member nodes and capabilities - -`Symbol` rows for **methods** use `_node_row` defaults (`capabilities: []`, `role: "OTHER"`) and do not run the brownfield resolver per method. The plan is **type-centric**; this is not a plan violation, but any future expectation of “method symbol capabilities in the graph” would be unmet. - -**Severity:** N/A for current plan; documentation only if users assume otherwise. - ---- - -## Summary - -| ID | Topic | Severity | -|----|--------------------------------------|------------| -| 1 | Pre-flight E2E (index + search) | Medium | -| 2 | Malformed YAML test naming / body | Low | -| 3 | Phase 2: cycle, depth, FQN+meta tests| Medium (partial) | -| 4 | Phase 3: stacked caps + AST additive | Low | -| 5 | Meta map source: graph vs. disk | Low (consistency) | -| 6 | Depth cap without test | Low–medium | -| 7 | Method `Symbol` rows / capabilities | N/A | - ---- - -## What was in good shape (for balance) - -- `BrownfieldOverrides` loader, validation against shared ontology, stderr warnings for unknowns. -- `resolve_role_and_capabilities` execution order and **B-before-A** semantics with **OTHER** guards; FQN and `@CodebaseRole` ordering relative to C. -- `AnnotationRef.arguments` and `CodebaseCapabilities` value extraction in `ast_java.py`. -- Wiring: `build_ast_graph` type nodes, `enrich_chunk`, `JavaLanceChunk` + `process_java_file` for `capabilities`. -- README, CODEBASE_REQUIREMENTS, and MCP `instructions` mention customisation. -- B-beats-A regression test is present (critical for the plan’s execution-order invariant). diff --git a/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-fixes.md b/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-fixes.md deleted file mode 100644 index eb808f52..00000000 --- a/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-fixes.md +++ /dev/null @@ -1,431 +0,0 @@ -# PLAN-CAPABILITIES-MODEL — implementation fixes - -**Inputs:** `reports/review/PLAN-CAPABILITIES-MODEL-implement-report.md` + -designer review of that report. -**Plan file (now amended):** `plans/PLAN-CAPABILITIES-MODEL.md`. Re-read -the plan first — it has two new sections (**Filter strategy** and the -expanded **`trace_flow` seeding** subsection) that change how some of -the original instructions should be implemented. -**Goal of this pass:** close 7 issues from the review **plus** correct a -design-level flaw the review surfaced but did not call out — the -existing four `capability` filters use naive post-filter, which silently -under-delivers results against the `limit` contract. - -Apply the fixes in priority order. Run the full test suite after each -group; do not bundle group D into A/B/C. - ---- - -## Group A — `codebase_search` response & filter (Issues 1, 2 + design correction) - -### A.1 Add `capabilities` to `CodeChunkHit` - -**File:** `server.py` - -In the `CodeChunkHit` Pydantic model (currently around line 65), add the -field next to `annotations_on_type` / `symbols`: - -```python -capabilities: list[str] = Field( - default_factory=list, - description=( - "Multi-tag capabilities derived from method/type annotations " - "and injected types (MESSAGE_LISTENER, MESSAGE_PRODUCER, " - "SCHEDULED_TASK, EXCEPTION_HANDLER). A class can carry several." - ), -) -``` - -In `_rows_to_hits` (around line 402), populate it alongside the other -list fields: - -```python -capabilities=_clean_str_list(r.get("capabilities")), -``` - -`_clean_str_list` already handles the legacy-string / native-list dual -shape — no new helper needed. - -`JAVA_ENRICHED_COLUMNS` already includes `"capabilities"` -(`search_lancedb.py` line 37), so the column is fetched when present. -The schema-presence guard on line 459 means stale indexes without the -column degrade gracefully. - -### A.2 Add `capability` filter to `codebase_search` (storage-pushdown) - -**Files:** `search_lancedb.py`, `server.py` - -This is **not** a post-filter. The plan's amended **Filter strategy** -section is explicit: post-filter without over-fetch widening violates -the `limit` contract and is rejected. - -#### Step A.2.1 — extend `_build_extra_predicates` - -In `search_lancedb.py` (around line 65), accept a new keyword: - -```python -def _build_extra_predicates( - *, - columns: set[str], - role: str | None, - module: str | None, - microservice: str | None, - package_prefix: str | None, - fqn_in: list[str] | None, - role_in: list[str] | None = None, - exclude_roles: list[str] | None = None, - capability: str | None = None, # NEW - capability_in: list[str] | None = None, # NEW — used by trace_flow seeding -) -> list[str]: - ... -``` - -Emit a list-contains predicate when the column exists. **Verify the -exact LanceDB SQL syntax for the project's installed version** before -wiring — likely candidates, in order of compatibility: - -```python -# Preferred (Lance >=0.10): -preds.append(f"array_has(capabilities, '{_escape_sql_str(capability)}')") -# Fallback if array_has unavailable: -preds.append(f"array_position(capabilities, '{_escape_sql_str(capability)}') >= 0") -# Last resort (some Lance builds): -preds.append(f"'{_escape_sql_str(capability)}' = ANY(capabilities)") -``` - -Run a tiny ad-hoc query against the local index to confirm which form -parses. Pick one and use it consistently. - -For the multi-value variant (`capability_in`, used only by `trace_flow` -seeding — see Group B), build a disjunction: - -```python -if capability_in and "capabilities" in columns: - parts = [ - f"array_has(capabilities, '{_escape_sql_str(c)}')" - for c in capability_in - ] - preds.append("(" + " OR ".join(parts) + ")") -``` - -Both predicates must be conditioned on `"capabilities" in columns` so -older indexes lacking the column still answer queries (filter ignored). - -#### Step A.2.2 — surface in `run_search` - -`run_search` (around line 722) gains a `capability: str | None = None` -parameter and forwards it to `_build_extra_predicates`. Same for -`capability_in: list[str] | None = None`. No other ranking change. - -#### Step A.2.3 — surface in `codebase_search` MCP tool - -In `server.py::codebase_search` (around line 488), add the parameter -next to `role`: - -```python -capability: str | None = Field( - default=None, - description=( - "Java only: AND-filter to chunks whose enclosing type carries " - "this capability (MESSAGE_LISTENER|MESSAGE_PRODUCER|" - "SCHEDULED_TASK|EXCEPTION_HANDLER). Use `list_by_capability` " - "for graph-only queries." - ), -), -``` - -Forward to `run_search(..., capability=capability, ...)`. - -### A.3 Update unit + integration tests - -- Extend `tests/test_lancedb_e2e.py` with the **`limit` contract** - assertion (plan test #3): a fixture with 50 `@Service` classes of - which 5 are also `MESSAGE_PRODUCER`; `list_by_role("SERVICE", - capability="MESSAGE_PRODUCER", limit=50)` must return exactly the 5. - Same shape for `codebase_search(..., capability=...)` (plan test #6). - ---- - -## Group B — `trace_flow` capability seeding coordination (Issue 4 + design fix) - -This is the design gap the review surfaced. The implementer faithfully -wrote the Kuzu OR predicate the plan asked for, but the LanceDB -pre-filter in `server.py::trace_flow` discards capability-only -entrypoints (role=OTHER, capability=SCHEDULED_TASK) before the Kuzu -seed query ever sees their FQNs. **Both sides must learn about -capabilities together.** - -The plan's amended **`trace_flow` seeding** subsection is now explicit -about this. The Kuzu side is already implemented; only the LanceDB side -needs work. - -### B.1 Widen the LanceDB seed pre-filter - -**File:** `server.py` - -In `trace_flow` (around line 880), the existing seed helper is: - -```python -entry_roles = ["CONTROLLER", "COMPONENT", "SERVICE", "FEIGN_CLIENT"] - -def _seed(role_allowlist: list[str] | None) -> list[dict[str, Any]]: - return run_search( - ... - role_in=role_allowlist, - exclude_roles=None if role_allowlist else sorted(baseline_excludes), - ) -``` - -Extend it to also pass capability allowlist. Match the Kuzu side -exactly — `["MESSAGE_LISTENER", "SCHEDULED_TASK"]`: - -```python -entry_roles = ["CONTROLLER", "COMPONENT", "SERVICE", "FEIGN_CLIENT"] -entry_capabilities = ["MESSAGE_LISTENER", "SCHEDULED_TASK"] - -def _seed(role_allowlist: list[str] | None, - capability_allowlist: list[str] | None) -> list[dict[str, Any]]: - return run_search( - ... - role_in=role_allowlist, - capability_in=capability_allowlist, - exclude_roles=( - None if (role_allowlist or capability_allowlist) - else sorted(baseline_excludes) - ), - ) -``` - -Then in the calling code: - -```python -# First pass: restricted to entrypoint-like role OR entrypoint capability. -seed_rows = await asyncio.to_thread(_seed, entry_roles, entry_capabilities) -if not seed_rows: - seed_rows = await asyncio.to_thread(_seed, None, None) -``` - -The `OR` semantics between `role_in` and `capability_in` are produced -by `_build_extra_predicates`: each predicate is a separate string, -joined with `AND` at the top level. To get the right semantics -(role-OR-capability rather than role-AND-capability), emit a *combined -disjunction* when both are set: - -```python -# In _build_extra_predicates: -role_pred = None -if role_in and "role" in columns: - vals = ", ".join(f"'{_escape_sql_str(v)}'" for v in role_in) - role_pred = f"role IN ({vals})" - -cap_pred = None -if capability_in and "capabilities" in columns: - parts = [ - f"array_has(capabilities, '{_escape_sql_str(c)}')" - for c in capability_in - ] - cap_pred = "(" + " OR ".join(parts) + ")" - -if role_pred and cap_pred: - preds.append(f"({role_pred} OR {cap_pred})") -elif role_pred: - preds.append(role_pred) -elif cap_pred: - preds.append(cap_pred) -``` - -The standalone `role_in` / single `capability` cases keep their existing -behaviour (each emitted independently as before). Only the *paired -seeding case* triggers the OR composition. - -### B.2 Verify with a fixture - -Add to `tests/test_lancedb_e2e.py` (plan test #5): a fixture class -implementing `org.quartz.Job` with **no** Spring stereotype. Confirm -that `trace_flow("scheduled order cleanup", ...)` returns this class as -a stage-0 seed. Without B.1 it will not — that is the regression guard. - ---- - -## Group C — `find_*` and `list_by_*` storage pushdown (Issue 3 + design fix) - -The four already-landed `capability` filters -(`find_implementors`, `find_subclasses`, `list_by_role`, -`list_by_annotation`) use naive post-filter. `find_injectors` is -missing the parameter entirely. Both flaws fix together by switching to -storage pushdown in Kuzu. - -### C.1 Push the `capability` filter into `KuzuGraph` methods - -**File:** `kuzu_queries.py` - -For each of the five graph methods consumed by these tools -(`find_implementors`, `find_subclasses`, `find_injectors`, -`list_by_role`, `list_by_annotation`), add an optional `capability` -parameter: - -```python -def list_by_role( - self, role: str, *, - module: str | None = None, - microservice: str | None = None, - capability: str | None = None, # NEW - limit: int = 100, -) -> list[SymbolHit]: - filters = ["s.role = $role"] - params: dict[str, Any] = {"role": role} - if capability: - filters.append("$capability IN s.capabilities") - params["capability"] = capability - filters.extend(_scope_filters("s", module=module, microservice=microservice, params=params)) - where = " AND ".join(filters) - query = f"MATCH (s:Symbol) WHERE {where} RETURN {_SYMBOL_RETURN} LIMIT {int(limit)}" - return [_row_to_symbol(r) for r in self._rows(query, params)] -``` - -Same shape for `list_by_annotation`, `find_implementors`, -`find_subclasses`. Apply the predicate against the result-node alias -(`s` for the `list_by_*` queries; whatever alias is used in the -implementor / subclass query). The `LIMIT` clause **must** come after -the capability filter — Kuzu's planner handles this automatically once -it's part of `WHERE`. - -For `find_injectors`, the result is an *edge* between two `Symbol` -nodes (`src` injects `dst`). The user-relevant capability is on the -**consumer** (`src`): - -```python -def find_injectors( - self, name: str, *, - module: str | None = None, - microservice: str | None = None, - capability: str | None = None, # NEW - limit: int = 100, -) -> list[EdgeHit]: - # ... existing query that binds (src)-[:INJECTS]->(dst) ... - if capability: - filters.append("$capability IN src.capabilities") - params["capability"] = capability - ... -``` - -### C.2 Replace post-filter with parameter pass-through in `server.py` - -For each of the five tools (`find_implementors`, `find_subclasses`, -`find_injectors`, `list_by_role`, `list_by_annotation`): - -- Remove the post-filter line `rows = [r for r in rows if capability in r.capabilities]`. -- Pass `capability=capability` to the corresponding `KuzuGraph` method. -- For `find_injectors` (Issue 3): add the `capability` parameter to - the tool signature in the first place. Reuse the same - `Field(default=None, description=...)` shape as the other four. Pass - through to `graph.find_injectors(..., capability=capability)`. - -`list_by_capability` is unaffected — it already pushes down via Cypher. - -### C.3 Tests - -Convert the existing `capability` post-filter tests to assert -pushdown semantics: build a fixture with N=50 services of which only 5 -have the requested capability, request `limit=50`, expect exactly 5 -results. The previous post-filter implementation would also pass this -specific shape, but a stronger fixture (50 services, capability=Y on 5 -services that are *not* in the first 50 vector hits or graph rows) -will distinguish the two implementations. Pick the stronger fixture. - ---- - -## Group D — Documentation (Issues 5, 6) - -### D.1 `README.md` - -Add a new section **after** the existing "Roles" section, before the -search-tools section. Suggested skeleton: - -```markdown -## Capabilities - -In addition to the single primary `role` per Java type, the indexer -extracts a multi-tag `capabilities: list[str]` field from method-level -annotations, type-level annotations, injected types, and supertypes. -A type can carry zero or many capabilities. Capabilities never -*replace* the role; they augment it. - -| Capability | Trigger | -|---|---| -| `MESSAGE_LISTENER` | `@KafkaListener`, `@RabbitListener`, `@JmsListener`, `@SqsListener`, `@EventListener`, `@StreamListener` on any method | -| `MESSAGE_PRODUCER` | type injects `KafkaTemplate`, `RabbitTemplate`, `JmsTemplate`, `StreamBridge`, or `ApplicationEventPublisher` | -| `SCHEDULED_TASK` | `@Scheduled` on any method, or class implements `org.quartz.Job` | -| `EXCEPTION_HANDLER`| `@ControllerAdvice`, `@RestControllerAdvice`, or any method with `@ExceptionHandler` | - -Use `list_by_capability` to enumerate types carrying a capability, or -pass `capability=...` to `codebase_search` / `list_by_role` / -`list_by_annotation` / `find_*` to AND-filter results. -``` - -### D.2 `CODEBASE_REQUIREMENTS.md` - -Add a short note under the role-inference section: - -```markdown -Capabilities are derived at the **type level**: method-level annotation -evidence is aggregated up to the enclosing type. Per-method capability -storage is intentionally out of scope for the current ontology -(version 3) — see `plans/PLAN-CAPABILITIES-MODEL.md`. The deferred -call-graph layer (`propose/DEFERRED-CALL-GRAPH-PROPOSE.md`) is the -designated place to revisit method-granularity if the need arises. -``` - ---- - -## Group E — Style nit (Issue 7) - -**File:** `ast_java.py`, around line 113. - -Insert a single blank line between `_SUPERTYPE_TO_CAPABILITY` and -`_TYPE_KINDS`. No other change. Verify by running the existing -formatter / linter the project uses. - ---- - -## Acceptance checklist - -Run before declaring done: - -- [ ] **Group A:** `codebase_search` returns `capabilities` per hit; - `capability` filter present and pushed down; `limit` contract - test passes (50 services / 5 producers / `limit=50` → exactly 5). -- [ ] **Group B:** `trace_flow` returns a Quartz `Job` implementor - (role=OTHER, capability=SCHEDULED_TASK) as a stage-0 seed. -- [ ] **Group C:** all five graph-backed tools push the `capability` - filter into Cypher; `find_injectors` has the parameter; no Python - post-filter on `r.capabilities` remains in `server.py` for these - tools (verify with `rg "for r in rows if capability in" server.py` - → no matches). -- [ ] **Group D:** `README.md` has a "Capabilities" section; - `CODEBASE_REQUIREMENTS.md` notes the type-level granularity. -- [ ] **Group E:** blank line restored. -- [ ] All existing tests still pass. -- [ ] New tests cover (a) `limit` contract, (b) capability-only - `trace_flow` seeding, (c) `codebase_search` capability filter. -- [ ] No new ontology bump (still `3`); no unrelated API changes. - -## Notes for the implementer - -- The plan was updated alongside this fix list. **Re-read - `plans/PLAN-CAPABILITIES-MODEL.md`** — the **Filter strategy** and - **`trace_flow` seeding** sections are new and binding. Anything in - this file that conflicts with the plan, the plan wins. -- The reviewer attributed Issue 4 (`trace_flow` dead code) to - implementation. It's actually a plan gap — the plan asked for a - Kuzu change without specifying the LanceDB coordination. Group B - closes that gap. You did not do anything wrong on that one; you - faithfully implemented what the plan said. The plan is now - complete. -- Verify LanceDB array-predicate syntax against the project's - installed Lance version *before* writing the predicate. If the - preferred form (`array_has`) is unavailable, document the chosen - fallback in a comment on `_build_extra_predicates`. -- `find_injectors`' `capability` semantic (consumer side, not target) - is a deliberate API decision; surface it in the Pydantic - description string so callers don't guess wrong. diff --git a/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-report.md b/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-report.md deleted file mode 100644 index d6454873..00000000 --- a/reports/review/completed/PLAN-CAPABILITIES-MODEL-implement-report.md +++ /dev/null @@ -1,140 +0,0 @@ -# Implementation Review: PLAN-CAPABILITIES-MODEL - -**Plan file:** `plans/PLAN-CAPABILITIES-MODEL.md` -**Review date:** 2026-04-26 -**Status:** Partially implemented — 4 hard misses, 1 design gap, 2 doc gaps, 1 style nit - ---- - -## Summary - -The core capability machinery is correctly implemented: -- `ONTOLOGY_VERSION` bumped 2 → 3 in `ast_java.py` -- All four detector tables (`_METHOD_ANN_TO_CAPABILITY`, `_TYPE_ANN_TO_CAPABILITY`, `_INJECTED_TYPES_TO_CAPABILITY`, `_SUPERTYPE_TO_CAPABILITY`) are present with the right entries -- `TypeDecl.capabilities` field added; populated by `infer_capabilities_for_type` after construction in `_parse_type` -- `infer_capabilities_for_type` and all tables exported in `__all__` -- `ChunkEnrichment.capabilities` plumbed from `encl.capabilities` in `graph_enrich.py` -- `Symbol` schema extended with `capabilities STRING[]`; `_node_row` defaults and `_CREATE_SYMBOL` Cypher updated; type nodes write `list(d.capabilities)`; phantoms carry `"capabilities": []` -- `SymbolHit.capabilities` field added; `_symbol_return_for` and `_row_to_symbol` updated -- `list_by_capability` added to `KuzuGraph` with correct `list_contains` Cypher -- `list_by_capability` MCP tool added to `server.py` -- `capability` post-filter parameter added to `find_implementors`, `find_subclasses`, `list_by_role`, `list_by_annotation` -- `capabilities: list[str]` added to `SymbolDto` -- `_INSTRUCTIONS` and `trace_flow` tool description updated to mention capabilities -- `"capabilities"` added to `JAVA_ENRICHED_COLUMNS` in `search_lancedb.py` -- Version guard in `KuzuGraph.get` raises on ontology mismatch -- Unit tests in `tests/test_ast_java_capabilities.py` cover all 9 plan scenarios -- `test_symbol_has_capabilities_column` regression guard added to `test_ast_graph_build.py` - ---- - -## Issues - -### Issue 1 — `CodeChunkHit` missing `capabilities` field (Hard miss) - -**File:** `server.py` - -`JAVA_ENRICHED_COLUMNS` in `search_lancedb.py` includes `"capabilities"` so the value is fetched from LanceDB, but `CodeChunkHit` has no `capabilities` field and `_rows_to_hits` never maps it. The plan explicitly requires: - -> Plumb `capabilities` through whatever Pydantic / dataclass models the search path uses to surface Java hits, so callers see them in results. - -**Fix needed:** Add `capabilities: list[str] = Field(default_factory=list)` to `CodeChunkHit`, and map it in `_rows_to_hits` via `_clean_str_list(r.get("capabilities"))`. - ---- - -### Issue 2 — `codebase_search` missing `capability` filter parameter (Hard miss) - -**File:** `server.py` - -The plan says: - -> In `codebase_search`, `find_*`, `list_by_role`, add an optional parameter `capability: str | None` that, when set, AND-filters results to those carrying that capability. (Implementation: post-filter on the returned `SymbolHit.capabilities` list — no Cypher change needed.) - -`list_by_role`, `find_implementors`, `find_subclasses`, and `list_by_annotation` all received the parameter. `codebase_search` did not. - -Note: for `codebase_search` the post-filter would operate on `CodeChunkHit.capabilities` (which also depends on Issue 1 being fixed first). - -**Fix needed:** Add `capability: str | None = Field(default=None, description="...")` to `codebase_search`; post-filter `hits` to `[h for h in hits if capability in h.capabilities]` when `capability` is set. - ---- - -### Issue 3 — `find_injectors` missing `capability` parameter (Hard miss) - -**File:** `server.py` - -The plan says "In `codebase_search`, `find_*`, …". `find_injectors` is a `find_*` tool and did not receive the parameter. The other two `find_*` tools (`find_implementors`, `find_subclasses`) did. - -For `find_injectors` the natural semantic is to filter on the injecting symbol (consumer): keep edges where `edge.src.capabilities` contains the requested capability. - -**Fix needed:** Add `capability: str | None = Field(default=None, …)` to `find_injectors`; post-filter `edges` to those where `capability in e.src.capabilities`. - ---- - -### Issue 4 — Kuzu capability-OR in `_run_seed_query` is effectively dead code (Design gap) - -**File:** `kuzu_queries.py` + `server.py` - -`_run_seed_query` (kuzu_queries.py) correctly adds: - -```python -f"(s.role IN $entry_roles OR {cap_predicates})" -``` - -However, in `server.py`'s `trace_flow`, the first pass already filters LanceDB results with `role_in=["CONTROLLER", "COMPONENT", "SERVICE", "FEIGN_CLIENT"]`. Every FQN that arrives at Kuzu's seed query therefore already has a role in `_ENTRYPOINT_ROLES`, making the `OR cap_predicates` branch unreachable for any class with role `OTHER`. - -Concretely: a plain `Job` implementor (role `OTHER`, capability `SCHEDULED_TASK`) is excluded by the LanceDB role filter before the Kuzu check ever sees it. The plan's stated test case #4 ("returns the `MESSAGE_LISTENER` class as a stage-0 seed even when its primary role is `SERVICE`") does work because `SERVICE` is in `entry_roles`. But the broader intent — expanding seeding beyond role boundaries via capabilities — is not achieved. - -**Fix needed:** In `server.py`'s `trace_flow`, add a third LanceDB seed pass that searches without role restriction but filters on known entry-capability values (`MESSAGE_LISTENER`, `SCHEDULED_TASK`) using a LanceDB predicate on the `capabilities` column, then merges unique FQNs into the seed set before calling `graph.trace_flow`. - ---- - -### Issue 5 — `README.md` not updated (Plan requirement skipped) - -**File:** `README.md` - -The plan requires: - -> `README.md` — add a section "Capabilities" describing the multi-tag axis, the initial capability set, and `list_by_capability`. Keep the existing "Roles" section intact. - -No change was made to `README.md`. - ---- - -### Issue 6 — `CODEBASE_REQUIREMENTS.md` not updated (Plan requirement skipped) - -**File:** `CODEBASE_REQUIREMENTS.md` - -The plan requires: - -> `CODEBASE_REQUIREMENTS.md` — note the type-level granularity choice and the deferred per-method storage (link to this plan). - -No change was made to `CODEBASE_REQUIREMENTS.md`. - ---- - -### Issue 7 — Missing blank line between `_SUPERTYPE_TO_CAPABILITY` and `_TYPE_KINDS` (Style nit) - -**File:** `ast_java.py`, line ~113 - -```python -_SUPERTYPE_TO_CAPABILITY: dict[str, str] = { - "Job": "SCHEDULED_TASK", -} -_TYPE_KINDS = { # <-- no blank line before this -``` - -Every other pair of top-level variables in the file is separated by a blank line. The missing line here was likely a merge artefact. - ---- - -## Priority Order for Fixes - -| # | Severity | File | Description | -|---|----------|------|-------------| -| 1 | High | `server.py` | `CodeChunkHit` missing `capabilities` field | -| 2 | High | `server.py` | `codebase_search` missing `capability` filter | -| 3 | High | `server.py` | `find_injectors` missing `capability` filter | -| 4 | Medium | `server.py` + `kuzu_queries.py` | `trace_flow` capability seeding is dead code for role=OTHER classes | -| 5 | Low | `README.md` | "Capabilities" section not written | -| 6 | Low | `CODEBASE_REQUIREMENTS.md` | Granularity note not added | -| 7 | Nit | `ast_java.py` | Missing blank line between two dict constants | diff --git a/reports/what-to-borrow-from-cmm.md b/reports/what-to-borrow-from-cmm.md deleted file mode 100644 index e2258de3..00000000 --- a/reports/what-to-borrow-from-cmm.md +++ /dev/null @@ -1,247 +0,0 @@ -# What to Borrow from Codebase-Memory MCP - -A focused, prioritized guide for evolving `java-codebase-rag` (AMA agent) by adopting proven patterns from [DeusData/codebase-memory-mcp](https://github.com/DeusData/codebase-memory-mcp) (paper: [arXiv:2603.27277](https://arxiv.org/abs/2603.27277)) — without giving up your Spring-aware, hybrid (vector + graph) edge. - -> **Guiding principle.** CMM optimizes for *token efficiency at acceptable quality* across 66 languages. Your AMA agent optimizes for *answer quality on Spring/Java microservices* via hybrid retrieval. Borrow CMM's structural mechanics; keep your semantic / role-aware layer as the differentiator. - ---- - -## Snapshot — where each tool wins - -| Layer | Your AMA agent | Codebase-Memory MCP | Action | -|---|---|---|---| -| Java/Spring DI semantics | Strong (`@Autowired`, `@Inject`, Lombok, `@FeignClient`) | None | Keep yours | -| Vector / hybrid retrieval (LanceDB + RRF + `graph_expand`) | Yes | None | Keep yours | -| Role / capability ontology (`CONTROLLER`, `MESSAGE_LISTENER`, ...) | Yes | None | Keep yours | -| Microservice topology + brownfield overrides | Yes | Generic `Project` only | Keep yours | -| `CALLS` / `HTTP_CALLS` / `ASYNC_CALLS` resolution | Roadmap (Phase 3) | Shipped, mature | **Borrow** | -| `Route` as first-class node | Roadmap | Shipped | **Borrow** | -| Cross-repo / cross-service edges | Roadmap | Shipped (`pass_cross_repo`) | **Borrow** | -| Runtime trace ingestion | None | Shipped (`ingest_traces`) | **Borrow** | -| Git-diff impact + risk classification | Partial (`impact_analysis`) | Shipped (`detect_changes`) | **Borrow** | -| Layered ignore (`.gitignore` + project ignore) | Constant list | Layered (`.cbmignore`) | **Borrow** | -| Louvain community detection | None | Shipped | **Borrow (Phase 4)** | -| Dead-code detection | None | Shipped | **Borrow (Phase 4)** | -| 66-language tree-sitter grammars | Java only | Yes | Skip (off-strategy) | -| Single static binary distribution | Python venv | Yes | Skip until Phase 5+ | -| 3D graph UI | None | Yes | Skip | -| `get_architecture` mega-tool | Split into small tools | One bundled tool | Skip — keep yours | - ---- - -## Tier 1 — Borrow now (cheap, high impact) - -### B1. Confidence-scored CALLS resolution cascade - -CMM's [`pass_calls.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_calls.c) and [`extract_calls.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/internal/cbm/extract_calls.c) resolve calls via a deterministic cascade. Adopt the **shape**, not the C code. - -**What to lift:** - -- A 4-strategy cascade with explicit confidence values: - 1. Import-map resolved (`0.95`) - 2. Same-module / same-package (`0.90`) - 3. Globally unique simple name (`0.75`) - 4. Suffix / fuzzy match (`0.55`) -- A `confidence` property on every `CALLS` edge so downstream tools (and the MCP agent) can filter (`WHERE c.confidence >= 0.8`). -- A `source` property: `"static"` vs `"trace"` vs `"di_proxy"`. - -**Why now:** Add the property when you create the Kuzu schema for Phase 3 — retrofitting columns later is painful. - -**Suggested Kuzu DDL:** - -```sql -CREATE REL TABLE CALLS ( - FROM Method TO Method, - confidence DOUBLE, -- 0.55 .. 1.0 - source STRING, -- 'static' | 'trace' | 'di_proxy' - strategy STRING, -- 'import_map' | 'same_module' | 'unique_name' | 'suffix' - call_site STRING -- file:line -); -``` - ---- - -### B2. `Route` as a first-class node - -CMM models REST endpoints and message channels as a single `Route` label so that *any* call site can attach to *any* endpoint via `HTTP_CALLS` / `ASYNC_CALLS`. See [`pass_route_nodes.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_route_nodes.c). - -**What to lift:** - -- Adopt the **`Route`** label (instead of `RestEndpoint` from your current PRODUCT-VISION) — keeps you semantically interoperable if anyone runs both MCPs in parallel. -- Properties: `path`, `method`, `framework` (`spring_mvc`, `webflux`, `feign`, `kafka`, `rabbitmq`), `broker` (for async), `service` (microservice name). -- Edges: - - `(Method)-[:EXPOSES]->(Route)` for `@RequestMapping`/`@KafkaListener` - - `(Method)-[:HTTP_CALLS]->(Route)` for `RestTemplate`/`WebClient`/`@FeignClient` - - `(Method)-[:ASYNC_CALLS]->(Route)` for `KafkaTemplate.send`/`StreamBridge.send` -- A normalization rule: `/api/users/{id}` and `/api/users/123` collapse to the same `Route` (path-template canonicalization). - ---- - -### B3. Runtime trace ingestion (`ingest_traces`) - -This is the single biggest quality lever you don't have yet. Static analysis misses Spring AOP proxies, polymorphic dispatch, reflection, and event-driven flows — runtime traces capture all of them. - -**What to lift:** - -- A new MCP tool `ingest_traces(spans: List[Span], source: str)`. -- Accept OpenTelemetry / Sleuth / Micrometer JSON natively. -- For each `(parent_span, child_span)` pair, emit `(caller:Method)-[:CALLS {source:"trace", confidence:1.0}]->(callee:Method)`. -- For HTTP client spans, emit `(caller)-[:HTTP_CALLS]->(Route)` using `http.url` + `http.method` to match an existing `Route` node. -- Deduplicate via `(source_id, target_id, source)` so re-ingesting traces is idempotent. - -**Why this matters:** Lifts Phase 3 from "static approximation" to "ground-truth where traces exist, static elsewhere" — and the agent can prefer `confidence:1.0` edges automatically. - ---- - -### B4. Git-diff impact mapping with risk score - -CMM's [`detect_changes`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_gitdiff.c) maps a diff to affected symbols and a blast radius. You already have `impact_analysis` — make it diff-driven and add risk classification. - -**What to lift:** - -- New MCP tool `analyze_pr(diff: str | git_ref: str)`: - 1. Parse `git diff` line ranges per file - 2. Map line ranges → chunks → graph nodes (functions/methods) - 3. Run your existing reverse closure - 4. Return `{ changed_nodes, blast_radius, risk_score, risk_level }` -- Risk formula (start simple, tune later): - -``` -risk = log10(1 + downstream_consumers) * role_weight * cross_service_factor - -role_weight = { CONTROLLER:1.5, SERVICE:1.2, REPOSITORY:1.0, CONFIG:1.8, ENTITY:1.3, ... } -cross_service_factor = 1.0 if changes only touch one microservice, 2.0 otherwise -risk_level = "low" (<1.0), "medium" (1.0..2.5), "high" (>2.5) -``` - -- Output usable directly in PR review or CI gating. - ---- - -### B5. Layered ignore patterns - -CMM uses **hardcoded patterns → `.gitignore` hierarchy → `.cbmignore`** ([`discover/`](https://github.com/DeusData/codebase-memory-mcp/tree/master/src/discover)). Cleaner than your current `COMMON_EXCLUDED_PATH_PATTERNS` constant. - -**What to lift:** - -- Layer order: - 1. Hardcoded must-skip (`.git`, `node_modules`, `target`, `build`, `out`, `.idea`, `.gradle`, `bin`) - 2. Walk up `.gitignore` files from each indexed directory - 3. Project-level `.lancedb-mcp.yml`'s `ignore:` list - 4. NEW: optional `.lancedb-mcp-ignore` file with gitignore syntax -- Always skip symlinks (cycle protection). -- Reuse `pathspec` (Python) — it's the gitignore-spec-compliant matcher. - ---- - -## Tier 2 — Borrow during Phase 2 / 3 - -### B6. Cross-repo / cross-service edges - -CMM's [`pass_cross_repo.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_cross_repo.c) matches an `HTTP_CALLS` edge in service A to a `Route` node in service B and creates a `CROSS_HTTP_CALLS` edge. This is the killer feature for a multi-microservice AMA. - -**What to lift:** - -- After per-service indexing, run a global pass: - - For each `HTTP_CALLS` edge with `path` + `method`, find the matching `Route` node in any other indexed service. - - Emit `(callerMethod)-[:CALLS_HTTP]->(Route)<-[:EXPOSES]-(calleeMethod)` so traversal in either direction works. -- Same for async: match `topic`/`queue` strings in `KafkaTemplate.send` calls to `@KafkaListener` `Route` nodes. -- Path template matching: `GET /api/orders/{id}` matches a call to `GET /api/orders/123` — use a `path_pattern` regex stored on the `Route`. - -**Killer query unlocked:** *"What breaks if I rename `POST /api/orders` in `order-service`?"* → traverse `Route` → cross-service `HTTP_CALLS` → caller methods → reverse closure → affected controllers in `checkout-service`. - ---- - -### B7. Louvain community detection - -CMM runs Louvain over `CALLS` to discover functional modules. Useful for onboarding and architecture pitches. - -**What to lift:** - -- After Phase 3 `CALLS` lands, run Louvain on the call subgraph (use `python-igraph` or `networkx-community`). -- Store `cluster_id` and `cluster_size` as `Method` properties. -- New MCP tool `find_module_clusters(min_size: int)` returning ranked clusters with their dominant role mix and entry methods. -- Bonus: weight edges by call frequency from traces (B3) for higher-quality partitions. - ---- - -### B8. Dead-code detection - -Trivial once `CALLS` exists, but valuable for cleanup and consulting deliverables. - -**What to lift:** - -- New MCP tool `find_dead_code(exclude_entry_points: bool = true)`. -- Definition: `Method` with zero incoming `CALLS` and zero incoming `EXPOSES`. -- Entry-point predicates to exclude: - - Spring stereotypes that auto-invoke: `@Scheduled`, `@PostConstruct`, `@EventListener`, `@KafkaListener`, `@RabbitListener`, `@JmsListener` - - HTTP entry points: any method with an `EXPOSES` edge - - Test methods: `@Test`, `@ParameterizedTest`, lifecycle annotations - - `public static void main(String[])` -- Cypher (one query): - -```cypher -MATCH (m:Method) -WHERE NOT (m)<-[:CALLS]-() - AND NOT (m)-[:EXPOSES]->() - AND NOT m.is_entry_point -RETURN m.qualified_name, m.role, m.file, m.line -ORDER BY m.role, m.qualified_name -``` - ---- - -## Tier 3 — Borrow later or skip - -### Borrow only if you go poly-language (Phase 5+) - -- **B9. Multi-grammar indexing.** CMM ships 66 grammars vendored. Adopt only if you sell to non-Java SMBs. -- **B10. Static binary distribution.** Compelling for SMB clients ("download → run"). Not relevant while you're a Python venv. - -### Skip (don't fit your strategy) - -- **`get_architecture` mega-tool.** Your split tools (`graph_meta`, `list_by_role`, `list_by_capability`) are more agent-friendly because each is named and small. The agent picks better when tool intent is narrow. -- **3D graph UI.** Not the differentiator. If you need visualization, render Kuzu subgraphs to Mermaid or Graphviz on demand from a tool — far less code, embeds in chat. -- **Their ADR module.** Markdown folder + your existing search is enough. Adding ADR CRUD is scope creep. -- **CMM's mini-Cypher executor.** You already have Kuzu — strictly more capable. - ---- - -## Suggested roadmap reorder - -A revised ordering that front-loads borrowed pieces with the highest ROI: - -| Phase | Goal | Borrowed items | -|---|---|---| -| **2** (now) | `Route` nodes + `HTTP_CALLS` / `ASYNC_CALLS` from Spring/Feign/Kafka, with `confidence` columns | B2 | -| **2.5** | `ingest_traces` MCP tool (cheap, huge quality lift) | B3 | -| **3** | Static `CALLS` with 4-strategy cascade; `find_callers` / `find_callees`; dead code | B1, B8 | -| **3.5** | `pass_cross_repo`-style cross-service edges | B6 | -| **4** | `analyze_pr` (diff → impact + risk); Louvain clusters | B4, B7 | -| **5** | Eval harness; head-to-head benchmark vs. CMM on Java repos | — | -| **5+** | Optional poly-language grammars; static-binary packaging | B9, B10 | - -Layered ignores (B5) can land anywhere — drop it in alongside the next indexer change. - ---- - -## Strategic notes - -- **Run both MCPs in parallel as a zero-integration option.** `.mcp.json` supports many servers. Let your tool answer Java/architectural queries; CMM handles non-Java or generic structural queries when you eventually touch poly-glot codebases. Zero integration cost, maximum optionality. -- **Use the comparison itself as a portfolio asset.** When you start pitching SMB clients on AI automation, "I built a Spring-aware hybrid retrieval system that beats the published Codebase-Memory baseline on Java microservice questions" — with numbers from your Phase 5 eval harness — is a credible artifact. Few consultants can show that. -- **Don't fork CMM.** It's MIT-licensed C with vendored grammars; maintenance cost is high and the code style diverges from your Python stack. Read it as documentation, port the patterns. - ---- - -## References - -- Codebase-Memory MCP source — [github.com/DeusData/codebase-memory-mcp](https://github.com/DeusData/codebase-memory-mcp) -- Paper — [Codebase-Memory: Tree-Sitter-Based Knowledge Graphs for LLM Code Exploration via MCP (arXiv:2603.27277)](https://arxiv.org/abs/2603.27277) -- Your repo — [HumanBean17/java-codebase-rag](https://github.com/HumanBean17/java-codebase-rag) -- Key CMM files referenced above: - - [`pass_calls.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_calls.c) — call resolution - - [`pass_route_nodes.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_route_nodes.c) — route nodes - - [`pass_cross_repo.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_cross_repo.c) — cross-service edges - - [`pass_gitdiff.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_gitdiff.c) — git diff impact - - [`extract_channels.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/internal/cbm/extract_channels.c) — async patterns - - [`service_patterns.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/internal/cbm/service_patterns.c) — framework markers From 63df99ca6a988ddbcf45190f3937dbf4ce52ceb7 Mon Sep 17 00:00:00 2001 From: Dmitry Teryaev Date: Sun, 24 May 2026 13:05:20 +0300 Subject: [PATCH 2/3] fix: update references and restore deleted reports MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Move reports/ to docs/reports/ to preserve technical content - Update all references to CODEBASE_REQUIREMENTS.md → docs/CODEBASE_REQUIREMENTS.md - Update all references to PRODUCT-VISION.md → docs/PRODUCT-VISION.md - Update reports/ references to docs/reports/ - Fix broken links in README.md, AGENTS.md, and 21 proposal/plan files The deleted reports contained valuable technical content: - what-to-borrow-from-cmm.md: borrowing patterns guide from Codebase-Memory MCP - call-graph-review.md: code review with bugs and design invariants - PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md: design issues and gaps Co-Authored-By: Claude Opus 4.7 --- AGENTS.md | 4 +- README.md | 6 +- docs/JAVA-CODEBASE-RAG-CLI.md | 2 +- docs/reports/call-graph-review.md | 364 ++++++++++++++++++ ...BROWNFIELD-ROLE-OVERRIDES-design-issues.md | 62 +++ docs/reports/what-to-borrow-from-cmm.md | 247 ++++++++++++ plans/completed/AGENT-PROMPTS-MCP-API-V2.md | 8 +- plans/completed/AGENT-PROMPTS-TIER1B.md | 6 +- ...-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md | 2 +- .../PLAN-BROWNFIELD-ROLE-OVERRIDES.md | 2 +- plans/completed/PLAN-CALL-GRAPH.md | 4 +- plans/completed/PLAN-CAPABILITIES-MODEL.md | 2 +- plans/completed/PLAN-CLI-SCENARIOS.md | 6 +- plans/completed/PLAN-CLIENT-ROLE-RENAME.md | 4 +- .../completed/PLAN-HTTP-ROUTE-METHOD-ENUM.md | 6 +- plans/completed/PLAN-MCP-API-V2.md | 4 +- .../completed/PLAN-REMOTE-PROJECT-INDEXING.md | 4 +- plans/completed/PLAN-TIER1B-COMPLETION.md | 4 +- propose/completed/CALL-GRAPH-PROPOSE.md | 4 +- propose/completed/CLI-SCENARIOS-PROPOSE.md | 6 +- .../completed/CLIENT-ROLE-RENAME-PROPOSE.md | 4 +- propose/completed/TIER1-COMPLETION-PROPOSE.md | 14 +- .../TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md | 10 +- .../TIER2-INCREMENTAL-REBUILD-PROPOSE.md | 2 +- 24 files changed, 725 insertions(+), 52 deletions(-) create mode 100644 docs/reports/call-graph-review.md create mode 100644 docs/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md create mode 100644 docs/reports/what-to-borrow-from-cmm.md diff --git a/AGENTS.md b/AGENTS.md index 1c5379b9..77384229 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -39,7 +39,7 @@ when needed. operator guide for the `java-codebase-rag` CLI (`init` / `increment` / `reprocess` / `erase`, `meta`, `tables`, `diagnose-ignore`, `analyze-pr`; hidden `refresh` alias → `reprocess` — see that doc). -- `CODEBASE_REQUIREMENTS.md` — Java-repo assumptions and per-file map of +- `docs/CODEBASE_REQUIREMENTS.md` — Java-repo assumptions and per-file map of what to edit when a target tree doesn't match defaults. - `tests/README.md` — testing philosophy. - **`propose/`** — design proposes. **In-flight** proposes are **`*.md` @@ -112,7 +112,7 @@ For any non-trivial change, read the relevant doc first instead of inferring from code: - Behaviour / public surface → `README.md`. -- Brownfield assumptions, role/capability tuning → `CODEBASE_REQUIREMENTS.md`. +- Brownfield assumptions, role/capability tuning → `docs/CODEBASE_REQUIREMENTS.md`. - In-flight design proposes → **`propose/*.md` at the root of `propose/`** (not under `propose/completed/`). **List or search** for current names. - Why current design exists → `propose/completed/` and `plans/completed/`. diff --git a/README.md b/README.md index 0eb1174c..57c333c8 100644 --- a/README.md +++ b/README.md @@ -128,7 +128,7 @@ The operator-facing surface is small: pick an index dir, pick an embedding model | Understand the graph (nodes, edges, capabilities, ranking) | [`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) §3 | | Steer a brownfield Java tree (custom stereotypes, non-Spring stacks) | [`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) §4 | | Control which files the indexer walks | [`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) §5 | -| Check whether your repo fits this tool's assumptions | [`CODEBASE_REQUIREMENTS.md`](./CODEBASE_REQUIREMENTS.md) | +| Check whether your repo fits this tool's assumptions | [`docs/CODEBASE_REQUIREMENTS.md`](./docs/CODEBASE_REQUIREMENTS.md) | --- @@ -158,9 +158,9 @@ Run `java-codebase-rag --help` to list grouped subcommands. Operator playbook wi | [`docs/EDGE-NAVIGATION.md`](./docs/EDGE-NAVIGATION.md) | MCP-traversable edges, directions, dot-key composition. | | [`docs/skills/java-codebase-explore.md`](./docs/skills/java-codebase-explore.md) | Agent exploration skill (strategy, missions, fallbacks); packaged zip [`docs/skills/java-codebase-explore.zip`](./docs/skills/java-codebase-explore.zip) for Perplexity-style hosts. | | [`docs/MANUAL-VERIFICATION-CHECKLIST.md`](./docs/MANUAL-VERIFICATION-CHECKLIST.md) | 7-phase agent-driven verification after indexing your project. | -| [`CODEBASE_REQUIREMENTS.md`](./CODEBASE_REQUIREMENTS.md) | Assumptions about your Java repo + per-file edit map for non-conforming codebases. | +| [`docs/CODEBASE_REQUIREMENTS.md`](./docs/CODEBASE_REQUIREMENTS.md) | Assumptions about your Java repo + per-file edit map for non-conforming codebases. | | [`automation/cursor_propose_only/README.md`](./automation/cursor_propose_only/README.md) | Optional proposal orchestration workflow (single-command autopilot, planning bundles, automated execution/review loops). | -| [`propose/PRODUCT-VISION.md`](./propose/PRODUCT-VISION.md) | Long-term product direction. | +| [`docs/PRODUCT-VISION.md`](./docs/PRODUCT-VISION.md) | Long-term product direction. | --- diff --git a/docs/JAVA-CODEBASE-RAG-CLI.md b/docs/JAVA-CODEBASE-RAG-CLI.md index 9c5655d1..80a971be 100644 --- a/docs/JAVA-CODEBASE-RAG-CLI.md +++ b/docs/JAVA-CODEBASE-RAG-CLI.md @@ -226,5 +226,5 @@ Prefer **`java-codebase-rag reprocess --graph-only`** when you only need Kuzu re ## See also - [README.md](../README.md) — env vars, MCP tool table, ignore layout. -- [CODEBASE_REQUIREMENTS.md](../CODEBASE_REQUIREMENTS.md) — repo layout, brownfield, when to rebuild. +- [CODEBASE_REQUIREMENTS.md](./CODEBASE_REQUIREMENTS.md) — repo layout, brownfield, when to rebuild. - [MANUAL-VERIFICATION-CHECKLIST.md](./MANUAL-VERIFICATION-CHECKLIST.md) — phased checks that mix CLI + MCP. diff --git a/docs/reports/call-graph-review.md b/docs/reports/call-graph-review.md new file mode 100644 index 00000000..e6ed718e --- /dev/null +++ b/docs/reports/call-graph-review.md @@ -0,0 +1,364 @@ +# Call Graph Layer — Code Review + +**Repository:** [HumanBean17/java-codebase-rag](https://github.com/HumanBean17/java-codebase-rag) +**Commits reviewed:** +- `b3a15d8` — *call graph layer propose* +- `fb5473f` — *call graph layer implementation* + +**Reference docs:** +- [`propose/completed/CALL-GRAPH-PROPOSE.md`](https://github.com/HumanBean17/java-codebase-rag/blob/master/propose/completed/CALL-GRAPH-PROPOSE.md) +- [`plans/completed/PLAN-CALL-GRAPH.md`](https://github.com/HumanBean17/java-codebase-rag/blob/master/plans/completed/PLAN-CALL-GRAPH.md) + +**Test status:** all 24 new call-graph tests pass locally +(`tests/test_ast_java_calls.py`, `tests/test_call_graph_smoke_roundtrip.py`, +`tests/test_call_graph_receiver_resolution.py`). + +--- + +## Overall verdict + +**Strong, faithfully-scoped implementation.** The proposal is realised as +written, the receiver-type resolver is well-structured, the schema and edge +metadata match the design (confidence + strategy + source), and the test +coverage targets concrete proposal section numbers. Scope discipline is +visible — no creep into HTTP / async / AOP / traces. + +There are **three correctness bugs** that should land as a quick follow-up +before Phase 3 is closed, plus a handful of design issues worth pushing back +on. All three bugs share one root cause: **resolution strategy and +confidence are silently downgraded at edge-emit time when the receiver was +already resolved successfully.** + +--- + +## What's done well + +- **Confidence + strategy tagging is faithful to the design.** Every edge + carries (`confidence`, `strategy`, `source='static'`) — clean migration + path for trace ingestion later. +- **Multigraph dedup at write time** (`(src_id, dst_id, arg_count, line)`) + is correctly shaped: prevents accidental duplication while preserving + overload-ambiguous fan-out at distinct call sites. +- **Receiver-type resolver** is clear and matches the proposal: scope table + built once per method, supertype-bounded lookup, explicit + `chained_receiver` phantom path, deterministic phantom IDs. +- **Receiver-disambiguation discipline.** `_unique_type_simple_resolve` + deliberately uses the *type* registry (not a per-method simple-name + index). The dedicated test + `test_receiver_disambiguation_uses_type_index_not_method_unique` is + exactly the right kind of negative test — this is the precise trap + CMM-style cascades fall into and the implementation avoids it. +- **`_method_ids_for_call_graph_needle`** elegantly accepts type FQN, + method FQN, or simple method name; fan-out through `DECLARES` from a + type needle is the right move and matches §6.1. +- **`exclude_external` is filter-on-result, not filter-on-store.** Phantoms + stay in the graph (so impact analysis can see JDK-adjacent signals), but + query consumers get clean lists by default. Matches risk #2 mitigation + in the proposal. +- **Tests target proposal section numbers.** 24 tests, all passing, + including a Kuzu round-trip on a real fixture project. The shadowing + test (`test_local_shadows_field_same_name_resolves_receiver`) is the + kind of edge case that bites in real codebases. +- **Diagnostics are baked in** — `pass3_calls` prints the chained-phantom + percentage as the proposal mandates. + +--- + +## Bugs (must fix) + +### B1. Constructor calls always become phantoms when the class has no explicit constructor + +**Severity: high — most common Java call site is broken.** + +`new Svc()` in `ScopeReceivers.byLocal()` resolves the receiver type to +`smoke.Svc` correctly. But `Svc` has no explicit constructor in source, so +`_parse_method` is never invoked for an ``, and no constructor +`MemberEntry` is created. `_lookup_method_candidates(type='smoke.Svc', +callee='', argc=0)` finds nothing → fallthrough to phantom at +`confidence=0.0`. + +Confirmed empirically against the smoke fixture: + +``` +['smoke.ScopeReceivers#byLocal()', 'smoke.Svc#(0)', 'phantom', False, 0.0] +['smoke.ScopeReceivers#shadowLocalOverField()', 'smoke.Svc#(0)', 'phantom', False, 0.0] +``` + +In a real Spring codebase, **every** `new MyDto()`, `new HashMap<>()`, +`new ArrayList<>()` on a project type without a hand-written constructor +lands as a phantom. + +**Fix.** When parsing a `TypeDecl` and discovering no constructor +declaration, synthesize a default +`MethodDecl(name="", signature="()", is_constructor=True, ...)` +with `start_line` / `start_byte` from the type declaration and +`parameters=[]`. Make sure it gets a `MemberEntry`. + +Two corollary checks: + +- `_emit_call_edge` for `new Svc()` should then resolve to the synthesized + member with `strategy='constructor'` (not `phantom`), `confidence` + inherited from the receiver-resolution tier. +- Confirm existing `INJECTS` / `DECLARES` accounting doesn't double-count + the synthesized node. + +**Suggested test** — add to `tests/test_call_graph_smoke_roundtrip.py` +(`test_implicit_default_ctor_is_resolved`): + +```java +public class HasNoCtor {} +public class Caller { void m() { new HasNoCtor(); } } +``` + +Assert: `(Caller#m)-[CALLS {strategy:'constructor', resolved:true}]->(HasNoCtor#())`. + +--- + +### B2. Implicit `super()` for a class that doesn't extend anything is mis-tagged as `phantom` + +**Severity: medium — diagnostic regression, not a wrong answer.** + +`WildUtils` has an explicit `private WildUtils() {}` constructor with no +`super(...)` body, so the AST extractor synthesizes the implicit-super +call site. `_first_supertype_fqn` returns `None` (no `EXTENDS` row → +there is no `Object` node in the index), so `_resolve_receiver_type` +returns `(None, "phantom", 0.0)`. Result: + +``` +['smoke.WildUtils#WildUtils()', '?super#(0)', 'phantom', False, 0.0] +``` + +The proposal §4.2 promises strategy `implicit_super (0.90)` for this case. +Right now the agent cannot distinguish "implicit super to `Object`" from +"I have no idea what this call resolved to" — real signal loss. + +**Fix.** In `_resolve_receiver_type`, when `expr == 'super'` and +`_first_supertype_fqn(...) is None`, return +`("java.lang.Object", "implicit_super", 0.90)`. In `_emit_call_edge`, +allow phantom callee (no member resolved on `Object`) but **preserve +`strategy='implicit_super'` and `confidence=0.90`** instead of overriding +to `phantom` / `0.0`. This is the same fix-shape as B3 below. + +--- + +### B3. Resolution strategy and confidence are silently overridden to `phantom` / `0.0` when the callee can't be located on a resolved external receiver + +**Severity: high — collapses static-import precision when callees are JDK / Spring.** + +In `_resolve_and_emit_call`: + +```python +if not candidates: + pid = _phantom_method_id(...) + _emit_call_edge(..., confidence=0.0, strategy="phantom", resolved=False) + return +``` + +This branch fires whenever the receiver type *did* resolve (e.g. +`java.util.Objects` via `static_import`, confidence 0.95) but the callee +method isn't on a type we indexed. The static-import smoke test confirms it: + +``` +requireNonNull edges: 1 + phantom 0.0 False java.util.Objects#requireNonNull(1) +``` + +The README and the MCP instructions both tell agents to use +`min_confidence=0.9` to filter noise. Under that filter, **every JDK +static-import call disappears from the graph**, even though the resolver +*knew* the call's target type with 0.95 confidence. + +**Fix.** Decouple the *receiver-resolution strategy/confidence* from the +*callee-found* boolean. When `candidates` is empty: + +- Keep the phantom callee (creating it on the resolved receiver type — + already done). +- Keep `resolved=False` on the edge (the *callee node* is a phantom). +- **Preserve the receiver-resolution `strat` and `conf`** unless they're + `'chained_receiver'`. Specifically: `strategy` stays `'static_import'` / + `'static_import_wildcard'` / `'import_map'` / `'same_module'` etc.; + `confidence` stays the receiver-tier value. + +The only case where `confidence=0.0, strategy='phantom'` is honest is when +the receiver itself was unresolvable. Distinguishing those two failure +modes is the whole point of the cascade. + +Optional: add a small property `callee_found BOOLEAN` on the edge so a +query like *"high-confidence edges with phantom callees"* (= calls into +well-known external libraries) becomes one Cypher predicate. + +**Suggested tests:** + +- `test_static_import_to_jdk_keeps_high_confidence` — `requireNonNull` + edge has `confidence>=0.95` and `strategy='static_import'`, with + `resolved=False` on the edge. +- `test_min_confidence_filter_keeps_high_confidence_static_import_callers` + — `find_callers('java.util.Objects#requireNonNull(1)', min_confidence=0.9)` + returns the in-repo caller. + +--- + +## Design issues (push back on the proposal here, not just the implementation) + +### D1. Phantom-ID `arg_count` semantics are inconsistent across method-references and regular calls + +`_phantom_method_id` builds the FQN as `{receiver}#{callee}({arg_count})`. +For method references the `arg_count` is `-1`. So the same external method +can exist as both `Foo#bar(2)` and `Foo#bar(-1)` phantom nodes — distinct +nodes for the same logical target. The dedup key +`(src_id, dst_id, arg_count, line)` then keeps both edges, doubling the +graph for code that mixes calls and method references on the same target. + +**Recommendation.** Either normalize phantom IDs without `arg_count` for +method references (`?{recv}#{callee}(?)`) or drop `arg_count` from the +dedup key and use `(src_id, dst_id, line, byte)` (line+byte already pin a +unique call site). + +--- + +### D2. Method-reference precision is leaving free wins on the table + +Method references that *are* unambiguous on name (single method, no +overloads) currently still emit with `arg_count=-1`. Cheap precision win, +no extra resolver complexity: when the receiver type is known and exactly +one method with `name == callee_simple` exists on the receiver type, pick +that single-arity match and emit a fully-resolved edge with the receiver's +real arity instead of `-1`. + +--- + +### D3. Anonymous-inner-class call attribution does the proposal-correct thing, but the design is questionable + +Right now `pingFromAnon()` (called from inside +`new Runnable() { run() { pingFromAnon(); } }`) is attributed to +**`NestedCalls#m()`**, the enclosing named method, with +`strategy='this_super'`. That matches §4.1's wording. + +But: the anonymous `Runnable` *does* get parsed as a nested type in +`_parse_type` (kind `class`). It produces a `MemberEntry` for its +`run()` method. So the graph has two contradictory facts: the call edge +goes from `NestedCalls#m`, and the structural fact "there exists a +`run()` method here" lives on a separate, disconnected anonymous type +node. + +**Recommendation.** Re-attribute calls inside an anonymous-class body to +the anonymous-class member. The named-enclosing fallback is only needed +for **lambdas** (which don't synthesize a member) and static / instance +initializers. For anonymous classes, the call-site naturally belongs to +the anonymous member. This makes +`find_callers('OperatorAssignedProcessor.onOperatorAssigned')` find the +anonymous handler that actually contains the call, instead of the outer +service method. + +--- + +### D4. `expand_methods` discards confidence on the way out + +The output is `list[str]` of type FQNs. There's no way for the search-side +fusion in `_graph_expand_merge` to weight a CALLS-derived hit lower than +a structural one. The proposal §6.2 says "merged via existing RRF, no new +caller-visible parameters" — so RRF treats every reach equally regardless +of whether it came from a 0.95 import-map edge or a 0.55 suffix edge. + +**Recommendation (small).** Have `expand_methods` return +`list[tuple[str, float]]` (type FQN + max confidence on the discovery +path), and let `_graph_expand_merge` pass that as the RRF rank weight. +Internal-only signature change; no MCP surface change. + +--- + +### D5. `trace_flow`'s default change quietly rebudgets stage capacity across two qualitatively different edge sources + +`follow_calls=True` is the new default. Existing agent prompts that +expected type-only stages now get extra entries with +`via.edge_type='CALLS'`. That's good — agents can infer it. But the +per-stage cap (`stage_limit`) now budgets across both edge classes, so a +high-fan-out service can starve INJECTS results in favor of CALLS results. + +**Recommendation.** Either: + +1. Keep separate budgets (`stage_limit_structural`, `stage_limit_calls`, + default to `stage_limit` each), or +2. Order ingestion to prefer INJECTS / EXTENDS / IMPLEMENTS first, then + top up with CALLS until `stage_limit`. The current code already runs + the structural query first — just keep the CALLS top-up bounded by + `stage_limit - len(stage_results)` instead of a separate + `stage_limit * 4` LIMIT. + +--- + +### D6. `_resolve_this_super_field_chain` lacks fixture coverage + +The resolver line +`chain = _resolve_this_super_field_chain(expr, member=member, ast=ast, tables=tables)` +is a real bonus over what CMM does — if it walks +`this.fieldA.fieldB.fieldC.method()` correctly. Add a smoke fixture that +exercises it; none of the existing files do. + +--- + +## Smaller nits + +- **N1 — Per-call rebuild of `_scope_table`.** `_resolve_and_emit_call` + calls `_scope_table(member, ast, tables)` on every call site. + Field / parameter scope is identical for every call inside a single + method body — locals only grow as you step through the body. Build it + once per `member` in `_resolve_method_calls` and pass it in. On a + 5-microservice corpus this is the kind of constant-factor that doubles + `pass3_calls` runtime. +- **N2 — `_lookup_method_candidates`'s `name_only` fallback rule is good, + but the strategy logic in `_resolve_and_emit_call` is intricate.** + The branch + `elif name_only_fb and len(candidates) == 1: edge_strat = strat` is + correct but easy to misread — the inline comment is good; consider + promoting it to a docstring section. +- **N3 — `is_static_call` heuristic.** `_infer_static_method_invocation` + returns `True` when the receiver starts with an uppercase identifier. + For `var Foo = supplier.get();` followed by `Foo.bar()` this + misclassifies. Rare in practice, but worth a TODO; conservative fix is + to consult the scope table (if `Foo` is in scope as a variable, it's + not a static call). +- **N4 — Ontology guard.** `ONTOLOGY_VERSION` 3 → 4 is set, but confirm + `KuzuGraph.get` actually raises on `GraphMeta.ontology_version` + mismatch at read time so a stale graph fails loudly (proposal §5.3). +- **N5 — `pass3_calls` diagnostics.** The log line reports + chained-phantom % only. Add the `phantom_other` ratio (the bigger one + in real codebases) so you can spot B1 / B3 regressions in the log + immediately. +- **N6 — Method reference inside lambda.** `visit` sets + `lam=lam or chained` for method references with a chained qualifier. + That conflates "I'm in a lambda" with "this method ref is itself + chained." `chained` should propagate as a separate flag, not as + `in_lambda`. +- **N7 — Empty `expr` and `is_static_call=False` branch.** The condition + `expr in ("", "this") or (not expr and call.is_static_call is False + and not call.receiver_expr)` is redundant: if `expr == ""` the second + clause is also true. Simplify to `expr in ("", "this")`. + +--- + +## Suggested fix order + +1. **B1, B2, B3 as one PR** titled + *"call graph: faithful confidence preservation across the resolver→writer boundary"* + — the three bugs share one architectural fix (don't downgrade + strategy / confidence at edge-emit time when the receiver was + resolved). Add the suggested tests in the same PR. +2. **D5 as a separate PR** — `trace_flow` budget split with a regression + test that seeds a service whose CALLS fan-out exceeds the structural + one. +3. **D3 (anon-class re-attribution), D4 (`expand_methods` confidence), + N1 (scope-table caching) as a small follow-up** before opening the + next phase. + +--- + +## Closing note + +This is solid Phase-3 work. Land the three bug fixes and the codebase is +in an excellent spot to start on the next phase — either cross-service +`HTTP_CALLS` (B6 / B7 in +[`what-to-borrow-from-cmm.md`](https://github.com/HumanBean17/java-codebase-rag/blob/master/tmp/what-to-borrow-from-cmm.md)) +or runtime-trace ingestion (B3 from the same doc). Both will lean on the +resolver and confidence machinery just built; the bug fixes above make +that lean trustworthy. diff --git a/docs/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md b/docs/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md new file mode 100644 index 00000000..83a99cae --- /dev/null +++ b/docs/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md @@ -0,0 +1,62 @@ +# Design issues: PLAN-BROWNFIELD-ROLE-OVERRIDES (plan / specification) + +**Plan file:** `plans/todo/PLAN-BROWNFIELD-ROLE-OVERRIDES.md` +**Review date:** 2026-04-26 +**Scope:** Problems, ambiguities, or gaps in the *written plan* (not the codebase). + +--- + +## 1. Dual pipeline for meta-annotation data (spec gap) + +The plan describes building Layer A (meta-annotation reachability) from a two-pass process anchored in `build_ast_graph.py` and `GraphTables`. The chunk-enrichment / Lance path must also apply the same resolution rules, but the plan does **not** require a single shared primitive for “which `@interface` definitions exist in the project.” + +A careful reader can infer that graph build and index enrichment should agree, but two independent implementations (graph tables vs. a separate tree walk) are **not** ruled out. If file coverage, exclude patterns, or parse-failure handling differ, Lance and Kuzu can **disagree** on `meta_chain` for the same type. The plan would be stronger with an explicit constraint: e.g. “meta maps MUST be derived from the same file set and exclusion rules as `build_ast_graph` pass1,” or “Lance and Kuzu MUST share one builder function.” + +--- + +## 2. Depth cap for meta-annotation resolution is under-specified + +The plan gives a sketch of `_resolve_meta_chain` with `len(seen) > 4` and cycle handling. As written, the `seen` set is used both for **cycle** detection and as a stand-in for **path depth**. On a *linear* chain of meta-annotations, set size tracks depth. On **branching** shapes, set cardinality and “steps from root” diverge, so the sketch does not define a single clear semantics (strict path depth vs. global visit count). + +The follow-up test (“six wrappers → `OTHER`”) depends on a precise cap. The plan should name the exact metric (e.g. maximum path length from the start simple name) and the integer bound, so implementers and tests are aligned. + +--- + +## 3. Pre-flight test 9 mixes “unit” and “integration” scope + +The pre-flight item asks for a “unit-style” regression but specifies: build a **fresh** Lance index with FQN overrides, **query the table directly**, and then run **`codebase_search(..., capability=...)`** end to end. That is a **multi-layer** test (indexer + storage + search API) and is expensive to run and to keep stable in CI. + +A tiered requirement would match intent better: (1) schema / `JavaLanceChunk` field, (2) `process_java_file` row, (3) optional full search. As written, teams may either skip the heavy part or over-invest in flaky integration for what is mainly a **write-path** contract. + +--- + +## 4. “Precedence” vs. “execution order” is correct but error-prone to skim + +The plan is internally consistent: execution order is the *reverse* of listed priority, and guards use the **current** `role` after each step. Still, a reader who only scans the “Precedence summary (final)” table may implement **C before FQN** in the wrong direction or mis-order **B vs. A** without reading the “Execution order in code (REQUIRED)” block. + +This is a **documentation hazard** in the spec, not a logic error. A short, single bullet at the top (“Apply steps in *only* the order: …; do not reorder”) or a Mermaid sequence diagram would reduce mis-implementation. + +--- + +## 5. Layer A duplicate `@interface` simple names + +The plan correctly specifies first-seen-wins and a stderr warning. The **implication** (colliding simple names in different packages map to one `meta_chain` entry) is only obvious if you already know Java’s annotation resolution limits in this indexer. A one-line “Limitation:” callout in the plan would set expectations for monorepos with same-named annotations. + +--- + +## 6. Rollout vs. single document + +The plan says three independent PRs (Phase 1 → 2 → 3) while also presenting all phases in one file. That is fine for a complete picture, but the **merge strategy** (squashed single PR vs. three) is a process choice the plan does not need to fix—only note that “shippable phases” and “one landing” can conflict in review scope unless branches are cut accordingly. + +--- + +## Summary + +| ID | Topic | Severity (spec) | +|----|------------------------------|-----------------| +| 1 | Single source of truth for meta map inputs | High (consistency) | +| 2 | Depth / cycle semantics | Medium | +| 3 | Pre-flight test cost / tiers | Low–medium | +| 4 | Precedence skimming hazard | Low | +| 5 | Duplicate simple-name limits | Low | +| 6 | Multi-PR vs one doc | Process only | diff --git a/docs/reports/what-to-borrow-from-cmm.md b/docs/reports/what-to-borrow-from-cmm.md new file mode 100644 index 00000000..e2258de3 --- /dev/null +++ b/docs/reports/what-to-borrow-from-cmm.md @@ -0,0 +1,247 @@ +# What to Borrow from Codebase-Memory MCP + +A focused, prioritized guide for evolving `java-codebase-rag` (AMA agent) by adopting proven patterns from [DeusData/codebase-memory-mcp](https://github.com/DeusData/codebase-memory-mcp) (paper: [arXiv:2603.27277](https://arxiv.org/abs/2603.27277)) — without giving up your Spring-aware, hybrid (vector + graph) edge. + +> **Guiding principle.** CMM optimizes for *token efficiency at acceptable quality* across 66 languages. Your AMA agent optimizes for *answer quality on Spring/Java microservices* via hybrid retrieval. Borrow CMM's structural mechanics; keep your semantic / role-aware layer as the differentiator. + +--- + +## Snapshot — where each tool wins + +| Layer | Your AMA agent | Codebase-Memory MCP | Action | +|---|---|---|---| +| Java/Spring DI semantics | Strong (`@Autowired`, `@Inject`, Lombok, `@FeignClient`) | None | Keep yours | +| Vector / hybrid retrieval (LanceDB + RRF + `graph_expand`) | Yes | None | Keep yours | +| Role / capability ontology (`CONTROLLER`, `MESSAGE_LISTENER`, ...) | Yes | None | Keep yours | +| Microservice topology + brownfield overrides | Yes | Generic `Project` only | Keep yours | +| `CALLS` / `HTTP_CALLS` / `ASYNC_CALLS` resolution | Roadmap (Phase 3) | Shipped, mature | **Borrow** | +| `Route` as first-class node | Roadmap | Shipped | **Borrow** | +| Cross-repo / cross-service edges | Roadmap | Shipped (`pass_cross_repo`) | **Borrow** | +| Runtime trace ingestion | None | Shipped (`ingest_traces`) | **Borrow** | +| Git-diff impact + risk classification | Partial (`impact_analysis`) | Shipped (`detect_changes`) | **Borrow** | +| Layered ignore (`.gitignore` + project ignore) | Constant list | Layered (`.cbmignore`) | **Borrow** | +| Louvain community detection | None | Shipped | **Borrow (Phase 4)** | +| Dead-code detection | None | Shipped | **Borrow (Phase 4)** | +| 66-language tree-sitter grammars | Java only | Yes | Skip (off-strategy) | +| Single static binary distribution | Python venv | Yes | Skip until Phase 5+ | +| 3D graph UI | None | Yes | Skip | +| `get_architecture` mega-tool | Split into small tools | One bundled tool | Skip — keep yours | + +--- + +## Tier 1 — Borrow now (cheap, high impact) + +### B1. Confidence-scored CALLS resolution cascade + +CMM's [`pass_calls.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_calls.c) and [`extract_calls.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/internal/cbm/extract_calls.c) resolve calls via a deterministic cascade. Adopt the **shape**, not the C code. + +**What to lift:** + +- A 4-strategy cascade with explicit confidence values: + 1. Import-map resolved (`0.95`) + 2. Same-module / same-package (`0.90`) + 3. Globally unique simple name (`0.75`) + 4. Suffix / fuzzy match (`0.55`) +- A `confidence` property on every `CALLS` edge so downstream tools (and the MCP agent) can filter (`WHERE c.confidence >= 0.8`). +- A `source` property: `"static"` vs `"trace"` vs `"di_proxy"`. + +**Why now:** Add the property when you create the Kuzu schema for Phase 3 — retrofitting columns later is painful. + +**Suggested Kuzu DDL:** + +```sql +CREATE REL TABLE CALLS ( + FROM Method TO Method, + confidence DOUBLE, -- 0.55 .. 1.0 + source STRING, -- 'static' | 'trace' | 'di_proxy' + strategy STRING, -- 'import_map' | 'same_module' | 'unique_name' | 'suffix' + call_site STRING -- file:line +); +``` + +--- + +### B2. `Route` as a first-class node + +CMM models REST endpoints and message channels as a single `Route` label so that *any* call site can attach to *any* endpoint via `HTTP_CALLS` / `ASYNC_CALLS`. See [`pass_route_nodes.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_route_nodes.c). + +**What to lift:** + +- Adopt the **`Route`** label (instead of `RestEndpoint` from your current PRODUCT-VISION) — keeps you semantically interoperable if anyone runs both MCPs in parallel. +- Properties: `path`, `method`, `framework` (`spring_mvc`, `webflux`, `feign`, `kafka`, `rabbitmq`), `broker` (for async), `service` (microservice name). +- Edges: + - `(Method)-[:EXPOSES]->(Route)` for `@RequestMapping`/`@KafkaListener` + - `(Method)-[:HTTP_CALLS]->(Route)` for `RestTemplate`/`WebClient`/`@FeignClient` + - `(Method)-[:ASYNC_CALLS]->(Route)` for `KafkaTemplate.send`/`StreamBridge.send` +- A normalization rule: `/api/users/{id}` and `/api/users/123` collapse to the same `Route` (path-template canonicalization). + +--- + +### B3. Runtime trace ingestion (`ingest_traces`) + +This is the single biggest quality lever you don't have yet. Static analysis misses Spring AOP proxies, polymorphic dispatch, reflection, and event-driven flows — runtime traces capture all of them. + +**What to lift:** + +- A new MCP tool `ingest_traces(spans: List[Span], source: str)`. +- Accept OpenTelemetry / Sleuth / Micrometer JSON natively. +- For each `(parent_span, child_span)` pair, emit `(caller:Method)-[:CALLS {source:"trace", confidence:1.0}]->(callee:Method)`. +- For HTTP client spans, emit `(caller)-[:HTTP_CALLS]->(Route)` using `http.url` + `http.method` to match an existing `Route` node. +- Deduplicate via `(source_id, target_id, source)` so re-ingesting traces is idempotent. + +**Why this matters:** Lifts Phase 3 from "static approximation" to "ground-truth where traces exist, static elsewhere" — and the agent can prefer `confidence:1.0` edges automatically. + +--- + +### B4. Git-diff impact mapping with risk score + +CMM's [`detect_changes`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_gitdiff.c) maps a diff to affected symbols and a blast radius. You already have `impact_analysis` — make it diff-driven and add risk classification. + +**What to lift:** + +- New MCP tool `analyze_pr(diff: str | git_ref: str)`: + 1. Parse `git diff` line ranges per file + 2. Map line ranges → chunks → graph nodes (functions/methods) + 3. Run your existing reverse closure + 4. Return `{ changed_nodes, blast_radius, risk_score, risk_level }` +- Risk formula (start simple, tune later): + +``` +risk = log10(1 + downstream_consumers) * role_weight * cross_service_factor + +role_weight = { CONTROLLER:1.5, SERVICE:1.2, REPOSITORY:1.0, CONFIG:1.8, ENTITY:1.3, ... } +cross_service_factor = 1.0 if changes only touch one microservice, 2.0 otherwise +risk_level = "low" (<1.0), "medium" (1.0..2.5), "high" (>2.5) +``` + +- Output usable directly in PR review or CI gating. + +--- + +### B5. Layered ignore patterns + +CMM uses **hardcoded patterns → `.gitignore` hierarchy → `.cbmignore`** ([`discover/`](https://github.com/DeusData/codebase-memory-mcp/tree/master/src/discover)). Cleaner than your current `COMMON_EXCLUDED_PATH_PATTERNS` constant. + +**What to lift:** + +- Layer order: + 1. Hardcoded must-skip (`.git`, `node_modules`, `target`, `build`, `out`, `.idea`, `.gradle`, `bin`) + 2. Walk up `.gitignore` files from each indexed directory + 3. Project-level `.lancedb-mcp.yml`'s `ignore:` list + 4. NEW: optional `.lancedb-mcp-ignore` file with gitignore syntax +- Always skip symlinks (cycle protection). +- Reuse `pathspec` (Python) — it's the gitignore-spec-compliant matcher. + +--- + +## Tier 2 — Borrow during Phase 2 / 3 + +### B6. Cross-repo / cross-service edges + +CMM's [`pass_cross_repo.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_cross_repo.c) matches an `HTTP_CALLS` edge in service A to a `Route` node in service B and creates a `CROSS_HTTP_CALLS` edge. This is the killer feature for a multi-microservice AMA. + +**What to lift:** + +- After per-service indexing, run a global pass: + - For each `HTTP_CALLS` edge with `path` + `method`, find the matching `Route` node in any other indexed service. + - Emit `(callerMethod)-[:CALLS_HTTP]->(Route)<-[:EXPOSES]-(calleeMethod)` so traversal in either direction works. +- Same for async: match `topic`/`queue` strings in `KafkaTemplate.send` calls to `@KafkaListener` `Route` nodes. +- Path template matching: `GET /api/orders/{id}` matches a call to `GET /api/orders/123` — use a `path_pattern` regex stored on the `Route`. + +**Killer query unlocked:** *"What breaks if I rename `POST /api/orders` in `order-service`?"* → traverse `Route` → cross-service `HTTP_CALLS` → caller methods → reverse closure → affected controllers in `checkout-service`. + +--- + +### B7. Louvain community detection + +CMM runs Louvain over `CALLS` to discover functional modules. Useful for onboarding and architecture pitches. + +**What to lift:** + +- After Phase 3 `CALLS` lands, run Louvain on the call subgraph (use `python-igraph` or `networkx-community`). +- Store `cluster_id` and `cluster_size` as `Method` properties. +- New MCP tool `find_module_clusters(min_size: int)` returning ranked clusters with their dominant role mix and entry methods. +- Bonus: weight edges by call frequency from traces (B3) for higher-quality partitions. + +--- + +### B8. Dead-code detection + +Trivial once `CALLS` exists, but valuable for cleanup and consulting deliverables. + +**What to lift:** + +- New MCP tool `find_dead_code(exclude_entry_points: bool = true)`. +- Definition: `Method` with zero incoming `CALLS` and zero incoming `EXPOSES`. +- Entry-point predicates to exclude: + - Spring stereotypes that auto-invoke: `@Scheduled`, `@PostConstruct`, `@EventListener`, `@KafkaListener`, `@RabbitListener`, `@JmsListener` + - HTTP entry points: any method with an `EXPOSES` edge + - Test methods: `@Test`, `@ParameterizedTest`, lifecycle annotations + - `public static void main(String[])` +- Cypher (one query): + +```cypher +MATCH (m:Method) +WHERE NOT (m)<-[:CALLS]-() + AND NOT (m)-[:EXPOSES]->() + AND NOT m.is_entry_point +RETURN m.qualified_name, m.role, m.file, m.line +ORDER BY m.role, m.qualified_name +``` + +--- + +## Tier 3 — Borrow later or skip + +### Borrow only if you go poly-language (Phase 5+) + +- **B9. Multi-grammar indexing.** CMM ships 66 grammars vendored. Adopt only if you sell to non-Java SMBs. +- **B10. Static binary distribution.** Compelling for SMB clients ("download → run"). Not relevant while you're a Python venv. + +### Skip (don't fit your strategy) + +- **`get_architecture` mega-tool.** Your split tools (`graph_meta`, `list_by_role`, `list_by_capability`) are more agent-friendly because each is named and small. The agent picks better when tool intent is narrow. +- **3D graph UI.** Not the differentiator. If you need visualization, render Kuzu subgraphs to Mermaid or Graphviz on demand from a tool — far less code, embeds in chat. +- **Their ADR module.** Markdown folder + your existing search is enough. Adding ADR CRUD is scope creep. +- **CMM's mini-Cypher executor.** You already have Kuzu — strictly more capable. + +--- + +## Suggested roadmap reorder + +A revised ordering that front-loads borrowed pieces with the highest ROI: + +| Phase | Goal | Borrowed items | +|---|---|---| +| **2** (now) | `Route` nodes + `HTTP_CALLS` / `ASYNC_CALLS` from Spring/Feign/Kafka, with `confidence` columns | B2 | +| **2.5** | `ingest_traces` MCP tool (cheap, huge quality lift) | B3 | +| **3** | Static `CALLS` with 4-strategy cascade; `find_callers` / `find_callees`; dead code | B1, B8 | +| **3.5** | `pass_cross_repo`-style cross-service edges | B6 | +| **4** | `analyze_pr` (diff → impact + risk); Louvain clusters | B4, B7 | +| **5** | Eval harness; head-to-head benchmark vs. CMM on Java repos | — | +| **5+** | Optional poly-language grammars; static-binary packaging | B9, B10 | + +Layered ignores (B5) can land anywhere — drop it in alongside the next indexer change. + +--- + +## Strategic notes + +- **Run both MCPs in parallel as a zero-integration option.** `.mcp.json` supports many servers. Let your tool answer Java/architectural queries; CMM handles non-Java or generic structural queries when you eventually touch poly-glot codebases. Zero integration cost, maximum optionality. +- **Use the comparison itself as a portfolio asset.** When you start pitching SMB clients on AI automation, "I built a Spring-aware hybrid retrieval system that beats the published Codebase-Memory baseline on Java microservice questions" — with numbers from your Phase 5 eval harness — is a credible artifact. Few consultants can show that. +- **Don't fork CMM.** It's MIT-licensed C with vendored grammars; maintenance cost is high and the code style diverges from your Python stack. Read it as documentation, port the patterns. + +--- + +## References + +- Codebase-Memory MCP source — [github.com/DeusData/codebase-memory-mcp](https://github.com/DeusData/codebase-memory-mcp) +- Paper — [Codebase-Memory: Tree-Sitter-Based Knowledge Graphs for LLM Code Exploration via MCP (arXiv:2603.27277)](https://arxiv.org/abs/2603.27277) +- Your repo — [HumanBean17/java-codebase-rag](https://github.com/HumanBean17/java-codebase-rag) +- Key CMM files referenced above: + - [`pass_calls.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_calls.c) — call resolution + - [`pass_route_nodes.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_route_nodes.c) — route nodes + - [`pass_cross_repo.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_cross_repo.c) — cross-service edges + - [`pass_gitdiff.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/src/pipeline/pass_gitdiff.c) — git diff impact + - [`extract_channels.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/internal/cbm/extract_channels.c) — async patterns + - [`service_patterns.c`](https://github.com/DeusData/codebase-memory-mcp/blob/master/internal/cbm/service_patterns.c) — framework markers diff --git a/plans/completed/AGENT-PROMPTS-MCP-API-V2.md b/plans/completed/AGENT-PROMPTS-MCP-API-V2.md index bf079cee..2dc4e914 100644 --- a/plans/completed/AGENT-PROMPTS-MCP-API-V2.md +++ b/plans/completed/AGENT-PROMPTS-MCP-API-V2.md @@ -279,7 +279,7 @@ Headline items: 2. `README.md`: delete v1 tool reference; promote "v2 navigation tools (preview)" to primary `### Tool reference`. Keep ops tools listed as "operational — moving to `user-rag` CLI in next release". -3. `propose/PRODUCT-VISION.md`: rewrite v1 example invocations to v2 (per +3. `docs/PRODUCT-VISION.md`: rewrite v1 example invocations to v2 (per propose §11 mapping). 4. Delete `tests/test_mcp_v2_equivalence.py` entirely — v1 no longer exists. 5. Update `tests/test_server.py` (or add if missing) tool-count assertion to: @@ -322,9 +322,9 @@ grep -nE "^### Tool reference" README.md - [ ] `tests/test_mcp_v2_equivalence.py` does not exist. - [ ] README §"Tool reference" lists exactly the 4 v2 tools as primary; ops tools noted as transitional. -- [ ] `propose/PRODUCT-VISION.md` example invocations updated to v2. +- [ ] `docs/PRODUCT-VISION.md` example invocations updated to v2. - [ ] Diff is confined to deliverables in this prompt (`server.py`, `README.md`, - `propose/PRODUCT-VISION.md`, deleted `tests/test_mcp_v2_equivalence.py`, + `docs/PRODUCT-VISION.md`, deleted `tests/test_mcp_v2_equivalence.py`, `tests/test_server.py` or equivalent surface-assertion test), plus narrowly-related test harness/import updates required to make those changes pass. @@ -337,7 +337,7 @@ grep -nE "^### Tool reference" README.md - [`plans/PLAN-MCP-API-V2.md` § PR-V2-3](./PLAN-MCP-API-V2.md#pr-v2-3--delete-v1-navigation-tools) — list of the 18 tools to delete. - [`propose/completed/MCP-API-V2-REDESIGN-PROPOSE.md`](../../propose/completed/MCP-API-V2-REDESIGN-PROPOSE.md) - §11 mapping table — for rewriting `propose/PRODUCT-VISION.md` examples. + §11 mapping table — for rewriting `docs/PRODUCT-VISION.md` examples. - `server.py` history (git log) — to identify each tool's helper-function graveyard. diff --git a/plans/completed/AGENT-PROMPTS-TIER1B.md b/plans/completed/AGENT-PROMPTS-TIER1B.md index 23e2458c..b8e420c3 100644 --- a/plans/completed/AGENT-PROMPTS-TIER1B.md +++ b/plans/completed/AGENT-PROMPTS-TIER1B.md @@ -575,7 +575,7 @@ Concretely: plan §5.3) — two services + a third "ambiguous" controller. - Create `tests/test_call_edge_matching.py` with cases 32–40. - Extend `tests/test_mcp_tools.py` with cases 41–48. -- Flip `propose/PRODUCT-VISION.md` `HTTP_CALLS` / `ASYNC_CALLS` rows +- Flip `docs/PRODUCT-VISION.md` `HTTP_CALLS` / `ASYNC_CALLS` rows from *planned* to *shipped*. ## Out of scope (do NOT touch) @@ -620,7 +620,7 @@ don't ship it. 11. New fixture `tests/fixtures/cross_service_smoke/`. 12. New test file `tests/test_call_edge_matching.py` with cases 32–40. 13. Cases 41–48 added to `tests/test_mcp_tools.py`. -14. `propose/PRODUCT-VISION.md` flipped (planned → shipped). +14. `docs/PRODUCT-VISION.md` flipped (planned → shipped). 15. `README.md` MCP tools section updated. ## Tests @@ -681,7 +681,7 @@ Expected: at least one caller from `chat-assign` with `match='cross_service'`. - [ ] Sentinel greps return expected results. - [ ] No file outside `build_ast_graph.py`, `kuzu_queries.py`, `server.py`, `pr_analysis.py`, `README.md`, - `propose/PRODUCT-VISION.md`, and the new `tests/` paths is + `docs/PRODUCT-VISION.md`, and the new `tests/` paths is modified. - [ ] PR description includes the scope statement, the manual evidence output (pass6 log + meta() snippet + find_route_callers output), diff --git a/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md b/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md index 13ec61f3..e87ee7e4 100644 --- a/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md +++ b/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md @@ -6,7 +6,7 @@ Status: **completed** — applied. Companion document to ## Why this file exists The brownfield plan grew through two review rounds; the second review -(`reports/review/active/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md`) +(`docs/reports/review/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-issues.md`) flagged design issues that were folded back into the plan in-place. Once they're inlined, they stop standing out — but they are exactly the parts an implementer is most likely to skim past or get wrong, diff --git a/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES.md b/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES.md index b6ef32f6..96bc081c 100644 --- a/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES.md +++ b/plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES.md @@ -805,7 +805,7 @@ order in this list is the only correct interleaving; do not reorder. - `README.md` — new section "Brownfield overrides" walking through Layer B (config), with a complete example block. Mention Layer C as the last resort, with the four interface declarations to copy-paste. -- `CODEBASE_REQUIREMENTS.md` — expand the role-inference section to note +- `docs/CODEBASE_REQUIREMENTS.md` — expand the role-inference section to note the override layers exist. - MCP server `instructions` string in `server.py` — one extra sentence noting that "role and capability inference can be customised per-project diff --git a/plans/completed/PLAN-CALL-GRAPH.md b/plans/completed/PLAN-CALL-GRAPH.md index bcde568e..3880d15e 100644 --- a/plans/completed/PLAN-CALL-GRAPH.md +++ b/plans/completed/PLAN-CALL-GRAPH.md @@ -460,7 +460,7 @@ Additions (~80 lines, no removals): - How to filter by `min_confidence`. - Why phantoms aren't dropped at index time. -### 8. `CODEBASE_REQUIREMENTS.md` +### 8. `docs/CODEBASE_REQUIREMENTS.md` Add a "Call graph" note listing the tree-sitter node types the extractor depends on: @@ -585,7 +585,7 @@ Single PR. Breaking changes: | 8 | Augment `_graph_expand_merge` to also call `expand_methods`. | `search_lancedb.py` | Graph-expand results include method-reachable chunks on the smoke corpus. | | 9 | Add MCP tools (`find_callers`, `find_callees`), `follow_calls` param on `trace_flow`, update `_INSTRUCTIONS`. | `server.py` | `test_mcp_tools.py` additions pass. | | 10 | Update tests: new files + extend `test_ast_graph_build.py` / `test_kuzu_queries.py` / `test_mcp_tools.py`. | `tests/` | `pytest` green. | -| 11 | Update `README.md` + `CODEBASE_REQUIREMENTS.md`. | docs | Manual review. | +| 11 | Update `README.md` + `docs/CODEBASE_REQUIREMENTS.md`. | docs | Manual review. | | 12 | Confirm `propose/completed/CALL-GRAPH-PROPOSE.md` is the only active call-graph proposal (old deferred draft already removed; git history retains it). | `propose/` | Directory listing shows a single call-graph proposal. | ## Out of scope (for this plan, tracked elsewhere) diff --git a/plans/completed/PLAN-CAPABILITIES-MODEL.md b/plans/completed/PLAN-CAPABILITIES-MODEL.md index 7c8d9e65..6707a17e 100644 --- a/plans/completed/PLAN-CAPABILITIES-MODEL.md +++ b/plans/completed/PLAN-CAPABILITIES-MODEL.md @@ -453,7 +453,7 @@ callers see them in results. - `README.md` — add a section "Capabilities" describing the multi-tag axis, the initial capability set, and `list_by_capability`. Keep the existing "Roles" section intact. -- `CODEBASE_REQUIREMENTS.md` — note the type-level granularity choice +- `docs/CODEBASE_REQUIREMENTS.md` — note the type-level granularity choice and the deferred per-method storage (link to this plan). - MCP server `instructions` string in `server.py` — one extra sentence pointing at `list_by_capability` for behavioural questions about diff --git a/plans/completed/PLAN-CLI-SCENARIOS.md b/plans/completed/PLAN-CLI-SCENARIOS.md index c07e3011..d1156d8e 100644 --- a/plans/completed/PLAN-CLI-SCENARIOS.md +++ b/plans/completed/PLAN-CLI-SCENARIOS.md @@ -73,7 +73,7 @@ before PR-CLI-2 so contributors exercising new subcommands do not pay multi-seco | --- | --- | --- | --- | --- | --- | | **PR-CLI-1** | Land / freeze propose (doc-only merge of `CLI-SCENARIOS-PROPOSE.md` if not already on `master`) | none | `propose/completed/CLI-SCENARIOS-PROPOSE.md` (status bump); `plans/completed/PLAN-CLI-SCENARIOS.md` (tracking) | n/a | none | | **PR-CLI-2** | Full implementation: lifecycle handlers, env + YAML + index layout, package rename, `server.py` / indexer / path helpers, **`mcp_v2.py`**, **`path_filtering.py`** (`.lancedb-mcp/ignore` → `.java-codebase-rag/ignore`), help redesign, tracking issue constant, user-visible stderr hints; **`mcp.json.example`** env keys = source of truth | none | `pyproject.toml`, package dir rename, `server.py`, `mcp_v2.py`, `java_codebase_rag/cli.py`, `java_index_flow_lancedb.py`, `graph_enrich.py`, `path_filtering.py`, `search_lancedb.py`, `kuzu_queries.py`, `build_ast_graph.py`, tests, `mcp.json.example`, `.gitignore`, any other `user_rag` / env / path references in Python | unit + integration + help-structure test (see below) | PR-CLI-1 merged | -| **PR-CLI-3** | Doc and example sweep + **`.cursor/rules/`** + migration sections + acceptance grep; **`mcp.json.example`** comment/example polish only (keys already correct from PR-CLI-2) | none | `README.md`, `docs/*`, `AGENTS.md`, `.cursor/rules/*.mdc`, `CODEBASE_REQUIREMENTS.md`, `mcp.json.example` (prose only if needed), selected `propose/*.md`, `.gitignore` notes | manual grep audit; `ruff` / `pytest` unchanged by docs | PR-CLI-2 merged | +| **PR-CLI-3** | Doc and example sweep + **`.cursor/rules/`** + migration sections + acceptance grep; **`mcp.json.example`** comment/example polish only (keys already correct from PR-CLI-2) | none | `README.md`, `docs/*`, `AGENTS.md`, `.cursor/rules/*.mdc`, `docs/CODEBASE_REQUIREMENTS.md`, `mcp.json.example` (prose only if needed), selected `propose/*.md`, `.gitignore` notes | manual grep audit; `ruff` / `pytest` unchanged by docs | PR-CLI-2 merged | Landing order: **PR-CLI-1 → PR-CLI-2 → PR-CLI-3**. @@ -284,7 +284,7 @@ Follow the **explicit file list** in propose §6 (`README.md`, `paper.pdf`, `AGENTS.md`, **`.cursor/rules/*.mdc`** (agent rules audit), `CODEBASE_REQUIREMENTS.md`, `mcp.json.example` (comments only — keys from PR-CLI-2), `propose/INDEX-AUTO-MODE-PROPOSE.md`, -`propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md`, `propose/PRODUCT-VISION.md`, +`propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md`, `docs/PRODUCT-VISION.md`, `.gitignore`). Add **Migration from legacy names** sections with explicit `mv` commands @@ -314,7 +314,7 @@ Add **Migration from legacy names** sections with explicit `mv` commands | # | Step | File(s) | Done when | | --- | --- | --- | --- | | 1 | README + CLI operator guide | `README.md`, `docs/JAVA-CODEBASE-RAG-CLI.md` | New subcommand table + 5 env vars + migration | -| 2 | Agent + checklist + requirements | `docs/*`, `CODEBASE_REQUIREMENTS.md` | No stale operator paths | +| 2 | Agent + checklist + requirements | `docs/*`, `docs/CODEBASE_REQUIREMENTS.md` | No stale operator paths | | 3 | Paper + proposes + example MCP JSON | `docs/paper/`, `propose/*`, `mcp.json.example` | PDF rebuilt; examples updated | | 4 | Acceptance grep | repo root | Reviewer sign-off | diff --git a/plans/completed/PLAN-CLIENT-ROLE-RENAME.md b/plans/completed/PLAN-CLIENT-ROLE-RENAME.md index bf2f2001..4edbcd08 100644 --- a/plans/completed/PLAN-CLIENT-ROLE-RENAME.md +++ b/plans/completed/PLAN-CLIENT-ROLE-RENAME.md @@ -188,7 +188,7 @@ Six references at `server.py:49, 689, 1141, 1338, 1342, 1418`: - Line 1342: docstring `"...FEIGN_CLIENT/REPOSITORY/MAPPER..."` → `"...CLIENT/REPOSITORY/MAPPER..."` - Line 1418: `entry_roles = ["CONTROLLER", "COMPONENT", "SERVICE", "FEIGN_CLIENT"]` → `[..., "CLIENT"]` -#### Change 6: Update `README.md` and `CODEBASE_REQUIREMENTS.md` +#### Change 6: Update `README.md` and `docs/CODEBASE_REQUIREMENTS.md` `README.md`: - Line 137: `trace_flow` description's stage chain `FEIGN_CLIENT/REPOSITORY/MAPPER` → `CLIENT/REPOSITORY/MAPPER` @@ -338,7 +338,7 @@ description. - [ ] `_ROLE_SCORE_WEIGHTS["CLIENT"] = 0.06` (was `FEIGN_CLIENT`) (`search_lancedb.py:188`) - [ ] Six `server.py` literal references updated (lines 49, 689, 1141, 1338, 1342, 1418) - [ ] `README.md` updated (3 lines + brownfield note) -- [ ] `CODEBASE_REQUIREMENTS.md` updated (lines 146, 162, 346-347) +- [ ] `docs/CODEBASE_REQUIREMENTS.md` updated (lines 146, 162, 346-347) - [ ] `tests/test_lancedb_e2e.py:342` allow-list updated - [ ] `ONTOLOGY_VERSION` bumped 8 → 9 with phase-comment update - [ ] All 9 new tests in `tests/test_client_role_rename.py` pass diff --git a/plans/completed/PLAN-HTTP-ROUTE-METHOD-ENUM.md b/plans/completed/PLAN-HTTP-ROUTE-METHOD-ENUM.md index 266edf78..a966c4d9 100644 --- a/plans/completed/PLAN-HTTP-ROUTE-METHOD-ENUM.md +++ b/plans/completed/PLAN-HTTP-ROUTE-METHOD-ENUM.md @@ -31,7 +31,7 @@ Depends on: **none** (lands on current `master`). | PR | Scope | Ontology bump | Files touched (approx) | Test buckets | Independent of | | --- | --- | --- | --- | --- | --- | | **PR-1** | `CodebaseHttpMethod.java` stub under route fixtures; **parameterized** structured stderr emitter (new small module and/or `build_ast_graph.py`) — **no production call sites** | **none** | `tests/fixtures/brownfield_route_stubs/...`, emitter module, **1** new test | Unit test exercises emitter directly (INFO + WARN shapes as needed) | none | -| **PR-2** | Rename stubs + route stub field type; `ast_java.py` recognition + client `method` enum parse + **extractor-time** INFO shadowing + WARN on string `method`; `graph_enrich.py` HTTP branch replace **without** merge-time shadowing + `meta_chain` / log strings; tighten `test_23`; **new** inbound exclusivity test; README + `CODEBASE_REQUIREMENTS.md` + any other doc hits for examples | **11 → 12** (`ast_java.ONTOLOGY_VERSION`; README / `AGENTS.md` callouts) | `ast_java.py`, `graph_enrich.py`, structured-log module (see PR-2 §4), stubs, `tests/test_*.py` listed below, `README.md`, `CODEBASE_REQUIREMENTS.md`, `build_ast_graph.py` if comment at ~1904 | Full `pytest tests`; new exclusivity + optional shadowing log test | PR-1 merged | +| **PR-2** | Rename stubs + route stub field type; `ast_java.py` recognition + client `method` enum parse + **extractor-time** INFO shadowing + WARN on string `method`; `graph_enrich.py` HTTP branch replace **without** merge-time shadowing + `meta_chain` / log strings; tighten `test_23`; **new** inbound exclusivity test; README + `docs/CODEBASE_REQUIREMENTS.md` + any other doc hits for examples | **11 → 12** (`ast_java.ONTOLOGY_VERSION`; README / `AGENTS.md` callouts) | `ast_java.py`, `graph_enrich.py`, structured-log module (see PR-2 §4), stubs, `tests/test_*.py` listed below, `README.md`, `docs/CODEBASE_REQUIREMENTS.md`, `build_ast_graph.py` if comment at ~1904 | Full `pytest tests`; new exclusivity + optional shadowing log test | PR-1 merged | | **PR-3** | Agent docs + v2 addendum only | **none** | `docs/AGENT-GUIDE.md`, `docs/skills/java-codebase-explore.md` (if needed), `propose/completed/BROWNFIELD-ANNOTATIONS-V2-ADDENDUM-HTTP-METHOD-ENUM.md` | Docs-only CI | PR-2 merged | Landing order: **PR-1 → PR-2 → PR-3**. @@ -136,7 +136,7 @@ Landing order: **PR-1 → PR-2 → PR-3**. - `tests/test_assign_endpoint_client_extraction.py` — Feign + JAX-RS mirror; `method = HttpMethod.POST` may need alignment with **`CodebaseHttpMethod.POST`** on brownfield annotation per surface rules. - `tests/test_cross_service_resolution_flag.py` — generated Java strings. -### 8. `README.md` and `CODEBASE_REQUIREMENTS.md` +### 8. `README.md` and `docs/CODEBASE_REQUIREMENTS.md` - Replace annotation names and examples with `@CodebaseHttpClient` / enum `method`; add **Re-index required** callout for PR-2 + ontology **12**. @@ -185,7 +185,7 @@ Landing order: **PR-1 → PR-2 → PR-3**. | 3 | Merge HTTP replace (behaviour only) | `graph_enrich.py` | Replace pattern; **zero** shadowing logs from this file | | 4 | Verbose plumbing if needed | `build_ast_graph.py` → parse entry | Shadowing INFO respects volume gate | | 5 | `meta_chain` + log strings | `graph_enrich.py` | Grep clean for old simple names | -| 6 | Tests + docs | `tests/*`, `README.md`, `CODEBASE_REQUIREMENTS.md` | Full pytest; doc examples match stubs | +| 6 | Tests + docs | `tests/*`, `README.md`, `docs/CODEBASE_REQUIREMENTS.md` | Full pytest; doc examples match stubs | | 7 | Ontology + meta test | `ast_java.py`, `tests/test_call_edges_e2e.py` | `meta()` reports 12 | --- diff --git a/plans/completed/PLAN-MCP-API-V2.md b/plans/completed/PLAN-MCP-API-V2.md index 01c2ba9f..eaa3be60 100644 --- a/plans/completed/PLAN-MCP-API-V2.md +++ b/plans/completed/PLAN-MCP-API-V2.md @@ -356,7 +356,7 @@ For each, also delete: "operational — moving to `user-rag` CLI in next release". This is a one-PR transition state. -### 3. `propose/PRODUCT-VISION.md` — agent-recipe examples +### 3. `docs/PRODUCT-VISION.md` — agent-recipe examples - Update any example invocations from v1 to v2. Search for `find_callers`, `list_routes`, `list_clients`, `find_route_*`, `trace_*`, `impact_*` and @@ -551,7 +551,7 @@ DoD is the delta + suite-green, not an absolute total. | v2 handlers diverge in behaviour from v1 | V2-1 | Equivalence tests (14 of them) compare returned id sets directly. Drift is caught at PR review. | | `direction`/`edge_types` required-field change breaks existing clients | V2-1 | No existing clients — confirmed by Dmitry ("nobody uses this MCP bundle yet"). Tests assert `ValidationError` is raised, which is the contract. | | `describe.edge_summary` adds N round-trips per call | V2-2 | Single grouped count query, not 9 round-trips. Test asserts call count via Kuzu connection mock. | -| Removing v1 tools breaks the agent system prompt | V2-3 | `propose/PRODUCT-VISION.md` and README are updated in the same PR. Agent prompt is separate (not in this repo). | +| Removing v1 tools breaks the agent system prompt | V2-3 | `docs/PRODUCT-VISION.md` and README are updated in the same PR. Agent prompt is separate (not in this repo). | | CLI subprocess tests are slow / flaky | V2-4 | Each subprocess invocation hits a pre-built fixture under `/tmp`; no rebuilds inside tests. Targeted at < 5s total. | | `pyproject.toml` package layout breaks the existing flat-script bundle | V2-4 | Today's `packages = []` is intentional; we promote it to `packages = ["user_rag"]` only — root scripts (`server.py`, `build_ast_graph.py`, etc.) stay outside the package. Tested by `pip install .` succeeding. | | `pr-review` skill under `.cursor/skills/pr-review/` still calls `analyze_pr` MCP after V2-4 | V2-4 | PR description includes a manual TODO to update that skill. CLI version of the call is documented in README's "Migration from v1" subsection. | diff --git a/plans/completed/PLAN-REMOTE-PROJECT-INDEXING.md b/plans/completed/PLAN-REMOTE-PROJECT-INDEXING.md index 3ed41801..b36e4ee0 100644 --- a/plans/completed/PLAN-REMOTE-PROJECT-INDEXING.md +++ b/plans/completed/PLAN-REMOTE-PROJECT-INDEXING.md @@ -163,7 +163,7 @@ Users will configure their MCP client (Cursor, Claude Code, etc.) like this: | `server.py` | `_cocoindex_subprocess_env(root)` + pass `env=` to CocoIndex subprocess | | `README.md` | Env table + `refresh_code_index` subprocess note | | `mcp.json.example` | Example `LANCEDB_MCP_PROJECT_ROOT` path | -| `CODEBASE_REQUIREMENTS.md` | Project root / CocoIndex / MCP consistency | +| `docs/CODEBASE_REQUIREMENTS.md` | Project root / CocoIndex / MCP consistency | | `tests/test_mcp_tools.py` | `test_cocoindex_subprocess_env_sets_project_root` | --- @@ -172,5 +172,5 @@ Users will configure their MCP client (Cursor, Claude Code, etc.) like this: - [x] Step 1: Modify `java_index_flow_lancedb.py` - [x] Step 2: Modify `server.py` (including `_cocoindex_subprocess_env` helper + unit test) -- [x] Step 3: Update documentation (`README.md`, `mcp.json.example`, `CODEBASE_REQUIREMENTS.md`, flow docstring) +- [x] Step 3: Update documentation (`README.md`, `mcp.json.example`, `docs/CODEBASE_REQUIREMENTS.md`, flow docstring) - [x] Testing: `test_cocoindex_subprocess_env_sets_project_root`; heavy e2e unchanged (`cwd` + no env → `.` root) diff --git a/plans/completed/PLAN-TIER1B-COMPLETION.md b/plans/completed/PLAN-TIER1B-COMPLETION.md index aa32e46a..09b9acd1 100644 --- a/plans/completed/PLAN-TIER1B-COMPLETION.md +++ b/plans/completed/PLAN-TIER1B-COMPLETION.md @@ -938,7 +938,7 @@ path in both services. Used by tests 32, 34, 37. | 10 | Extend `analyze_pr`: `cross_service_callers_count` per changed symbol | `server.py`, `pr_analysis.py` | test 47 passes | | 11 | Risk-score weight bump in `pr_analysis.py` | `pr_analysis.py` | docstring + unit test | | 12 | Create `tests/fixtures/cross_service_smoke/` | `tests/fixtures/...` | fixture files in place | -| 13 | Update `README.md` MCP tools section + `propose/PRODUCT-VISION.md` (`HTTP_CALLS` planned → shipped) | `README.md`, `propose/PRODUCT-VISION.md` | manual review | +| 13 | Update `README.md` MCP tools section + `docs/PRODUCT-VISION.md` (`HTTP_CALLS` planned → shipped) | `README.md`, `docs/PRODUCT-VISION.md` | manual review | --- @@ -988,7 +988,7 @@ path in both services. Used by tests 32, 34, 37. 6. Existing MCP tools extended (`impact_analysis`, `trace_flow`, `analyze_pr`). 7. `README.md` updated for caller-side edges, brownfield clients, - match outcomes; `propose/PRODUCT-VISION.md` flips + match outcomes; `docs/PRODUCT-VISION.md` flips `HTTP_CALLS` / `ASYNC_CALLS` from *planned* to *shipped*. 8. Each PR's description quotes the relevant stats from a manual run on bank-chat-system as evidence. diff --git a/propose/completed/CALL-GRAPH-PROPOSE.md b/propose/completed/CALL-GRAPH-PROPOSE.md index 9f2a6028..a37284af 100644 --- a/propose/completed/CALL-GRAPH-PROPOSE.md +++ b/propose/completed/CALL-GRAPH-PROPOSE.md @@ -4,7 +4,7 @@ Status: **completed** — shipped (static intra-JVM `CALLS` + `DECLARES`; plan: [`plans/completed/PLAN-CALL-GRAPH.md`](../../plans/completed/PLAN-CALL-GRAPH.md) for the step-by-step implementation. -This proposal realises **point 4 of `PRODUCT-VISION.md`** ("Adding a Call +This proposal realises **point 4 of `docs/PRODUCT-VISION.md`** ("Adding a Call Graph Layer") with a deliberately narrow scope: **static, intra-JVM method-to-method edges**. Cross-service HTTP/async, AOP-proxy resolution, and runtime-trace ingestion are explicit non-goals of this phase. @@ -535,7 +535,7 @@ round-trip test. 0.5 day: server surface + search-side expansion). - Validation on `bank-chat-system` + micro-fixture: **1 day** (unit + integration + regression run; manual trace_flow spot-checks). -- Documentation update (`README.md`, `CODEBASE_REQUIREMENTS.md`, MCP +- Documentation update (`README.md`, `docs/CODEBASE_REQUIREMENTS.md`, MCP instructions): **2 hours**. Total: **3–4 working days** including tests and docs. diff --git a/propose/completed/CLI-SCENARIOS-PROPOSE.md b/propose/completed/CLI-SCENARIOS-PROPOSE.md index 42e58875..f68ca868 100644 --- a/propose/completed/CLI-SCENARIOS-PROPOSE.md +++ b/propose/completed/CLI-SCENARIOS-PROPOSE.md @@ -312,11 +312,11 @@ Open a GitHub issue titled **"AST graph (Kuzu) incremental rebuild"** referencin - `docs/paper/paper.tex` — architecture paper updated for new CLI verbs / env vars / file paths; rebuild `paper.pdf` (Russian translation `paper_ru.tex` is a standalone artifact outside the repo and is not in scope). - `AGENTS.md` — CLI doc reference + any `refresh` mention. - `.cursor/rules/*.mdc` — agent workflow / env / CLI contract; see **Agent rules audit** below (must match post-rename surface). -- `CODEBASE_REQUIREMENTS.md` — every `.lancedb-mcp.yml` / `LANCEDB_MCP_*` / `lancedb_data` reference updated. +- `docs/CODEBASE_REQUIREMENTS.md` — every `.lancedb-mcp.yml` / `LANCEDB_MCP_*` / `lancedb_data` reference updated. - `mcp.json.example` — **PR-CLI-3 is a second pass only:** PR-CLI-2 updates this file so **env keys match the live server**; PR-CLI-3 reconciles comments, examples, and any doc drift — **no conflicting edits**; if both PRs touch it, **PR-CLI-2 wins** for structure, PR-CLI-3 for prose polish. - `propose/INDEX-AUTO-MODE-PROPOSE.md` — one-line note that `refresh` is being renamed to `reprocess`. - `propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md` — one-line note that the new tracking issue (created in PR-CLI-2) is the user-facing handle. -- `propose/PRODUCT-VISION.md` — update `lancedb_data` mention (§ about Kuzu's on-disk footprint) and any `refresh` reference. +- `docs/PRODUCT-VISION.md` — update `lancedb_data` mention (§ about Kuzu's on-disk footprint) and any `refresh` reference. - `.gitignore` — add `.java-codebase-rag/`, keep `lancedb_data/` for grace-period cleanup, or remove if PR-CLI-2 drops that grace-period entry. **Agent rules audit (PR-CLI-3, manual checklist — use together with acceptance grep below):** @@ -342,7 +342,7 @@ Expected output after PR-CLI-3 (docs + rules): The startup-slowness fix (deferred imports in `cli.py`) is a **separate, prior PR** outside this migration; it does not change the surface and should land before PR-CLI-2 so contributors testing the new subcommands aren't taxed by the multi-second startup. -**PR-CLI-3 (docs sweep):** README, `docs/*`, `AGENTS.md`, `.cursor/rules/*.mdc`, `CODEBASE_REQUIREMENTS.md`, `mcp.json.example` path placeholders, selected `propose/*.md` one-line notes, `docs/paper/paper.tex` + rebuilt `paper.pdf`, migration `mv` sections, and acceptance grep per the command in this section. +**PR-CLI-3 (docs sweep):** README, `docs/*`, `AGENTS.md`, `.cursor/rules/*.mdc`, `docs/CODEBASE_REQUIREMENTS.md`, `mcp.json.example` path placeholders, selected `propose/*.md` one-line notes, `docs/paper/paper.tex` + rebuilt `paper.pdf`, migration `mv` sections, and acceptance grep per the command in this section. --- diff --git a/propose/completed/CLIENT-ROLE-RENAME-PROPOSE.md b/propose/completed/CLIENT-ROLE-RENAME-PROPOSE.md index 67ef8d3b..806aa3e5 100644 --- a/propose/completed/CLIENT-ROLE-RENAME-PROPOSE.md +++ b/propose/completed/CLIENT-ROLE-RENAME-PROPOSE.md @@ -239,7 +239,7 @@ Verified count: **5 production files, ~12 references**: | `server.py` | 49, 687, 1138, 1335, 1339, 1415 | MCP tool docstrings, `role` enum strings, entry-role filter | | `tests/test_lancedb_e2e.py` | 342 | One assertion | -Plus docs: `README.md`, `CODEBASE_REQUIREMENTS.md`. Doc sweep is +Plus docs: `README.md`, `docs/CODEBASE_REQUIREMENTS.md`. Doc sweep is straightforward. ### 4.4 Brownfield input today @@ -398,7 +398,7 @@ story as every other graph-shape change). - README: rename role table; mention `CLIENT` + `HTTP_CLIENT` capability; document the `MESSAGE_PRODUCER` capability that already exists for symmetry. -- `CODEBASE_REQUIREMENTS.md`: rename references. +- `docs/CODEBASE_REQUIREMENTS.md`: rename references. - `propose/DEFERRED-REST-CLIENT-MIGRATION-PROPOSE.md`: **delete** (this proposal supersedes it; the rename-vs-capability decision is reversed by current architecture). diff --git a/propose/completed/TIER1-COMPLETION-PROPOSE.md b/propose/completed/TIER1-COMPLETION-PROPOSE.md index c3b1a541..25585df8 100644 --- a/propose/completed/TIER1-COMPLETION-PROPOSE.md +++ b/propose/completed/TIER1-COMPLETION-PROPOSE.md @@ -1,7 +1,7 @@ # Tier 1 completion — proposal (shipped) Status: **completed — shipped via PR-A1 → PR-C** (merged 2026-04 → 2026-05). Moved to `propose/completed/` after PR-D3 (Tier 1B) landed. Pairs with the borrow guide -[`reports/what-to-borrow-from-cmm.md`](../../reports/what-to-borrow-from-cmm.md) +[`docs/reports/what-to-borrow-from-cmm.md`](../../docs/reports/what-to-borrow-from-cmm.md) and follows on from the completed [`propose/completed/CALL-GRAPH-PROPOSE.md`](CALL-GRAPH-PROPOSE.md). @@ -859,10 +859,10 @@ Independent PRs, but a sensible review order: section with `route_overrides` examples and `@CodebaseRoute` source stub** — same shape as the existing `role_overrides` / `@CodebaseRole` material. -- `CODEBASE_REQUIREMENTS.md`: update the schema diagram and the env-var +- `docs/CODEBASE_REQUIREMENTS.md`: update the schema diagram and the env-var table (`.lancedb-mcp-ignore` mention). Document the route resolver five-layer composition table from §4.6.4. -- `propose/PRODUCT-VISION.md`: tick B2a / B4 / B5 off the roadmap; note +- `docs/PRODUCT-VISION.md`: tick B2a / B4 / B5 off the roadmap; note B2b + B6 as the next proposal. --- @@ -902,7 +902,7 @@ Open questions to settle during implementation, not now: - [ ] `ONTOLOGY_VERSION` bumped 4 → 5; stale-graph guard test added. - [ ] README brownfield section extended with `route_overrides` and `@CodebaseRoute` examples. -- [ ] `CODEBASE_REQUIREMENTS.md` documents the §4.6.4 five-layer +- [ ] `docs/CODEBASE_REQUIREMENTS.md` documents the §4.6.4 five-layer composition table. - [ ] No regressions in existing role / capability resolution (run the existing brownfield test suite). @@ -918,7 +918,7 @@ Open questions to settle during implementation, not now: - [ ] Old `compile_excluded_glob_patterns` call sites replaced (3 of them). - [ ] `graph_meta` exposes `ignore_layers`. -- [ ] `CODEBASE_REQUIREMENTS.md` documents the layer order. +- [ ] `docs/CODEBASE_REQUIREMENTS.md` documents the layer order. --- @@ -947,9 +947,9 @@ follow-ups, in order of leverage: ## 11. References - [`TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md`](TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md) - B2b + B6 propose -- [`reports/what-to-borrow-from-cmm.md`](../../reports/what-to-borrow-from-cmm.md) — original borrow guide (Tier 1 §B1–B5). +- [`docs/reports/what-to-borrow-from-cmm.md`](../../docs/reports/what-to-borrow-from-cmm.md) — original borrow guide (Tier 1 §B1–B5). - [`propose/completed/CALL-GRAPH-PROPOSE.md`](CALL-GRAPH-PROPOSE.md) — completed call-graph proposal; same shape & style. -- [`reports/call-graph-review.md`](../../reports/call-graph-review.md) — review that surfaced the resolver / extractor invariants. +- [`docs/reports/call-graph-review.md`](../../docs/reports/call-graph-review.md) — review that surfaced the resolver / extractor invariants. - [`plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md`](../../plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md) — **mandatory reading** for the implementer of §4.6 (brownfield route resolver mirrors this design). - `graph_enrich.py` §"brownfield role / capability overrides" — the existing implementation B2a extends. diff --git a/propose/completed/TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md b/propose/completed/TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md index 0a4f0150..b78e3e1a 100644 --- a/propose/completed/TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md +++ b/propose/completed/TIER1B-HTTP-ASYNC-EDGES-PROPOSE.md @@ -21,9 +21,9 @@ Before working on this proposal, read in order: 1. [`TIER1-COMPLETION-PROPOSE.md`](TIER1-COMPLETION-PROPOSE.md) §4 (B2a `Route` + `EXPOSES`) — defines every join key used here. -2. [`reports/what-to-borrow-from-cmm.md`](../../reports/what-to-borrow-from-cmm.md) +2. [`docs/reports/what-to-borrow-from-cmm.md`](../../docs/reports/what-to-borrow-from-cmm.md) §B2 (Route shape) and §B6 (cross-service edges). -3. [`reports/call-graph-review.md`](../../reports/call-graph-review.md) +3. [`docs/reports/call-graph-review.md`](../../docs/reports/call-graph-review.md) — same correctness invariants apply (microservice scoping, confidence semantics, phantom-id collisions). 4. [`plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md`](../../plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md) @@ -466,7 +466,7 @@ but conservative. ## 12. References - [`TIER1-COMPLETION-PROPOSE.md`](TIER1-COMPLETION-PROPOSE.md) — B2a, B4, B5 (active). -- [`reports/what-to-borrow-from-cmm.md`](../../reports/what-to-borrow-from-cmm.md) §B2, §B6. -- [`reports/call-graph-review.md`](../../reports/call-graph-review.md) — invariants this proposal must not regress. +- [`docs/reports/what-to-borrow-from-cmm.md`](../../docs/reports/what-to-borrow-from-cmm.md) §B2, §B6. +- [`docs/reports/call-graph-review.md`](../../docs/reports/call-graph-review.md) — invariants this proposal must not regress. - [`plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md`](../../plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md) — mandatory reading for §6. -- [`propose/PRODUCT-VISION.md`](PRODUCT-VISION.md) §3 — `HTTP_CALLS` / `ASYNC_CALLS` are listed as *planned*; this proposal flips them to *shipped*. +- [`docs/PRODUCT-VISION.md`](../docs/PRODUCT-VISION.md) §3 — `HTTP_CALLS` / `ASYNC_CALLS` are listed as *planned*; this proposal flips them to *shipped*. diff --git a/propose/stale/TIER2-INCREMENTAL-REBUILD-PROPOSE.md b/propose/stale/TIER2-INCREMENTAL-REBUILD-PROPOSE.md index 81e8b092..73101047 100644 --- a/propose/stale/TIER2-INCREMENTAL-REBUILD-PROPOSE.md +++ b/propose/stale/TIER2-INCREMENTAL-REBUILD-PROPOSE.md @@ -5,7 +5,7 @@ User-facing tracking for graph-side incremental work: GitHub issue **#73** (link Pairs with the focused MCP-tool proposal [`propose/INDEX-AUTO-MODE-PROPOSE.md`](INDEX-AUTO-MODE-PROPOSE.md) (decision engine for `refresh_code_index`) and supersedes its -"future Kuzu work" footnote in [`propose/PRODUCT-VISION.md`](PRODUCT-VISION.md) §99. +"future Kuzu work" footnote in [`docs/PRODUCT-VISION.md`](../docs/PRODUCT-VISION.md) §99. This is a **proposal**, not an implementable plan. After review and scoping decisions (the §11 [TBD] list), an implementable From 67ce67b3be05c57db348ab90daf5f46d9e753f1b Mon Sep 17 00:00:00 2001 From: Dmitry Teryaev Date: Sun, 24 May 2026 13:11:17 +0300 Subject: [PATCH 3/3] fix: update CONFIGURATION.md link in CODEBASE_REQUIREMENTS.md The link was pointing to ./docs/CONFIGURATION.md which is incorrect from within the docs/ directory. Changed to ./CONFIGURATION.md. Co-Authored-By: Claude Opus 4.7 --- docs/CODEBASE_REQUIREMENTS.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/CODEBASE_REQUIREMENTS.md b/docs/CODEBASE_REQUIREMENTS.md index 01737872..8442ba20 100644 --- a/docs/CODEBASE_REQUIREMENTS.md +++ b/docs/CODEBASE_REQUIREMENTS.md @@ -173,7 +173,7 @@ and `capabilities`, register inbound routes, and register outbound clients/producers for a given repo via `.java-codebase-rag.yml` at the project root (`role_overrides:`, `route_overrides:`, `http_client_overrides:`, `async_producer_overrides:`) and/or by copying the in-source stubs from -[`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) into your sources: +[`docs/CONFIGURATION.md`](./CONFIGURATION.md) into your sources: - `@CodebaseRole` / `@CodebaseCapability` / `@CodebaseCapabilities` (class-level role + capabilities) — see `docs/CONFIGURATION.md` §4.3.