Skip to content

feat(jrag): ship agent-facing jrag CLI (JRAG-CLI plan, 9 PRs)#377

Open
HumanBean17 wants to merge 25 commits into
masterfrom
feat/cli
Open

feat(jrag): ship agent-facing jrag CLI (JRAG-CLI plan, 9 PRs)#377
HumanBean17 wants to merge 25 commits into
masterfrom
feat/cli

Conversation

@HumanBean17

Copy link
Copy Markdown
Owner

jrag — agent-facing CLI (JRAG-CLI plan)

Implements plans/active/PLAN-JRAG-CLI.md — a new jrag console script that gives an AI coding agent one command per engineering intent, taking human-readable identifiers (FQN / simple name / route path / topic) and never raw node IDs. Built as a thin compose-and-render layer over the existing backend (resolve_v2, find_v2/search_v2/describe_v2/neighbors_v2, LadybugGraph, run_search); loads the index in-process per call; no daemon, no ontology bump, no cocoindex dependency.

Resolves #END_OF_PLAN_FULL_SUITE.

What lands (9 PRs)

PR Scope
PR-JRAG-0a Single source of truth for shipped skill/agent docs (scripts/sync_agent_artifacts.py + drift gate)
PR-JRAG-0b Extract resolve_v2 → root resolve_service.py (+ shared graph_types.py); mcp_v2 re-exports
PR-JRAG-1a jrag entry + Envelope + text/JSON render + resolve-first + status (frozen contract)
PR-JRAG-1b find (query + filter modes) + inspect
PR-JRAG-2 Listings: routes/clients/producers/topics/jobs/listeners/entities
PR-JRAG-3a Direct traversals: callers/callees/hierarchy/implementations/subclasses/overrides/overridden-by/dependents/impact/decompose/flow
PR-JRAG-3b Compose traversals: callees (client/producer) + dependencies + connection + outline + imports
PR-JRAG-4 Orientation (microservices/map/conventions/overview) + search + agent_next_actions (wired into all commands) + packaging
PR-JRAG-5 java-codebase-rag install --surface mcp|cli branching + CLI skill/subagent (fixes the CLI-only update fatal-exit)

Every <query> command is resolve-first (one→run, many→candidates+stop, nonenot_found; raw IDs never required). --offset exists only on find/search and is rejected everywhere else. Default output is token-lean text; --format json emits the envelope verbatim. truncated via +1-fetch (or "narrow your query" for non-offset commands). agent_next_actions ≤5.

Verification

  • Per-PR reviews: implementer + spec/quality reviewer per PR; all green after fix rounds.
  • Final whole-branch review: cross-PR contract holds (resolve-first everywhere, --offset scope, explicit shape dispatch, lazy-import invariant branch-wide, no cocoindex in the CLI). One Critical (4 advertised-but-unimplemented common flags) + one regression (PR-5 install --verbose mock) — both fixed.
  • Full suite (end-of-plan gate): 1003 passed, 12 failed, 14 skipped. All 12 failures are pre-existing — verified by running them at the branch base b459adc, where they fail identically (test_mcp_tools×4, test_lance_optimize×4, test_mcp_v2 async×3 [need pytest-asyncio], test_cli_progress_stdout_invariant×1). Zero regressions from this branch.
  • scripts/sync_agent_artifacts.py --check green; jrag --help loads no torch/sentence_transformers/mcp_v2/cocoindex.

Accepted deviations from the plan

  • NodeRef in a new graph_types.py (plan said "stays in mcp_v2"). Unsatisfiable as written: ResolveCandidate.node: NodeRef forces load-time resolution, and mcp_v2 ↔ resolve_service re-export is a load-time cycle. graph_types is a leaf module; mcp_v2 re-exports NodeRef for back-comat. Reviewer-confirmed sound.
  • --fuzzy on find deferred (jrag find --fuzzy: name prefix/contains fallback (deferred from PR-JRAG-1b) #375): backend find_by_name_or_fqn is exact-only/Symbol-only; faithful name-prefix/contains needs a backend LIKE change (out of scope for the thin-CLI PRs).
  • topics --consumer-in / listeners --topic-prefix use the EXPOSES edge: the plan's neighbors(producer_ids, "in", ["ASYNC_CALLS"]) was wrong — ASYNC_CALLS is Producer→Route (outbound). Consumers (listeners) reach topics via listener_method -[:EXPOSES]-> Route(topic). Unified in _resolve_topic_consumers.
  • Dropped --brief/--fields/--count/--exists (jrag: implement --count/--exists/--brief/--fields output flags (deferred from JRAG-CLI) #376): registered on the common parser in PR-1a but never implemented; removed (rather than ship advertised-but-broken behavior). Tracked for implementation.

Follow-ups

Merge notes

19 commits (plan + 9 PRs + review fix rounds). Squash-merge OK (matches the repo's recent workflow), or preserve the per-PR commit history — reviewer's call. The propose/plan docs can move to propose/completed/ / plans/completed/ on merge (per AGENTS.md hygiene; tracked in the plan).

🤖 Generated with Claude Code

HumanBean17 and others added 21 commits July 4, 2026 21:07
Adds propose/JRAG-CLI-PROPOSE.md and plans/active/PLAN-JRAG-CLI.md for a
new `jrag` agent-facing CLI: a thin compose-and-render layer over
resolve_v2 + the MCP v2 handlers + LadybugGraph, internalizing resolve so
every command takes a human identifier (FQN / simple name / route path /
topic), never a raw node id.

9 PRs (0a, 0b, 1a, 1b, 2, 3a, 3b, 4, 5); in-process (no daemon); no
ontology bump / re-index. Plan is grounded against current source and was
revised after a 5-subagent adversarial review (6 blockers + ~7 highs
folded in: offset un-globalized, overrides/overridden-by direction fixed,
resolve_operator_config reuse, fd-limit, pydantic->dict boundary, PR-5
completeness).

Co-Authored-By: Claude <noreply@anthropic.com>
Lift the resolve pipeline out of mcp_v2.py into a transport-agnostic,
neutral-named root module so the CLI's resolve-first layer imports
resolve_service and cannot silently re-implement the pipeline.

Moved to resolve_service.py:
- resolve_v2 + ResolveOutput + ResolveCandidate + ResolveStatus
- All resolve-only private helpers (validate/parse/collect/dedupe/rank/finalize)
- Resolve-only constants (_RESOLVE_*, _*_RESOLVE_RETURN projections)

Moved to graph_types.py (neutral shared module, breaks the load-time cycle
that the plan's "NodeRef stays in mcp_v2" line would have created):
- NodeRef (used by Edge.other, describe_v2, and ResolveCandidate/Output)
- StructuredHint (needed by _to_structured_hints)
- Shared helpers: _hints_or_skip, _node_ref_from_row, _resolve_node_kind,
  _node_kind_from_id, _to_structured_hints, set_hints_enabled +
  the _hints_enabled flag (single source of truth — previously duplicated,
  which was a latent bug since server.py only set it on mcp_v2)

mcp_v2.py re-exports resolve_v2/ResolveOutput/ResolveCandidate/ResolveStatus
(from resolve_service) and NodeRef/StructuredHint/set_hints_enabled
(from graph_types) so every existing importer is unchanged. Zero MCP SDK
imports in any of the three modules. No call site changed.

Co-Authored-By: Claude <noreply@anthropic.com>
Ships the frozen foundation every later JRAG-CLI PR builds on:

- `java_codebase_rag/jrag_envelope.py` - lean `@dataclass` Envelope (not pydantic),
  `resolve_query` (resolve-first mapper: one->proceed+file_location, many->candidates
  capped at 10 with reason, none->not_found with `jrag search` hint; auto-pick forbidden),
  `normalize_enum` (+ explicit lookup tables for client_kind/producer_kind/source_layer
  confirmed against java_ontology/graph_enrich), `mark_truncated` (+1-fetch helper),
  `simple_name` (fqn.rsplit('.', 1)[-1] - NodeRef has no `name`), `to_envelope_rows`
  (pydantic->dict boundary via `.model_dump()` once).
- `java_codebase_rag/jrag_render.py` - fresh text renderer (listing omits FQN, traversal
  shows `root:` + edge rows with `conf:` only on CALLS/HTTP_CALLS/ASYNC_CALLS, inspect
  renders ALL keys alphabetical, ambiguous carries reason/no file, scalar fallback),
  `tiered_name` (simple name -> name @service -> FQN), truncated hints
  ("narrow your query" vs "use --offset <N>").
- `java_codebase_rag/jrag.py` - `build_parser` (no global `--offset` - added only to
  find/search in PR-1b/PR-4; only `status` registered in 1a), `_resolve_cfg`
  (reuses cocoindex-free `resolve_operator_config` + `apply_to_os_environ`),
  `_load_graph` (exists check -> `_IndexNotFound`; ontology-mismatch RuntimeError ->
  `_IndexStale`; both surface as actionable envelopes), `main` (raise_fd_limit first;
  argparse.ArgumentError -> exit 1, handler exception -> exit 2 with status:error
  envelope to stdout AND traceback to stderr - deliberate divergence from operator
  CLI's traceback-swallowing), `_console_script_main` (os._exit wrapper for the
  lancedb/pyarrow worker-thread teardown race), `_cmd_status`.
- `pyproject.toml` - `[project.scripts] jrag = "java_codebase_rag.jrag:_console_script_main"`.
- `README.md` - "## jrag (agent CLI, preview)" subsection.

Lazy-import invariant: `build_parser()` imports no backend modules - the sentinel
`python -c "import java_codebase_rag.jrag as j; j.build_parser()"` loads no
torch/sentence_transformers/mcp_v2. Verified.

Tests: 42 focused tests across test_jrag_{envelope,render,status}.py - all named
tests 1-22 from the brief plus extras (mark_truncated boundaries, simple_name on
pydantic-via-boundary, tiered_name tiers, status text-format envelope,
--offset rejected on status subparser and before subcommand). ruff clean.
Operator CLI tests still pass 62/62 (no regressions).

Co-Authored-By: Claude <noreply@anthropic.com>
…spatch (PR-JRAG-1a)

Addresses review Important findings + trivially-correct Minors on top of bdfc670.

Important #1 - honest offset-not-global coverage
- Deleted `test_offset_is_not_a_global_flag` (`jrag callers --offset 5`):
  `callers` is not registered in 1a, so argparse rejected the *subcommand*
  before seeing `--offset`. The test passed for the WRONG reason and would
  not catch a regression that added `--offset` to the `common` parent parser.
  The contract is honestly covered by three siblings:
    * `test_offset_not_accepted_on_status_subparser` (status IS registered)
    * `test_offset_not_accepted_before_subcommand` (the key "not on parent"
      test - `jrag --offset 5 status`)
    * `test_jrag_help_lists_status_subcommand` (asserts `--offset` absent
      from `--help`).

Important #2 - stop overloading `edge_summary` as the inspect-shape signal
- Approach (b): made the dispatch signal STRUCTURAL, not name-based.
  `_render_text_shape` now routes to `_render_inspect` when any node carries
  any dict-typed value; `_render_inspect` renders ANY dict-typed value as
  an indented alphabetical section (not just `edge_summary`). `edge_summary`
  is no longer special - it is reserved for PR-JRAG-3 real edge data and is
  one of many possible section sources.
- `_cmd_status` no longer stuffs `{counts, edges}` under `edge_summary`;
  they are top-level dict-valued fields on the index node. Test updated to
  read `index["counts"]` (not `index["edge_summary"]["counts"]`) and assert
  `edge_summary` is NOT populated by status.

Minors:
- jrag_render.py: removed dead `if TYPE_CHECKING: pass` block + unused
  `TYPE_CHECKING` import.
- jrag_envelope.py `to_envelope_rows`: replaced `dict(item)` fallback with
  `raise TypeError(...)` so a non-pydantic/non-dict item surfaces a
  backend-contract bug instead of silently coercing.
- jrag_render.py `_render_text_shape`: added a one-line precedence comment
  documenting that `ok + root` wins over `ok + nested-dict nodes` by design.
- jrag_envelope.py `to_dict`: copy semantics now uniform - all collection
  fields shallow-copied (nodes/edges/candidates/agent_next_actions/warnings)
  so the returned dict is a stable snapshot at call time.

Verification:
- Focused: 41/41 pass (down 1 from 42 - the misleading test, expected).
- ruff clean.
- Sentinels green (no mcp / cocoindex import in jrag*.py; build_parser
  loads no torch/sentence_transformers/mcp_v2).
- Smoke: `jrag status` against empty index -> actionable `error: ...` envelope,
  exit 2 (unchanged).

Co-Authored-By: Claude <noreply@anthropic.com>
Re-review flagged a real foot-gun in the structural dispatch I chose in
6c3b58e: "any dict-valued node field -> inspect" fires before the listing
fallback, so a listing envelope whose nodes carry ANY nested-dict field
after .model_dump() (Symbol nodes WILL: source_range, annotations,
capabilities, metadata, etc.) silently routes the whole listing to
_render_inspect, which renders FQN alphabetically - exactly what
_render_listing is contractually forbidden from doing. Silent mis-render,
no error, on a frozen contract.

The fix - make inspect dispatch EXPLICIT:

- render() gains shape: str | None = None. Passing shape="inspect" routes
  to _render_inspect; None falls back to structural inference
  (root -> traversal, nodes/noun -> listing, else scalar). The fuzzy
  any(isinstance(v, dict) for v in n.values()) predicate is GONE.
- _cmd_status now passes shape="inspect" (its rendered output is unchanged
  - the status test still asserts index["ontology_version"] == 17 and
  index["counts"] is non-empty).
- test_render_inspect_edge_summary_alphabetical now passes shape="inspect"
  explicitly (no longer relies on the structural predicate).
- New regression test test_render_listing_with_dict_valued_node_does_not_route_to_inspect
  constructs a listing node with realistic dict-valued fields (annotations,
  source_range - the exact Symbol shape that triggered the foot-gun) and
  asserts the listing contract holds (FQN omitted, only name + @service).
  Verified it would have FAILED under the old any-dict predicate.

Minors:
- Envelope.to_dict docstring tightened to match behavior. Top-level
  collection fields are shallow-copied, but node/candidate VALUES are shared
  references - mutating a node dict in place DOES propagate to a prior
  snapshot. Docstring now says so and points to copy.deepcopy for callers
  needing true isolation (envelope is short-lived in practice, so shared
  references are not a hazard).
- to_envelope_rows grep confirmed safe: only 2 callers (both in tests,
  passing pydantic NodeRef or plain dict). The TypeError raise on unknown
  types is safe - no production callers in 1a.

Verification:
- Focused: 42/42 pass (41 prior + 1 new regression test). ruff clean.
- Sentinels green (no mcp/cocoindex import in jrag*.py; build_parser loads
  no torch/sentence_transformers/mcp_v2; jrag --help in 0.02s).
- Smoke: jrag status against missing index -> actionable error envelope,
  exit 2 (unchanged).

Co-Authored-By: Claude <noreply@anthropic.com>
Implemented find (query mode + filter mode) and inspect commands:

- find has two modes:
  - Query mode (positional <query>): calls g.find_by_name_or_fqn with
    fuzzy fallback and post-filters for role/java-kind/annotation/etc.
  - Filter mode: builds NodeFilter from flags and calls find_v2.
  - Kind inference from domain flags (--http-method→route, etc.)
  - Contradiction detection (error envelope when domain flags conflict)
  - --offset support in filter mode only; limit capped at 499.

- inspect:
  - resolve-first via resolve_query (one/many/none contract)
  - describe_v2 for full node details + edge_summary
  - file_location populated from resolve_query

- Added next_actions_hook no-op stub to jrag_envelope.py (filled by PR-4).

- All 13 tests pass (test_jrag_locate.py) covering exact/fuzzy/find/
  filter/offset/limit/inspect/edge_summary/file_location.

Co-Authored-By: Claude <noreply@anthropic.com>
Review fixes for PR-JRAG-1b:

- Query mode + non-symbol kind (explicit OR inferred from --http-method/
  --client-kind/--producer-kind/etc.) now returns status:error (exit 2)
  explaining find_by_name_or_fqn is Symbol-only, telling the user to drop
  the positional <query> and use filter mode. Previously silently empty.
- Removed --fuzzy flag from the find subparser + the unimplementable
  exact→prefix→contains fallback (find_by_name_or_fqn is exact-only;
  NodeFilter only has fqn_prefix). Added to PLAN-JRAG-CLI Out of scope.
- --framework/--source-layer in query mode now surface a warnings[] entry
  instead of being silently dropped (SymbolHit lacks those fields).
- Cleaned up query-mode kind_map: only symbol sub-kinds from --java-kind
  remain; route/client/producer entries removed (would never match Symbols).
- test_inspect_returns_edge_summary_with_composed_keys now inspects
  ChatAssignmentPort#requestAssignment and asserts OVERRIDDEN_BY composed
  key present with out>0 (was only checking edge_summary is a dict).
- Added comment on the dict→list→truncate→dict flow in _cmd_find_query_mode.

Tests: removed test_find_fuzzy_falls_back_to_prefix; added
test_find_query_mode_with_non_symbol_kind_returns_error and
test_find_query_mode_framework_and_source_layer_warn. 56 passed, ruff clean.

Co-Authored-By: Claude <noreply@anthropic.com>
…helper

Review follow-up on top of ee71fff. Four fixes:

1. (Critical) listeners --topic-prefix now filters for real. The previous
   implementation was a documented no-op stub (SymbolHit carries no topic).
   Edge-model investigation showed the listener->topic path is:
     listener_class -[:DECLARES]-> listener_method -[:EXPOSES]-> Route(topic)
   Resolved via a focused graph._rows() Cypher lookup (same pattern as
   jrag_envelope._node_file_location). neighbors_v2 was infeasible here:
   ASYNC_CALLS run Producer->Route (outbound from producers, so direction="in"
   on producers yields nothing), and the EXPOSES Route's topic property is
   not projected onto the returned NodeRef.

2. (Important) Extracted shared helpers to kill 7x scaffolding duplication:
   _load_graph_or_error (cfg/load/error frame), _clamped_limit (limit clamp),
   _render_listing (+1-fetch truncation + envelope + render),
   _symbol_hit_to_dict (SymbolHit->dict). routes/clients/producers/jobs/entities
   now share the frame; topics and listeners stay bespoke (compose-heavy).

3. (Important) topics now emits a warning when producers lack a topic
   ("N producer(s) had no topic and were excluded") so the empty-topic case
   is distinguishable from the no-producers case.

4. (Minor) Lifted the consumer-fetch limit=100 magic number to the named
   module constant _CONSUMER_FETCH_LIMIT (200), reused by topics --consumer-in
   and the listeners pre-filter fetch.

New test: test_listeners_topic_prefix_narrows asserts the filter narrows
the listener set (3 -> 1) and matches the known fixture pair
(ComplianceReviewListener consumes 'banking.chat.compliance.review').

Co-Authored-By: Claude <noreply@anthropic.com>
…LS-in)

topics --consumer-in was shipping the same silent-wrong-results defect just
fixed for listeners: it traversed ASYNC_CALLS inbound to Producer nodes,
which is the wrong edge model (ASYNC_CALLS run Producer -> Route per
java_ontology.py:415-416), so it returned empty on every graph.

A consumer of a topic IS a listener, so the EXPOSES-based resolver proved
on listeners is the correct path. Three changes:

1. Generalized the listener-topic resolver into _resolve_topic_consumers
   (graph, *, topic, microservice=None, prefix=False) -> list[dict]. Returns
   consumer dicts with id/fqn/kind/microservice. _listener_ids_for_topic_prefix
   is now a thin wrapper that intersects the result with the pre-fetched
   SymbolHit ids.

2. Rewired topics --consumer-in: for each producer-grouped topic, call
   _resolve_topic_consumers(topic=topic_name, microservice=consumer_in) for
   an exact-match lookup. Removed the dead neighbors_v2(producer_ids, "in",
   ["ASYNC_CALLS"]) call and the misleading "retained for graphs where
   inbound ASYNC_CALLS exist" comment.

3. Made the test non-vacuous: test_topics_consumer_in_resolves_consumers_via_exposes
   replaces the rc==0-only test. Fixture reality: no producer topic literal
   overlaps a listener topic literal (unresolved constants vs resolved
   literals), so the test calls _resolve_topic_consumers directly on the known
   resolved pair: topic='banking.chat.compliance.review' + microservice='chat-core'
   must return ComplianceReviewListener (also verified for prefix='banking.chat').

Co-Authored-By: Claude <noreply@anthropic.com>
…, not ASYNC_CALLS)

Co-Authored-By: Claude <noreply@anthropic.com>
Add 11 resolve-first traversal subcommands to jrag: callers, callees,
hierarchy, implementations, subclasses, overrides, overridden-by,
dependents, impact, decompose, flow. Each resolves via resolve_query,
calls a LadybugGraph method (or neighbors_v2 for the override axis),
then renders via the traversal shape (envelope.root + edge rows).
--offset is registered on no traversal subparser.

Verified every backend signature against source (ladybug_queries.py /
mcp_v2.py / java_ontology.py). Two brief adaptations:
  * find_implementors DOES accept a capability kwarg (brief claimed
    otherwise) -- --capability is PUSHED DOWN on `implementations`.
  * CALLS edges are intra-CODEBASE not intra-SERVICE (ontology:286) --
    flow help/test 15 frame it as a data property, validated by the
    presence of cross-service chat-assign->chat-core endpoints.
OVERRIDES direction confirmed: overrider -> declaration, so out=dispatch
UP (overrides) and in=dispatch DOWN (overridden-by) -- brief correct.

--service is a client-side post-filter + warnings[] on callers-Route
(find_route_callers ignores microservice once route_id set) and on
impact (impact_analysis has no microservice param). --include-external
symmetric on callers/callees; --depth clamped 1..3 on decompose,
--max-hops clamped 1..8 on flow.

Fix: ambiguous resolve (rc=0) no longer falls through to the backend;
all 11 handlers check `if rrc or node is None:`.

tests/test_jrag_traversal_direct.py: 17 named tests (bank-chat fixture).
Focused jrag suite (6 files): 86 passed. ruff clean. Lazy-import
invariant holds (build_parser loads no heavy modules).

Co-Authored-By: Claude <noreply@anthropic.com>
…rnings

Address review Approved-with-fixes (6 fixes on top of 4f803e8):

Fix 1 (Important): _render_traversal now honors the `direction` and `stage`
fields it was already passed. Edges carrying `direction` (hierarchy) render
under `↑ supertypes:` / `↓ subtypes:` headers; edges carrying `stage`
(decompose) render under `stage 0 (seed):` / `stage N (role):` headers.
Other traversals stay flat (grouping is conditional on the field being
present). Factored out _format_edge_line to deduplicate the per-row format.

Fix 2 (Important): tests 5 (hierarchy) and 14 (decompose) now assert the
RENDERED structure in --format text (the ↑/↓ and `stage N:` headers must
appear in stdout), not just JSON data presence. Previously they'd pass even
if the renderer never emitted a tree/waterfall.

Fix 3 (Important): --service/--module on hierarchy/overrides/overridden-by/
flow now emit a warnings[] entry explaining the flag isn't applied (structural
edges / no microservice predicate), via shared _warn_unapplied_scope helper.
Plan principle "inapplicable flags never silently ignored" satisfied.

Fix 4 (Important): decompose --limit (when set to non-default) emits a warning
("--limit does not apply to decompose; use --max-stage to cap per-stage
breadth") instead of silently dropping it. Default --limit stays silent.

Fix 5 (Important): hierarchy now applies the limit PER DIRECTION (limit up +
limit down) instead of on the combined list, so a full `up` can no longer
starve `down` behind truncated=True.

Fix 6 (Minor): DRY'd the 11x kind-guard into _require_kind(node, *,
expected, kinds, args, hint="") -> int | None. Applied to 9 handlers
(callers keeps its inline Symbol-or-Route dispatch guard).

Added test 18 (test_inapplicable_flags_emit_warnings) covering Fixes 3+4.
Focused jrag suite: 87 passed. ruff clean. Lazy invariant holds.

Co-Authored-By: Claude <noreply@anthropic.com>
…-JRAG-3b)

Add five command surfaces on top of PR-JRAG-3a, all resolve-first (or, for
connection, an explicitly-documented resolve-first EXCEPTION):

  * callees Client/Producer variant — Client root dispatches via
    neighbors_v2([id],"out",["HTTP_CALLS"]) reaching :Route; Producer root
    via neighbors_v2([id],"out",["ASYNC_CALLS"]) reaching the kafka_topic
    :Route (NOT :Producer). Symbol path is unchanged from 3a. --include-external
    is a warned no-op on the Client/Producer path (edges are to :Route, which
    is always in-graph).
  * dependencies — neighbors_v2([id],"out",["INJECTS"]) = types this class
    injects (Symbol -> Symbol). --service/--module warned (structural edge).
  * connection <microservice> — multi-section inbound:/outbound: view. First
    positional is a microservice NAME (resolve_v2 NEVER run on it). Inbound =
    list_clients(target_service=svc) + find_route_callers on this service's
    listener EXPOSES topic Routes; outbound = list_clients(microservice=svc) +
    list_producers(microservice=svc). --inbound/--outbound/--both (default both),
    --http-method, --calls-service. Synthetic microservice root so the
    traversal-shape + section-grouped rendering fire.
  * outline <file> — find_symbols_in_file_range(start_line=1, end_line=2**31-1).
    start_line=1 is mandatory (backend returns [] for start_line<1). Unbounded
    (no --limit cap); missing file is graceful.
  * imports <file> — tree-sitter parse via ast_java.parse_java, walk
    explicit_imports (dict: simple_name -> FQN), resolve each FQN via resolve_v2.
    Static/wildcard imports rendered as unresolved_import rows. File resolved
    via cfg.source_root / file_arg (or absolute path).

Renderer extension: _render_traversal gained a third grouping branch keyed on
edge["section"] (alongside stage and direction), emitting inbound:/outbound:
headers for connection. Chosen over a new shape per the global-context guidance.

Backend signatures + edge directions verified against source at PR-JRAG-3b
time: HTTP_CALLS=Client->Route (java_ontology.py:352), ASYNC_CALLS=Producer->
Route (386), INJECTS=Symbol->Symbol (216), EXPOSES=Symbol->Route (294). All
four brief edge claims confirmed correct.

Tests: tests/test_jrag_traversal_compose.py ships 12 named tests (bank-chat
fixture). The full focused jrag suite (99 tests across 7 files) passes serially
in 62s; ruff clean. jrag --help stays fast (22ms) and loads zero heavy modules
(torch/sentence_transformers/mcp_v2/ladybug_queries/cocoindex/resolve_service/
ast_java).

Minor seed correction in tests (not a brief gap): ConfigurableChatAssignment
injects ChatEngineProperties, NOT ChatAssignmentPort. ClientMessageProcessor
is the class that injects ChatAssignmentPort; test 3 uses it as the seed.

Co-Authored-By: Claude <noreply@anthropic.com>
…orts text marker (PR-JRAG-3b)

Address PR-JRAG-3b review (3 Important + bundled Minors). No Critical; the
functional correctness from the initial commit was strong (signatures verified,
section renderer guard explicit/safe). Follow-up on top of 1e5241f.

Fix 1 (Important): connection default direction is now --inbound (brief-faithful).
- Handler: `direction = getattr(args, "direction", None) or "inbound"` (was "both").
- Subparser description rewritten so it's internally consistent (default
  --inbound; --outbound / --both are opt-ins; the previous text contradicted
  itself between "Inbound (default)" and "no direction flag renders both").
- Help text on --inbound/--outbound/--both updated to state the default clearly.
- Test 6 (test_connection_both_default) rewritten to assert default == --inbound
  (default MUST equal explicit --inbound; --both is the explicit opt-in).

Fix 2 (Important): --calls-service outbound loophole tightened.
- Previous predicate `(target_service == calls_service) or not target_service`
  matched unresolved clients (empty target_service, e.g. AuditLogClient);
  silent-wrong-results. Split into two predicates:
  * Clients: STRICT `_calls_service_match_out_client` — `target_service ==
    calls_service` exactly. Unresolved clients are EXCLUDED.
  * Producers: kept (no service target on ASYNC channels) with ONE warnings[]
    entry: "--calls-service does not filter producers (...); N producer(s) kept
    visible". The warning fires only when producers_bypass_calls_service and
    there are producers to keep.
- Help text on --calls-service and --http-method now states they apply to HTTP
  callers only and producers bypass with a warning.
- N+1 producer fetch in the inbound-async loop cached per caller_microservice.
- New test 13 (test_connection_calls_service_outbound_excludes_unresolved_clients)
  asserts: chat-core clients KEPT, AuditLogClient EXCLUDED, producers KEPT
  with the warning.

Fix 3 (Important): text-mode imports distinguishes resolved vs unresolved.
- _render_listing now appends " (unresolved)" to nodes with kind ==
  "unresolved_import". Without this marker, resolved Symbols and unresolved
  placeholders rendered identically in text mode (only JSON distinguished).
- New test 14 (test_imports_text_mode_marks_unresolved) asserts the marker
  appears on Spring imports (unresolved) and NOT on JoinOperatorRequest
  (resolved graph Symbol).

Fix 4 (Minors, bundled):
- Test 3 docstring corrected to match the seed (ClientMessageProcessor injects
  ChatAssignmentPort, NOT ConfigurableChatAssignment).
- imports --limit warning mirrored from outline (warns when --limit is set
  away from the default; the full import block is always returned).
- dependencies --include-external flag ADDED for surface symmetry with
  callers/callees; the handler emits a warning (INJECTS is structural Symbol
  -> Symbol with no external-exclusion analog).
- Help text rewritten to document the --calls-service / --http-method HTTP-only
  scope and the producer-bypass warning.

Tests: 14/14 in test_jrag_traversal_compose.py (12 + 2 new); focused jrag suite
101/101 (was 99 + 2 new) passes serially in 65s; ruff clean; build_parser
loads zero heavy modules.

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
…-JRAG-4)

Co-Authored-By: Claude <noreply@anthropic.com>
…RAG-5)

Add Surface (Literal['mcp','cli']) branching to the installer so
'java-codebase-rag install --surface cli' deploys the jrag CLI skill +
subagent instead of the MCP entry. Fixes the update regression where a
CLI-only install was invisible (no MCP entry to scan → fatal exit 2).

Installer:
  - Surface + ConfiguredHost NamedTuple (host, scope, surface)
  - ARTIFACT_MANIFEST single source iterated by deploy_artifacts and
    refresh_artifacts; 'surface="mcp"' keyword-only default preserves the
    6 direct-call sites in test_installer.py
  - .java-codebase-rag.hosts marker file written at install; read by
    detect_configured_hosts (returns list[ConfiguredHost]); legacy MCP-entry
    scan + surface='mcp' fallback for pre-marker installs
  - run_update unpacks (host, scope, surface) and passes surface= to refresh
  - resolve_mcp_command surface-conditional: cli resolves 'jrag' via
    shutil.which and skips the MCP-binary SystemExit(2); prompt + display
    name parameterized via _surface_binary
  - select_surface wizard step + --surface flag; handle_rerun prefill from
    marker via _prior_surface_from_marker

CLI skill + subagent (canonical at dev root, synced via PR-JRAG-0a):
  - skills/explore-codebase-cli/SKILL.md (jrag shell vocabulary)
  - agents/explorer-rag-cli.md (Claude Code subagent driving jrag)
  - scripts/sync_agent_artifacts.py SYNC_MAP extended with the new skill
  - install_data copies byte-equal to dev source (drift gate green)

Tests + docs:
  - tests/test_installer_surface.py: 10 named tests + 3 supplementary
  - tests/test_installer.py: 3 TestDetectConfiguredHosts cases migrated
    from 2-tuple unpack to the 3-field ConfiguredHost NamedTuple
  - tests/test_agent_skills_static.py: EXPECTED_SKILL_DIRS adds
    explore-codebase-cli; TestCliSkillFrontmatter validates the new skill's
    frontmatter (MCP-vocabulary tests stay scoped to explore-codebase)
  - tests/test_install_data_sync.py: synthetic temp workspaces seed the
    new skills/explore-codebase-cli source dir
  - AGENTS.md, skills/README.md, README.md: document the two surfaces

Verification:
  - ruff check . : All checks passed
  - sync_agent_artifacts.py --check : All agent artifacts in sync
  - focused tests (5 files) : 123 passed, 2 skipped (heavy integration)

Co-Authored-By: Claude <noreply@anthropic.com>
…nstall --verbose mock (final review)

- Remove four unimplemented flags (--brief, --fields, --count, --exists) from the
  common parent parser in jrag.py. These flags were registered but never read by
  any handler, violating the "inapplicable flags never silently ignored" principle.
- Remove references to these flags from skills/explore-codebase-cli/SKILL.md and
  agents/explorer-rag-cli.md documentation.
- Fix test_cmd_install_forwards_verbose_flag mock signature to accept surface="mcp"
  (matching run_install's new signature from PR-5), and assert the default forwarding.
- Run sync_agent_artifacts.py to propagate documentation changes to install_data.

Fixes #END_OF_PLAN_FULL_SUITE
Three bugs surfaced while testing jrag, plus two findings from the fresh
4-reviewer fan-out folded in.

1. `java-codebase-rag install` never prompted for CLI-vs-MCP surface: the
   `--surface` flag defaulted to "mcp", so argparse always populated
   args.surface and select_surface returned immediately, bypassing the
   interactive wizard (and ignoring the marker prefill on re-run). Default
   is now None; interactive prompts, non-interactive still falls back to
   "mcp" inside select_surface.

2. `jrag` usage text leaked internal PR-tracking tags ("Status command
   (PR-JRAG-1a)", "PR-JRAG-3b adds Client/Producer variants") and listed
   only `status` under a stale heading. Rewritten with a grouped command
   list (health / locate / listings / traversal / orientation / search) and
   no internal tags; the callees epilog now describes semantics, not backend
   calls.

3. `jrag routes` (and clients/producers/topics) rendered blank names —
   routes carry `path`/`method`, not `fqn`, so simple_name returned "".
   New display_name() picks the identifying field per node kind
   (METHOD path / member -> topic / member -> target / fqn); used by the
   listing, tiered_name (traversal targets), and ambiguous renderers.

Reviewer findings:
- installer _write_hosts_marker uses os.replace (not os.rename) so the re-run
  overwrite path works on Windows too (PR #371 fixed this pattern elsewhere).
- test_installer_surface + test_cmd_install_forwards_verbose_flag updated
  for the --surface default=None contract.

Co-Authored-By: Claude <noreply@anthropic.com>
HumanBean17 and others added 4 commits July 4, 2026 21:43
…est debt)

Folded review fan-out (review-fresh-1..4) findings into one commit.

Code fixes (silent-wrong-output / spec-compliance):
- A: resolve_query returned `ambiguous` with empty candidates when a
  post-filter rejected every `many` candidate; now `not_found` with the
  filter-failure message (an empty ambiguous list had no narrowing value).
- B: _cmd_connection wrapped list_by_capability(MESSAGE_LISTENER) in a bare
  `except: listener_hits = []` -> silent wrong-results; now warns-and-continues
  so an empty async inbound section is distinguishable from "no listeners".
- C: find query-mode truncation was decided on the POST-filtered list, so a
  post-filter that dropped rows silently cleared `truncated`; now decided on
  the raw name/FQN fetch (limit+1), with a warning when post-filters apply
  after a capped fetch. find filter-mode didn't slice out.results -> displayed
  limit+1 rows at the boundary; now slices to limit and sets truncated from
  has_more_results OR len>limit.
- D: overrides/overridden-by set direction="up"/"down" on edge rows, tripping
  the renderer's has_direction guard -> mislabeled flat lists as
  `↑ supertypes:` / `↓ subtypes:`. Dropped the direction key (flat is correct).
- E: jrag_hints result_edges fallback emitted self-hints (after `callers` it
  suggested `jrag callers` again). Added current_command param (plumbed via
  next_actions_hook + _emit_traversal) to skip the self-hint; the inverse
  direction (`callees` after `callers`) is kept — that's the useful signal.
- F: warnings[] were JSON-only (the renderer never emitted them in text), so
  the "inapplicable flags never silently ignored" spec was unenforced for text
  consumers. The renderer now appends `warning:` lines; status/microservices/
  map/conventions/topics/impact now warn on inherited flags they don't apply.

Tests (G + pinning):
- Pin A/D/E/F with focused regression tests.
- Strengthen vacuous assertions: test_resolve_service (7/10 skipped -> use the
  ladybug_db_path fixture; tautological `status in (one,many,none)` -> per-branch
  contracts), test_jrag_listing (truncated now actually verified; client_kind
  enum must not accept un-normalized "feign"; routes must have `path`),
  test_jrag_locate (find_by_capability/annotation prove narrowing vs.
  unfiltered; inspect-ambiguous asserts each branch instead of `elif ok: pass`),
  test_jrag_orientation test 7 (overview --as vacuous `if status==ok` guard ->
  unconditional dispatch assertion).

Verified: 138 focused jrag/resolve tests green; full suite 1006 passed / 16
failed, all 16 pre-existing or fixture-pollution (4 TestPR4IndexProgress are
order-dependent pollution from test_java_codebase_rag_cli init/erase tests
running against the real corpus_root — tracked as a separate follow-up, not a
regression from this commit).

Co-Authored-By: Claude <noreply@anthropic.com>
Address every non-design finding (A-L). search: populate _score from
_distance in the non-hybrid path (was 0.0 for all hits; sort was already
correct). resolve: auto-pick class over constructor when survivors are a
class plus its own ctors; argless method FQNs now resolve via prefix match.
normalize_enum: framework/java_kind -> lowercase (role/capability stay
UPPER) so routes --framework and search --java-kind filter work. Bad enum
filters now return a clean status:error envelope instead of a traceback.
routes listing gains [http]/[kafka] type tags, backfills microservice from
filename, and falls back to filename so no blank lines. connection default
is --both (services no longer look connectionless). map gains --by
{microservice,module}. decompose auto-promotes a method seed to its owning
type. inspect renderer recurses nested dicts / list-of-dicts. Suppress
search stderr noise (tqdm, resource_tracker warning, fail-loud diagnostic).
overview bare-arg gives a helpful error. Update the 3 tests whose
assertions encoded the old buggy contracts.

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
…PR-JRAG-6)

Split the concern conflated by --format {text,json} into two orthogonal axes:
--format picks representation, --detail picks how much of each node/edge. Both
modes honor the same detail level through one projection seam (project_envelope),
invoked once in render() before JSON and text dispatch.

- jrag_envelope: project_node/project_edge/project_envelope + _drop_empty +
  _compose_file; category key-sets (_BRIEF/_NORMAL/full). to_dict/to_json stay
  verbatim — projection is a separate transform.
- jrag_render: render() projects once; listing shows inline module/role/file/
  score at normal, per-row block at full; edges append mechanism/all attrs.
- jrag: --detail flag (default normal); inspect + orientation subparsers
  set_defaults(detail="full"); all 45 render calls thread detail=args.detail;
  _symbol_hit_to_dict carries the full SymbolHit (file at normal, signature/
  annotations/capabilities at full).
- Fixes "text too terse" (normal shows file/score) and "json dumps 50-line
  snippet + 10 empty fields" (_drop_empty inside node dicts; snippet is full-only).
- Tests: 8 projector + 6 orthogonality/text-level tests; existing render tests
  updated for the new default. 142 jrag tests + 86 skill/operator tests green;
  ruff clean; token-budget re-pinned under normal.
- skill/agent docs document --detail; install_data synced; drift test green.

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant