Skip to content

Commit da99f74

Browse files
HumanBean17claude
andauthored
Structured MCP filter schemas, hardened value enums, tool-description rewrite (#315)
* expose structured filter schemas and rewrite mcp tool descriptions search/find/neighbors `filter`/`edge_filter` params are now typed NodeFilter/EdgeFilter (structured, extra-forbidden) instead of opaque dict|str, so MCP clients see field-level enums and descriptions. Value taxonomies role/framework/source_layer/client_kind/producer_kind are Literal enums. Rewrite the five tool descriptions and _INSTRUCTIONS to drop internal-implementation leaks, cross-reference sibling tools, and document strict-frame failure modes. apply_auto_scope operates on NodeFilter. Co-Authored-By: Claude <noreply@anthropic.com> * validate client/producer kind at index time; document OTHER role split The in-source @CodebaseHttpClient(clientKind=) / @CodebaseProducer(producerKind=) enum paths did str(val) with no validation, unlike the YAML path. Validate against VALID_CLIENT_KINDS/VALID_PRODUCER_KINDS (deferred import to avoid the java_ontology->ast_java cycle) and warn-and-ignore on unknown — this closes the last open producers, making client_kind/producer_kind a closed set safe to surface as enums. Also document in java_ontology.py that VALID_ROLES deliberately excludes OTHER (inference-only fallback; the read-side Role enum includes it). Co-Authored-By: Claude <noreply@anthropic.com> * docs: align consumer docs with closed filter enums (#315) The structured filter schemas PR makes role/framework/source_layer/ client_kind/producer_kind closed Literal enums with extra="forbid", so stale or omitted values are now hard schema errors instead of silent no-ops. Align the three consumer-facing operating manuals with the now-closed sets: - Framework glossary: drop the non-existent `codebase_async_route` (only ever a function name, never a stored value), add the missing `feign`, drop the misleading `...` (the set is now closed/exhaustive). - NodeFilter applicability tables: add `source_layer` to the client and producer rows (applicable per _NODEFILTER_APPLICABLE_FIELDS, previously undocumented). - Strict-frame note: invalid enum values (e.g. wrong case) are rejected earlier at the schema layer with the valid set listed - distinct from the success=false applicability path. - Glossary: add Source layers (client/producer) closed value set. Doc-only; no code or test impact. Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent f4ebf46 commit da99f74

9 files changed

Lines changed: 202 additions & 94 deletions

File tree

agents/explorer-rag-enhanced.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -151,15 +151,15 @@ Simple types in parentheses; generics erased. No spaces after commas. No-arg: `(
151151

152152
### Shared NodeFilter
153153

154-
For `find`, `filter` is required — `{}` means no predicates. **Strict frame:** unknown keys or inapplicable populated fields → `success=false`.
154+
For `find`, `filter` is required — `{}` means no predicates. **Strict frame:** unknown keys or inapplicable populated fields → `success=false`; invalid enum values (e.g. wrong case) are rejected earlier at the schema layer with the valid set listed.
155155

156156
| Keys | Applies to |
157157
| ---- | ---------- |
158158
| `microservice`, `module` | All kinds |
159159
| `role`, `exclude_roles`, `annotation`, `capability`, `fqn_prefix`, `symbol_kind`, `symbol_kinds` | **symbol** |
160160
| `http_method`, `path_prefix`, `framework` | **route** |
161-
| `client_kind`, `target_service`, `target_path_prefix`, `http_method` | **client** |
162-
| `producer_kind`, `topic_prefix` | **producer** |
161+
| `source_layer`, `client_kind`, `target_service`, `target_path_prefix`, `http_method` | **client** |
162+
| `source_layer`, `producer_kind`, `topic_prefix` | **producer** |
163163

164164
No wildcards in prefix fields — use `search(query=…)` for fuzzy text.
165165

ast_java.py

Lines changed: 23 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
from __future__ import annotations
1414

1515
import posixpath
16+
import sys
1617
from dataclasses import dataclass, field
1718
from functools import lru_cache
1819
from typing import Iterable
@@ -1642,9 +1643,17 @@ def _parse_codebase_http_client_annotation(
16421643
pairs, _ = _annotation_kv_nodes(ann, src)
16431644
client_kind = ""
16441645
if "clientKind" in pairs:
1645-
val, _kind = _annotation_value(pairs["clientKind"], src)
1646-
if val and _kind == "enum":
1647-
client_kind = str(val)
1646+
val, vkind = _annotation_value(pairs["clientKind"], src)
1647+
if val and vkind == "enum":
1648+
kind_val = str(val)
1649+
from java_ontology import VALID_CLIENT_KINDS # deferred: java_ontology imports ast_java
1650+
if kind_val in VALID_CLIENT_KINDS:
1651+
client_kind = kind_val
1652+
else:
1653+
print(
1654+
f"[lancedb-mcp] CodebaseHttpClient: invalid clientKind {kind_val!r} — ignored",
1655+
file=sys.stderr,
1656+
)
16481657
target_service = ""
16491658
if "targetService" in pairs:
16501659
atoms = _string_value_atoms(pairs["targetService"], src, ctx)
@@ -1714,9 +1723,17 @@ def _parse_codebase_producer_annotation(
17141723
client_kind = "kafka_send"
17151724
kind_node = pairs.get("producerKind") or pairs.get("clientKind")
17161725
if kind_node is not None:
1717-
val, _kind = _annotation_value(kind_node, src)
1718-
if val and _kind == "enum":
1719-
client_kind = str(val)
1726+
val, vkind = _annotation_value(kind_node, src)
1727+
if val and vkind == "enum":
1728+
kind_val = str(val)
1729+
from java_ontology import VALID_PRODUCER_KINDS # deferred: java_ontology imports ast_java
1730+
if kind_val in VALID_PRODUCER_KINDS:
1731+
client_kind = kind_val
1732+
else:
1733+
print(
1734+
f"[lancedb-mcp] CodebaseProducer: invalid producerKind {kind_val!r} — ignored",
1735+
file=sys.stderr,
1736+
)
17201737
topic = ""
17211738
if "topic" in pairs:
17221739
atoms = _string_value_atoms(pairs["topic"], src, ctx)

docs/AGENT-GUIDE.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -138,12 +138,12 @@ For **`find`**, `filter` is required — `{}` means no predicates (all nodes of
138138
| `microservice`, `module` | All kinds |
139139
| `role`, `exclude_roles`, `annotation`, `capability`, `fqn_prefix`, `symbol_kind`, `symbol_kinds` | **symbol** |
140140
| `http_method`, `path_prefix`, `framework` | **route** |
141-
| `client_kind`, `target_service`, `target_path_prefix`, `http_method` | **client** |
142-
| `producer_kind`, `topic_prefix` | **producer** |
141+
| `source_layer`, `client_kind`, `target_service`, `target_path_prefix`, `http_method` | **client** |
142+
| `source_layer`, `producer_kind`, `topic_prefix` | **producer** |
143143

144144
`http_method` filters HTTP verbs on **routes** (declared method) and on **clients** (outbound call method). Not applicable to **symbol** rows.
145145

146-
**Strict frame:** one populated field → one stored attribute for that kind. Unknown keys or inapplicable populated fields → `success=false` with a teaching `message`. No wildcards in `fqn_prefix`, `path_prefix`, or `target_path_prefix` (`*` / `?` rejected) — use `search(query=…)` for ranked text instead. `search.query` is opaque text, not a DSL.
146+
**Strict frame:** one populated field → one stored attribute for that kind. Unknown keys or inapplicable populated fields → `success=false` with a teaching `message`. Invalid enum values (e.g. wrong case) are rejected earlier at the schema layer with the valid set listed. No wildcards in `fqn_prefix`, `path_prefix`, or `target_path_prefix` (`*` / `?` rejected) — use `search(query=…)` for ranked text instead. `search.query` is opaque text, not a DSL.
147147

148148
### Identifier resolution (`resolve`)
149149

@@ -245,12 +245,14 @@ Returns **edges** with `attrs` (`confidence`, `strategy`, `match`, … on cross-
245245

246246
**Symbol kinds (`symbol_kind` / `symbol_kinds`):** `class`, `interface`, `enum`, `record`, `annotation`, `method`, `constructor`.
247247

248-
**Route `framework` (examples on stored routes):** `spring_mvc`, `webflux`, `kafka`, `rabbitmq`, `jms`, `stream`, `codebase_async_route`, …
248+
**Route `framework` (closed set on stored routes):** `spring_mvc`, `webflux`, `kafka`, `rabbitmq`, `jms`, `stream`, `feign`.
249249

250250
**Client kinds:** `feign_method`, `rest_template`, `web_client`.
251251

252252
**Producer kinds:** `kafka_send`, `stream_bridge_send`.
253253

254+
**Source layers (client/producer):** `builtin`, `layer_a_meta`, `layer_b_ann`, `layer_b_fqn`, `layer_c_source`.
255+
254256
**HTTP call `attrs.match` / async `attrs.match`:** `cross_service`, `intra_service`, `ambiguous`, `phantom`, `unresolved`.
255257

256258
### Recovery playbook

java_ontology.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,10 @@
1515
_TYPE_ANN_TO_CAPABILITY,
1616
)
1717

18-
# Roles: Spring stereotype values plus DTO from `infer_role_for_type`.
18+
# Roles assignable by indexing: Spring stereotype values plus DTO. ``OTHER`` is the
19+
# built-in inference fallback (ast_java.infer_role when nothing matches) and is
20+
# deliberately excluded here — it is a read-side value (the mcp_v2 ``Role`` enum
21+
# includes it) but not a role a user may set via @CodebaseRole / role_overrides.
1922
VALID_ROLES: frozenset[str] = frozenset((*ROLE_ANNOTATIONS.values(), "DTO"))
2023

2124
VALID_CAPABILITIES: frozenset[str] = frozenset(

mcp_v2.py

Lines changed: 35 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,22 @@ def _hints_or_skip(tool: str, payload: dict) -> tuple[list, list]:
4848

4949
DeclarationSymbolKind = Literal["class", "interface", "enum", "record", "annotation", "method", "constructor"]
5050

51+
# Closed value taxonomies surfaced to MCP consumers as enums. Sources of truth:
52+
# Role — VALID_ROLES in java_ontology.py + the "OTHER" inference fallback (ast_java.infer_role)
53+
# Framework — hardcoded literals across ast_java.py / build_ast_graph.py
54+
# SourceLayer — exhaustive classifier build_ast_graph._client_source_layer / _producer_source_layer
55+
# ClientKind — VALID_CLIENT_KINDS in java_ontology.py (every producer validated at index time)
56+
# ProducerKind — VALID_PRODUCER_KINDS in java_ontology.py (every producer validated at index time)
57+
# Keep these in sync with the indexing-side taxonomies if they change.
58+
Role = Literal[
59+
"CONTROLLER", "SERVICE", "REPOSITORY", "COMPONENT", "CONFIG",
60+
"ENTITY", "CLIENT", "MAPPER", "DTO", "OTHER",
61+
]
62+
Framework = Literal["spring_mvc", "webflux", "kafka", "rabbitmq", "jms", "stream", "feign", ""]
63+
SourceLayer = Literal["builtin", "layer_a_meta", "layer_b_ann", "layer_b_fqn", "layer_c_source"]
64+
ClientKind = Literal["feign_method", "rest_template", "web_client"]
65+
ProducerKind = Literal["kafka_send", "stream_bridge_send"]
66+
5167
# Stored graph edge labels for one-hop neighbors. Composed DECLARES.* and OVERRIDDEN_BY.*
5268
# dot-keys are separate ComposedEdgeType literals (2-hop traversal). Stored OVERRIDES is an EdgeType.
5369
EdgeType = Literal[
@@ -133,21 +149,30 @@ class NodeFilter(BaseModel):
133149

134150
microservice: str | None = None
135151
module: str | None = None
136-
source_layer: str | None = None
137-
role: str | None = None
138-
exclude_roles: list[str] | None = None
152+
source_layer: SourceLayer | None = None
153+
role: Role | None = None
154+
exclude_roles: list[Role] | None = None
139155
annotation: str | None = None
140156
capability: str | None = None
141157
fqn_prefix: str | None = None
142158
symbol_kind: DeclarationSymbolKind | None = None
143159
symbol_kinds: list[DeclarationSymbolKind] | None = None
144-
http_method: str | None = None
160+
http_method: str | None = Field(
161+
default=None,
162+
description="HTTP verb (commonly GET/POST/PUT/DELETE/PATCH; user route annotations may yield others).",
163+
)
145164
path_prefix: str | None = None
146-
framework: str | None = None
147-
client_kind: str | None = None
165+
framework: Framework | None = None
166+
client_kind: ClientKind | None = Field(
167+
default=None,
168+
description="Outbound HTTP client kind: feign_method, rest_template, or web_client.",
169+
)
148170
target_service: str | None = None
149171
target_path_prefix: str | None = None
150-
producer_kind: str | None = None
172+
producer_kind: ProducerKind | None = Field(
173+
default=None,
174+
description="Outbound async producer kind: kafka_send or stream_bridge_send.",
175+
)
151176
topic_prefix: str | None = None
152177

153178

@@ -157,9 +182,9 @@ class EdgeFilter(BaseModel):
157182
min_confidence: float | None = None
158183
exclude_strategies: list[str] | None = None
159184
include_strategies: list[str] | None = None
160-
callee_declaring_role: str | None = None
161-
callee_declaring_roles: list[str] | None = None
162-
exclude_callee_declaring_roles: list[str] | None = None
185+
callee_declaring_role: Role | None = None
186+
callee_declaring_roles: list[Role] | None = None
187+
exclude_callee_declaring_roles: list[Role] | None = None
163188

164189
@model_validator(mode="after")
165190
def _strategy_axes_mutually_exclusive(self) -> EdgeFilter:

0 commit comments

Comments
 (0)