Skip to content

Commit c363727

Browse files
aksOpsclaude
andauthored
feat: port codeiq from Java/Spring Boot to Go single-binary (Phases 1-4) (#130)
* chore(go): scaffold Go module at go/ - Add go/go.mod with module github.com/randomcodespace/codeiq/go (Go 1.26.2 directive) - Add go/.gitignore for build artifacts (binaries, coverage, dist) - Add .claude/ to root .gitignore for ralph-loop state files This is Phase 1 Task 1 of the Java → Go port (spec §10). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * checkpoint: pre-yolo 2026-05-12T01:15:25 * feat(buildinfo): version/commit/date/dirty strings + Platform/Features Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cache): SHA-256 file hasher matching Java FileHasher (64 hex chars) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(model): NodeKind enum with all 34 kinds + JSON round-trip Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(model): EdgeKind enum with all 28 kinds + JSON round-trip Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(model): Confidence three-tier enum with score + parse Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(model): Layer enum (frontend/backend/infra/shared/unknown) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(model): CodeNode struct + constructor + JSON tags Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(model): CodeEdge struct + constructor + JSON tags Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(fixture): minimal 3-file fixture for parity testing Phase 1 Task 31 (spec §10). UserController.java + User.java + models.py exercise every phase-1 detector (spring_rest, jpa_entity, django_models, flask_routes, generic_imports). No build files yet — ServiceDetector lands in phase 2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(parser): Language enum + Parse facade + Tree wrapper Adds the language identifier (Java/Python/Unknown), the extension-based mapping, the Tree wrapper around tree-sitter's parsed root, and the Parse facade. The tsLanguage dispatcher is intentionally left undefined here — Task 13 wires in the Java + Python grammars and provides it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector): Detector interface + Context + Result Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector): static registry with deterministic ordering Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(parser): tree-sitter bindings for Java + Python Wires up the Java and Python grammars from github.com/smacker/go-tree-sitter and adds the tsLanguage dispatcher that Parse() uses. End-to-end test parses a trivial Java and Python hello-world and asserts the root node type matches each grammar's conventional root ("program" for Java, "module" for Python). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/base): RegexDetector helpers (FindLineNumber + LEXICAL floor) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/base): TreeSitterDetector helpers (Walk + Find* + SYNTACTIC floor) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cli): Cobra root command + global flags (build pending version.go) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cli): version subcommand with text + JSON output Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * checkpoint: pre-yolo 2026-05-12T01:24:20 * feat(cache): SQLite Open/Put/Get/Has/Version + IterateAll Implements Task 11 of the Go-port plan: a SQLite-backed analysis cache keyed by content hash. Each Put atomically wipes and re-inserts files + nodes + edges for a hash in one transaction; Get rehydrates the Entry, returning ErrNotFound for misses. CacheVersion is stamped into cache_meta at Open. IterateAll yields entries in deterministic (path, content_hash) order for phase-2 enrich. Round-trip + version + miss tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(analyzer): FileDiscovery via git ls-files with dir-walk fallback Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci(go): vet + race test + staticcheck + gosec + govulncheck Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(cli): contract test asserts every subcommand has Use/Short/Long/Example/RunE Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector): port Spring REST controller detector (regex path, phase 1) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cmd): codeiq binary entry point Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector): port JPA entity detector (regex path, phase 1) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci(parity): cross-binary parity check on fixture-minimal Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(parity): SQLite → sorted JSON normalizer (path + kind + id ordering) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(readme): development section for the Go port phase 1 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector): port Django model detector (regex path, phase 1) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(analyzer): Run orchestrator wiring discovery, parser, registry, cache Test currently blocked by missing detector/generic subpackage (in flight from parallel agent). analyzer.go and analyzer_test.go build/compile cleanly in isolation; tests will pass once detector/generic lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector): port Flask route detector (regex path, phase 1) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector): port generic-imports detector (java + python, phase 1) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cli): add index subcommand wiring analyzer.Run Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(parity): cross-binary parity harness for fixture-minimal Phase 1 Task 33 (spec §11). parity_test.go runs Go-only snapshot mode when TEST_JAVA_NORMALIZED is unset; CI workflow go-parity.yml provides the Java-side input. open_ro.go is a stable read-only seam for phase 2. expected-divergence.json is empty — phase 1 ports target byte-equivalent output on fixture-minimal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(phase-1): exit gate passes — Go port phase 1 complete Phase 1 Task 37 (spec §10). All 37 phase-1 tasks committed. Exit gate results: - go vet ./...: clean - go test ./... (12 packages): all PASS - go test -tags=parity ./parity/...: PASS - go build ./cmd/codeiq: success (5.7MB binary) - codeiq --version: produces text + --json formats per spec §7.1 - codeiq index fixture-minimal: 3 files -> 25 nodes + 15 edges -> SQLite Total commits on port/go-port: 37 (Phase 1) + 2 checkpoints = 39 ahead of origin/main. Known divergences from plan (documented for phase-4 reconciliation): - Spring REST: regex window bound; renamed shadowing locals - Flask route: one endpoint per route, http_methods property carries extras - Task 10 (cache schema) + Task 24 (graph_builder) commits swept up into parallel-agent commits due to git add -A collision; content is correct, commit attribution drifted Next: Phase 2 (Kuzu graph store + intelligence/lexical + intelligence/extractor + read-side stats/query/find). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(go): pin go-kuzu binding and update buildinfo features Pins github.com/kuzudb/go-kuzu@v0.7.1 as the Kuzu CGO binding for the phase 2 graph layer, and adds "kuzu" to buildinfo.Features() so --version records the new compile-time feature flag. The binding is currently an indirect dependency; it becomes direct once internal/graph/store.go imports it in Task 2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/intelligence): DocCommentExtractor for Java/TS/Py/Go/Rust Ports DocCommentExtractor.java to internal/intelligence/lexical/doc_comment.go. Handles block comments (Java/TS/JSDoc/C++ Doxygen) with annotation walk-back and blank-line gap detection, contiguous // and /// line comments (Go/Rust), and Python triple-quoted docstrings (single- and multi-line, both " and '). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/analyzer): LayerClassifier kind+framework+path heuristics Port of LayerClassifier.java: stamps Layer on every CodeNode using first-match-wins rules (kind → language → file path → framework → shared kinds → fallback path/package heuristics → Java src/main convention). Pure deterministic function; compiled regexes are package-level vars; tests cover one positive case per priority rule plus determinism. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/graph): add Store facade for embedded Kuzu Introduces internal/graph/Store as the embedded-Kuzu facade for the phase 2 graph layer. The Store owns one Kuzu database + one long-lived Connection, serializes access through an internal mutex, and exposes Open/Close/Path/Conn/Lock/Unlock. The default open path is .codeiq/graph/codeiq.kuzu/ on disk; Open ensures the parent directory exists and lets Kuzu create the database directory itself. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/intelligence): SnippetStore with bounded extraction Ports SnippetStore.java to internal/intelligence/lexical/snippet_store.go. Extracts a CodeSnippet centred on a node's line range (default ±3 context lines), caps each snippet at 50 lines by recentring on the symbol midpoint when the span overflows, and rejects file paths that resolve outside the analysis root. InferLanguage maps canonical source extensions to the language identifiers used elsewhere in the graph. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/analyzer): TopicLinker pairs producer to consumer Mirrors src/main/java/io/github/randomcodespace/iq/analyzer/linker/TopicLinker.java. Walks TOPIC/QUEUE/EVENT/MESSAGE_QUEUE nodes and matches PRODUCES/SENDS_TO/PUBLISHES edges with CONSUMES/RECEIVES_FROM/LISTENS edges sharing the same topic label, then emits direct CALLS edges from each producer to each non-self consumer. Iteration is sorted by label then producer then consumer for determinism. Also introduces the `linker.Linker` interface and shared `linker.Result` type used by all subsequent linkers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/graph): add Cypher execution facade Adds Store.Cypher(query, args...) — runs a Cypher statement against the embedded Kuzu connection and returns rows as []map[string]any keyed by result column name. No-args invocations route through Connection.Query; parameterized invocations route through Prepare+Execute and bind the caller-supplied map. DDL and void queries return an empty slice. DefaultQueryTimeout = 30s mirrors the Java side's DBMS-level transaction cap (Neo4jConfig.transaction_timeout). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/intelligence): LexicalEnricher populates lex_comment + lex_config_keys Ports LexicalEnricher.java to internal/intelligence/lexical/enricher.go. Stamps `lex_comment` on doc-comment candidate kinds (CLASS, ABSTRACT_CLASS, INTERFACE, ENUM, ANNOTATION_TYPE, METHOD, ENDPOINT, ENTITY, SERVICE, REPOSITORY, COMPONENT, GUARD, MIDDLEWARE) and `lex_config_keys` on CONFIG_KEY / CONFIG_FILE / CONFIG_DEFINITION (FQN preferred, label fallback). Groups doc-comment candidates by filePath so each source file is read at most once; iterates file groups in sorted order for determinism; refuses path-escape inputs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/analyzer): EntityLinker repository to entity QUERIES Mirrors src/main/java/io/github/randomcodespace/iq/analyzer/linker/EntityLinker.java. Strips the longest matching suffix from REPOSITORY labels (Repository, Repo, Dao, DAO — first match wins) and emits a QUERIES edge to the case-insensitively named ENTITY. Skips when an explicit QUERIES edge already exists between the same source and target, so detector output isn't duplicated. Falls back to the simple name parsed from the entity's FQN when label-only match fails. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/analyzer): ModuleContainmentLinker emits MODULE + CONTAINS Mirrors src/main/java/io/github/randomcodespace/iq/analyzer/linker/ModuleContainmentLinker.java. Groups non-MODULE nodes by their Module field and emits a MODULE node (reusing an existing one by ID if present) plus a CONTAINS edge per member. Skips nodes whose Module field is empty, MODULE-kind nodes themselves (so a module can't contain itself), and any (source, target) pair already covered by an explicit CONTAINS edge. Modules iterate alphabetically and members within a module iterate by ID, making output stable across runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/intelligence): LexicalQueryService bridges fulltext index Ports LexicalQueryService.java to internal/intelligence/lexical/query_service.go. Three query entry points — FindByIdentifier, FindByDocComment, FindByConfigKey — route to a FullTextStore interface (satisfied by *graph.Store once Task 7's SearchByLabel / SearchLexical helpers land) and tag each Result with its Source attribution. Limits are clamped to [50, 200] per the Java parity. When a SnippetStore + root are wired in, doc-comment results carry a bounded source snippet. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/graph): schema DDL for CodeNode + per-EdgeKind rels Single CodeNode node table backs all 34 NodeKinds (kind is a column, not a label, matching the label-free model on the Java/SDN side). One REL table per EdgeKind, all with FROM/TO CodeNode. JSON-serialised props column + a handful of first-class columns reserved for indexing / projection (label_lower, fqn_lower, prop_lex_comment, prop_lex_config_keys). All DDL is CREATE ... IF NOT EXISTS so ApplySchema is safe to re-run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/analyzer): ServiceDetector walks filesystem for 30+ build systems Ports ServiceDetector.java to Go. Walks projectRoot directly (not the node list) because not all build files produce CodeNodes during index, so node paths alone miss modules. Emits SERVICE nodes (one per build-file-bearing directory) plus CONTAINS edges from each service to the nodes whose filePath falls under it, deepest-match wins. Supports 30+ build systems via two maps: exact filename (pom.xml, package.json, go.mod, Cargo.toml, pyproject.toml, build.gradle, etc.) and suffix match (*.csproj, *.fsproj, *.vbproj, *.gemspec, *.cabal, *.nimble). Priority rules mirror the Java side: supplemental tools (Docker, nx, lerna, turbo, rush) don't override real build tools; python files follow pyproject.toml > setup.py > requirements.txt > manage.py; gradle settings.* doesn't override build.gradle. Prunes node_modules, .git, target, build, dist, .gradle, .idea, .vscode, __pycache__, .tox, .eggs, venv, .venv, vendor, .bundle, _build, deps from the walk so vendored deps don't masquerade as separate services. Extracts the canonical name from build file contents when possible (artifactId, npm name, go module last segment, cargo name, pyproject name, gradle rootProject.name, sbt name, composer name, mix app, pubspec name) and falls back to the directory name (or projectDir for the root) when no extractor matches. 10 new tests cover the priority rules, dir/projectDir fallback, .csproj suffix path, skip-list enforcement, child assignment + CONTAINS edges + endpoint/entity counts, the no-build-files synthetic "unknown" service, and a determinism check across two runs of the same tree. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/graph): bulk-load nodes via CSV + COPY FROM Per-node CREATE doesn't scale to the enrich-phase volumes we target (44k files / 100k+ nodes). BulkLoadNodes stages rows in a temp CSV (cleaned up on return) and ships them through Kuzu's COPY FROM with an explicit column list aligned to schema.go's CodeNode DDL. Empty input is a no-op rather than erroring on an empty CSV. INT64 columns (line_start, line_end) are emitted as empty strings when zero so Kuzu treats them as NULL on non-source nodes (SERVICE, MODULE etc.). framework + language are pulled out of the properties map into the first-class columns for direct projection; the full property map still round-trips through the JSON-serialised props column. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/graph): bulk-load edges grouped by rel table BulkLoadEdges partitions a mixed-kind batch internally and issues one COPY <REL_TABLE> FROM per kind. Iteration goes through AllEdgeKinds() in canonical order so the COPY sequence stays deterministic for parity diffing against the Java/SDN side. Each rel-table staging CSV starts with FROM/TO node primary keys (Kuzu's rel COPY convention) followed by id, confidence, source, and the JSON-serialised property map. Empty input is a no-op like the node path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/graph): search helpers for search_graph + lexical query Mirrors the Java GraphStore.createIndexes() surface: SearchByLabel covers label_lower + fqn_lower (powers /api/search + search_graph MCP tool), SearchLexical covers prop_lex_comment + prop_lex_config_keys (powers LexicalQueryService doc-comment / config-key search). Kuzu version caveat: the official FTS extension ships pre-bundled only from Kuzu v0.11.3 onward. go-kuzu v0.7.1 links Kuzu 0.7.x, where FTS requires a network INSTALL — incompatible with the air-gapped build policy. The CreateIndexes / SearchByLabel / SearchLexical surface stays identical; behind the scenes we run case-insensitive CONTAINS predicates. When Kuzu pins move past 0.11.3 the implementation swaps to CALL CREATE_FTS_INDEX / QUERY_FTS_INDEX without touching callers. Cypher quirks discovered on Kuzu 0.7.1: - LIMIT/SKIP do NOT accept parameter binding; values must be inlined - The lower-case function is SQL-style `lower(x)`, not `toLower(x)` Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/graph): read helpers for query services Eight thin Cypher wrappers backing Java QueryService / StatsService / GraphController: - Count, CountEdges: totals for /api/stats - CountNodesByKind, CountNodesByLayer: rich-stats breakdowns - FindByID: single-node lookup; (nil, nil) when absent - FindByKindPaginated: ordered+paged list for /api/kinds/{kind} - FindIncomingNeighbors, FindOutgoingNeighbors: for /api/nodes/{id}/neighbors All projections route through rowsToNodes (defined in indexes.go) so the neighbour / search / by-kind helpers stay consistent. Kuzu 0.7.1 binder quirks discovered: - MATCH ()-[r]->() unions every rel table (used for CountEdges) - DISTINCT collapses the rel-pattern scope; ORDER BY must reference the projected alias (`id`), never the bound variable (`a.id`) - count(*) returns int64 — asInt64() helper guards against drift Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/query): StatsService 7 categorized statistics Port StatsService.java to Go — pure functions over (nodes, edges) that produce the seven-category breakdown (graph / languages / frameworks / infra / connections / auth / architecture) the Java side exposes via /api/stats. OrderedMap preserves Java's LinkedHashMap insertion order so parity diffs match byte-for-byte once rendered. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/intelligence): LanguageEnricher orchestrator + interface Adds the LanguageExtractor interface, EmptyResult helper, and an Enricher that fans out per-language extractors over a node list. Files are read at most once across all nodes sharing a path; per-file work runs on a goroutine per file with results merged in sorted-file order for deterministic output. Also adds model.CapabilityLevel (EXACT, PARTIAL, LEXICAL_ONLY, UNSUPPORTED) to mirror Java's CapabilityLevel — distinct from the per-edge Confidence ladder. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/intelligence): Java language extractor (calls + type hints) Tree-sitter-driven port of JavaLanguageExtractor.java. For METHOD nodes, walks the matching method_declaration subtree and emits one CALLS edge per method_invocation that resolves to a unique METHOD node in the registry (ambiguous label = dropped, same guard as Java's lookupByLabel). For CLASS/ABSTRACT_CLASS/INTERFACE nodes, extracts extends_type and implements_types type-hints from the superclass / interfaces fields. Adds parser.Walk, parser.ChildFieldText, parser.ParseByName helpers so extractors can drive the existing tree-sitter parser via string keys. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/query): QueryService consumers/producers/callers/cycles/dead Port QueryService.java to Go — high-level read service backed by graph.Store. Implements FindConsumers / FindProducers / FindCallers / FindDependencies / FindDependents / FindShortestPath / FindCycles / FindDeadCode. The Java side wraps Neo4j's single RELATES_TO edge; on Kuzu we filter by LABEL(r) against the per-EdgeKind rel tables. Kuzu 0.7 feature gaps worked around in this commit: - List comprehension [n IN nodes(p) | n.id] is rejected by the binder ("Variable n not in scope"); use properties(nodes(p), 'id') instead. - Parameters declared at the outer WHERE are not visible inside an EXISTS subquery, so the semantic-edge filter is inlined as a rel alternation pattern rather than `LABEL(r) IN $param`. - Kuzu's Go binding accepts []any only, not []string, for LIST parameters — stringsToAny widens. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/intelligence): TypeScript language extractor Wires the tree-sitter TypeScript grammar into the parser package (also covers JavaScript — the grammar is a superset) and adds the TS extractor. For METHOD nodes the extractor walks the matching function_declaration / method_definition / arrow_function and emits one CALLS edge per call_expression whose `function` field resolves to a registry node. For MODULE nodes it stamps a `module_exports` type-hint listing the declarations attached to every export_statement in the file. Grammar import: github.com/smacker/go-tree-sitter/typescript/typescript (sub-package of the already-vendored smacker module — no go.mod change). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/intelligence): Python language extractor Tree-sitter-driven port of PythonLanguageExtractor.java. For METHOD nodes walks the matching function_definition and emits one CALLS edge per call node whose `function` field resolves to a registry node. For CLASS nodes extracts the first superclass (parens stripped from the `superclasses` field text) as the `extends_type` hint. For MODULE nodes regex-matches a top-level `__all__ = [...]` list, strips quotes and whitespace, and stamps the result as an `all_exports` hint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/query): TopologyService cross-service topology Port TopologyService.java to Go — service-map / service-detail / blast-radius / find-path / find-bottlenecks / find-circular / find-dead-services. The Java side ingests the full node + edge lists and walks them in heap; the Go side uses targeted Cypher queries against the structural CONTAINS edges ServiceDetector emits, so peak memory stays flat regardless of graph size. Implementation choices: - Pivot through CONTAINS rather than parsing the JSON `service` property at query time (no JSON-extract helper in Kuzu 0.7). - extractJSONString / extractJSONInt are single-pass scanners reading build_tool / endpoint_count / entity_count out of the props blob — cheaper than full JSON parse for the few fields we surface. - FindCircular uses an in-Go DFS over the cross-service adjacency, normalising each cycle to start at its lexicographically smallest service for stable output. Kuzu 0.7 feature gaps worked around in this commit: - Combining multi-label rel alternation (r:CALLS|...) with the kleene star in a single recursive pattern breaks the binder; BlastRadius uses an anonymous recursive pattern instead. - ORDER BY after RETURN DISTINCT must reference the projected alias (e.g. `id`), not `b.id` — DISTINCT scope drops the rel pattern's node aliases (same caveat as graph.FindOutgoingNeighbors). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/intelligence): Go language extractor + tree-sitter-go wiring Wires the tree-sitter Go grammar into the parser package (smacker's `golang` sub-package — `go` is a reserved keyword in import paths) and adds the Go extractor under internal/intelligence/extractor/golang (matching package naming). For METHOD nodes the extractor walks the matching function_declaration / method_declaration and emits one CALLS edge per call_expression whose `function` field resolves to a registry node — qualified callees like `log.Println` are stripped to their bare name before lookup. For CLASS nodes it regex-matches the `var _ Iface = (*Foo)(nil)` interface-assertion idiom and stamps `implements_types` with the interface qualifier. Grammar import: github.com/smacker/go-tree-sitter/golang (sub-package of the already-vendored smacker module — no go.mod change). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/cli): enrich command orchestrates linkers + intelligence + bulk load Implements the enrich pipeline orchestrator and Cobra subcommand mirroring the Java side's index -> enrich -> serve workflow. Enrich rehydrates the SQLite cache, runs the three linkers (TopicLinker, EntityLinker, ModuleContainmentLinker), layer classifier, lexical enricher, language extractors (Java, TypeScript, Python, Go), and the filesystem-driven service detector, then bulk-loads Kuzu and creates the FTS-equivalent indexes. Adds `resolvePath` / `printOrdered` CLI helpers and an `OrderedMap` MarshalJSON shim so query results serialise with deterministic key order. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/cli): stats command with category + json modes Adds `codeiq stats [path]` backed by graph.Store.LoadAllNodes / LoadAllEdges. Hydrates the JSON `props` column back into CodeNode / CodeEdge so the existing in-memory StatsService aggregations (languages, frameworks, infra, connections, auth, architecture) get the same view they had during enrich. Introduces a StoreStatsService wrapper in query/ that lazy-loads the full node + edge lists on first ComputeStats / ComputeCategory call — the same snapshot-cache bridge the Java side uses while the targeted-Cypher rewrite is pending (see CLAUDE.md gotcha). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/cli): query subcommands (consumers, producers, callers, deps, dependents) Adds the `codeiq query` parent command and the five preset finder subcommands backed by query.Service. Each subcommand resolves the graph directory the same way (--graph-dir override, otherwise default under the project root), opens the Kuzu store read-only, and prints tab-separated `id\tkind\tlabel` rows sorted by id for deterministic output. Extends docs_test.go to recurse into nested subcommands so the §7.1 documentation contract is enforced for `codeiq query <sub>` as well as top-level commands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(go): fixture-multi-lang for enrich + intelligence parity Multi-language fixture (Java + TS + Python + Maven multi-module) exercising the full phase-2 pipeline: linkers, layer classifier, lexical enricher, language extractors, and ServiceDetector (30+ build systems). Layout: testdata/fixture-multi-lang/ pom.xml root multi-module services/checkout-svc/pom.xml checkout module src/main/java/com/example/checkout/ CheckoutController.java @RestController + @GetMapping User.java @Entity + @Table UserRepository.java JpaRepository<User, Long> services/web-ui/package.json npm service src/components/CartView.tsx default-exported React component services/notifier/pyproject.toml python service notifier/views.py @app.route("/notify") notifier/models.py Subscriber(models.Model) Empirically-validated counts after index + enrich (CGO_ENABLED=1): Files: 6 Nodes: 20 Edges: 11 (index, pre-enrich) Nodes: 24 Edges: 31 Services: 4 (post-enrich) Node kinds module=16 service=4 endpoint=2 entity=2 Edge kinds imports=11 contains=20 Layers backend=22 unknown=2 Frameworks django=1 flask=1 jpa=1 spring_boot=1 expected-stats.json mirrors StatsService.ComputeStats output. expected- divergence.json reserves slots for Java-vs-Go RESOLVED→SYNTACTIC drops and lex_comment whitespace deltas (Java does trim+space, Go does the same; no daylight expected today). .gitignore: whitelist go/testdata/**/pyproject.toml so the fixture's python build files ship with the repo (the global pyproject.toml ignore otherwise hides it). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/cli): find finders for endpoints/guards/entities/etc Adds `codeiq find <what>` with eight preset finder subcommands (endpoints, guards, entities, topics, queues, services, databases, components). Each subcommand is a thin wrapper over graph.Store.FindByKindPaginated with --limit / --offset paging. Each finder produces tab-separated `id\tlabel` rows ordered by id. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/cli): topology command + sub-views Adds `codeiq topology` with five sub-views — service-detail, blast-radius, bottlenecks, circular, dead — plus a `path` finder for shortest-path BFS between two services. Bare `topology` renders the service map (services plus cross-service runtime connections + count aggregates). Each sub-view is a thin Cobra wrapper around query.Topology methods, returning JSON via the OrderedMap.MarshalJSON path so output is deterministic and easily diffed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(go/parity): phase-2 mode compares Kuzu vs Neo4j dump Adds the phase-2 leg of the parity harness: dump the post-enrich Kuzu store to a deterministic JSON envelope and diff it against the Java side's Neo4j dump, filtered through expected-divergence.json. Components: - parity/kuzu_dump.go: DumpKuzu(dir) opens the Kuzu store, projects every CodeNode (id/kind/label/fqn/file_path/layer/framework/language plus the prop_lex_comment / prop_lex_config_keys lexical fields) and every edge (id/kind/source/target), Go-side sorts by id, and serializes with empty- slice coercion so the JSON output stays stable across empty stores. - parity/kuzu_dump_test.go: TestDumpKuzuEmptyStore + TestDumpKuzuIsDeterministic. Schema-only, no enrich required — runs without the parity build tag. - parity/parity_test.go: TestFixtureMultiLangParityPhase2 builds the Go binary, runs index + enrich on testdata/fixture-multi-lang, calls DumpKuzu, and either snapshots (TEST_JAVA_KUZU_DUMP unset) or diffs against the Java side. Mismatches write the Go dump to t.TempDir() and print the path for CI artifact upload. diffJSON now renders via pmezard/go-difflib's unified format, with the allowlist applied line-by-line so allowed deltas are absorbed and only unexplained drift fails the build. - go.mod: promote github.com/pmezard/go-difflib v1.0.0 from indirect to direct (transitive in phase 1 via stretchr/testify, now used by the parity harness directly). - testdata/fixture-multi-lang/expected-divergence.json: switch property_ drift to string tags so the divergenceFile schema stays []string for fixture-minimal compatibility. Verification (CGO_ENABLED=1): go test ./parity/... -> 3 passed go test -tags=parity ./parity/... -> 5 passed go vet ./... -> clean go test ./... -count=1 -> 282 passed Kuzu-specific notes captured inline: rel-type accessor is label(r) not type(r) in Kuzu 0.7.1; LIMIT/SKIP can't be parameter-bound; ORDER BY scope drops after DISTINCT — Go-side sort is the safety net. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(phase-2): exit gate passes — Go port phase 2 complete Phase 2 Task 33 (spec §10). 32 prior Phase 2 commits land: Tasks 1-3 (Kuzu store + Cypher facade), 4-8 (schema/bulk/indexes/reads), 9-11 (3 linkers), 12 (LayerClassifier), 13 (ServiceDetector), 14-17 (intelligence/lexical: doc comment + snippet + enricher + query), 18-22 (intelligence/extractor: orchestrator + Java/TS/Python/Go), 23-25 (Stats/Query/Topology services), 26-30 (enrich + 4 read-side CLIs), 31 (fixture-multi-lang), 32 (parity Phase-2 mode) Exit gate results: - go vet ./...: clean - go test ./... (20 packages): 279 PASS / 0 FAIL - codeiq build: 5.7MB binary - codeiq index fixture-multi-lang: 6 files -> 20 nodes + 11 edges - codeiq enrich fixture-multi-lang: 24 nodes + 31 edges + 4 services in Kuzu - codeiq stats fixture-multi-lang: 7 categories render correctly Known Kuzu v0.7.1 limitations (deferred until pin moves past v0.11.3): - FTS extension not bundled; SearchByLabel/SearchLexical use CONTAINS fallback - LIMIT/SKIP not parameterizable - lower() not toLower() (SQL style) - RETURN DISTINCT scope tighter than openCypher - List comprehension binder rejects out-of-scope vars - EXISTS subquery doesn't see outer-scope params - []string→[]any widener required for IN $param - Multi-label rel + kleene* in single recursive pattern breaks binder Plus 2 plan divergences (Spring REST regex + Flask methods, documented for phase 4 reconciliation) and 2 git-add collision incidents (commit attribution drift only, no lost work). Total commits on port/go-port: 70 ahead of origin/main. Next: Phase 3 (mcp stdio + 34 tools + intelligence/evidence + intelligence/query planner + cypher/flow/graph/cache/plugins/mcp CLIs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(go): phase-2 exit gate green Phase 2 acceptance commands all pass (CGO_ENABLED=1): go vet ./... PASS go test ./... -count=1 282 passed go test -tags=parity ./parity/... -count=1 5 passed go test ./internal/cli/ -run TestEverySubcommandIsDocumented PASS (8 subcommands documented) go build -o /tmp/codeiq ./cmd/codeiq Success /tmp/codeiq --version prints version + features codeiq index testdata/fixture-multi-lang Files: 6 Nodes: 20 Edges: 11 codeiq enrich testdata/fixture-multi-lang 24 nodes, 31 edges, 4 services codeiq stats testdata/fixture-multi-lang matches expected-stats.json byte-for-byte codeiq stats ... --json | jq '.graph.nodes' -> 24 codeiq find endpoints testdata/fixture-multi-lang checkout/getUser + notifier/notify govulncheck ./... 0 reachable vulnerabilities The single repo change in this commit fixes the hand-authored expected-stats.json: architecture.modules=16 was missing because StatsService.computeArchitecture emits a `modules` key whenever the graph contains NodeModule entries (16 here: 11 ext: import targets + 3 java file-modules + 2 py file-modules). With the fix the Go output matches the fixture's expected JSON exactly. Lexical FTS end-to-end check (validated inline via a tag-gated test during exit-gate verification, then removed): SearchLexical("checkout") hits CheckoutController.getUser Javadoc + User.java @Entity comment; SearchLexical("notification") hits Subscriber model docstring. Confirms LexicalEnricher populates prop_lex_comment from real Javadoc / docstring sources and the lexical_index FTS retrieves them. Phase 2 deliverable complete: graph store, schema, bulk load, linkers, layer classifier, service detector, lexical enricher, language extractors (Java/TS/Python/Go), query services, full CLI surface (index/enrich/stats/ query/find/topology), and the multi-lang parity harness. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/intelligence/query): QueryRoute + QueryType enums Mirrors src/.../intelligence/query/{QueryRoute,QueryType}.java with the same identifiers so JSON envelopes match byte-for-byte and degradation notes downstream read identically across the two ports. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/intelligence/query): Plan + CapabilityMatrix Adds CapabilityDimension, CapabilityLevel, CapabilityMatrix type alias, Plan struct with UsesGraph/UsesLexical helpers, and the per-language capability tables (java, typescript, javascript, python, go, csharp, rust, cpp, lexical-only fallback for kotlin/scala/ruby/php/shell/etc.). CapabilityMatrixFor and AllCapabilities mirror the Java side's CapabilityMatrix.forLanguage / asSerializableMap. Returned matrices are defensive copies so callers can mutate without touching package state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/intelligence/query): Planner with deterministic rules Planner maps (QueryType, language) → Plan with priority: DEGRADED > LEXICAL_FIRST > MERGED > GRAPH_FIRST. SEARCH_TEXT is special-cased to always route via LEXICAL_FIRST regardless of language. Degradation-note text mirrors the Java side byte-for-byte so cross-port regression diffs stay clean. Capability provider is injected as a closure so tests can swap fixed matrices without touching the package-level tables. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/mcp): stdio server, registry, tool plumbing Pin github.com/modelcontextprotocol/go-sdk v1.6.0 (latest stable; the plan was drafted against v0.4.0 which is now superseded — same `mcp` package API surface in v1.x). Wraps the SDK so callers register mcp.Tool {Name, Description, Schema, Handler} and Serve(ctx, transport) delegates to Server.Run. The Handler signature is the simple json.RawMessage shape from the plan; the SDK's ToolHandler unmarshalling happens inside Tool.asSDKTool. SDK API anomaly vs. plan: v1.x has no NewStdioTransport(in, out) and no ServerTool aggregate. Tests use NewInMemoryTransports; CLI will pass &StdioTransport{} (zero value, hard-bound to os.Stdin/os.Stdout). The wrapper hides this — Serve takes any mcp.Transport. 5 tests cover handshake, tools/list, tools/call round-trip, and the registry's duplicate/empty/nil-handler rejection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/intelligence/evidence): EvidencePack + Request types EvidencePack mirrors Java EvidencePack.java 1:1: matched_symbols, related_files, references, snippets, provenance, degradation_notes, artifact_metadata, capability_level. EmptyPack guarantees non-nil slice fields so JSON serializes as [] rather than null per the MCP envelope contract. Request mirrors EvidencePackRequest with IsEmpty helper. ArtifactMetadata is forward-declared in this package until the dedicated intelligence/provenance port lands; Capabilities is a free-form map[string]any to avoid pulling intelligence/query into the pack's public surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/mcp): structured error envelope + result/depth cap helpers ErrorEnvelope carries {code, message, request_id, error} with the legacy `error` field mirroring `message` for backward compat with MCP clients that read `error` directly. Codes match the Java McpTools.errorEnvelope: INTERNAL_ERROR / INVALID_INPUT / FILE_READ_FAILED / SERIALIZATION_FAILED. CapResults / CapDepth clamp caller-supplied values to [1, hardCap] with DefaultMaxResults=500 / DefaultMaxDepth=10 fallbacks matching the Java ConfigDefaults built-ins. Caps are enforced in each tool's iteration loop, never injected as LIMIT N into Cypher (spec §8 gotcha). WithRequestID / RequestID round-trip a per-call UUIDv4 through ctx so tool handlers can stamp `request_id` on returned envelopes without a separate dependency. google/uuid promoted to direct require. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/intelligence/evidence): assembler + helpers Stateless EvidencePackAssembler mirrors Java EvidencePackAssembler: - routes via QueryPlanner with QueryFindSymbol intent; - pulls lexical matches by symbol or by file path; - extracts bounded source snippets via SnippetStore; - collects unique sorted file paths; - traverses CALLS + DEPENDS_ON edges for references when requested; - bundles per-node provenance (file_path / line_start / line_end / kind + prov_* properties); - derives CapabilityLevel from the planner's route. LexFinder and GraphReader are small local interfaces so the package stays CGo-free for unit tests and the MCP/serve wiring can plug in graph-backed implementations without forcing this package to depend on the kuzu-backed graph.Store directly. No config.Config dependency yet (package not yet ported) — rootPath + maxSnippetLines are passed explicitly at construction time. Helpers (resolveMaxLines, boundSnippet, inferLanguage, uniqueSortedFiles, provenanceFor, deriveCapability) each have a dedicated unit test covering the edge cases the Java side guards. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/graph): read-only Kuzu open + mutation gate + row-cap iter graph.OpenReadOnly(path, timeout) opens an existing Kuzu store with the SDK read-only flag set and applies a per-query wall-clock timeout via Connection.SetTimeout (Kuzu uses milliseconds). Timeout=0 disables the cap; defaults to 30s to match Neo4jConfig.transaction_timeout. graph.MutationKeyword(q) scans for blocked write keywords (CREATE, DELETE, DETACH, SET, REMOVE, MERGE, DROP, FOREACH, LOAD CSV, COPY) and gates CALL on a read-only-procedure allow-list (db.*, show_*, table_*, current_setting, table_info). Block-comment + line-comment strip happens before keyword detection so commented-out writes don't trip the gate. Earliest match wins so "DETACH DELETE" surfaces "DETACH". Go RE2 has no lookahead, so the CALL gate is a two-stage match (find all CALL sites, then allow-list each procedure name) rather than a single negative-lookahead regex. Store.Cypher now rejects mutation queries when readOnly=true. Store.CypherRows iterates up to maxRows then peeks one tuple to set truncated=true — cap enforced in the loop, NOT injected as LIMIT N into the user-supplied query (spec §8 gotcha). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/flow): View enum, model structs Ports src/main/java/.../flow/FlowModels.java to Go: - View (overview, ci, deploy, runtime, auth) + IsKnownView/AllViews - Node, Edge, Subgraph, Diagram structs with AllNodes/ValidEdges helpers - yaml.v3 promoted from indirect to direct dependency (renderer needs it) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/mcp): read_file with path-traversal + MIME-allowlist guards ReadRepoFile mirrors Java SafeFileReader + GraphController.readFile: - Reject absolute paths, empty inputs, directories early. - filepath.EvalSymlinks on both root and candidate; second-stage containment check after symlink resolution catches an in-repo symlink that points at /etc/passwd. - http.DetectContentType sniffs the first 512 bytes; rejected unless the type matches text/*, application/json, application/xml, application/x-yaml, or application/javascript (matches Java side). - Byte cap enforced in the read loop with `Truncated: true` surfaced on the response. Whole-file and line-range modes both honor the cap. - Default MaxBytes = 1 MiB when caller passes 0, so the function is safe to call without a ConfigDefaults wired up. 10 tests cover happy path, symlink escape, ../../ traversal, absolute- path rejection, binary MIME rejection, oversize truncation, line range, missing file, directory rejection, and empty-input rejection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/flow): Engine + per-view builders Ports src/main/java/.../flow/FlowEngine.java + FlowViews.java to Go: - Engine.Generate(view) loads a snapshot once and dispatches to a builder - Snapshot is the materialised (nodes, edges) view shared across builders - buildOverview, buildCI, buildDeploy, buildRuntime, buildAuth — five view builders matching the Java side 1:1 (subgraph IDs, node IDs, labels) so a parity diff on the same fixture lines up Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/flow): JSON + Mermaid + DOT + YAML renderers Ports src/main/java/.../flow/FlowRenderer.java and adds DOT/YAML the Java side does not yet ship: - Render(d, format) dispatches to RenderJSON/Mermaid/DOT/YAML - Mermaid: graph LR/TD with classDef styles + per-kind brackets - DOT: digraph with cluster_* subgraphs for Graphviz grouping - YAML: structured output via gopkg.in/yaml.v3 - Deterministic output: nodes within each subgraph sorted by ID, edges sorted by (source, target) before emission Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/mcp): 20 graph tools — stats, query, neighbors, dead-code, cypher, read_file RegisterGraph(srv, deps) wires every graph-facing MCP tool: - get_stats / get_detailed_stats → query.StoreStatsService - query_nodes / query_edges → graph.Store + CapResults - get_node_neighbors / get_ego_graph → graph.Store + CapDepth - find_cycles / find_shortest_path → query.Service - find_{consumers,producers,callers, dependencies,dependents} → query.Service (shared shape via consumerLikeTool) - find_dead_code → query.Service (entry-point + semantic-edge filter) - find_component_by_file → two-pass Cypher (Kuzu 0.7 binder rejects OPTIONAL MATCH (s)-[:CONTAINS]->(n) shape; split into node lookup + per-node service annotation) - trace_impact → query.Topology.BlastRadius with CapDepth - find_related_endpoints → service-container Cypher joining ep + target - search_graph → graph.Store.SearchByLabel - run_cypher → graph.MutationKeyword gate + CypherRows row-cap; truncated:true / max_results surfaced in the envelope shape from the Java side - read_file → ReadRepoFile from §3 with FILE_READ_FAILED on errors Deps carries Store/Query/Stats/Topology + MaxResults/MaxDepth/RootPath. All caps clamp via CapResults / CapDepth in the handler loop — never injected as LIMIT N into the user's Cypher (spec §8 gotcha). 24 tool tests under the in-memory transport pair: registration roster, each tool's happy path, missing-param INVALID_INPUT envelopes, the read-only mutation block, row-cap truncation, depth-cap clamp. Kuzu 0.7 binder quirk: variable-length pattern endpoints can't be projected as bare `(n)` — supported shape is `nodes(p)` over the named path. get_ego_graph rewritten accordingly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/cli): cypher (actually implemented, not a stub) The Java side ` + "\`cypher\`" + ` command has been a stub since commit 81b645c. The Go port wires this through to graph.CypherRows() against a read-only Kuzu store. Mutation keywords are rejected at the gate via the graph.MutationKeyword helper. Rows are capped at --max-results and the response carries a truncated flag when the cap is hit. JSON output by default; --table renders a column-aligned text table. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/cli): flow command with 5 views + 4 formats ` + "\`codeiq flow <view>\`" + ` generates an architecture flow diagram for one of the five canonical views (overview/ci/deploy/runtime/auth) and emits in JSON (default), Mermaid, DOT, or YAML. Output can be redirected to a file via --out instead of stdout. The Kuzu store is opened read-only via OpenReadOnly so the flow command can be invoked alongside a serving instance without lock contention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/cli): graph export (json/yaml/mermaid/dot) ` + "\`codeiq graph -f <format>\`" + ` exports every node and edge from the analyzed graph in JSON, YAML, Mermaid, or DOT. JSON / YAML emit the full hydrated payload with properties; Mermaid / DOT project the graph into a flow.Diagram and reuse the flow renderer. Large graphs (>500 nodes) truncate the mermaid/dot output to keep it readable; use JSON / YAML for the complete view. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/cli): cache info/list/inspect/clear ` + "\`codeiq cache\`" + ` exposes four subcommands over the SQLite analysis cache: - info row counts, version, on-disk size - list paginated file entries (table or --json) - inspect the full deserialised entry for a hash or path - clear destructive wipe; requires explicit --yes confirmation Cache helpers in internal/cache/inspect.go (Stats, List, Clear, LookupByHashOrPath) carry the SQL behind these subcommands so future callers can reuse the same primitives without going through the CLI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/cli): plugins list + inspect ` + "\`codeiq plugins list\`" + ` prints one row per registered detector with NAME / CATEGORY / LANGUAGES / CONFIDENCE columns; --language filters by supported language; --json switches to a machine-readable array. ` + "\`codeiq plugins inspect <name>\`" + ` prints the per-detector metadata block (name, category, languages, default confidence, Go type). Category is derived from the detector's Go package path (e.g. .../detector/jvm/java -> jvm/java) so detectors don't need an explicit Category() method. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/cli): mcp command wires stdio server ` + "\`codeiq mcp\`" + ` opens the Kuzu graph read-only, builds the Deps bundle (graph store, query service, store-backed stats, topology), constructs the MCP Server, registers the available tool families (RegisterGraph today via internal/mcp), and runs the stdio JSON-RPC protocol loop via the official Anthropic Go SDK (modelcontextprotocol/go-sdk v1.6.0). Stderr is the log channel — stdout is reserved for JSON-RPC frames so the protocol cannot be corrupted by ambient logs. optionalRegisterHooks is the parallel-agent-friendly extension point — topology/flow/intelligence Register* hooks land via separate files as their sections of phase 3 complete; this CLI starts whether or not those hooks are populated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/mcp): 9 topology tools via targeted Cypher (no heap snapshot) Wires get_topology, service_detail, service_dependencies, service_dependents, blast_radius, find_path, find_bottlenecks, find_circular_deps, find_dead_services on top of internal/query.Topology. Each handler runs targeted Cypher per call rather than loading the full graph into memory — the Java side's McpTools.getCachedData() 60s heap snapshot is intentionally NOT replicated (see spec §8 gotcha). Adds ServiceDependencies + ServiceDependents methods to query.Topology to project the cross-service runtime connections originating from / terminating at a named service. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/mcp): generate_flow tool wrapping flow.Engine Single MCP tool over the internal/flow engine — five views (overview/ci/ deploy/runtime/auth) and four renderer formats (json/mermaid/dot/yaml). Mirrors Java McpTools.generateFlow but routes JSON/YAML/Mermaid/DOT through the new Go renderers (Java only ships JSON+Mermaid today). Adds Flow / Evidence / QueryPlanner / ArtifactMeta fields to Deps so the remaining intelligence tools can plug in. Updates the SDK tool wrapper to pass plain-string returns through verbatim — generate_flow emits already-rendered text that must not be JSON-double-encoded. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(go/mcp): 4 intelligence tools + full 34-tool server wiring find_node: planner-routed fuzzy lookup. Always runs the structural search (label/fqn substring) — the planner's route is advisory and surfaces via degradation_note. LEXICAL_FIRST / MERGED routes augment with a lexical pass (doc-comment + config-key match). get_evidence_pack: wraps internal/intelligence/evidence.Assembler. Falls back to the legacy `{"error":"...unavailable. Run 'enrich' first."}` envelope when Evidence is not wired — matches the Java contract so existing MCP clients reading `error` keep working. get_artifact_metadata: returns the most recent provenance snapshot from Deps.ArtifactMeta with the same legacy-error envelope when nil. get_capabilities: returns the per-language CapabilityMatrix — either the full matrix or a single language row, matching Java McpTools .getCapabilities key-for-key. Wires every tool family in cli/mcp.go: graph (20) + topology (9) + flow (1) + intelligence (4) = 34 tools total. tools/list now lands all 34 entries; the optionalRegisterHooks slice is reserved for future tool-family extensions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(phase-3): exit gate passes — Go port phase 3 complete Phase 3 commits land 34 MCP tools, intelligence/{evidence,query} packages, all remaining CLI subcommands, and the stdio MCP server. Exit gate results: - All 34 MCP tools wired: * 20 graph tools (82644f6) * 9 topology tools (eae2141) * 1 flow tool (661b71c) * 4 intelligence tools + server wiring (e2a2c75) - intelligence/evidence: pack + assembler + helpers - intelligence/query: route + plan + planner + capabilities - All 14 CLI subcommands functional (codeiq help shows 14 commands) - go test ./internal/mcp/... PASS - codeiq binary builds, --version + help work Kuzu v0.7.1 limitations (deferred, documented): - FTS extension not bundled (CONTAINS fallback) - LIMIT/SKIP not parameterizable - lower() not toLower() - RETURN DISTINCT scope tighter - List comprehension binder limits - EXISTS subquery param scope - []string→[]any widener - Multi-label rel + kleene* binder Next: Phase 4 (94 remaining detectors across 10 batches). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(go/mcp): end-to-end integration test via real stdio binary Spawns the freshly-built codeiq binary, exchanges JSON-RPC frames over real stdin/stdout pipes, and asserts: - initialize handshake completes with serverInfo.name == "CODE MCP" - tools/list returns exactly 34 tools (graph 20 + topology 9 + flow 1 + intelligence 4) - one tool from each family is in the list - tools/call get_capabilities returns a body containing `matrix` Build tag `integration` keeps this out of the default `go test ./...` loop. Run via `go test -tags integration ./internal/mcp/...`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/typescript): port ExpressRouteDetector Express.js route detector — regex-only path matches the Java ExpressRouteDetector.detectWithRegex output 1:1. AST refinement deferred to phase 5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/go): port GoStructuresDetector Detects Go packages, structs, interfaces, methods, and functions. Mirrors Java GoStructuresDetector (regex-only — Java side defaults to regex too). Package named "golang" rather than "go" to match the convention already in use under intelligence/extractor/golang/. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/go): port GoOrmDetector Detects GORM models, queries, and migrations; sqlx connections and queries; database/sql connections and queries. Discriminator gating on import detection so query patterns don't false-fire when the respective driver isn't imported. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/go): port GoWebDetector Detects Go web endpoints across Gin, Echo, Chi (lowercase), gorilla/mux (with and without .Methods()), and net/http (Handle/HandleFunc). Emits MIDDLEWARE nodes for .Use(...) calls. Framework discriminator checks the canonical constructor call (gin.Default/New, echo.New, chi.NewRouter, mux.NewRouter) so endpoints get tagged correctly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/base): structured + frontend helpers for phase 4 batches 1+2 Foundations for the structured detectors (YAML/JSON/TOML/INI/properties) and frontend component detectors (Angular/React/Vue/Svelte). Mirrors the Java AbstractStructuredDetector + FrontendDetectorHelper. - parser.Language: add Yaml/JSON/TOML/INI/Properties/SQL/Batch/Vue/Svelte - parser.ParseStructured: minimal YAML (yaml.v3) / JSON (stdlib) / TOML / INI / properties parsers, all returning the Java envelope shape with type+data keys - analyzer: parse structured content into Context.ParsedData per file - base.StructuredDetector helpers (AsMap/GetMap/GetList/GetString, BuildFileNode, AddKeyNode) — confidence floor = SYNTACTIC - base.CreateComponentNode / LineAt — frontend component helper - Parse() now returns (nil, nil) for structured-only languages so they silently pass through the tree-sitter path without erroring; preserves the LanguageUnknown error contract - TOML / INI parsers are shallow on purpose — section + key shape is all the structured detectors look at, matching Java's SnakeYAML subset Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/jvm/kotlin): port KotlinStructuresDetector + shared JVM helpers Phase 4 batch 3 (1/5): port Java KotlinStructuresDetector to Go regex tier. Add jvmhelpers package mirroring StructuresDetectorHelper + AbstractJavaMessagingDetector helpers — these will be reused by Scala structures and all Java messaging detectors in the next commits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/typescript): port NestJSControllerDetector @Controller + @Get/@Post/etc. routes with EXPOSES edges. Guard requires @nestjs/* import to avoid generic decorator false-positives. RE2 possessive quantifiers (*+) collapsed to greedy (*). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/typescript): port FastifyRouteDetector Shorthand routes, route-objects, register-plugin edges, and addHook middleware. Discriminator guard requires `fastify` import to prevent collisions with Express patterns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/structured): port YamlStructure + JsonStructure detectors Mirror Java behaviour: emit a CONFIG_FILE node per file + a CONFIG_KEY node and CONTAINS edge per top-level key. Tests use Java's pre-parsed envelope shape ({"type":"yaml","data":{...}} / yaml_multi+documents). Top-level keys are sorted before emission for determinism. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/typescript): port NestJSGuardsDetector @UseGuards, @Roles, canActivate, AuthGuard('strategy') — emits GUARD nodes with role lists. Requires @nestjs/ import guard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/jvm/scala): port ScalaStructuresDetector Phase 4 batch 3 (2/5): port Java ScalaStructuresDetector to Go regex tier. Mirrors `extends Base with Mixin1 with Mixin2` → 1 EXTENDS + N IMPLEMENTS edges via the same shared helpers used by Kotlin structures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/typescript): port PassportJwtDetector passport.use(Strategy), passport.authenticate, jwt.verify, and express-jwt imports — emits GUARD + MIDDLEWARE nodes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/structured): port TOML / INI / Properties detectors - TomlStructureDetector: top-level keys; map-valued keys get section=true - IniStructureDetector: section nodes + key nodes nested under each section, plus CONTAINS edges file→section→key - PropertiesDetector: URL-shaped JDBC keys become DATABASE_CONNECTION nodes labeled by DB type (MySQL/PostgreSQL/...); everything else is CONFIG_KEY. Caps at 200 keys per file like Java. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/jvm/kotlin): port KtorRouteDetector Phase 4 batch 3 (3/5): port Java KtorRouteDetector to Go regex tier. Includes the route() brace-depth tracker so nested `route("/api") { get("/users") { } }` emits `/api/users` rather than `/users`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/jvm/java): port QuarkusDetector with io.quarkus discriminator Phase 4 batch 3 (4/5): port Java QuarkusDetector to Go regex tier. Requires io.quarkus / io.smallrye / @QuarkusTest discriminator before running pattern matches — avoids false positives on Spring code that shares @Transactional, @Scheduled, @Singleton, etc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/csharp): port CSharpEfcoreDetector Detects Entity Framework Core DbContexts (REPOSITORY), DbSet<T> entities, Migration subclasses, and CreateTable() calls. Emits QUERIES edges from each context to each entity. Deduplicates entities reached via both DbSet and CreateTable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/csharp): port CSharpMinimalApisDetector Detects ASP.NET Core Minimal API endpoints (.MapGet/.MapPost/...) plus Use/AddAuthentication/Authorization GUARDs. WebApplication.CreateBuilder gates the MODULE node so we don't false-positive on plain C# files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/csharp): port CSharpStructuresDetector Detects C# namespaces, classes, interfaces, enums, using imports, and MVC controller endpoints ([Route] + [HttpGet/Post/...]). Preserves Java parity behaviour: a 60-char window before the class match decides "abstract"; a 5-line forward scan above each method picks the first HttpXxx attribute (both are known Java parity bugs noted in tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/structured): port SqlStructure + BatchStructure detectors - SqlStructureDetector: regex-based scan for CREATE TABLE / VIEW / INDEX / PROCEDURE; FK edges from REFERENCES clauses to the most-recent table - BatchStructureDetector: MODULE per .bat file, METHOD per :LABEL, SET vars as CONFIG_DEFINITION, plus CONTAINS + CALLS edges; skips @ECHO OFF / REM / :: comment lines Both regex-based, AbstractRegexDetector confidence floor (LEXICAL). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/jvm/java): port Micronaut…
1 parent f7e792e commit c363727

427 files changed

Lines changed: 50884 additions & 8 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/ci-java.yml

Lines changed: 28 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,35 @@
11
name: Java CI
2+
3+
# Lean Java CI — fast compile + unit-test gate on the Java reference side.
4+
# Pairs with go-parity.yml: this workflow proves the Java jar still builds
5+
# on every PR; go-parity.yml then uses the same build to diff against the
6+
# Go port.
7+
#
8+
# Heavier checks (jacoco coverage, SpotBugs, OWASP dependency-check) live
9+
# under workflow_dispatch via release-java.yml — they're not in the per-PR
10+
# loop because they slow the Go port's PRs without adding signal.
11+
#
12+
# Disappears in Phase 6 cutover along with the rest of the Java tree.
13+
214
on:
315
push:
416
branches: [main]
5-
paths: ['src/**', 'pom.xml']
17+
paths:
18+
- 'src/**'
19+
- 'pom.xml'
20+
- '.github/workflows/ci-java.yml'
621
pull_request:
722
branches: [main]
23+
paths:
24+
- 'src/**'
25+
- 'pom.xml'
26+
- '.github/workflows/ci-java.yml'
827

928
permissions: read-all
1029

1130
jobs:
1231
build:
32+
name: build
1333
runs-on: ubuntu-latest
1434
permissions:
1535
contents: read
@@ -22,14 +42,14 @@ jobs:
2242
distribution: 'temurin'
2343
java-version: '25'
2444
cache: 'maven'
25-
- name: Build + verify (jacoco 85% + SpotBugs)
26-
run: mvn -B -ntp clean verify
45+
- name: Compile + unit tests (skip frontend)
46+
# -Dfrontend.skip=true so the npm step doesn't run — CI image
47+
# doesn't carry node 20 by default and the frontend is owned by
48+
# a separate workflow. -B (batch) + -ntp (no transfer progress)
49+
# for quiet logs.
50+
run: mvn -B -ntp -Dfrontend.skip=true verify
2751
- uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v4.6.2
2852
if: always()
2953
with:
30-
name: test-results
54+
name: java-test-results
3155
path: target/surefire-reports/
32-
- uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v4.6.2
33-
with:
34-
name: coverage-report
35-
path: target/site/jacoco/

.github/workflows/go-ci.yml

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
name: go-ci
2+
3+
on:
4+
push:
5+
branches: [main]
6+
paths: ['go/**', '.github/workflows/go-ci.yml']
7+
pull_request:
8+
branches: [main]
9+
paths: ['go/**', '.github/workflows/go-ci.yml']
10+
11+
permissions:
12+
contents: read
13+
14+
jobs:
15+
go:
16+
name: vet / test / staticcheck / gosec / govulncheck
17+
runs-on: ubuntu-latest
18+
env:
19+
CGO_ENABLED: "1"
20+
defaults:
21+
run:
22+
working-directory: go
23+
steps:
24+
- uses: actions/checkout@v4
25+
- uses: actions/setup-go@v5
26+
with:
27+
# Pin to 1.25.x — 1.26+ isn't on enough developer machines yet.
28+
# 1.25.10 includes the fix for GO-2026-4918 (HTTP/2 SETTINGS
29+
# infinite loop) which is reachable via review.Client.Review.
30+
go-version: '1.25.10'
31+
cache: true
32+
cache-dependency-path: go/go.sum
33+
- name: Install C toolchain
34+
run: sudo apt-get update -y && sudo apt-get install -y build-essential
35+
- name: go vet
36+
run: go vet ./...
37+
- name: go test (race)
38+
run: go test ./... -race -count=1
39+
- name: staticcheck
40+
run: |
41+
# staticcheck must understand the Go toolchain version that built
42+
# the binaries above. 2024.1.1 errors with "internal error in
43+
# importing internal/byteorder (unsupported version: 2)" against
44+
# Go 1.25's stdlib. 2025.1.1 is the first release that handles it.
45+
go install honnef.co/go/tools/cmd/staticcheck@2025.1.1
46+
"$(go env GOPATH)/bin/staticcheck" ./...
47+
- name: gosec
48+
run: |
49+
# v2.21.4 won't compile under Go 1.25 — its pinned
50+
# golang.org/x/tools v0.25.0 hits an int64 constant-overflow
51+
# bug in tokeninternal.go. v2.22.0 ships an x/tools bump that
52+
# builds clean on 1.25.x.
53+
go install github.com/securego/gosec/v2/cmd/gosec@v2.22.0
54+
# Suppressed rule rationale (all reviewed manually):
55+
# G104 — idiomatic deferred Close()/Rollback() error drops
56+
# G115 — uint64→int64 on counter rows from Kuzu, bounded
57+
# G202 — analysis-cache LIMIT/OFFSET; ints, not user input
58+
# G204 — git ls-files / mvn shellouts, no user input
59+
# G301/G306 — codeiq cache files are dev-local, 0o755/0o644 ok
60+
# G304 — fixture and cache files under controlled dirs
61+
# G401/G404/G501 — non-crypto hashing (MD5 for ID dedup, etc.)
62+
"$(go env GOPATH)/bin/gosec" -quiet -exclude=G104,G115,G202,G204,G301,G304,G306,G401,G404,G501 ./...
63+
- name: govulncheck
64+
run: |
65+
go install golang.org/x/vuln/cmd/govulncheck@latest
66+
"$(go env GOPATH)/bin/govulncheck" ./...

.github/workflows/go-parity.yml

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
name: go-parity
2+
3+
# Java vs Go parity test for fixture-minimal. Validates that the Go port
4+
# produces the same canonical graph shape as the Java reference until
5+
# Phase 6 cutover deletes the Java tree. Runs on PRs that touch the Go
6+
# tree, the Java tree, the parity harness, or this workflow.
7+
#
8+
# The Java side ships a JSON graph via `codeiq graph -f json` from the
9+
# `serving` profile (Neo4j-backed). A small jq filter
10+
# (go/parity/java-normalize.jq) rewrites that into the same per-file
11+
# canonical shape that the Go-side parity.Normalize emits. The two
12+
# normalized JSON blobs are then diff'd by the `parity` build tag in
13+
# go/parity/parity_test.go, with expected-divergence.json holding the
14+
# allow-list of intentional drift.
15+
16+
on:
17+
pull_request:
18+
branches: [main]
19+
paths:
20+
- 'go/**'
21+
- 'src/**'
22+
- 'pom.xml'
23+
- '.github/workflows/go-parity.yml'
24+
workflow_dispatch:
25+
26+
permissions:
27+
contents: read
28+
29+
jobs:
30+
parity:
31+
name: Java vs Go parity (fixture-minimal)
32+
runs-on: ubuntu-latest
33+
env:
34+
CGO_ENABLED: "1"
35+
steps:
36+
- uses: actions/checkout@v4
37+
- uses: actions/setup-java@v4
38+
with:
39+
distribution: temurin
40+
java-version: '25'
41+
cache: maven
42+
- uses: actions/setup-go@v5
43+
with:
44+
# Pin to 1.25.x — 1.26+ isn't on enough developer machines yet.
45+
go-version: '1.25.10'
46+
cache: true
47+
cache-dependency-path: go/go.sum
48+
- name: Install C toolchain
49+
run: sudo apt-get update -y && sudo apt-get install -y build-essential jq
50+
51+
# ---- Java side ----------------------------------------------------
52+
- name: Build Java jar (skip frontend)
53+
run: mvn -B -q -DskipTests -Dfrontend.skip=true package
54+
- name: Stage Java fixture (separate copy so caches don't collide)
55+
run: cp -r go/testdata/fixture-minimal /tmp/fm-java
56+
- name: Java index → H2 cache
57+
run: java -jar target/code-iq-*-cli.jar index /tmp/fm-java
58+
- name: Java enrich → Neo4j (serving profile)
59+
# `graph -f json` reads from Neo4j under the serving profile, not
60+
# H2. Need to enrich first or the export prints "No graph data
61+
# found. Run 'codeiq analyze' first."
62+
run: |
63+
java -Dspring.profiles.active=serving \
64+
-jar target/code-iq-*-cli.jar enrich /tmp/fm-java
65+
- name: Java graph → normalized JSON
66+
# Run from inside the fixture so Neo4j path resolution finds the
67+
# store enrich wrote. java-normalize.jq pivots the Java
68+
# {nodes:[...]} shape into the per-file array shape
69+
# parity.Normalize uses on the Go side.
70+
#
71+
# The Java CLI prints Logback JSON log lines to stdout BEFORE the
72+
# graph JSON, so we capture everything then awk to the first line
73+
# that is exactly "{" — that's the pretty-printed graph object.
74+
run: |
75+
cd /tmp/fm-java
76+
java -Dspring.profiles.active=serving \
77+
-jar "$GITHUB_WORKSPACE"/target/code-iq-*-cli.jar graph . -f json \
78+
> /tmp/java-raw-with-logs.json
79+
awk '/^\{$/ {f=1} f' /tmp/java-raw-with-logs.json > /tmp/java-raw.json
80+
jq -f "$GITHUB_WORKSPACE"/go/parity/java-normalize.jq /tmp/java-raw.json \
81+
> /tmp/java-normalized.json
82+
83+
# ---- Go side ------------------------------------------------------
84+
- name: Build Go binary
85+
working-directory: go
86+
run: go build -o codeiq ./cmd/codeiq
87+
- name: Go parity test (diff vs normalized Java output)
88+
working-directory: go
89+
env:
90+
TEST_JAVA_NORMALIZED: /tmp/java-normalized.json
91+
run: go test -tags=parity ./parity/... -v
92+
93+
# ---- Failure artifact --------------------------------------------
94+
- name: Upload normalized JSON on failure
95+
if: failure()
96+
uses: actions/upload-artifact@v4
97+
with:
98+
name: parity-diff
99+
path: |
100+
/tmp/java-normalized.json
101+
/tmp/java-raw.json

.gitignore

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,9 @@ target/
1616
*.swo
1717
*~
1818

19+
# Claude Code local state (progress trackers, settings, ralph-loop state)
20+
.claude/
21+
1922
# OS
2023
.DS_Store
2124
Thumbs.db
@@ -94,6 +97,7 @@ dist/
9497
build/
9598
*.whl
9699
pyproject.toml
100+
!go/testdata/**/pyproject.toml
97101
uv.lock
98102
.venv/
99103
venv/
@@ -116,3 +120,7 @@ graph.db/
116120
# Phase A baseline
117121
.seeds/
118122
docs/superpowers/baselines/**/raw/**
123+
124+
# Agent-generated plans / scratch (not project deliverables)
125+
go-port-phase4-plan.md
126+
phase*-plan.md

CHANGELOG.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,24 @@ for that specific tag for the per-commit details.
1616

1717
### Added
1818

19+
- **Go port (Phases 1-4 of the rewrite)** — codeiq is being ported from
20+
Java/Spring Boot to a single static Go binary on the `port/go-port`
21+
branch. PR #130. 100 detectors at 1:1 parity with the Java side; 34 MCP
22+
tools (deprecated) + 6 consolidated mode-driven tools (new); `codeiq
23+
review` CLI + `review_changes` MCP tool for LLM-driven PR review via
24+
Ollama (Cloud or local). Java tree untouched until Phase 6 cutover.
25+
- **Graph dedup + determinism** (Go side) — `GraphBuilder` deduplicates
26+
nodes by ID with confidence-aware merging, edges by canonical
27+
`(source, target, kind)` tuple. Linker output sorted at the boundary.
28+
`codeiq index` surfaces "Deduped: N nodes, M edges Dropped: K phantom
29+
edges" so graph hygiene is visible.
30+
- **`codeiq review`** — LLM-driven review of `git diff base..head` against
31+
the indexed graph. Defaults to local Ollama (`gpt-oss:20b`); set
32+
`OLLAMA_API_KEY` to flip to Ollama Cloud. `--format=markdown|json`,
33+
`--out`, `--focus`. Graph evidence (nodes-in-file + 1-hop blast radius)
34+
attached per changed file when the Kuzu store is enriched.
35+
- **`review_changes` MCP tool** — same review flow exposed over MCP for
36+
agent-driven invocation. Strictly read-only against the graph.
1937
- OpenSSF supply-chain wiring — Best Practices project
2038
[12650](https://www.bestpractices.dev/projects/12650), live Scorecard at
2139
[securityscorecards.dev](https://api.securityscorecards.dev/projects/github.com/RandomCodeSpace/codeiq),

README.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,27 @@
1919

2020
---
2121

22+
## Development — Go Port (Phase 1)
23+
24+
An in-progress Go port lives in [`go/`](./go/). Phase 1 ships `codeiq index`
25+
over 5 detectors with byte-level parity against the Java side on
26+
`go/testdata/fixture-minimal`. Phases 2-6 land enrich, MCP, the remaining 94
27+
detectors, release infra, and Java cutover (see
28+
[`docs/superpowers/specs/2026-05-11-codeiq-go-port-design.md`](docs/superpowers/specs/2026-05-11-codeiq-go-port-design.md)).
29+
30+
Build and run:
31+
32+
```bash
33+
cd go
34+
CGO_ENABLED=1 go build -o codeiq ./cmd/codeiq
35+
./codeiq index .
36+
./codeiq --version
37+
```
38+
39+
The Go binary writes to the same `.codeiq/cache/` location the Java side
40+
uses, but `CACHE_VERSION` is bumped to 6 so the first run triggers a clean
41+
rebuild. Phase 1 is parity-only — use the Java side for production runs.
42+
2243
## Quick Start
2344

2445
```bash

go/.gitignore

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
/codeiq
2+
/codeiq.exe
3+
/coverage.out
4+
/coverage.html
5+
/dist/
6+
/.cache/

go/cmd/codeiq/main.go

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
// Binary codeiq is the codeiq CLI entry point. All logic lives in
2+
// internal/cli; this file is just the os.Exit shim.
3+
package main
4+
5+
import (
6+
"os"
7+
8+
"github.com/randomcodespace/codeiq/go/internal/cli"
9+
)
10+
11+
func main() {
12+
os.Exit(cli.Execute())
13+
}

go/cmd/extcheck/main.go

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
package main
2+
3+
import (
4+
"fmt"
5+
6+
"github.com/randomcodespace/codeiq/go/internal/detector"
7+
// Same blank imports as the CLI uses
8+
_ "github.com/randomcodespace/codeiq/go/internal/detector/auth"
9+
_ "github.com/randomcodespace/codeiq/go/internal/detector/csharp"
10+
_ "github.com/randomcodespace/codeiq/go/internal/detector/frontend"
11+
_ "github.com/randomcodespace/codeiq/go/internal/detector/generic"
12+
_ "github.com/randomcodespace/codeiq/go/internal/detector/golang"
13+
_ "github.com/randomcodespace/codeiq/go/internal/detector/iac"
14+
_ "github.com/randomcodespace/codeiq/go/internal/detector/jvm/java"
15+
_ "github.com/randomcodespace/codeiq/go/internal/detector/jvm/kotlin"
16+
_ "github.com/randomcodespace/codeiq/go/internal/detector/jvm/scala"
17+
_ "github.com/randomcodespace/codeiq/go/internal/detector/markup"
18+
_ "github.com/randomcodespace/codeiq/go/internal/detector/proto"
19+
_ "github.com/randomcodespace/codeiq/go/internal/detector/python"
20+
_ "github.com/randomcodespace/codeiq/go/internal/detector/script/shell"
21+
_ "github.com/randomcodespace/codeiq/go/internal/detector/sql"
22+
_ "github.com/randomcodespace/codeiq/go/internal/detector/structured"
23+
_ "github.com/randomcodespace/codeiq/go/internal/detector/systems/cpp"
24+
_ "github.com/randomcodespace/codeiq/go/internal/detector/systems/rust"
25+
_ "github.com/randomcodespace/codeiq/go/internal/detector/typescript"
26+
)
27+
28+
func main() {
29+
for _, lang := range []string{"terraform", "csharp", "kotlin", "vue", "bash", "rust", "powershell"} {
30+
dets := detector.Default.For(lang)
31+
fmt.Printf("%-12s: %d detectors\n", lang, len(dets))
32+
}
33+
}

go/go.mod

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
module github.com/randomcodespace/codeiq/go
2+
3+
// Minimum Go version that can compile this module — clamped at 1.25.0
4+
// because github.com/modelcontextprotocol/go-sdk v1.6 (Phase 3, MCP
5+
// server) declares `go 1.25.0`. `go mod tidy` rewrites anything lower
6+
// back to 1.25.0. Bumping out of 1.25 should wait until a release of
7+
// that SDK that targets 1.26+.
8+
go 1.25.0
9+
10+
// Actual build toolchain. Pinned to 1.25.7 — 1.26+ isn't on enough
11+
// developer machines yet. CI pins the same version (.github/workflows/
12+
// go-ci.yml + go-parity.yml).
13+
toolchain go1.25.10
14+
15+
require github.com/mattn/go-sqlite3 v1.14.22
16+
17+
require (
18+
github.com/google/uuid v1.6.0
19+
github.com/kuzudb/go-kuzu v0.7.1
20+
github.com/modelcontextprotocol/go-sdk v1.6.0
21+
github.com/pmezard/go-difflib v1.0.0
22+
github.com/smacker/go-tree-sitter v0.0.0-20240827094217-dd81d9e9be82
23+
github.com/spf13/cobra v1.8.0
24+
github.com/spf13/pflag v1.0.5
25+
gopkg.in/yaml.v3 v3.0.1
26+
)
27+
28+
require (
29+
github.com/google/jsonschema-go v0.4.3 // indirect
30+
github.com/inconshreveable/mousetrap v1.1.0 // indirect
31+
github.com/segmentio/asm v1.1.3 // indirect
32+
github.com/segmentio/encoding v0.5.4 // indirect
33+
github.com/shopspring/decimal v1.4.0 // indirect
34+
github.com/yosida95/uritemplate/v3 v3.0.2 // indirect
35+
golang.org/x/oauth2 v0.35.0 // indirect
36+
golang.org/x/sys v0.41.0 // indirect
37+
)

0 commit comments

Comments
 (0)