fix: preserve dependent nodes in incremental graph rebuild (#305)#311
Merged
Conversation
_delete_file_scope deleted Symbol nodes for both changed and dependent files during incremental rebuild. Dependent nodes can have incoming CALLS edges from out-of-scope callers: the orchestrator expands dependents from changed nodes only, so callers OF dependent nodes are never pulled into scope, and source_file on an edge is the caller's file, so Phase 1's outgoing-edge delete left those incoming edges in place. The plain DELETE then crashed ('Node ... has connected edges in table CALLS in the bwd direction') and the rebuild fell back to full every time.
Delete Symbol nodes only for changed files. Dependent nodes are kept and re-MERGEd in place on their deterministic symbol_id, so their incoming edges from out-of-scope callers survive (no crash, no silent edge loss). Phase 3 uses DETACH DELETE for changed nodes as a safety net. No schema or ontology change, so no re-index burden.
Fixes #305
Co-Authored-By: Claude <noreply@anthropic.com>
Address PR #311 review feedback (non-blocking follow-ups): - Extract the `_SYMBOL_TO_SYMBOL_EDGE_TYPES` module constant and derive `_find_dependents` from it. The #305 fix's safety rests on `_find_dependents` enumerating EVERY Symbol->Symbol edge type (so every real caller of a changed node enters scope and Phase 1 removes its edge before the node delete). The named constant plus cross-references in the Phase 1 and Phase 3 comments make that invariant enforced/documented by construction, instead of two parallel hand-maintained lists (6 vs 12 types) silently needing to stay in sync. - Phase 2: comment why deleting UnresolvedCallSite children of preserved dependents is safe (scope files, dependents included, are reprocessed and re-emit them in `_scoped_write`). No behavior change; tests unchanged and still green. Co-Authored-By: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Fixes #305 —
java-codebase-rag incrementcrashed during scoped deletion and fell back to a full rebuild every time:Root cause
incremental_rebuild()→_delete_file_scope()deleted Symbol nodes for both changed and dependent files with a plainDELETE. But:source_fileon every Symbol→Symbol edge is the caller's file (pinned bytest_source_file_value_matches_symbol_filename), so Phase 1'se.source_file IN $filenamesdeletes only outgoing edges. Incoming edges from out-of-scope callers survive.changed_node_ids). So a dependent file's node — pulled into scope because it calls a changed node — can still have an incoming CALLS edge from an out-of-scope caller (a caller of the dependent, which is never brought into scope). That edge survives Phase 1, and the plainDELETEon the dependent node trips LadybugDB's "connected edges in the bwd direction" guard → exception → full-rebuild fallback.A naive fix (DETACH DELETE, or an extra
dst.filenameincoming-edge pass) silences the crash but permanently drops those out-of-scope edges (the caller is never reprocessed) → silent graph corruption.Fix
Delete Symbol nodes only for
changed_files; preserve dependent-file nodes. Dependent files are reprocessed only to re-resolve their outgoing edges against changed nodes — their node definitions didn't change, and node ids are deterministic (symbol_id), so they re-MERGE in place on the same id. Their incoming edges from out-of-scope callers survive untouched.MATCH (s:Symbol) WHERE s.filename IN $changed_files DETACH DELETE s— changed files only; DETACH DELETE as a safety net for the rare phantom/surviving edge.Complementary to #310 (which wires
incrementto actually invokeincremental_rebuild); no file overlap.Tests (
tests/test_incremental_graph.py)test_incremental_preserves_incoming_edges_to_dependent— end-to-end repro of Increment graph error #305 (C→B→A, change A). Wasmode == "full_fallback"before the fix;mode == "incremental"after, with the out-of-scopeC→BCALLS edge and the dependentBnode preserved.test_delete_file_scope_preserves_dependent_nodes— direct unit test of the new(changed_files, dependent_files)signature.test_delete_file_scope_removes_only_matchingto the new signature.Validation
.venv/bin/ruff check .— clean..venv/bin/python -m pytest tests/test_incremental_graph.py -v— 24 passed.testssuite: the only failures are pre-existing/environmental and live in modules that do not import the changed code (test_mcp_tools,test_mcp_v2,test_lance_optimize,test_cli_progress_stdout_invariant— missing async pytest plugin in this venv, per theUnknown config option: asyncio_modewarning). Verified unrelated to this change.Notes
propose/doc (clearly-bounded one-file-plus-tests fix).🤖 Generated with Claude Code