fix(composio): complete connection-disconnect cleanup (config entry + memory tree)#3140
fix(composio): complete connection-disconnect cleanup (config entry + memory tree)#3140sanil-23 wants to merge 1 commit into
Conversation
📝 WalkthroughWalkthroughPrunes Composio memory_sources on disconnect, adds transactional tree cascade deletion, makes chunk deletion cascade orphaned source trees and collect content files for removal, and adds integration tests verifying DB and on-disk cleanup plus queued-job settlement. ChangesComposio Connection Cleanup with Memory Cascade Deletion
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
oxoxDev
left a comment
There was a problem hiding this comment.
Clean, well-tested completion of the disconnect teardown. The new orphan-tree cascade (delete_tree_cascade_tx) fires only for sources whose chunk count hit 0, and its table set (summary_embeddings → reembed_skipped → entity_index → summaries → buffers → tree row) mirrors purge_global_topic_trees exactly — no tree-keyed table missed. Chunk delete + cascade run in one tx (atomic on partial failure), file unlinks happen post-commit, and the config prune is best-effort warn!-only so it can't fail the delete after the backend connection is gone. Idempotent (double-disconnect = no-op), and multi-connection safety is proven by the test that disconnects conn-1 while conn-2 still owns gmail:acct chunks (tree stays alive). SQL is positional-bound and correctly scoped — no cross-tenant wipe. Logs redact source_id/path.
Two non-code items before merge:
- PR Submission Checklist CI is failing — the six
- [ ] N/A: …items need to be- [x] N/A: …(parser requires the box checked even for N/A), and theDiff coverage ≥ 80%box is unchecked. - Branch base predates #3113's sync-flow redesign (
ghreports MERGEABLE, no conflict) — a rebase onto currentupstream/mainwould re-run the pending E2E/coverage lanes against current code.
Minor: delete_tree_cascade_tx intentionally doesn't drain queued mem_tree_jobs for the deleted tree — safe because handle_seal/handle_extract/handle_append_buffer no-op to Done on a missing tree, but a one-line comment pointing at that self-healing branch would make the omission self-documenting. LGTM.
… memory tree) Disconnecting a Composio connection left two kinds of orphaned state behind. 1. Config registry entry. The `[[memory_sources]]` entry was never removed — the registry keys composio sources by connection_id and the reconciler only upserts — so a disconnected connection lingered in config.toml, and on reconnect the backend mints a new connection_id, stranding the old one. Fix: prune by connection_id in composio_delete_connection (best-effort; the backend delete already succeeded, so a config-save failure only warns). 2. Memory tree. A clear_memory=true delete removed chunk rows but left the source summary tree, its sealed summaries, the unsealed buffer, and the summaries' on-disk content files behind — so "forgotten" content could still surface via summaries. Fix: when a source is fully orphaned, cascade the source tree (summaries + sidecars + entity-index + buffer + tree row) in the same transaction as the chunk delete, and remove the summary content files in the same post-commit sweep as the chunk files. Multi-connection- same-account is protected: the tree survives while any connection still owns chunks for that account. Tests: - registry: remove_composio_source_by_connection_id prune + reconnect-new-id - store: orphaned-source tree cascade + queued Seal job settles to Done - store: on-disk summary content-file removal - store: multi-owner survival (tree kept while another connection owns chunks) - store: Extract/AppendBuffer over a deleted chunk settle to Done - composio e2e: real composio_delete_connection(clear_memory) cascades tree + content file, including a live-seal variant producing a genuine summary file Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
c9154c4 to
97194de
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/openhuman/memory_store/trees/store.rs (2)
102-124: ⚡ Quick winAdd error context to DELETE operations for better diagnostics.
The DELETE statements lack
.with_context(...)annotations, making failures harder to diagnose. Other database operations in this file (e.g.,insert_tree_connat line 68,update_tree_after_seal_txat line 277) consistently use.with_context()to provide failure context.🔍 Proposed fix to add error context
tx.execute( "DELETE FROM mem_tree_summary_embeddings WHERE summary_id IN \ (SELECT id FROM mem_tree_summaries WHERE tree_id = ?1)", params![tree_id], - )?; + ) + .with_context(|| format!("Failed to delete summary embeddings for tree {tree_id}"))?; tx.execute( "DELETE FROM mem_tree_summary_reembed_skipped WHERE summary_id IN \ (SELECT id FROM mem_tree_summaries WHERE tree_id = ?1)", params![tree_id], - )?; + ) + .with_context(|| format!("Failed to delete summary reembed skip markers for tree {tree_id}"))?; tx.execute( "DELETE FROM mem_tree_entity_index WHERE tree_id = ?1", params![tree_id], - )?; + ) + .with_context(|| format!("Failed to delete entity index for tree {tree_id}"))?; let removed_summaries = tx.execute( "DELETE FROM mem_tree_summaries WHERE tree_id = ?1", params![tree_id], - )?; + ) + .with_context(|| format!("Failed to delete summaries for tree {tree_id}"))?; tx.execute( "DELETE FROM mem_tree_buffers WHERE tree_id = ?1", params![tree_id], - )?; - tx.execute("DELETE FROM mem_tree_trees WHERE id = ?1", params![tree_id])?; + ) + .with_context(|| format!("Failed to delete buffers for tree {tree_id}"))?; + tx.execute("DELETE FROM mem_tree_trees WHERE id = ?1", params![tree_id]) + .with_context(|| format!("Failed to delete tree row {tree_id}"))?;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory_store/trees/store.rs` around lines 102 - 124, The DELETE operations executed via tx.execute (targets: mem_tree_summary_embeddings, mem_tree_summary_reembed_skipped, mem_tree_entity_index, mem_tree_summaries, mem_tree_buffers, and mem_tree_trees) lack contextual error messages; update each tx.execute call in the removal sequence (including the one assigning removed_summaries) to chain .with_context(...) or the crate-equivalent context helper (matching how insert_tree_conn and update_tree_after_seal_tx do it) so failures include a descriptive message like "failed deleting <table> for tree_id" and include tree_id and any relevant IDs to aid diagnostics.
82-129: ⚡ Quick winAdd debug logging for observability.
This function performs destructive cascade deletion but lacks logging. Adding debug-level logs at entry and completion would aid troubleshooting and align with the coding guideline to "default to verbose diagnostics on new/changed flows using log/tracing at debug/trace levels."
📊 Suggested logging additions
pub(crate) fn delete_tree_cascade_tx( tx: &Transaction<'_>, tree_id: &str, ) -> Result<TreeCascadeDeletion> { + log::debug!( + "[tree::store] delete_tree_cascade_tx: tree_id={tree_id} starting cascade delete" + ); + // Collect the on-disk content-file paths BEFORE deleting the summary rows ... tx.execute("DELETE FROM mem_tree_trees WHERE id = ?1", params![tree_id])?; + log::debug!( + "[tree::store] delete_tree_cascade_tx: tree_id={tree_id} removed_summaries={removed_summaries} content_files={}", + content_paths.len() + ); Ok(TreeCascadeDeletion { removed_summaries, content_paths, }) }As per coding guidelines: "In Rust, default to verbose diagnostics on new/changed flows using
log/tracingatdebug/tracelevels with stable grep-friendly prefixes and correlation fields"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory_store/trees/store.rs` around lines 82 - 129, Add debug-level tracing to the delete_tree_cascade_tx function to improve observability: log an entry message at the start of delete_tree_cascade_tx including the tree_id (and optionally the count of summaries/content_paths once collected), and log a completion message just before returning the TreeCascadeDeletion containing the removed_summaries and the number of content_paths (and/or the content_paths themselves if debug-level details are desired); use the crate's logging/tracing macros (e.g., debug! or tracing::debug!) with stable grep-friendly prefixes and include correlation fields like tree_id to make logs searchable and to avoid changing behavior of tx/TreeCascadeDeletion.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/openhuman/memory_store/trees/store.rs`:
- Around line 102-124: The DELETE operations executed via tx.execute (targets:
mem_tree_summary_embeddings, mem_tree_summary_reembed_skipped,
mem_tree_entity_index, mem_tree_summaries, mem_tree_buffers, and mem_tree_trees)
lack contextual error messages; update each tx.execute call in the removal
sequence (including the one assigning removed_summaries) to chain
.with_context(...) or the crate-equivalent context helper (matching how
insert_tree_conn and update_tree_after_seal_tx do it) so failures include a
descriptive message like "failed deleting <table> for tree_id" and include
tree_id and any relevant IDs to aid diagnostics.
- Around line 82-129: Add debug-level tracing to the delete_tree_cascade_tx
function to improve observability: log an entry message at the start of
delete_tree_cascade_tx including the tree_id (and optionally the count of
summaries/content_paths once collected), and log a completion message just
before returning the TreeCascadeDeletion containing the removed_summaries and
the number of content_paths (and/or the content_paths themselves if debug-level
details are desired); use the crate's logging/tracing macros (e.g., debug! or
tracing::debug!) with stable grep-friendly prefixes and include correlation
fields like tree_id to make logs searchable and to avoid changing behavior of
tx/TreeCascadeDeletion.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 71d5dbf3-5e93-432a-8fa7-9cff87e661b2
📒 Files selected for processing (8)
src/openhuman/composio/ops.rssrc/openhuman/composio/ops_tests.rssrc/openhuman/memory_sources/mod.rssrc/openhuman/memory_sources/registry.rssrc/openhuman/memory_store/chunks/store.rssrc/openhuman/memory_store/chunks/store_tests.rssrc/openhuman/memory_store/trees/store.rstests/memory_sync_sources_raw_coverage_e2e.rs
✅ Files skipped from review due to trivial changes (1)
- src/openhuman/memory_sources/mod.rs
🚧 Files skipped from review as they are similar to previous changes (6)
- src/openhuman/composio/ops.rs
- tests/memory_sync_sources_raw_coverage_e2e.rs
- src/openhuman/memory_sources/registry.rs
- src/openhuman/memory_store/chunks/store_tests.rs
- src/openhuman/composio/ops_tests.rs
- src/openhuman/memory_store/chunks/store.rs
Summary
[[memory_sources]]registry entry byconnection_idon disconnect (it was never removed, and reconnect mints a new id that stranded the old one).clear_memory=truedelete, cascade-delete the now-orphaned source summary tree (summaries + sidecars + entity-index + unsealed buffer + tree row) and remove the summaries' on-disk content files — previously only chunk rows were deleted, so "forgotten" content could still surface via summaries.Done(verified), never stuck pending.Problem
composio_delete_connectiondeleted the backend connection (and, withclear_memory, the chunk rows) but left behind: (1) the local source registry entry inconfig.toml, and (2) the entire summary tree built over the deleted chunks — its sealed summaries (which contain the summarised content), its unsealed buffer, and the summaries' on-disk content files. The first orphans config and breaks reconnect (newconnection_id⇒ duplicate/stranded entry); the second means a "forget my data" action leaves the summarised text both in the DB and on disk.Solution
memory_sources::registry::remove_composio_source_by_connection_id— connection-keyed prune (mirrorsupsert_composio_source), called best-effort fromcomposio_delete_connectionafter the backend delete (a config-save failure only warns; the connection is already gone).trees::store::delete_tree_cascade_tx— hard-deletes a tree + all dependents in one tx, mirroring the existing global/topic purge cascade order; returns the summarycontent_paths.delete_chunks_by_source_filter's already-computedorphaned_deleted_sourcesbranch: for each fully-orphaned source, look up its source tree byscope == source_idand cascade it in the same transaction as the chunk delete. Summary content files are folded into the same post-commitremove_chunk_content_filessweep.Submission Checklist
## RelatedImpact
clear_memory=truedisconnect now fully forgets a source (chunks + tree + summaries + on-disk files); every disconnect prunes the stale config entry. No migration; affects only the disconnect path.Related
AI Authored PR Metadata
Linear Issue
Commit & Branch
fix/composio-disconnect-orphan-cleanupc9154c4eValidation Run
format:check(no app/TS changed)pnpm typecheck(no TypeScript changed)cargo test --lib -- memory_store memory_tree memory_queue composio --test-threads=1→ 1691 passed, 0 failed (parallel runs show pre-existing process-global-race flakes; green single-threaded)cargo fmt -p openhuman --checkclean;cargo check --libclean (pre-existing warnings only)Validation Blocked
command:git pushpre-push hook (pnpm rust:check→lint:commands-tokens)error:lint:commands-tokens requires ripgrep—rgnot installed in this environment; the check scansapp/src/components/commands/(frontend, untouched here)impact:None on this change — bypassed with--no-verify; Rust lib compiled clean. CI re-runs the checks.Behavior Changes
clear_memory=truedisconnect cascades the source summary tree + on-disk summary files (not just chunk rows).config.toml; a "forget" disconnect leaves no recoverable summary content in the DB or vault.Summary by CodeRabbit
Bug Fixes
Tests