Skip to content

feat(memory): redesign sync flow with ingest_summary, graph improvements, audit log#3113

Merged
senamakel merged 38 commits into
tinyhumansai:mainfrom
senamakel:feat/memory-sources-defaults
Jun 1, 2026
Merged

feat(memory): redesign sync flow with ingest_summary, graph improvements, audit log#3113
senamakel merged 38 commits into
tinyhumansai:mainfrom
senamakel:feat/memory-sources-defaults

Conversation

@senamakel
Copy link
Copy Markdown
Member

@senamakel senamakel commented Jun 1, 2026

Summary

Major redesign of the memory sync pipeline and graph visualization:

  • New sync architecture: GitHub sync moved from memory_sources/sync.rs to memory_sync/sources/github.rs. Sources produce summaries directly (bypassing chunks) via memory_tree::ingest::ingest_summary — a single-shape function that stages .md files, inserts SummaryNode rows, manages L1 buffers, and cascades seals.
  • Auto-rebuild from raw: When a sync completes and raw archive files exist but the tree has no summaries, automatically rebuild the tree from disk.
  • Graceful embedding degradation: Extract handler and seal cascade no longer fail jobs on Voyage AI 429 rate limits — they log a warning and continue without vectors.
  • Per-source sync mutex: Pressing the sync button twice no longer runs the same source concurrently.
  • Retry failed jobs on sync: All terminally-failed pipeline jobs are reset to ready when any sync triggers.
  • Graph improvements: Synthetic orange source-root nodes, document leaf nodes for GitHub (commits/issues/PRs), orphan L-nodes link to their source root, higher levels render bigger, 10k node cap, WebGL-only (no SVG fallback), smooth animated transitions on data updates.
  • Sync audit log: JSONL-based audit trail recording every sync run's timestamp, items, tokens, cost, duration. Surfaced in a new "Sync History" panel on the Memory tab.
  • Wikilink fix: Child wikilinks in summary .md files now point to actual raw archive paths so Obsidian resolves them.

Test plan

  • cargo check passes (core + Tauri shell)
  • pnpm compile (tsc --noEmit) passes
  • 13 new unit tests pass (memory_tree::ingest, memory_sync::sources::github, memory_sync::sources::audit, memory_sync::sources::rebuild)
  • Manual testing: triggered Gmail and GitHub syncs from the app UI, verified graph renders with source roots and document leaves
  • N/A: E2E tests — these are memory-pipeline internals not covered by WDIO specs
  • N/A: i18n — new UI strings use t() with English fallbacks; full translations tracked separately

Submission Checklist

  • My code follows the project code style and conventions
  • I have performed a self-review of my code
  • N/A: I have added tests that prove my fix/feature works — 13 new unit tests added; E2E not applicable for backend pipeline changes
  • N/A: New and existing unit tests pass locally — pre-existing test suite not affected; new tests pass
  • N/A: I have updated relevant documentation — architecture docs unchanged; CLAUDE.md rules followed
  • My changes generate no new warnings
  • N/A: I have checked my code for security vulnerabilities — no user input handling changes; internal pipeline only

Summary by CodeRabbit

  • New Features

    • Per-source “Build/Flush” action with progress UI and real-time events
    • Sync History panel showing audit log, aggregates and run details
    • GitHub repo sync pipeline, on-disk raw rebuild, and a summary ingest API
    • Source nodes in the memory graph with distinct styling and hover info
  • Improvements

    • Smoother/incremental graph updates, updated toast styling, and new i18n strings
  • Chores

    • Helper to retry failed background jobs added

senamakel and others added 30 commits May 31, 2026 00:02
…ts, restyle toasts

Composio sources are still auto-registered but now default to
enabled=false so they don't burn tokens until the user explicitly
enables them. GitHub item limits lowered from 2000→1000 per type.
Toast notifications restyled with white card + colored left border
accent for a calmer look.
…, granular progress

Commits now sync via bare git clone + `git log` instead of paginating
the GitHub REST API. Falls back to API on clone failure. Issues/PRs
stay on gh CLI/API as before.

Chunk store paths for GitHub items are now repo-scoped
(`document/github-org-repo/`) instead of per-item
(`document/github-org-repo-commit-<sha>/`), eliminating thousands of
directories in Obsidian.

Sync progress events now emit per-item for GitHub sources (vs every 5)
and include the new-item count in the message.
…h issue/PR

The paginated /issues and /pulls endpoints already return full body,
state, labels, etc. Previously we discarded this at list time and
re-fetched each item individually in read_item — N extra API calls
for no reason. Now list_issues/list_prs cache the full GhIssue/GhPr
in a module-level HashMap, and read_issue/read_pr consume from cache
first (only falling back to API on cache miss). Comments are still
fetched per-item since they're not in the list response.

For 1000 issues this cuts API calls from ~2010 to ~1010.
…s share one tree

Previously each GitHub commit/issue/PR got its own source tree because
the tree scope was the per-item source_id. Now when path_scope is set
(e.g. repo-scoped for GitHub), it's used as the tree scope instead.
All items from one repo land in a single source tree like
`source-github-tinyhumansai-openhuman` instead of thousands of
per-commit trees.
…ket.IO

Adds three Socket.IO bridges from the Rust event bus:
- memory:sync_stage — per-source sync progress (stage, detail, percent)
- memory:tree_progress — tree node seal events (namespace, level)
- memory:tree_completed — tree rebuild done (triggers graph refresh)

Frontend changes:
- socketService: listen for push events, dispatch window CustomEvents
- MemorySourcesRegistry: real-time progress bars with percentages per
  source, driven by push events instead of polling-only
- MemoryWorkspace: auto-refresh the graph when a tree build or source
  sync completes (2-3s debounce for worker drain)
Remove the SUMMARY_FANOUT (10-item) gate from L0 sealing so small-token
items like commit messages (~20-50 tokens) accumulate into large batches
before sealing. A 50K token budget naturally batches hundreds of commits
into one merged summary. The time-based flush catches low-volume sources.
…line

Adds MemoryTreeBuildProgress domain event with phase/step/tree_scope/
level/item_count/detail fields. Emitted at key points:

- extract: scoring, admitted/dropped
- append: buffering (leaf → tree L0)
- seal: summarising (N inputs → L{n+1}), embedding, persisting

Sub-phases inside seal_one_level (summarising, embedding, persisting)
give the frontend visibility into the LLM call, embed call, and disk
write separately.

Bridged to Socket.IO as memory:build_progress and forwarded to the
frontend via openhuman:memory-build-progress CustomEvent.
…tely

Adds memory_tree_flush_source RPC that calls force_flush_tree directly
(bypassing the job queue) with a per-scope mutex so concurrent clicks
are serialized. Each source row in the UI now has a Build button next
to Sync.
…e_ids

Old chunks ingested before the path_scope field was added have
path_scope=None and source_id=github:org/repo:commit:sha. The handler
now strips the item suffix (commit:sha, issue:42, pr:99) and derives
github:org/repo as the tree scope, so even old data routes to one
shared tree per repo on re-extraction.
Issues and PRs now use the paginated list data directly (body, state,
labels, author, dates all come from the /issues?per_page=100 and
/pulls?per_page=100 responses). No more individual fetch per item, no
more per-item comments fetch. For 1000 items this drops API calls from
~2010 to ~20 (just the paginated pages).

Sync loop now processes items with bounded concurrency (10 at a time)
via futures::stream::for_each_concurrent instead of sequentially.
…tirely

GitHub items now bypass the extract_chunk → append_buffer job queue.
Instead, all items are read from the paginated API (already cached),
pushed directly into the L0 buffer in bulk, then force-sealed in one
shot. This means:

- 1000 commits = 10 API pages + 1 LLM summary call
- No per-chunk scoring, no per-chunk embedding, no per-chunk job queue
- Entity extraction and embedding happen at seal time on the summary

The per-item queue path is preserved for non-GitHub sources (Composio,
folder, RSS, etc.) where per-chunk extraction is valuable.
The buffer/seal path requires chunks to exist in mem_tree_chunks for
hydrate_inputs to find them. The bulk path was pushing item_ids into
the buffer that didn't exist as chunks, causing "refused to seal empty
buffer". Fix: build SummaryInputs directly from the read content and
call summarise() in one shot. No buffer, no seal, no hydrate — just
one LLM call with all the items as input.
The bulk path was calling summarise() but never persisting anything.
Now it:
1. Writes raw archive files per item during the read phase
   (raw/github-com-org-repo/{commits,issues,prs}/<ts>_<uid>.md)
2. Writes the summary as a .md file with YAML frontmatter in
   wiki/summaries/<repo-slug>/bulk-summary-<timestamp>.md
…mary

Move GitHub sync logic from memory_sources/sync.rs into
memory_sync/sources/github.rs where sync code belongs. Introduce
memory_tree::ingest::ingest_summary as the single-shape entry point
for landing pre-built summaries into a tree — stages the .md file to
wiki/summaries/source-<slug>/L1/, inserts the SummaryNode row, appends
to the L1 buffer, and cascades seals when the buffer crosses
SUMMARY_FANOUT. Embeddings are temporarily disabled.

Child wikilinks now point to actual raw archive paths so Obsidian can
resolve them.
- Add per-source mutex to sync_source() so pressing the sync button
  twice doesn't run the same source concurrently.
- Add sync audit log (workspace/memory_tree/sync_audit.jsonl) that
  records each sync run: items fetched, batches, input/output tokens,
  estimated cost USD, duration, success/error.
- GitHub sync now reports token usage and cost in the completion event.
When a sync completes, check if the source has raw archive files on
disk but no tree summaries (max_level == 0). If so, automatically
rebuild: read all .md files from raw/<source_slug>/, batch into 50k
token groups, summarise each batch, and ingest into the tree.

Fixes gmail (and other composio sources) where raw emails were written
to disk but the tree was never built due to interrupted sync or
missing worker queue processing.
Add retry_all_failed() to the job queue store — resets all terminally
failed jobs back to 'ready' with attempts=0 so the worker pool
re-processes them.

Called at the start of every sync_source dispatch, so pressing the
sync button in the UI clears the "225 failed jobs" state and lets the
pipeline continue processing through all documents.
Pre-existing modifications to ingest pipeline, queue handlers, github
reader, document canonicalizer, bucket seal, and e2e test coverage
that were part of the memory-sources-defaults feature branch.
…ng job

The extract_chunk handler was propagating embedding failures (Voyage AI
429 rate limit) as job failures, causing 225+ jobs to exhaust retries
and go terminal. Now embedding errors are logged and the job continues
without a vector — the chunk is still scored, admitted, and written to
the tree. Embeddings can be backfilled later when the rate limit clears.
The bucket_seal cascade was aborting the entire seal when the embedder
failed (Voyage 429 rate limit), leaving the buffer stuck and the seal
job retrying forever. Now embedding errors during seal are logged and
the summary is persisted without a vector — same graceful degradation
as the extract handler fix.
Instead of destroying and recreating the Pixi WebGL context on every
data change, diff the node/edge arrays in-place:
- Existing nodes keep their positions and velocity
- New nodes spawn near their parent and animate into place
- The simulation reheats gently (alpha=0.3) so new nodes settle
  without disrupting the existing layout
- Camera (pan/zoom) is preserved across updates

Only a mode flip (tree ↔ contacts) triggers a full remount since the
edge semantics change. Data refreshes from sync events and manual
reloads now produce a smooth animated transition.
- Add synthetic orange "source" root nodes (one per tree scope, e.g.
  "GitHub · tinyhumansai/openhuman", "Gmail · stevent95@gmail.com")
- All parentless L-nodes and orphan chunks attach to their source root,
  giving the graph a proper tree shape instead of floating clusters
- GitHub L1 summaries expand their child_ids into document leaf nodes
  (commit/issue/PR labels) so the full depth is visible:
  source root → L2 → L1 → documents
- Frontend: source nodes render at radius 14 in orange (#F97316) with
  glow, appear in the legend, and show a distinct tooltip
- Summary node radius now grows with level: L1=7.5, L2=10, L3=12.5
  (was shrinking). Source roots are 16px. Chunks stay small at 3px.
- Only orphan L-nodes attach to source roots. Orphan chunks float
  freely as before — they connect to their parent_summary_id when
  sealed, no artificial source-root link.
Add openhuman.memory_sources_sync_audit_log RPC that reads the
sync_audit.jsonl file and returns entries most-recent-first.

New SyncAuditPanel component on the Memory tab shows:
- Summary row: total runs, total tokens in/out, total cost
- Table: when, source, items, tokens, cost, duration, success/fail
- Scrollable, most recent at top
senamakel added 4 commits May 31, 2026 22:19
Move audit logging to the sync_source dispatcher level so every
source kind (Composio, Folder, RSS, WebPage) gets an audit entry
on success or failure. GitHub still writes its own detailed entry
with token breakdowns; other kinds get items + duration + error.
@senamakel senamakel requested a review from a team June 1, 2026 02:44
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 599d042f-6b3a-4110-8f02-dc6e0835f846

📥 Commits

Reviewing files that changed from the base of the PR and between 9299a5f and d188c06.

📒 Files selected for processing (1)
  • app/src/components/intelligence/MemoryGraph.tsx

📝 Walkthrough

Walkthrough

Adds per-source path scoping, sync audit log and UI, GitHub sync/rebuild pipelines, source-root-aware tree export and immediate source flush RPC, event-bus → Socket.IO progress broadcasts, frontend build/progress UI and i18n, graph source-node visualization, and summary ingest with L1 buffering and cascading seals.

Changes

Memory source sync and tree integration

Layer / File(s) Summary
All PR changes (single checkpoint)
src/openhuman/**, app/src/components/intelligence/**, app/src/services/socketService.ts, app/src/utils/tauriCommands/memoryTree.ts, app/src/lib/i18n/*, tests/*
Comprehensive changes implementing per-source path scoping, canonicalisation threading, sync audit JSONL storage + RPC, event-bus → Socket.IO memory:* broadcasts, frontend build/flush UI and progress tracking, SyncAuditPanel, SOURCE node support in tree export and visualization (layout/tooltip/legend), Pixi renderer in-place updates, GitHub listing/commit git-cache and syncing pipeline, rebuild-from-raw, immediate source flush RPC, summary ingest API with L1 buffering and cascade seals, job progress events, and many test/data fixture updates.

Estimated code review effort: 🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related issues:

Suggested reviewers:

  • sanil-23
  • graycyrus

"I hopped through logs and code with delight,
Flushed trees, stitched audits, and bathed charts in light.
Sources now orange, progress bars hum—
A rabbit’s small cheer for the work you’ve done. 🐇✨"

@coderabbitai coderabbitai Bot added feature Net-new user-facing capability or product behavior. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/. working A PR that is being worked on by the team. labels Jun 1, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/openhuman/memory_tree/tree/bucket_seal.rs (1)

302-333: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Stale documentation: L0 seal gate no longer includes sibling count fallback.

The doc comment at lines 304-309 states L0 buffers gate on "either token_sum >= INPUT_TOKEN_BUDGET ... or sibling count >= SUMMARY_FANOUT", but the implementation at line 329 only checks token budget:

buf.token_sum >= INPUT_TOKEN_BUDGET as i64

The sibling count fallback has been removed (per AI summary and PR objectives), but the documentation wasn't updated accordingly.

📝 Proposed doc fix
 /// Level-aware seal gate.
 ///
-/// L0 buffers gate on **either** `token_sum >= INPUT_TOKEN_BUDGET`
-/// (so the summariser's input stays bounded) **or** sibling count
-/// `>= SUMMARY_FANOUT` (so leaves form predictably for sources whose
-/// chunks are individually small — without the count fallback,
-/// hundreds of tiny emails can sit unsealed waiting to hit 50k
-/// tokens).
+/// L0 buffers gate on `token_sum >= INPUT_TOKEN_BUDGET` only.
+/// Token-only gating lets small-token items (e.g. commit messages at
+/// ~20-50 tokens each) accumulate into large batches so summaries
+/// cover meaningful spans of activity.
 ///
 /// Time-based sealing for low-volume sources is handled separately
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/memory_tree/tree/bucket_seal.rs` around lines 302 - 333, The
doc comment for should_seal is stale: it claims L0 buffers seal on either
token_sum >= INPUT_TOKEN_BUDGET or sibling count >= SUMMARY_FANOUT, but the
implementation in should_seal (and the L0 branch checking buf.token_sum) now
only uses the token budget; update the function-level documentation to reflect
that L0 sealing is solely token-budget based (remove mention of the sibling
count fallback) and keep descriptions for L≥1 sealing by sibling count
referencing buf.item_ids and SUMMARY_FANOUT so comments match the behavior of
should_seal and the Buffer semantics.
🧹 Nitpick comments (3)
src/openhuman/memory_tree/ingest.rs (1)

280-306: 💤 Low value

Misleading test name.

The test is named ingest_summary_is_idempotent_on_buffer, but the assertion at line 305 (assert_eq!(buf.item_ids.len(), 3)) and the comment confirm that each call adds a new entry. This is the opposite of idempotent behavior. The buffer deduplication only prevents duplicate summary_ids, but ingest_summary generates a new ID on each call.

Consider renaming to ingest_summary_adds_unique_entries_to_buffer or similar.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/memory_tree/ingest.rs` around lines 280 - 306, The test
function ingest_summary_is_idempotent_on_buffer is misnamed because
ingest_summary generates a new summary_id each call and the assertion checks
that three unique entries were added (buf.item_ids.len() == 3); rename the test
to something like ingest_summary_adds_unique_entries_to_buffer (and update the
inline comment) to reflect actual behavior, and ensure references to the test
name (the #[tokio::test] function identifier) are updated so it compiles; no
logic changes to ingest_summary or the buf.item_ids assertion are needed.
src/openhuman/memory_sources/sync.rs (1)

309-315: ⚖️ Poor tradeoff

Consider preserving source item's modification time if available.

modified_at is set to chrono::Utc::now() rather than the item's actual modification time. This loses temporal information that could be valuable for time-based queries and chronological ordering. If ItemContent or the reader can provide the original timestamp, consider threading it through.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/memory_sources/sync.rs` around lines 309 - 315, The
DocumentInput is always stamping modified_at with chrono::Utc::now(), losing the
source item's original timestamp; update the construction in sync.rs to use the
item's original modification time if available (e.g., a field on ItemContent or
Item returned by the reader) and fall back to chrono::Utc::now() only when that
timestamp is missing or invalid—look for the DocumentInput creation, the
modified_at field, and the source variables content / item / reader to thread
through the original timestamp (e.g., content.modified_at or item.modified_at ->
use that value converted to chrono::DateTime<Utc> if needed).
src/openhuman/memory_store/content/mod.rs (1)

89-97: ⚡ Quick win

Add a regression test for non-None path_scope path derivation.

Line 89 changed path routing semantics, but current tests only cover path_scope: None fallback. Please add one test that seeds Metadata.path_scope = Some("...") and asserts content_path/file location uses that scope rather than source_id.

Proposed test addition
 #[test]
 fn stage_chunks_is_idempotent() {
@@
 }
+
+#[test]
+fn stage_chunks_uses_path_scope_when_present() {
+    let dir = TempDir::new().unwrap();
+    let mut c = sample_chunk(0);
+    c.metadata.path_scope = Some("github:owner/repo".into());
+    let staged = stage_chunks(dir.path(), &[c.clone()]).unwrap();
+    assert_eq!(staged.len(), 1);
+    let expected = paths::chunk_abs_path(
+        dir.path(),
+        c.metadata.source_kind.as_str(),
+        c.metadata.path_scope.as_deref().unwrap(),
+        &c.id,
+    );
+    assert!(expected.exists(), "file must exist: {}", expected.display());
+}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/memory_store/content/mod.rs` around lines 89 - 97, Add a
regression test that constructs a Chunk/Metadata where Metadata.path_scope =
Some("custom_scope") (and source_id set to a different value), then exercise the
path derivation logic that uses chunk_rel_path/chunk_abs_path (the same code
path that produces content_path) and assert the resulting rel and absolute paths
include "custom_scope" and not the source_id; place the test alongside existing
tests for content path derivation, instantiate content_root/source_kind
identically to other tests, and ensure the test fails if path_scope fallback to
source_id is used.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/src/components/intelligence/MemorySourcesRegistry.tsx`:
- Around line 257-258: The en.ts i18n file defines memorySources.build.* keys
(memorySources.build.title, .building, .successTitle, .sealsMessage,
.failedTitle) used by MemorySourcesRegistry (e.g., props isBuilding/progress and
toast UI) but those keys are missing in every other locale; update each locale
file (ar.ts, bn.ts, de.ts, es.ts, fr.ts, hi.ts, id.ts, it.ts, ko.ts, pl.ts,
pt.ts, ru.ts, zh-CN.ts) to include the same memorySources.build.* keys with
correct translations so the build UI/toasts render in each locale (also ensure
the same keys are added for the other occurrences noted in the diff ranges
around 281–286, 331–355, and 388–402); keep keys identical to en.ts and provide
appropriate translated strings.

In `@app/src/components/intelligence/MemoryWorkspace.tsx`:
- Line 73: The useEffect inside MemoryWorkspace calls setError(null)
synchronously which violates the react-hooks/set-state-in-effect rule; locate
the effect where setError is invoked and change the immediate call to schedule
the state update asynchronously (for example wrap it in queueMicrotask(() =>
setError(null)) or setTimeout(() => setError(null), 0)) so the state update is
not performed synchronously in the effect body.
- Around line 96-112: The effect schedules setTimeouts in onTreeDone and
onSyncDone but never clears them, so save each timeout ID (e.g., treeTimeoutId
and syncTimeoutId) when calling setTimeout inside onTreeDone and onSyncDone, and
call clearTimeout on those IDs in the useEffect cleanup alongside removing the
window listeners; store IDs in variables or refs (use ReturnType<typeof
setTimeout> or number) so they can be cleared and nulled to avoid stale
callbacks that call setGraphVersion after unmount/remount.
- Line 321: The JSX currently passes a hardcoded fallback string to the t(...)
call; remove the literal fallback and replace it with the proper translation key
only (e.g., t('sync.auditTitle') instead of t('sync.auditTitle', 'Sync
History')), then add the corresponding entry to the English locale file
(app/src/lib/i18n/en.ts) under the same key used by the component so the text is
resolved via useT()/t without a hardcoded fallback; update MemoryWorkspace.tsx
where t is invoked and ensure key parity in the locale file.

In `@app/src/components/intelligence/PixiGraph.tsx`:
- Around line 90-97: The effect currently lists nodes and edges in its
dependency array and its cleanup clears handleRef and mountedModeRef (and
destroys pending) on every data update, which prevents the in-place fast path in
updateGraph from working; change the effect so it only tears down the graph when
truly needed (e.g., on unmount or when mode changes) by removing nodes and edges
from the dependency array (use [mode] or similar) and stop nulling
handleRef/mountedModeRef in the cleanup that runs for data updates—only destroy
the pending handle and clear refs when the component unmounts or when mode
switches so updateGraph can detect an existing handleRef and use the fast-path.

In `@app/src/utils/tauriCommands/memoryTree.ts`:
- Around line 567-570: The interface FlushSourceResponse is not exported but
should be for consistency and to allow consumers to type-check
memoryTreeFlushSource returns; update the declaration of FlushSourceResponse to
export it (export interface FlushSourceResponse { tree_scope: string;
seals_fired: number; }) and ensure any imports/usages of FlushSourceResponse in
files that call memoryTreeFlushSource reference the exported symbol.

In `@src/openhuman/memory_sources/sync.rs`:
- Around line 217-221: The removal of source_id_for_panic from ACTIVE_SYNCS uses
if let Ok(...) which skips removal when the mutex is poisoned; change it to
recover a poisoned mutex the same way as earlier (use
ACTIVE_SYNCS.lock().unwrap_or_else(|e| e.into_inner())) and then
remove(&source_id_for_panic) so the entry is always cleared even if the mutex
was poisoned; update the block around ACTIVE_SYNCS and source_id_for_panic to
consistently use the unwrap_or_else(|e| e.into_inner()) pattern.

In `@src/openhuman/memory_tree/ingest.rs`:
- Around line 39-46: The struct exposes parallel arrays child_labels and
child_basenames but there is no length validation, so add a check in the
constructor/ingest function (ingest_summary) that validates child_basenames is
either empty or has the same length as child_labels; if not, return an error (or
Result::Err) with a clear message rather than allowing downstream code (e.g.,
compose_summary_md) to iterate mismatched vectors and panic. Locate the code
that builds this struct in ingest_summary, perform the length check on
child_labels.len() vs child_basenames.len(), and fail-fast with a descriptive
error or normalize by converting mismatched entries to None if you prefer
consistent semantics.

In `@src/openhuman/memory/read_rpc.rs`:
- Around line 1892-1910: The cleanup that removes the scope from ACTIVE must run
even if the spawn/await chain returns early with ?, so create an RAII guard
(e.g., ActiveScopeGuard) just after you acquire scope (before calling
tokio::spawn) which captures a clone of scope and on Drop removes it from
ACTIVE; instantiate that guard (so it will run on all return paths including
errors from tokio::spawn, get_or_create_source_tree, force_flush_tree, or the
map_err ? chain), then keep the rest of the tokio::spawn block and await logic
unchanged — implement ActiveScopeGuard::drop to perform the same removal logic
currently in the block that calls ACTIVE.lock().unwrap_or_else(|e|
e.into_inner()).remove(&scope).

---

Outside diff comments:
In `@src/openhuman/memory_tree/tree/bucket_seal.rs`:
- Around line 302-333: The doc comment for should_seal is stale: it claims L0
buffers seal on either token_sum >= INPUT_TOKEN_BUDGET or sibling count >=
SUMMARY_FANOUT, but the implementation in should_seal (and the L0 branch
checking buf.token_sum) now only uses the token budget; update the
function-level documentation to reflect that L0 sealing is solely token-budget
based (remove mention of the sibling count fallback) and keep descriptions for
L≥1 sealing by sibling count referencing buf.item_ids and SUMMARY_FANOUT so
comments match the behavior of should_seal and the Buffer semantics.

---

Nitpick comments:
In `@src/openhuman/memory_sources/sync.rs`:
- Around line 309-315: The DocumentInput is always stamping modified_at with
chrono::Utc::now(), losing the source item's original timestamp; update the
construction in sync.rs to use the item's original modification time if
available (e.g., a field on ItemContent or Item returned by the reader) and fall
back to chrono::Utc::now() only when that timestamp is missing or invalid—look
for the DocumentInput creation, the modified_at field, and the source variables
content / item / reader to thread through the original timestamp (e.g.,
content.modified_at or item.modified_at -> use that value converted to
chrono::DateTime<Utc> if needed).

In `@src/openhuman/memory_store/content/mod.rs`:
- Around line 89-97: Add a regression test that constructs a Chunk/Metadata
where Metadata.path_scope = Some("custom_scope") (and source_id set to a
different value), then exercise the path derivation logic that uses
chunk_rel_path/chunk_abs_path (the same code path that produces content_path)
and assert the resulting rel and absolute paths include "custom_scope" and not
the source_id; place the test alongside existing tests for content path
derivation, instantiate content_root/source_kind identically to other tests, and
ensure the test fails if path_scope fallback to source_id is used.

In `@src/openhuman/memory_tree/ingest.rs`:
- Around line 280-306: The test function ingest_summary_is_idempotent_on_buffer
is misnamed because ingest_summary generates a new summary_id each call and the
assertion checks that three unique entries were added (buf.item_ids.len() == 3);
rename the test to something like ingest_summary_adds_unique_entries_to_buffer
(and update the inline comment) to reflect actual behavior, and ensure
references to the test name (the #[tokio::test] function identifier) are updated
so it compiles; no logic changes to ingest_summary or the buf.item_ids assertion
are needed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5278b283-d271-4898-84c6-4eeb5882c31a

📥 Commits

Reviewing files that changed from the base of the PR and between 721ebe9 and a499c15.

📒 Files selected for processing (59)
  • app/src/components/intelligence/MemoryGraph.tsx
  • app/src/components/intelligence/MemorySourcesRegistry.tsx
  • app/src/components/intelligence/MemoryWorkspace.tsx
  • app/src/components/intelligence/PixiGraph.tsx
  • app/src/components/intelligence/SyncAuditPanel.tsx
  • app/src/components/intelligence/Toast.tsx
  • app/src/components/intelligence/memoryGraphLayout.ts
  • app/src/components/intelligence/pixiGraphRenderer.ts
  • app/src/lib/i18n/en.ts
  • app/src/services/socketService.ts
  • app/src/utils/tauriCommands/memoryTree.ts
  • src/core/event_bus/events.rs
  • src/core/socketio.rs
  • src/openhuman/channels/controllers/ops_tests.rs
  • src/openhuman/composio/ops_tests.rs
  • src/openhuman/memory/ingest_pipeline.rs
  • src/openhuman/memory/read_rpc.rs
  • src/openhuman/memory/schema.rs
  • src/openhuman/memory_queue/handlers/mod.rs
  • src/openhuman/memory_queue/handlers/mod_tests.rs
  • src/openhuman/memory_queue/store.rs
  • src/openhuman/memory_queue/worker.rs
  • src/openhuman/memory_sources/readers/github.rs
  • src/openhuman/memory_sources/registry.rs
  • src/openhuman/memory_sources/rpc.rs
  • src/openhuman/memory_sources/schemas.rs
  • src/openhuman/memory_sources/sync.rs
  • src/openhuman/memory_sources/types.rs
  • src/openhuman/memory_store/chunks/store.rs
  • src/openhuman/memory_store/chunks/store_tests.rs
  • src/openhuman/memory_store/chunks/types.rs
  • src/openhuman/memory_store/content/compose.rs
  • src/openhuman/memory_store/content/mod.rs
  • src/openhuman/memory_store/content/read.rs
  • src/openhuman/memory_store/content/tags.rs
  • src/openhuman/memory_store/retrieval/mod.rs
  • src/openhuman/memory_store/traits.rs
  • src/openhuman/memory_sync/canonicalize/chat.rs
  • src/openhuman/memory_sync/canonicalize/document.rs
  • src/openhuman/memory_sync/canonicalize/email.rs
  • src/openhuman/memory_sync/mod.rs
  • src/openhuman/memory_sync/sources/audit.rs
  • src/openhuman/memory_sync/sources/github.rs
  • src/openhuman/memory_sync/sources/mod.rs
  • src/openhuman/memory_sync/sources/rebuild.rs
  • src/openhuman/memory_tree/ingest.rs
  • src/openhuman/memory_tree/mod.rs
  • src/openhuman/memory_tree/retrieval/drill_down.rs
  • src/openhuman/memory_tree/retrieval/fetch.rs
  • src/openhuman/memory_tree/retrieval/integration_tests.rs
  • src/openhuman/memory_tree/retrieval/rpc.rs
  • src/openhuman/memory_tree/retrieval/source.rs
  • src/openhuman/memory_tree/tree/bucket_seal.rs
  • src/openhuman/memory_tree/tree/bucket_seal_tests.rs
  • src/openhuman/memory_tree/tree/flush.rs
  • tests/memory_raw_coverage_e2e.rs
  • tests/memory_threads_raw_coverage_e2e.rs
  • tests/memory_tree_sync_deep_raw_coverage_e2e.rs
  • tests/memory_tree_sync_raw_coverage_e2e.rs

Comment thread app/src/components/intelligence/MemorySourcesRegistry.tsx
console.debug('[ui-flow][memory-workspace] graph load: entry mode=%s', mode);
console.debug('[ui-flow][memory-workspace] graph load: entry mode=%s v=%d', mode, graphVersion);
let cancelled = false;
setError(null);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid synchronous state updates inside useEffect

Line 73 calls setError(null) directly in the effect body. This pattern is flagged by the repo’s react-hooks/set-state-in-effect rule.

Based on learnings: “do not perform synchronous setState directly inside useEffect bodies… react-hooks/set-state-in-effect is enforced in this codebase.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/components/intelligence/MemoryWorkspace.tsx` at line 73, The
useEffect inside MemoryWorkspace calls setError(null) synchronously which
violates the react-hooks/set-state-in-effect rule; locate the effect where
setError is invoked and change the immediate call to schedule the state update
asynchronously (for example wrap it in queueMicrotask(() => setError(null)) or
setTimeout(() => setError(null), 0)) so the state update is not performed
synchronously in the effect body.

Comment on lines +96 to +112
useEffect(() => {
const onTreeDone = () => {
setTimeout(() => setGraphVersion(v => v + 1), 2000);
};
const onSyncDone = (e: Event) => {
const data = (e as CustomEvent).detail as { stage?: string } | null;
if (data?.stage === 'completed') {
setTimeout(() => setGraphVersion(v => v + 1), 3000);
}
};
window.addEventListener('openhuman:memory-tree-completed', onTreeDone);
window.addEventListener('openhuman:memory-sync-stage', onSyncDone);
return () => {
window.removeEventListener('openhuman:memory-tree-completed', onTreeDone);
window.removeEventListener('openhuman:memory-sync-stage', onSyncDone);
};
}, []);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clear scheduled refresh timers in effect cleanup

The effect unregisters listeners but doesn’t clear pending setTimeout callbacks. Late callbacks can still fire after unmount/remount and trigger unnecessary graph reloads.

Suggested fix
  useEffect(() => {
+    const timerIds: number[] = [];
     const onTreeDone = () => {
-      setTimeout(() => setGraphVersion(v => v + 1), 2000);
+      const id = window.setTimeout(() => setGraphVersion(v => v + 1), 2000);
+      timerIds.push(id);
     };
     const onSyncDone = (e: Event) => {
       const data = (e as CustomEvent).detail as { stage?: string } | null;
       if (data?.stage === 'completed') {
-        setTimeout(() => setGraphVersion(v => v + 1), 3000);
+        const id = window.setTimeout(() => setGraphVersion(v => v + 1), 3000);
+        timerIds.push(id);
       }
     };
     window.addEventListener('openhuman:memory-tree-completed', onTreeDone);
     window.addEventListener('openhuman:memory-sync-stage', onSyncDone);
     return () => {
       window.removeEventListener('openhuman:memory-tree-completed', onTreeDone);
       window.removeEventListener('openhuman:memory-sync-stage', onSyncDone);
+      timerIds.forEach(id => window.clearTimeout(id));
     };
   }, []);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/components/intelligence/MemoryWorkspace.tsx` around lines 96 - 112,
The effect schedules setTimeouts in onTreeDone and onSyncDone but never clears
them, so save each timeout ID (e.g., treeTimeoutId and syncTimeoutId) when
calling setTimeout inside onTreeDone and onSyncDone, and call clearTimeout on
those IDs in the useEffect cleanup alongside removing the window listeners;
store IDs in variables or refs (use ReturnType<typeof setTimeout> or number) so
they can be cleared and nulled to avoid stale callbacks that call
setGraphVersion after unmount/remount.


<div className="rounded-lg border border-stone-100 dark:border-neutral-800 bg-white dark:bg-neutral-900 p-4">
<h3 className="mb-2 text-sm font-medium text-stone-700 dark:text-neutral-200">
{t('sync.auditTitle', 'Sync History')}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove fallback literal and use a real i18n key path

Line 321 uses a hardcoded fallback ('Sync History'). In app/src, user-visible copy should resolve from translation keys only, with key parity managed in locale files.

Suggested fix
-          {t('sync.auditTitle', 'Sync History')}
+          {t('sync.auditTitle')}
// app/src/lib/i18n/en.ts
+  'sync.auditTitle': 'Sync History',

As per coding guidelines: “Every user-visible string in app/src/** ... must go through useT() ... and be added to app/src/lib/i18n/en.ts”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/components/intelligence/MemoryWorkspace.tsx` at line 321, The JSX
currently passes a hardcoded fallback string to the t(...) call; remove the
literal fallback and replace it with the proper translation key only (e.g.,
t('sync.auditTitle') instead of t('sync.auditTitle', 'Sync History')), then add
the corresponding entry to the English locale file (app/src/lib/i18n/en.ts)
under the same key used by the component so the text is resolved via useT()/t
without a hardcoded fallback; update MemoryWorkspace.tsx where t is invoked and
ensure key parity in the locale file.

Comment on lines 90 to 97
return () => {
cancelled = true;
handleRef.current = null;
mountedModeRef.current = null;
void pending.then(handle => handle?.destroy());
};
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [nodes, edges, mode]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Effect cleanup currently defeats the in-place updateGraph path.

Because this effect depends on nodes/edges, the cleanup runs before each rerun and resets handleRef/mountedModeRef (Line 92-94). That makes the fast-path check on Line 58 fail on every data update, forcing full remounts.

Suggested direction
-  useEffect(() => {
+  useEffect(() => {
     const host = hostRef.current;
     if (!host) return;
@@
-    return () => {
-      cancelled = true;
-      handleRef.current = null;
-      mountedModeRef.current = null;
-      void pending.then(handle => handle?.destroy());
-    };
+    return () => {
+      cancelled = true;
+      void pending.then(handle => handle?.destroy());
+    };
   }, [nodes, edges, mode]);
+
+  useEffect(() => {
+    return () => {
+      handleRef.current?.destroy();
+      handleRef.current = null;
+      mountedModeRef.current = null;
+    };
+  }, []);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return () => {
cancelled = true;
handleRef.current = null;
mountedModeRef.current = null;
void pending.then(handle => handle?.destroy());
};
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [nodes, edges, mode]);
return () => {
cancelled = true;
void pending.then(handle => handle?.destroy());
};
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [nodes, edges, mode]);
useEffect(() => {
return () => {
handleRef.current?.destroy();
handleRef.current = null;
mountedModeRef.current = null;
};
}, []);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/components/intelligence/PixiGraph.tsx` around lines 90 - 97, The
effect currently lists nodes and edges in its dependency array and its cleanup
clears handleRef and mountedModeRef (and destroys pending) on every data update,
which prevents the in-place fast path in updateGraph from working; change the
effect so it only tears down the graph when truly needed (e.g., on unmount or
when mode changes) by removing nodes and edges from the dependency array (use
[mode] or similar) and stop nulling handleRef/mountedModeRef in the cleanup that
runs for data updates—only destroy the pending handle and clear refs when the
component unmounts or when mode switches so updateGraph can detect an existing
handleRef and use the fast-path.

Comment on lines +567 to +570
interface FlushSourceResponse {
tree_scope: string;
seals_fired: number;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Export FlushSourceResponse for consistency.

Every other response interface in this file is exported (DeleteChunkResponse, WipeAllResponse, ResetTreeResponse, FlushNowResponse, ObsidianVaultStatus, etc.), but FlushSourceResponse is not. Export it so consumers can type-check the return value of memoryTreeFlushSource.

📤 Proposed fix
-interface FlushSourceResponse {
+export interface FlushSourceResponse {
   tree_scope: string;
   seals_fired: number;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
interface FlushSourceResponse {
tree_scope: string;
seals_fired: number;
}
export interface FlushSourceResponse {
tree_scope: string;
seals_fired: number;
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/utils/tauriCommands/memoryTree.ts` around lines 567 - 570, The
interface FlushSourceResponse is not exported but should be for consistency and
to allow consumers to type-check memoryTreeFlushSource returns; update the
declaration of FlushSourceResponse to export it (export interface
FlushSourceResponse { tree_scope: string; seals_fired: number; }) and ensure any
imports/usages of FlushSourceResponse in files that call memoryTreeFlushSource
reference the exported symbol.

Comment on lines +217 to +221

// Release the per-source lock so future syncs can proceed.
if let Ok(mut active) = ACTIVE_SYNCS.lock() {
active.remove(&source_id_for_panic);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Inconsistent mutex recovery can permanently block syncs for a source.

Line 44 uses unwrap_or_else(|e| e.into_inner()) to recover from a poisoned mutex, but line 219 uses if let Ok(...) which silently skips removal if poisoned. If the inner task panics, the mutex becomes poisoned and this source remains in ACTIVE_SYNCS forever, blocking all future syncs for it until restart.

Proposed fix
         // Release the per-source lock so future syncs can proceed.
-        if let Ok(mut active) = ACTIVE_SYNCS.lock() {
-            active.remove(&source_id_for_panic);
-        }
+        let mut active = ACTIVE_SYNCS.lock().unwrap_or_else(|e| e.into_inner());
+        active.remove(&source_id_for_panic);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Release the per-source lock so future syncs can proceed.
if let Ok(mut active) = ACTIVE_SYNCS.lock() {
active.remove(&source_id_for_panic);
}
// Release the per-source lock so future syncs can proceed.
let mut active = ACTIVE_SYNCS.lock().unwrap_or_else(|e| e.into_inner());
active.remove(&source_id_for_panic);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/memory_sources/sync.rs` around lines 217 - 221, The removal of
source_id_for_panic from ACTIVE_SYNCS uses if let Ok(...) which skips removal
when the mutex is poisoned; change it to recover a poisoned mutex the same way
as earlier (use ACTIVE_SYNCS.lock().unwrap_or_else(|e| e.into_inner())) and then
remove(&source_id_for_panic) so the entry is always cleared even if the mutex
was poisoned; update the block around ACTIVE_SYNCS and source_id_for_panic to
consistently use the unwrap_or_else(|e| e.into_inner()) pattern.

Comment on lines +39 to +46
pub child_labels: Vec<String>,
/// Per-child wikilink overrides (parallel to `child_labels`). When
/// `Some(path)`, the wikilink points at `[[path]]` instead of the
/// sanitised child label. Use this to link to raw archive files so
/// Obsidian can resolve the wikilinks.
#[doc(hidden)]
pub child_basenames: Vec<Option<String>>,
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Missing length validation for parallel arrays.

child_basenames is documented as parallel to child_labels, but there's no validation that they have matching lengths when both are non-empty. If a caller provides mismatched lengths, downstream code iterating them in parallel (e.g., compose_summary_md) could panic or produce incorrect wikilinks.

🛡️ Proposed validation in ingest_summary
 pub async fn ingest_summary(
     config: &Config,
     tree: &Tree,
     input: SummaryIngestInput,
 ) -> Result<SummaryIngestOutcome> {
+    if !input.child_basenames.is_empty()
+        && input.child_basenames.len() != input.child_labels.len()
+    {
+        anyhow::bail!(
+            "child_basenames length ({}) must match child_labels length ({})",
+            input.child_basenames.len(),
+            input.child_labels.len()
+        );
+    }
     let target_level: u32 = 1;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/memory_tree/ingest.rs` around lines 39 - 46, The struct exposes
parallel arrays child_labels and child_basenames but there is no length
validation, so add a check in the constructor/ingest function (ingest_summary)
that validates child_basenames is either empty or has the same length as
child_labels; if not, return an error (or Result::Err) with a clear message
rather than allowing downstream code (e.g., compose_summary_md) to iterate
mismatched vectors and panic. Locate the code that builds this struct in
ingest_summary, perform the length check on child_labels.len() vs
child_basenames.len(), and fail-fast with a descriptive error or normalize by
converting mismatched entries to None if you prefer consistent semantics.

Comment on lines +1892 to +1910
let cfg2 = config.clone();
let scope2 = scope.clone();
let resp = tokio::spawn(async move {
let tree = get_or_create_source_tree(&cfg2, &scope2)?;
let strategy = TreeFactory::from_tree(&tree).label_strategy(&cfg2);
let sealed = force_flush_tree(&cfg2, &tree.id, Some(chrono::Utc::now()), &strategy).await?;
Ok::<_, anyhow::Error>(FlushSourceTreeResponse {
tree_scope: scope2,
seals_fired: sealed.len() as u32,
})
})
.await
.map_err(|e| format!("flush_source_tree join error: {e}"))?
.map_err(|e| format!("flush_source_tree: {e:#}"))?;

{
let mut active = ACTIVE.lock().unwrap_or_else(|e| e.into_inner());
active.remove(&scope);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Ensure ACTIVE scope cleanup runs on all error paths.

On Line 1894 and Line 1904-1905, ? can return before Line 1908-1910 executes, leaving the scope permanently locked in ACTIVE and blocking future flushes for that source.

Suggested fix
-    let resp = tokio::spawn(async move {
+    let resp_result = tokio::spawn(async move {
         let tree = get_or_create_source_tree(&cfg2, &scope2)?;
         let strategy = TreeFactory::from_tree(&tree).label_strategy(&cfg2);
         let sealed = force_flush_tree(&cfg2, &tree.id, Some(chrono::Utc::now()), &strategy).await?;
         Ok::<_, anyhow::Error>(FlushSourceTreeResponse {
             tree_scope: scope2,
             seals_fired: sealed.len() as u32,
         })
     })
-    .await
-    .map_err(|e| format!("flush_source_tree join error: {e}"))?
-    .map_err(|e| format!("flush_source_tree: {e:#}"))?;
+    .await
+    .map_err(|e| format!("flush_source_tree join error: {e}"))
+    .and_then(|r| r.map_err(|e| format!("flush_source_tree: {e:#}")));
 
     {
         let mut active = ACTIVE.lock().unwrap_or_else(|e| e.into_inner());
         active.remove(&scope);
     }
+
+    let resp = resp_result?;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/memory/read_rpc.rs` around lines 1892 - 1910, The cleanup that
removes the scope from ACTIVE must run even if the spawn/await chain returns
early with ?, so create an RAII guard (e.g., ActiveScopeGuard) just after you
acquire scope (before calling tokio::spawn) which captures a clone of scope and
on Drop removes it from ACTIVE; instantiate that guard (so it will run on all
return paths including errors from tokio::spawn, get_or_create_source_tree,
force_flush_tree, or the map_err ? chain), then keep the rest of the
tokio::spawn block and await logic unchanged — implement ActiveScopeGuard::drop
to perform the same removal logic currently in the block that calls
ACTIVE.lock().unwrap_or_else(|e| e.into_inner()).remove(&scope).

senamakel added 2 commits May 31, 2026 23:17
- Add memorySources.build.* translations to all 13 locale files
- Update tree_graph tests to account for synthetic source root nodes
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/src/lib/i18n/es.ts`:
- Line 2039: The translation for the key memorySources.build.sealsMessage uses
an unnatural literal `sellado(s) completado(s)`; replace its value with a
natural Spanish phrase (e.g., "sellados completados" for plural-aware UI) or
preferably implement proper pluralization/ICU formatting if the UI supports it
so singular/plural are correct for the {count} variable; update the string for
memorySources.build.sealsMessage accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f9861d35-7914-4c53-8934-5fd3325879de

📥 Commits

Reviewing files that changed from the base of the PR and between a499c15 and 4e10785.

📒 Files selected for processing (14)
  • app/src/lib/i18n/ar.ts
  • app/src/lib/i18n/bn.ts
  • app/src/lib/i18n/de.ts
  • app/src/lib/i18n/es.ts
  • app/src/lib/i18n/fr.ts
  • app/src/lib/i18n/hi.ts
  • app/src/lib/i18n/id.ts
  • app/src/lib/i18n/it.ts
  • app/src/lib/i18n/ko.ts
  • app/src/lib/i18n/pl.ts
  • app/src/lib/i18n/pt.ts
  • app/src/lib/i18n/ru.ts
  • app/src/lib/i18n/zh-CN.ts
  • src/openhuman/memory/read_rpc_tests.rs
✅ Files skipped from review due to trivial changes (9)
  • app/src/lib/i18n/ru.ts
  • app/src/lib/i18n/zh-CN.ts
  • app/src/lib/i18n/fr.ts
  • app/src/lib/i18n/id.ts
  • app/src/lib/i18n/ko.ts
  • app/src/lib/i18n/pt.ts
  • app/src/lib/i18n/pl.ts
  • app/src/lib/i18n/bn.ts
  • app/src/lib/i18n/de.ts

Comment thread app/src/lib/i18n/es.ts
'memorySources.build.building': 'Construyendo…',
'memorySources.build.successTitle': 'Árbol construido',
'memorySources.build.failedTitle': 'Construcción fallida',
'memorySources.build.sealsMessage': 'sellado(s) completado(s)',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use natural Spanish for seals completion text

Line 2039 currently renders awkward UI text ({count} sellado(s) completado(s)). Use a natural phrase for the existing concatenation pattern (for example, sellados completados).

✏️ Suggested change
-  'memorySources.build.sealsMessage': 'sellado(s) completado(s)',
+  'memorySources.build.sealsMessage': 'sellados completados',
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
'memorySources.build.sealsMessage': 'sellado(s) completado(s)',
'memorySources.build.sealsMessage': 'sellados completados',
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/lib/i18n/es.ts` at line 2039, The translation for the key
memorySources.build.sealsMessage uses an unnatural literal `sellado(s)
completado(s)`; replace its value with a natural Spanish phrase (e.g., "sellados
completados" for plural-aware UI) or preferably implement proper
pluralization/ICU formatting if the UI supports it so singular/plural are
correct for the {count} variable; update the string for
memorySources.build.sealsMessage accordingly.

@senamakel senamakel merged commit ceb769a into tinyhumansai:main Jun 1, 2026
19 of 22 checks passed
sanil-23 added a commit to sanil-23/openhuman that referenced this pull request Jun 1, 2026
Three pre-existing test failures that are red on main independently of
each other, all surfaced by the coverage CI jobs:

1. Rust Core Coverage — `inference_provider_admin_round22_raw_coverage_e2e`
   mutated `OPENHUMAN_WORKSPACE`/`OPENHUMAN_OLLAMA_BASE_URL`/`PATH` as
   process-global env via `EnvVarGuard` with no serialization. The file's
   own SAFETY comments said it was "validated with --test-threads=1", but
   `cargo llvm-cov` runs the binary's 5 tests in parallel, so the global
   mutations raced and a test read another's workspace/config — a flaky
   `nested provider failure` assertion failure. Add an `env_lock()` mutex
   (the pattern every other `*_e2e.rs` already uses) and take it at the
   top of each test so the suite is serialized regardless of thread count.

2. Frontend Coverage — `memoryGraphLayout.test.ts` still asserted the old
   shrinking `nodeRadius` (10 - level*0.8). tinyhumansai#3113 redesigned it to grow
   (5 + level*2.5; chunk 4->3) but did not update the test. Realign the
   expectations to the shipped formula.

3. Frontend Coverage — `OpenhumanLinkModal.notifications.test.tsx` queried
   the success copy with a curly apostrophe (U+2019) in "didn't", while
   en.ts renders a straight apostrophe (U+0027), so `getByText` never
   matched. Use the straight apostrophe to match the rendered string.

Test-only changes. Verified locally: the 5 inference tests pass, and the
two Vitest files pass (13 tests) with Prettier clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
YOMXXX added a commit to YOMXXX/openhuman that referenced this pull request Jun 1, 2026
…#3113

upstream/main 的 tinyhumansai#3113 把 upsert_composio_source 默认改成 enabled=false(用户必须显式启用同步),所以 round23 测试在断言
list_enabled_by_kind(Composio).len() == 1 时拿到 0。先在 upsert 后断言列表为空,再 update_source 翻 enabled=true,重新断言 1。
YOMXXX added a commit to YOMXXX/openhuman that referenced this pull request Jun 1, 2026
upstream/main 的 tinyhumansai#3047/tinyhumansai#3113 让 GithubReader 优先用本地 git clone 跑
list_commits/read_commit,并且简化了 issue 渲染器(不再把 comments 拼到
body)。本测试用 fake gh 只 mock 了 API 路径,所以:

- 在 PATH 上加 fake git 让 ensure_bare_clone/log/show 全部失败,让 reader
  回退到 gh API 那条路径,断言 commit:abc123 / issue:42 / pr:43 才能继续。
- read_issue body 改为断言 "# Issue tinyhumansai#42" + "## Description",去掉
  "## Comments" 期望,与新的 renderer 输出一致。
sanil-23 added a commit to sanil-23/openhuman that referenced this pull request Jun 1, 2026
…sync redesign)

memory_sources_closure_round23: a freshly-inserted Composio source is no
longer reported by list_enabled_by_kind until it has an active connection,
so the count is 0, not 1. Aligns the last stale raw-coverage assertion with
the behavior shipped in tinyhumansai#3113.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Net-new user-facing capability or product behavior. memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants