observability(sentry): attach user id to Rust-core events (#3135)#3136
observability(sentry): attach user id to Rust-core events (#3135)#3136oxoxDev wants to merge 7 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR establishes Sentry scope user-tracking for core-mode errors by introducing helper functions to set and clear the global Sentry user scope at session boundaries, integrating those helpers into session login/logout/boot flows, and adding comprehensive tests to validate the behavior. ChangesSentry user-scoping for direct-mode event attribution
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/core/observability.rs (1)
1772-1793: ⚡ Quick winAdd debug breadcrumbs for Sentry user-scope mutations (PII-safe).
The new flow mutates global Sentry scope but emits no debug/trace signal. Add grep-friendly debug logs while keeping the id redacted.
♻️ Proposed patch
pub fn set_sentry_user_id(user_id: &str) { let trimmed = user_id.trim(); if trimmed.is_empty() { return; } let owned = trimmed.to_string(); sentry::configure_scope(|scope| { scope.set_user(Some(sentry::User { id: Some(owned.clone()), ..Default::default() })); }); + tracing::debug!( + user_id_len = trimmed.len(), + "[observability] sentry user scope set" + ); } pub fn clear_sentry_user() { sentry::configure_scope(|scope| scope.set_user(None)); + tracing::debug!("[observability] sentry user scope cleared"); }As per coding guidelines, "In Rust, default to verbose diagnostics on new/changed flows using
log/tracingatdebug/tracelevels with stable grep-friendly prefixes and correlation fields" and "Never log secrets or full PII — redact sensitive information in logs".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/core/observability.rs` around lines 1772 - 1793, Add verbose, grep-friendly debug breadcrumbs when mutating the Sentry user scope: in set_sentry_user_id, after trimming and before/after calling sentry::configure_scope, emit a tracing::debug (or log::debug) with a stable prefix like "SENTRY_USER: set id=" and include a redacted form of the id (e.g., first 6 chars + "...") rather than the full PII; also call sentry::add_breadcrumb (category "user", level Debug) recording the same redacted id and action. In clear_sentry_user, emit a tracing::debug with prefix "SENTRY_USER: cleared" and add a corresponding sentry::add_breadcrumb noting the clear action. Ensure no full user_id is ever logged or sent in breadcrumbs and reuse the existing set_sentry_user_id and clear_sentry_user symbols for placement.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/core/observability_tests.rs`:
- Around line 3603-3609: The test set_sentry_user_id_ignores_blank_input
currently has no assertion; after calling set_sentry_user_id("") and
set_sentry_user_id(" "), capture the emitted Sentry event (using the test
transport / test helper used elsewhere in tests) and assert that the event has
no user attached (e.g., event.user is None or event.user.id is None) to ensure
blank/whitespace ids are dropped; update the test to retrieve the last emitted
event and perform this assertion referencing
set_sentry_user_id_ignores_blank_input and set_sentry_user_id.
In `@src/openhuman/credentials/ops.rs`:
- Around line 174-181: The Sentry binding is set too early: move the
crate::core::observability::set_sentry_user_id(resolved_user_id) call so it runs
only after session persistence succeeds (i.e., after store_provider_token and
any related session/store operations complete without error); specifically,
remove or skip the current early call guarded by resolved_user_id and re-invoke
set_sentry_user_id only in the success path following the successful return from
store_provider_token (and any subsequent session persistence steps) to avoid
binding Sentry on failed persistence.
---
Nitpick comments:
In `@src/core/observability.rs`:
- Around line 1772-1793: Add verbose, grep-friendly debug breadcrumbs when
mutating the Sentry user scope: in set_sentry_user_id, after trimming and
before/after calling sentry::configure_scope, emit a tracing::debug (or
log::debug) with a stable prefix like "SENTRY_USER: set id=" and include a
redacted form of the id (e.g., first 6 chars + "...") rather than the full PII;
also call sentry::add_breadcrumb (category "user", level Debug) recording the
same redacted id and action. In clear_sentry_user, emit a tracing::debug with
prefix "SENTRY_USER: cleared" and add a corresponding sentry::add_breadcrumb
noting the clear action. Ensure no full user_id is ever logged or sent in
breadcrumbs and reuse the existing set_sentry_user_id and clear_sentry_user
symbols for placement.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8319a2e0-d137-47c8-b589-3434f2872677
📒 Files selected for processing (3)
src/core/observability.rssrc/core/observability_tests.rssrc/openhuman/credentials/ops.rs
|
Pushed 2 follow-up commits closing the boot-restore gap: |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/openhuman/credentials/ops.rs (1)
174-181:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMove Sentry user binding to after session persistence succeeds.
The Sentry scope is set before
store_provider_token(Line 279) can fail. If persistence fails, the global Sentry scope remains bound and can misattribute subsequent events.🐛 Proposed fix
- // Bind the signed-in user's id to the Sentry scope as early as possible - // (on fresh login and on account switch, both of which route through here) - // so events reported by background loops carry `user.id` even before the - // frontend `app_state_snapshot` RPC has warmed the identity cache. Only - // the id is sent — never email/name/IP. See `set_sentry_user_id`. - if let Some(ref uid) = resolved_user_id { - crate::core::observability::set_sentry_user_id(uid); - } - // If we know the user_id, activate the user-scoped directory BEFORE storing @@ let profile = auth .store_provider_token( APP_SESSION_PROVIDER, DEFAULT_AUTH_PROFILE_NAME, trimmed_token, metadata, true, ) .map_err(|e| e.to_string())?; + + // Bind the signed-in user's id to the Sentry scope after persistence + // succeeds so events reported by background loops carry `user.id`. + if let Some(ref uid) = resolved_user_id { + crate::core::observability::set_sentry_user_id(uid); + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/credentials/ops.rs` around lines 174 - 181, The Sentry user binding (crate::core::observability::set_sentry_user_id) is currently called as soon as resolved_user_id exists, before persisting the session with store_provider_token; move the call so it only runs after store_provider_token returns success to avoid binding Sentry to a user when persistence fails. Specifically, locate the resolved_user_id check and the set_sentry_user_id call and relocate or duplicate the call to execute after the successful return/Ok path of store_provider_token (or inside the success branch handling function that confirms persistence), ensuring any error path does not call set_sentry_user_id.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@src/openhuman/credentials/ops.rs`:
- Around line 174-181: The Sentry user binding
(crate::core::observability::set_sentry_user_id) is currently called as soon as
resolved_user_id exists, before persisting the session with
store_provider_token; move the call so it only runs after store_provider_token
returns success to avoid binding Sentry to a user when persistence fails.
Specifically, locate the resolved_user_id check and the set_sentry_user_id call
and relocate or duplicate the call to execute after the successful return/Ok
path of store_provider_token (or inside the success branch handling function
that confirms persistence), ensuring any error path does not call
set_sentry_user_id.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 798dd78f-15d3-46ee-bede-12e4f602a04d
📒 Files selected for processing (3)
src/core/jsonrpc.rssrc/openhuman/credentials/ops.rssrc/openhuman/credentials/ops_tests.rs
There was a problem hiding this comment.
@oxoxDev the code looks good to me — the session-boundary scope binding is the right fix, the blank-id guard is correct, and the boot-restore warmer covers the restored-session gap that the previous lazy cache approach missed.
the two CodeRabbit findings (bind after persistence succeeds, blank-id test assertion) are both addressed in 19a7cf7, and the test coverage is solid: isolated TestTransport hubs, PII-minimal assertions, clear/re-bind lifecycle verified.
one thing i want to call out: warm_sentry_user_from_active_session does double duty — it returns the active user id (which jsonrpc.rs uses to set already_logged_in) and also fires the Sentry scope binding as a side effect. the function name makes the side effect clear, so this is fine for the boot path where both are needed together. just worth documenting if this function ever gets called from somewhere that only wants the id.
the only thing holding this back is CI — Rust Core Coverage and E2E Playwright lanes 1 and 2 are failing. the frontend/playwright failures look unrelated to this Rust-only change, but the Rust Core Coverage failure is blocking the diff-cover gate, so i cannot confirm >= 80% on the diff. once those are green i'll come back and approve.
|
@graycyrus thanks for the review — all addressed: CI reds (the blocker): root-caused the two red Rust lanes. They were not from this change — #3125 switched Double-duty callout: documented in Frontend Coverage + Playwright lane 1/4: unrelated to this Rust-only change — Frontend Coverage fails on Fresh CI is running now — should come back green on the Rust lanes. Re-requesting your review once it settles. |
sanil-23
left a comment
There was a problem hiding this comment.
@oxoxDev the code itself looks solid — setting the Sentry scope user at the session boundary (login/account-switch + boot-restore warmer) instead of leaning on the lazily-warmed before_send cache is the right fix, and it lines up with the backend pattern you referenced. The blank/whitespace guard, the after-persist ordering so a failed store_provider_token can't leave a stale global scope, and the PII-minimal id-only payload (no email/name/IP) are all handled well. Tests cover the set/clear/blank branches and both warmer paths.
The one thing holding this back is CI, which is red right now. Worth noting the failing checks look unrelated to this change:
- Rust Core Coverage fails on tests/inference_provider_admin_round22_raw_coverage_e2e.rs:254 ("nested provider failure" assertion) — that file isn't touched by this PR, so it reads as a pre-existing failure on main, same flavor as the missing-access-token e2e you already realigned in this branch.
- Frontend Coverage (Vitest) and the web E2E lane are frontend lanes, and this is a Rust-only diff.
Can you rebase on latest main (and pick up any fix for the round22 test) so CI goes green? Once the required checks are passing i'll come back and approve — the code is ready. Let me know if you want a hand tracking down the unrelated failures.
|
@graycyrus re-requesting review. State of the CI you flagged:
The three stale-main failures (memoryGraphLayout, chat-harness-subagent quarantine, + the round22 isolation fix) are being consolidated into #3149 ("align stale main tests"). Once #3149 lands I'll rebase this and every lane goes green. This PR's own diff is Rust-only and self-contained; the binding/test changes you reviewed are unchanged since your pass. The double-duty doc note you asked for is in |
|
@sanil-23 thanks — appreciate the close read on the after-persist ordering + PII-minimal payload. You're right that round22 is pre-existing/unrelated, with one refinement: it's not a stale assertion (the Picking up the round22 fix (serialize the env-mutating tests) plus the two frontend stale-main lanes now, then rebasing on latest main. Will ping when green. |
…inyhumansai#3135) store_session only binds the Sentry user on fresh login/account-switch. A restored session (app reopened while signed in) never routes through it, so background-loop errors (e.g. the periodic Composio sync tick) reported with user = None. Add warm_sentry_user_from_active_session() reading active_user.toml + binding the scope, with unit coverage for the present/absent branches.
…ansai#3135) Wire warm_sentry_user_from_active_session() into the existing-session boot branch so the scope is set before login-gated background services (incl. the periodic Composio sync) start.
…t blank-input, add breadcrumbs (tinyhumansai#3135) - store_session: move set_sentry_user_id to after store_provider_token succeeds, so a failed persist no longer leaves the global scope bound. - set_sentry_user_id_ignores_blank_input: assert the captured event carries no user (test previously had no assertion). - set/clear: add PII-safe debug breadcrumbs (id length only, never the id).
6f6945f to
1d5e3b1
Compare
|
Status — this is ready for approval. Rebased on latest The PR is
Net: CI on this branch is at parity with |
sanil-23
left a comment
There was a problem hiding this comment.
@oxoxDev re-reviewed at f3addf5. The observability work itself is in good shape — session-boundary scope binding plus the boot-restore warmer is exactly the right fix for the userCount=0 gap, the post-persist ordering (so a failed store_provider_token can't leave the global scope bound) is correct, and the test coverage is solid: set/clear, blank-input asserting no scope user, and both warmer branches. Privacy posture is good too — only the id reaches the scope, never email/name/IP, consistent with send_default_pii: false, and the debug logs only carry id length.
The boot-path change in jsonrpc.rs preserves the prior .is_some() semantics, so no behavioral drift there. New exports are only consumed by credentials/ops and the boot path — no broken callers.
I can't approve yet because CI isn't green:
- Rust Core Coverage fails on
inference_voice_http_round23_raw_coverage_e2e.rs:124(reasoning-v1model-listing assertion) — unrelated to this diff, looks pre-existing. - E2E web lane 1 fails on
auth-access-control.spec.ts(auth callback not reaching /home) — a timing-sensitive auth e2e, not something a Sentry scope mutation can cause; reads as flaky. - The
chat-harness-subagentquarantine in the last commit is a reasonable call for the ECONNREFUSED cascade, though it's a bundled concern from #3055.
If you rebase on latest main to pick up any fixes for those coverage/e2e tests (or confirm they're flaky and re-run), I'll come back and approve once required checks are green. Nice work.
f3addf5 to
1d5e3b1
Compare
|
Dropped the bundled CI is now at parity with the other open PRs on the same base (e.g. #3150, #3153): the only reds are the pre-existing Rust Core Coverage ( |
sanil-23
left a comment
There was a problem hiding this comment.
@oxoxDev re-reviewed at 1d5e3b1. The code is in good shape — binding the Sentry scope user at the session boundary (login/account-switch) plus the boot-restore warmer is the right fix for the userCount=0 gap, and it's a cleaner approach than leaning on the lazily-warmed before_send cache. The post-persist ordering in store_session is correct (a failed store_provider_token can no longer leave the global scope pointing at a half-completed login), the blank/whitespace guard is right, and the logout clear prevents misattribution. PII handling is solid: only the id reaches the scope, never email/name/IP, and the debug logs carry id length only — consistent with send_default_pii: false. Test coverage is strong: set, clear, blank-input (asserting no user), and both boot-warmer branches are all exercised.
The two earlier review points (bind after persistence succeeds, and the missing assertion in the blank-id test) are both resolved.
The only thing holding this back is CI — the Playwright E2E lanes and the coverage jobs are still pending/red on this commit. Once the required checks are green i'll come back and approve. If any of the failing lanes look like pre-existing flakes unrelated to this Rust-only diff, just flag it and i'll take a look.
sanil-23
left a comment
There was a problem hiding this comment.
@oxoxDev re-reviewed at 732c444 (the upstream/main merge). No observability code changed since the last pass, and it still reads clean — session-boundary scope binding plus the boot-restore warmer, post-persist ordering, blank-input guard, logout clear, and id-only PII posture are all correct, with solid test coverage. Both earlier review points stay resolved and CodeRabbit is approving.
Still can't approve yet — CI is red, but on checks unrelated to this Rust-only diff:
- Rust Core Coverage fails in
tools_approval_channels_raw_coverage_e2e.rs:1689(orchestrator_tool_synthesis_covers_agent_and_integration_delegation_edges) — orchestrator tooling, nothing a Sentry scope mutation touches. - E2E web lane 1/4 fails on
chat-harness-subagent.spec.ts:136, which then cascades ECONNREFUSED into the following specs. This is the #3055 flake; the clean rebase dropped the earlier quarantine, so it resurfaced.
Both look like pre-existing flakes rather than anything from this change. A re-run (or rebasing past whatever lands the fixes on main) should clear them. Once the required checks are green i'll come back and approve — the code side is ready.
Summary
user.id.set_sentry_user_id/clear_sentry_userhelpers insrc/core/observability.rs; wires them intocredentials::ops::store_session(set) andclear_session(clear).send_default_pii: false.Problem
Direct-mode Composio errors raised from the Rust core (
[composio-direct] list_connections failed,Invalid API key, etc.) land in the self-hosted Sentrytauri-rust/core-rustprojects with no user context (userCount=0 on every direct-mode composio issue), so thousands of events can't be traced back to a user.Both the standalone core binary (
src/main.rs) and the Tauri shell already attachevent.userinside the Sentrybefore_sendhook, but they read it fromapp_state::peek_cached_current_user_identity(). That cache is only ever warmed by the frontend-drivenapp_state_snapshotRPC. Errors that fire from background loops — e.g. the periodic Composio sync tick inmemory_sync/composio/periodic.rs, which callsreport_composio_op_error— run before, or independent of, any snapshot RPC, soevent.userresolves toNone. The backend project carriesuser.idbecause it tags every event at the session boundary rather than lazily.Solution
Mirror the backend's session-boundary tagging in the Rust core: set the Sentry scope user eagerly when the user becomes known, instead of relying on the lazily-warmed
before_sendcache.store_session(the fresh-login and account-switch path) resolves the backend's mongo ObjectId intoresolved_user_id; we callset_sentry_user_id(uid)right there. The scope mutation lands on the global hub, so all subsequentreport_error/capture_messagecalls — including the background composio tick — attach it.clear_session(logout) callsclear_sentry_user()so post-logout events aren't misattributed to the prior account.user.id.Submission Checklist
warm_sentry_user_from_active_sessionboot warmer) is covered by added unit tests; the thinjsonrpc.rsboot wiring delegates to the tested helper. CIdiff-covergate is authoritative.## Related— N/A: no feature IDs.Closes #NNNin the## RelatedsectionImpact
tauri-rust/core-rust, matching backend-mode coverage.send_default_pii: false.Related
AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
Validation Run
pnpm --filter openhuman-app format:check— no frontend change.pnpm typecheck— no frontend change.cargo test --lib core::observability(156 passed) +cargo test --lib credentials::ops(39 passed) — incl. all new tests.cargo fmt --checkclean;cargo checkclean (pre-existing warnings in unrelatedslack_backfillbin only).Validation Blocked
cargo clippy --lib -- -D warningsmaincheckout (local toolchain 1.93.0 newer than repo target); none in the touched files.rust-toolchain.toml.Behavior Changes
user.idfrom login through logout.Parity Contract
before_sendcache-based attribution is untouched; this adds an eager scope-level set that takes effect even when the cache is cold.Duplicate / Superseded PR Handling
Summary by CodeRabbit
Release Notes
New Features
Tests