Skip to content

feat(telemetry): instrument renderer failure and CTA events#2360

Open
Pritom14 wants to merge 1 commit into
AgentWrapper:mainfrom
Pritom14:feat/renderer-failure-cta-telemetry
Open

feat(telemetry): instrument renderer failure and CTA events#2360
Pritom14 wants to merge 1 commit into
AgentWrapper:mainfrom
Pritom14:feat/renderer-failure-cta-telemetry

Conversation

@Pritom14

@Pritom14 Pritom14 commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Closes #2362

Summary

Adds PostHog instrumentation for two blind spots in the desktop app: failures (things breaking) and CTAs (what users actually do). All events flow through the existing captureRendererEventsanitizeRendererProperties allowlist, so nothing leaves the renderer un-sanitized.

Failure signals

  • ao.renderer.daemon_failure — a machine-readable code is now stamped on DaemonStatus at every main-process failure site (not_configured, daemon_unreachable, binary_missing, spawn_failed, exited, port_unconfirmed, not_ready, identity_mismatch). A small renderer subscriber (daemon-telemetry.ts) rides the existing daemon:status IPC push and reports the coarse fields (daemon_state, code, exit_code, signal) — never the raw message.
  • ao.renderer.api_error — central interceptor in api-client.ts categorizes every failed daemon call (daemon_unavailable / network_error / http_4xx / http_5xx), with the operation normalized to METHOD /api/v1/resource/:id (IDs stripped) and a 30s dedupe window. Caller-initiated aborts are ignored.
  • ao.renderer.terminal_attach_failed — reason enum (pane_error / open_timeout).

CTA triads (requested → succeeded → failed)

  • task_create, session_kill, settings_save
  • orchestrator_spawn with a source enum (board / restore_dialog)
  • notification_opened (target: pr/session) and notification_marked_read (scope: single/all)

Privacy

Every new event is added to the sanitizeRendererProperties switch. project_id is hashed to project_id_hash (SHA-256); all free-text (titles, error messages, paths, arbitrary source/target/reason values) is dropped by the enum-only allowlist. Covered by unit tests.

Telemetry verification (live PostHog, project 475752)

Every failure and CTA path was exercised end-to-end against a running packaged instance and confirmed to land with the sanitized shape. Counts below are from this verification session.

Event Fired Verified properties
orchestrator_spawn_requested source: board, project_id_hash
orchestrator_spawn_succeeded source: board, project_id_hash
task_create_requested project_id_hash
task_create_succeeded project_id_hash
task_create_failed project_id_hash (raw title/brief dropped)
session_kill_requested project_id_hash
session_kill_failed project_id_hash
settings_save_requested project_id_hash
settings_save_succeeded project_id_hash
settings_save_failed project_id_hash
api_error error_category: daemon_unavailable | network_error, status: 503, normalized operation
terminal_attach_failed reason: open_timeout
daemon_failure coarse fields only (daemon_state / code / signal); raw message dropped
notification_opened / notification_marked_read unit-tested enum-only allowlist verified in telemetry.test.ts (not driven live this session)

Sanitization held on every event: project_id → SHA-256 project_id_hash, and no title / brief / error / message / body leaked — only enum values survived the allowlist. Common envelope (surface, build_mode, platform, app_version) is attached by the base capture.

Test plan

  • npm run typecheck — clean
  • npm run test — 396 passing (new: daemon-telemetry subscriber, api_error interceptor + normalizeApiOperation, sanitizer allowlist for every new event; updated two daemon-attach exact-match assertions for the new code field)
  • npx prettier --check on all touched files
  • E2E smoke of failure + CTA paths against a live instance, verified in PostHog project 475752 (see table above)

Add PostHog instrumentation for app failures and user CTAs:
- daemon_failure (machine-readable code on DaemonStatus, IPC-forwarded)
- api_error central interceptor (categorized, IDs stripped)
- terminal_attach_failed (pane error + open timeout)
- CTA triads: task_create, session_kill, settings_save,
  orchestrator_spawn (board/restore_dialog), notification open/read

All events sanitized: project_id hashed, enum-only codes, raw error
messages never sent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Telemetry blind spots: no failure or CTA instrumentation in the desktop renderer

1 participant