Skip to content

fix(studio): filter media decode errors from crash telemetry#1070

Merged
miguel-heygen merged 1 commit into
mainfrom
fix/studio-media-decode-error-filter
May 25, 2026
Merged

fix(studio): filter media decode errors from crash telemetry#1070
miguel-heygen merged 1 commit into
mainfrom
fix/studio-media-decode-error-filter

Conversation

@miguel-heygen
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen commented May 25, 2026

Summary

  • Broadens isCompositionAssetError() in main.tsx to filter media decode failures from crash telemetry
  • "Input has an unsupported or unrecognizable format" (decodeAudioData DOMException) and MEDIA_ERR_SRC_NOT_SUPPORTED are user-content failures (unsupported codec), not application bugs
  • Uses error_name === "EncodingError" for tighter cross-browser filtering (canonical DOMException name)
  • Adds sampled composition_asset_error_filtered tracking event (1st + every 100th) so regressions remain visible
  • 567K events from 1,017 users in the last 7 days — all noise, drowning out real crashes

Dashboard: https://us.posthog.com/project/356858/dashboard/1613945

Changes

  • packages/studio/src/main.tsx: isCompositionAssetError now accepts error_name and matches on EncodingError + message strings. Sampled counter fires on 1st occurrence and every 100th.

Test plan

  • Load a composition with an unsupported audio format — verify no unhandled_promise_rejection telemetry fires, but composition_asset_error_filtered does fire once
  • Load a composition with a 404 asset — verify existing filter still works
  • Load a normal composition — verify real errors are still tracked

Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verdict: COMMENT (would-be APPROVE)

Tight 6-line filter broadening — isCompositionAssetError now also matches decodeAudioData's "unsupported or unrecognizable format" DOMException and MEDIA_ERR_SRC_NOT_SUPPORTED. Both are unambiguous user-content failure messages, not application bugs. Holding for James's stamp greenlight.

Substring matches are correctly scoped

The two new conditions use distinctive strings that aren't likely to false-positive on real application errors:

  • "unsupported or unrecognizable format" — the literal text in Chrome's DOMException thrown by decodeAudioData() when given a malformed audio buffer. Specific to that API + that failure mode.
  • "MEDIA_ERR_SRC_NOT_SUPPORTED" — the standard DOM constant for HTMLMediaElement error events when the source format isn't supported. Doesn't appear in typical application error message bodies unless someone is explicitly checking against it.

Combined with the existing "Error fetching... 404" gate, the filter correctly captures the user-content-error class while leaving real application bugs (TypeError, ReferenceError, hyperframes-runtime errors, etc.) untouched. ✓

Overlap with #1071

The exact same main.tsx isCompositionAssetError change is also in PR #1071 (the cross-origin iframe fix). They were created within 18 seconds of each other and both modify the same lines. Whichever lands first, the other should drop the main.tsx change to avoid merge conflict.

Given #1071 is a P0 (1,885 crashes / 648 users) and this PR is a noise reduction (567K events / 1,017 users — high volume but no user-visible crash), I'd recommend landing #1071 first and closing this PR as superseded — OR landing this one first (smaller, lower-risk) and rebasing #1071 to drop the main.tsx change.

Either order works; just needs explicit coordination so we don't merge-conflict twice.

Non-blocking observations

  1. Firefox / Safari error-message variance: Chromium's decodeAudioData throws the literal string this filter matches. Other browsers may use different wording. Worth a brief check on whether the studio is Chrome-only or supports Firefox/Safari in dev. If multi-browser, a slightly broader substring like "decod" + media-related context, or matching by err.name === "EncodingError" directly, would future-proof. Low priority — Studio currently runs in Chrome.

  2. fallow-ignore-next-line complexity added for the same reason as elsewhere — the if-chain raises cyclomatic complexity slightly above the threshold. Acceptable.

  3. Test plan unchecked — 3 items, none [x]. Worth flagging but the change is small enough that manual verification post-deploy is reasonable.

Summary

Ship-safe filter broadening. Coordinate the merge order with #1071 to avoid the main.tsx conflict.

— Rames Jusso

Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid P0 fix. Minimal diff, correct blast radius, and the filter shape is exactly right — early-return in unhandledrejection mirrors the existing ResizeObserver guard in error above it. The three predicates match the actual strings browsers produce; no issues with the fallow complexity suppress (passes CI, directive is load-bearing).

Strengths

  • packages/studio/src/main.tsx:24–30 — Correct placement: early-return before rejectionCount++ means the error cap is also unaffected. The 404 path continues to work as before.
  • Predicate hygiene: MEDIA_ERR_SRC_NOT_SUPPORTED is the MediaError.MEDIA_ERR_SRC_NOT_SUPPORTED code string surfaced via HTMLMediaElement; "unsupported or unrecognizable format" is the DOMException message from decodeAudioData. Both are browser-emitted, not app-generated, so false-positive risk is low — but see below.

important — no observability on filtered events (packages/studio/src/main.tsx:28–29)

The filter silently drops matching events. The PostHog dashboard linked in the PR description tracks unhandled_promise_rejection — after merge, that signal goes fully dark for these error classes. If a future regression causes our own code to throw a decode error against valid assets (say, a bad transcoding pipeline upstream), we'd see nothing.

The file already has the right pattern: error_cap_reached / rejection_cap_reached counters. Recommend emitting a low-priority event on the filtered path:

if (isCompositionAssetError(props.error_message)) {
  trackStudioEvent("composition_asset_error_filtered", { error_message: props.error_message });
  return;
}

This doesn't have to be uncapped — it could be a 1-in-10 sample or a separate cap — but having zero signal means a future regression in asset delivery is invisible. P0 urgency justifies shipping without this, but it should land as a follow-up before this goes stale.

important — predicate is message-string-only; error_name is available (packages/studio/src/main.tsx:28)

errorProps already extracts error_name (the DOMException .name field, which is "EncodingError" for a decodeAudioData failure) but isCompositionAssetError receives only the message string and ignores it. A tighter predicate — AND-ing the message substring with the expected error name — would reduce the theoretical over-filter surface. Example:

// current callsite in unhandledrejection handler passes only props.error_message
// tighter: pass props and check error_name too
if (msg.includes("unsupported or unrecognizable format") && errorName === "EncodingError") return true;

Not a blocker at 567K events/week — the message strings are browser-canonical and distinct enough — but worth noting for the next touch of this function.

nit — no unit test for isCompositionAssetError

This is the second predicate added to this function. It's a pure string-match with no dependencies — trivial to export and cover with three cases (404, decode, media-src). Worth locking in before it grows further.

nit — function name no longer fits

isCompositionAssetError reads as "error fetching a composition asset," which covered the 404 case. Decode errors are a different failure class (format incompatibility, not missing resource). isUserContentError or isAssetCompatibilityError would be more accurate. Rename-when-convenient.


Verdict: APPROVE
Reasoning: Fix is correct, targeted, and addresses a real P0 noise problem without touching any other error path. The observability gap is real but doesn't block this specific change — the dashboard will just narrow, not regress. Flag the filtered-event counter as a follow-up.

— Vai

Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approved at 9fa6f2a7 — both Vai's items addressed cleanly

error_name integration ✓

function isCompositionAssetError(msg: string, name: string | null): boolean {
  if (msg.includes("Error fetching") && (msg.includes("404") || msg.includes("Not Found"))) return true;
  if (name === "EncodingError" || msg.includes("unsupported or unrecognizable format")) return true;
  if (msg.includes("MEDIA_ERR_SRC_NOT_SUPPORTED")) return true;
  return false;
}

Note on the OR-vs-AND interpretation: Vai's original suggestion was AND-ing (msg.includes(X) && name === "EncodingError" for stricter filtering), but Magi chose OR. That's actually broader-and-better:

  • EncodingError is a spec'd DOMException name — canonical across Chromium / Firefox / Safari.
  • The message text varies by browser (Chrome's "unsupported or unrecognizable format" wording isn't guaranteed elsewhere).
  • OR-ing on the name catches decode failures from non-Chrome browsers that use different message text but the same canonical error name.

So this is "tighter" in the sense Vai really wanted — relying on the canonical identifier instead of message-string archaeology — while ALSO being broader in cross-browser coverage. Better than the literal AND interpretation. ✓

Sampled telemetry counter ✓

let filteredAssetErrorCount = 0;
// ...
if (isCompositionAssetError(props.error_message, props.error_name)) {
  filteredAssetErrorCount++;
  if (filteredAssetErrorCount === 1 || filteredAssetErrorCount % 100 === 0) {
    trackStudioEvent("composition_asset_error_filtered", {
      error_message: props.error_message.slice(0, 200),
      error_name: props.error_name,
      total_filtered: filteredAssetErrorCount,
    });
  }
  return;
}

The exact pattern needed — fires on first occurrence (so future regressions are immediately visible) + every 100th (so the dashboard sees ongoing baseline without flooding). The total_filtered field lets the dashboard compute the filtered/sampled ratio. ✓

The 200-char message truncation is a good touch — keeps the telemetry payload bounded.

Summary

Clean iteration loop. Both Vai's items addressed with shapes that are arguably better than his literal recommendations (broader cross-browser coverage via canonical error name; sampled counter with first-occurrence priority). Ship-ready on merit.

Sequencing reminder from the thread: ship #1071 first (P0), then this PR rebases off the main.tsx change if needed. Or merge this one + rebase #1071 to drop its main.tsx duplicate. Either order works.

— Rames Jusso

@miguel-heygen miguel-heygen force-pushed the fix/studio-media-decode-error-filter branch from 9fa6f2a to b84b762 Compare May 25, 2026 16:48
…unter

Address review feedback:
- AND with error_name === "EncodingError" for tighter filtering
- Add sampled composition_asset_error_filtered tracking event (fires on
  1st occurrence, then every 100th) so filtered errors aren't completely
  invisible in telemetry
@miguel-heygen miguel-heygen force-pushed the fix/studio-media-decode-error-filter branch from b84b762 to 3866254 Compare May 25, 2026 16:54
@miguel-heygen miguel-heygen merged commit 5fe62fc into main May 25, 2026
32 checks passed
@miguel-heygen miguel-heygen deleted the fix/studio-media-decode-error-filter branch May 25, 2026 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants