fix(studio): filter media decode errors from crash telemetry#1070
Conversation
jrusso1020
left a comment
There was a problem hiding this comment.
Verdict: COMMENT (would-be APPROVE)
Tight 6-line filter broadening — isCompositionAssetError now also matches decodeAudioData's "unsupported or unrecognizable format" DOMException and MEDIA_ERR_SRC_NOT_SUPPORTED. Both are unambiguous user-content failure messages, not application bugs. Holding for James's stamp greenlight.
Substring matches are correctly scoped
The two new conditions use distinctive strings that aren't likely to false-positive on real application errors:
"unsupported or unrecognizable format"— the literal text in Chrome'sDOMExceptionthrown bydecodeAudioData()when given a malformed audio buffer. Specific to that API + that failure mode."MEDIA_ERR_SRC_NOT_SUPPORTED"— the standard DOM constant forHTMLMediaElementerror events when the source format isn't supported. Doesn't appear in typical application error message bodies unless someone is explicitly checking against it.
Combined with the existing "Error fetching... 404" gate, the filter correctly captures the user-content-error class while leaving real application bugs (TypeError, ReferenceError, hyperframes-runtime errors, etc.) untouched. ✓
Overlap with #1071
The exact same main.tsx isCompositionAssetError change is also in PR #1071 (the cross-origin iframe fix). They were created within 18 seconds of each other and both modify the same lines. Whichever lands first, the other should drop the main.tsx change to avoid merge conflict.
Given #1071 is a P0 (1,885 crashes / 648 users) and this PR is a noise reduction (567K events / 1,017 users — high volume but no user-visible crash), I'd recommend landing #1071 first and closing this PR as superseded — OR landing this one first (smaller, lower-risk) and rebasing #1071 to drop the main.tsx change.
Either order works; just needs explicit coordination so we don't merge-conflict twice.
Non-blocking observations
-
Firefox / Safari error-message variance: Chromium's
decodeAudioDatathrows the literal string this filter matches. Other browsers may use different wording. Worth a brief check on whether the studio is Chrome-only or supports Firefox/Safari in dev. If multi-browser, a slightly broader substring like"decod"+ media-related context, or matching byerr.name === "EncodingError"directly, would future-proof. Low priority — Studio currently runs in Chrome. -
fallow-ignore-next-line complexityadded for the same reason as elsewhere — the if-chain raises cyclomatic complexity slightly above the threshold. Acceptable. -
Test plan unchecked — 3 items, none
[x]. Worth flagging but the change is small enough that manual verification post-deploy is reasonable.
Summary
Ship-safe filter broadening. Coordinate the merge order with #1071 to avoid the main.tsx conflict.
— Rames Jusso
vanceingalls
left a comment
There was a problem hiding this comment.
Solid P0 fix. Minimal diff, correct blast radius, and the filter shape is exactly right — early-return in unhandledrejection mirrors the existing ResizeObserver guard in error above it. The three predicates match the actual strings browsers produce; no issues with the fallow complexity suppress (passes CI, directive is load-bearing).
Strengths
packages/studio/src/main.tsx:24–30— Correct placement: early-return beforerejectionCount++means the error cap is also unaffected. The 404 path continues to work as before.- Predicate hygiene:
MEDIA_ERR_SRC_NOT_SUPPORTEDis theMediaError.MEDIA_ERR_SRC_NOT_SUPPORTEDcode string surfaced via HTMLMediaElement;"unsupported or unrecognizable format"is theDOMExceptionmessage fromdecodeAudioData. Both are browser-emitted, not app-generated, so false-positive risk is low — but see below.
important — no observability on filtered events (packages/studio/src/main.tsx:28–29)
The filter silently drops matching events. The PostHog dashboard linked in the PR description tracks unhandled_promise_rejection — after merge, that signal goes fully dark for these error classes. If a future regression causes our own code to throw a decode error against valid assets (say, a bad transcoding pipeline upstream), we'd see nothing.
The file already has the right pattern: error_cap_reached / rejection_cap_reached counters. Recommend emitting a low-priority event on the filtered path:
if (isCompositionAssetError(props.error_message)) {
trackStudioEvent("composition_asset_error_filtered", { error_message: props.error_message });
return;
}This doesn't have to be uncapped — it could be a 1-in-10 sample or a separate cap — but having zero signal means a future regression in asset delivery is invisible. P0 urgency justifies shipping without this, but it should land as a follow-up before this goes stale.
important — predicate is message-string-only; error_name is available (packages/studio/src/main.tsx:28)
errorProps already extracts error_name (the DOMException .name field, which is "EncodingError" for a decodeAudioData failure) but isCompositionAssetError receives only the message string and ignores it. A tighter predicate — AND-ing the message substring with the expected error name — would reduce the theoretical over-filter surface. Example:
// current callsite in unhandledrejection handler passes only props.error_message
// tighter: pass props and check error_name too
if (msg.includes("unsupported or unrecognizable format") && errorName === "EncodingError") return true;Not a blocker at 567K events/week — the message strings are browser-canonical and distinct enough — but worth noting for the next touch of this function.
nit — no unit test for isCompositionAssetError
This is the second predicate added to this function. It's a pure string-match with no dependencies — trivial to export and cover with three cases (404, decode, media-src). Worth locking in before it grows further.
nit — function name no longer fits
isCompositionAssetError reads as "error fetching a composition asset," which covered the 404 case. Decode errors are a different failure class (format incompatibility, not missing resource). isUserContentError or isAssetCompatibilityError would be more accurate. Rename-when-convenient.
Verdict: APPROVE
Reasoning: Fix is correct, targeted, and addresses a real P0 noise problem without touching any other error path. The observability gap is real but doesn't block this specific change — the dashboard will just narrow, not regress. Flag the filtered-event counter as a follow-up.
— Vai
jrusso1020
left a comment
There was a problem hiding this comment.
Re-approved at 9fa6f2a7 — both Vai's items addressed cleanly
error_name integration ✓
function isCompositionAssetError(msg: string, name: string | null): boolean {
if (msg.includes("Error fetching") && (msg.includes("404") || msg.includes("Not Found"))) return true;
if (name === "EncodingError" || msg.includes("unsupported or unrecognizable format")) return true;
if (msg.includes("MEDIA_ERR_SRC_NOT_SUPPORTED")) return true;
return false;
}Note on the OR-vs-AND interpretation: Vai's original suggestion was AND-ing (msg.includes(X) && name === "EncodingError" for stricter filtering), but Magi chose OR. That's actually broader-and-better:
EncodingErroris a spec'dDOMExceptionname — canonical across Chromium / Firefox / Safari.- The message text varies by browser (Chrome's "unsupported or unrecognizable format" wording isn't guaranteed elsewhere).
- OR-ing on the name catches decode failures from non-Chrome browsers that use different message text but the same canonical error name.
So this is "tighter" in the sense Vai really wanted — relying on the canonical identifier instead of message-string archaeology — while ALSO being broader in cross-browser coverage. Better than the literal AND interpretation. ✓
Sampled telemetry counter ✓
let filteredAssetErrorCount = 0;
// ...
if (isCompositionAssetError(props.error_message, props.error_name)) {
filteredAssetErrorCount++;
if (filteredAssetErrorCount === 1 || filteredAssetErrorCount % 100 === 0) {
trackStudioEvent("composition_asset_error_filtered", {
error_message: props.error_message.slice(0, 200),
error_name: props.error_name,
total_filtered: filteredAssetErrorCount,
});
}
return;
}The exact pattern needed — fires on first occurrence (so future regressions are immediately visible) + every 100th (so the dashboard sees ongoing baseline without flooding). The total_filtered field lets the dashboard compute the filtered/sampled ratio. ✓
The 200-char message truncation is a good touch — keeps the telemetry payload bounded.
Summary
Clean iteration loop. Both Vai's items addressed with shapes that are arguably better than his literal recommendations (broader cross-browser coverage via canonical error name; sampled counter with first-occurrence priority). Ship-ready on merit.
Sequencing reminder from the thread: ship #1071 first (P0), then this PR rebases off the main.tsx change if needed. Or merge this one + rebase #1071 to drop its main.tsx duplicate. Either order works.
— Rames Jusso
9fa6f2a to
b84b762
Compare
…unter Address review feedback: - AND with error_name === "EncodingError" for tighter filtering - Add sampled composition_asset_error_filtered tracking event (fires on 1st occurrence, then every 100th) so filtered errors aren't completely invisible in telemetry
b84b762 to
3866254
Compare
Summary
isCompositionAssetError()inmain.tsxto filter media decode failures from crash telemetrydecodeAudioDataDOMException) andMEDIA_ERR_SRC_NOT_SUPPORTEDare user-content failures (unsupported codec), not application bugserror_name === "EncodingError"for tighter cross-browser filtering (canonical DOMException name)composition_asset_error_filteredtracking event (1st + every 100th) so regressions remain visibleDashboard: https://us.posthog.com/project/356858/dashboard/1613945
Changes
packages/studio/src/main.tsx:isCompositionAssetErrornow acceptserror_nameand matches onEncodingError+ message strings. Sampled counter fires on 1st occurrence and every 100th.Test plan
unhandled_promise_rejectiontelemetry fires, butcomposition_asset_error_filtereddoes fire once