test: add Sentry coverage-equivalence harness (envelope parse/normalize/diff)#43820
test: add Sentry coverage-equivalence harness (envelope parse/normalize/diff)#43820MajorLift wants to merge 4 commits into
Conversation
Layer-2 verification for the Sentry v8→v10 migration (#42867): capture the full set of envelopes sent to a mocked DSN for a fixed flow, normalize away volatile ids/timestamps, and diff a v10 run against a v8 baseline to surface any added/removed item, dropped correlation tag, or volume change. Pure parse/normalize/diff utility is unit-tested; the capture spec generates the baseline on v8 and compares on v10. Tracked by #43819.
|
CLA Signature Action: All authors have signed the CLA. You may need to manually re-run the blocking PR check if it doesn't pass in a few minutes. |
Builds ready [9cd096f] [reused from c4b43e9]
⚡ Performance Benchmarks (Total: 🟢 14 pass · 🟡 8 warn · 🔴 3 fail)
Bundle size diffs
|
| const removed = [...baseTags].filter((tag) => !currTags.has(tag)).sort(); | ||
| if (added.length > 0 || removed.length > 0) { | ||
| tagChanges.push({ signature, added, removed }); | ||
| } |
There was a problem hiding this comment.
Tag diff ignores duplicate signatures
Medium Severity
In diffCoverage, tag coverage for a signature is taken only from the first grouped NormalizedItem ([0]), not from every item sharing that signature. When per-type counts match, equivalent can stay true even if another duplicate item (e.g. a repeated pageload transaction) lost a correlation tag such as otelTraceId.
Reviewed by Cursor Bugbot for commit 9cd096f. Configure here.
`lint:tsc` flagged the `getSeenRequests()` map param as implicit-any (TS7006) — annotate it `CompletedRequest`. And the spec is a manual cross-build harness with no committed baseline, so it threw (failing normal CI); skip it by default unless a baseline exists or `UPDATE_SENTRY_COVERAGE_BASELINE=true`.
Builds ready [f7b9970] [reused from c4b43e9]
⚡ Performance Benchmarks (Total: 🟢 14 pass · 🟡 8 warn · 🔴 3 fail)
Bundle size diffs
|
Builds ready [67f2d57] [reused from a0fd448]
⚡ Performance Benchmarks (Total: 🟢 15 pass · 🟡 7 warn · 🔴 3 fail)
Bundle size diffs
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 67f2d57. Configure here.
| parseSentryEnvelopes(bodies).map((item) => item.type), | ||
| ); | ||
| return REQUIRED_TYPES.every((type) => types.has(type)); | ||
| }, MAX_WAIT_FOR_ENVELOPES_MS) |
There was a problem hiding this comment.
Incomplete capture after wait fails
Medium Severity
The envelope wait uses .catch(() => undefined), so a timeout or other wait failure is ignored and the test still snapshots and diffs (or writes a baseline). There is no check that session, event, and transaction were actually seen, so incomplete captures can pass comparison or produce a wrong v8 baseline.
Reviewed by Cursor Bugbot for commit 67f2d57. Configure here.
Builds ready [1349304] [reused from a0fd448]
⚡ Performance Benchmarks (Total: 🟢 15 pass · 🟡 7 warn · 🔴 3 fail)
Bundle size diffs
|
|





Description
Layer-2 verification tooling for the Sentry v8→v10 migration (#42867), implementing the harness spec'd in #43819.
Captures the full set of envelopes the extension sends to a mocked Sentry DSN during a fixed flow, normalizes away volatile ids/timestamps, and diffs a v10 run against a v8 baseline — surfacing any added/removed envelope item, dropped correlation tag (e.g.
otelTraceId), or volume change that a green suite would miss. The point is to verify behavioral equivalence of telemetry, not just that tests pass.What's here:
test/e2e/helpers/sentry-coverage.ts— pure parse/normalize/diff utility: a Sentry newline-delimited envelope parser, volatile-key stripping, stable per-item signatures, and a structural diff with per-type counts. No runtime/Sentry imports.test/e2e/helpers/sentry-coverage.test.ts— 10 unit tests over realistic error/transaction/session envelope fixtures: parsing, normalization, and diff-detection of removed items, dropped tags, and volume increases.test/e2e/tests/metrics/sentry-coverage.spec.ts— the e2e capture spec. A manual cross-build harness (not an always-on check): it skips by default and runs only withUPDATE_SENTRY_COVERAGE_BASELINE=true(write the baseline on v8) or when a baseline is present (compare on v10).How to use (the v8-vs-v10 protocol): on
main(v8), run the spec withUPDATE_SENTRY_COVERAGE_BASELINE=trueto writestate-snapshots/sentry-coverage-baseline.json; then on the v10 branch (#42867), run it without the env var to diff against the baseline and fail on any structural delta, each triaged benign (timing) vs regression. The baseline JSON is intentionally not committed — it must be generated from a v8 build.Related issues
Closes #43819
Related:
@sentry/browserfrom8.33.1to10.38.0#42867 — the Sentry v8→v10 upgrade this verifies.Manual testing steps
yarn jest test/e2e/helpers/sentry-coverage.test.ts→ 10 passing (the pure parse/normalize/diff utility).Screenshots/Recordings
N/A — E2E verification tooling only; no user-facing change.
Before
No way to verify the v8→v10 migration preserves equivalent telemetry beyond the existing per-field snapshot assertions.
After
A reusable harness diffs the full v8 vs v10 envelope set (items, tags, volume), with the pure core unit-tested.
Pre-merge author checklist
Pre-merge reviewer checklist
Note
Low Risk
Changes are confined to E2E test helpers and a skipped-by-default spec; no production telemetry or runtime behavior is modified.
Overview
Adds test-only tooling to compare Sentry telemetry across SDK builds (e.g. v8 vs v10) by diffing mocked envelope POSTs from a fixed E2E flow, not by asserting individual fields in existing tests.
New
sentry-coveragehelpers parse newline-delimited envelope bodies, strip volatile ids/timestamps, build stable per-item signatures, anddiffCoveragereports added/removed items, tag-key changes (e.g.otelTraceId), and per-type volume deltas. Unit tests cover parsing, normalization, and diff scenarios.The
sentry-coverage.spec.tsharness unlocks + triggers a developer-options test error, captures all Sentry POSTs via a high-priority mockttp handler, waits forsession/event/transaction, then either writes a baseline (UPDATE_SENTRY_COVERAGE_BASELINE=true) or asserts equivalence against a local baseline file. The spec skips in CI when no baseline exists; the baseline JSON is not committed.Reviewed by Cursor Bugbot for commit 1349304. Bugbot is set up for automated code reviews on this repo. Configure here.