fix(ci): parallelize nightly E2E across browsers to prevent timeout#1870
Conversation
The regression run was consistently hitting the 60-minute job limit
because 60 tests (20 per browser × 3 browsers) were executing serially
on a single worker. With retries=2 and transfer tests timing out at
2 minutes each, the full suite needs ~3 hours to complete.
Fix: split into a matrix of 3 parallel jobs (chromium, firefox, webkit).
Each job runs only its 20 tests and completes well within the 90-minute
limit. A separate notify job downloads all three artifacts and sends a
single combined Slack message with per-browser results.
Also installs only the required browser per job
(playwright install --with-deps ${{ matrix.browser }}) instead of all
browsers, saving ~2 minutes of setup time per job.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Reviews (1): Last reviewed commit: "fix(ci): parallelize nightly E2E across ..." | Re-trigger Greptile |
…ults - Replace cramped fields 2-column layout with mrkdwn text block so all three browser summaries render with clear spacing - Fix test name resolution: use spec.title (the actual test name) instead of test.title which is empty in Playwright's JSON output - Build suite path from ancestor describe blocks using processSuite recursion so failures show full context (e.g. "auth › login › when email is valid") - Strip ANSI escape codes from error messages before posting to Slack - Show first non-empty error line per failure, truncated at 250 chars - Add divider between summary and run/commit metadata blocks Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
.github/workflows/e2e-regression-tests.yml:212-217
When a browser artifact download fails (silently, due to `continue-on-error: true`) but the matrix job itself succeeded, `result` will be `null` for that browser. The loop only sets `overallFailed = true` when `result && result.failedTests > 0`, so a missing artifact from an otherwise-healthy matrix run would leave `overallFailed` as `false`. The Slack message would then show "✅ Passed" alongside a "No results" line — a misleading green notification.
```suggestion
for (const browser of browsers) {
const filePath = path.join('test-results', browser, 'playwright-results.json');
const result = parseResults(filePath);
browserResults[browser] = result;
if (!result) overallFailed = true;
if (result && result.failedTests > 0) overallFailed = true;
}
```
Reviews (2): Last reviewed commit: "fix(ci): improve Slack notification form..." | Re-trigger Greptile |
Move the inline Node.js script from e2e-regression-tests.yml into a standalone file at apps/wallets/quickstart-devkit/scripts/notify-slack.js, alongside the existing generate-test-report.js script. The workflow now calls it with a single `node` invocation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Done — extracted to |
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
apps/wallets/quickstart-devkit/scripts/notify-slack.js:78-82
When tests are stored directly on `suite.tests` (rather than in `suite.specs`), the failure record uses `suite: parentPath || ""` — which is the path *before* the current suite, so the current suite's own title is dropped from the failure context shown in Slack. For any test nested directly under a describe block, the suite field will be truncated or empty. Using `currentPath` here aligns with how specs are handled.
```suggestion
failures.push({
title: test.title || suite.title || "Unknown Test",
suite: currentPath,
error: result?.error?.message || "No error message",
});
```
Reviews (3): Last reviewed commit: "refactor(ci): extract Slack notify scrip..." | Re-trigger Greptile |
🔥 Smoke Test Results❌ Status: Failed Statistics
Test DetailsThis is a non-blocking smoke test. Full regression tests run separately. |
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Reviews (4): Last reviewed commit: "fix(ci): apply Biome formatting to notif..." | Re-trigger Greptile |
Problem
The nightly E2E job has been hitting the 60-minute job limit every day and cancelling before any tests complete.
Root cause: 60 tests (20 per browser × 3 browsers) run serially on
workers: 1. Withretries: 2and transfer tests that individually time out at 2 minutes, the full suite needs ~3 hours — 3× the job limit.Timeline from today's cancelled run (26626516698):
08:26:19— "Running 60 tests using 1 worker"08:26:55— first test fails, triggers retry cycle08:33:51— solana transfer timeout (2 min) consumes retry budget09:21:32— job cancelled after exactly 60 min with ~17/60 tests completedFix
Split the single job into a matrix of 3 parallel browser jobs (chromium, firefox, webkit). Each job runs only its 20 tests and has a 90-minute limit, which it comfortably fits within.
Other improvements
playwright install --with-deps ${{ matrix.browser }}) instead of all three — saves ~2 min of setup per jobfail-fast: falseso a webkit failure doesn't cancel the chromium and firefox jobsnotifyjob aggregates results from all 3 artifacts into a single Slack messageTest plan
workflow_dispatchand confirm 3 parallel jobs appear