Skip to content

ci: dedupe quality gates and add fast-fail pre-check#345

Merged
deanq merged 10 commits into
mainfrom
deanq/ae-3161-optimize-ci-cd
Jun 4, 2026
Merged

ci: dedupe quality gates and add fast-fail pre-check#345
deanq merged 10 commits into
mainfrom
deanq/ae-3161-optimize-ci-cd

Conversation

@deanq
Copy link
Copy Markdown
Member

@deanq deanq commented May 28, 2026

Summary

Eliminates redundant CI/CD work across the full release lifecycle while keeping defense-in-depth for actual code changes. Linear: AE-3161.

Adopts the single-required-check aggregator pattern from runpod-python (ci.yml:105) plus a content-classifying guard job that decides each run's mode by inspecting the actual diff -- not just branch names.

Lifecycle redundancy map (after this PR)

Stage Trigger Work Notes
1. PR commit pull_request push guard → pre-check → quality-gates(4) → build(conditional) → validation Required: build only runs when packaging-relevant files change
2. Feature PR → main push: main (non-metadata diff) Full matrix on merge commit Defense-in-depth, kept
3. Bot updates its PR pull_request on release-please--* guard(bypass) + validation only, ~7s Was ~28 min
4. Release commit on main push: main (metadata-only diff) guard(bypass) + validation only, ~7s Was ~7 min

What's redundant today (before this PR)

  1. Both ci.yml and release-please.yml run the full 4-version Quality Gates matrix on push: main -- ~24 min of duplicated compute per merge. The release workflow runs after CI on the same SHA.
  2. release-please bot PRs trigger the full CI matrix on every bot push (~28 min) even though they only touch CHANGELOG.md + .release-please-manifest.json.
  3. The release-please release commit on main triggers the full matrix again (~7 min) even though it changes nothing but metadata.
  4. e2e.yml has its own unit-tests job that re-implements ci.yml's quality-gates with a drifting setup-uv version.
  5. make ci-quality-github writes both pytest passes to the same --junitxml=pytest-results.xml -- the second pass silently overwrites the first report.
  6. Makefile:dev runs uv sync --all-groups then uv pip install -e . -- the editable install is redundant in workspace mode.
  7. quality-check and ci-quality-github are two Makefile targets running the same checks.
  8. No concurrency group -- pushing a fixup commit to a PR leaves the previous run racing.
  9. No fast-fail pre-check -- a ruff format failure wastes the full make dev install across 4 matrix legs before bailing.

Changes

.github/workflows/ci.yml

Added: guard job (content-based classifier) -- single source of truth for what mode each run is in. Inspects the actual diff and outputs:

  • mode: release-please-bypass (branch prefix matches AND diff is exactly CHANGELOG + manifest) or full
  • should_build: true if packaging-relevant files (pyproject.toml, Makefile, MANIFEST.in, scripts/validate-wheel.sh) changed, or new non-Python file added under src/

For push: main, the same classification applies: a release-please release commit (metadata-only) classifies as bypass; anything else as full.

Added: validation job (aggregator) -- runs unconditionally (if: always()), inspects needs.guard.result, needs.pre-check.result, needs.quality-gates.result, needs.build.result. Passes when every upstream is success or skipped, fails on anything else. This is the single required check -- adding/removing future jobs or matrix legs no longer requires branch-protection updates.

Other:

  • Concurrency group: cancel in-flight PR runs on new push
  • pre-check job: uv sync --only-group dev --frozen + uv run ruff (pins to lockfile, no uvx version drift)
  • quality-gates matrix: 3.10 / 3.11 / 3.12 / 3.13
  • build job: gated by guard.outputs.should_build
  • Bump setup-uv@v2 → v5
  • Upload artifact: pytest-results-*.xml (now two files: parallel + serial)

.github/workflows/release-please.yml

  • Remove quality-gates job entirely. Branch protection already enforces CI green before code lands. Re-running on the same SHA validates nothing new.
  • release-please becomes the entry point; pypi-publish runs if a release was created.
  • Bump setup-uv@v2 → v5.

.github/workflows/e2e.yml

  • Drop unit-tests job (duplicated ci.yml quality-gates).
  • Fold summary step into the e2e job; remove the separate summary job and its artifact round-trip.
  • Add a leading comment explaining why unit-tests was dropped, so future maintainers don't re-add it.

Makefile

  • dev: drop redundant uv pip install -e . (uv sync handles editable install in workspace mode). Verified locally: rm -rf .venv && make dev && .venv/bin/flash --version prints correctly.
  • quality-check: alias ci-quality-github (single source of truth).
  • ci-quality-github: write distinct pytest-results-parallel.xml and pytest-results-serial.xml (fixes silent junit overwrite).

Why the pytest double-pass stayed

Initial draft collapsed the two pytest invocations into a single -n auto --dist=loadgroup run, relying on conftest.py's xdist_group("serial") hook for isolation. Local make quality-check exposed test pollution: test_call_routes_to_execute_for_live_endpoint failed when sharing a worker with non-serial tests under the single-pass arrangement, but passes in isolation and in the original two-pass layout. The double pass provides real process isolation between parallel and serial tests; the silent junit overwrite is fixed instead.

Expected impact

Scenario Before After
Per PR push (typical) ~7 min × 4 matrix + 1 min build = ~29 min compute ~7 min × 1 leg parallel + 15s pre-check; ~75% less compute (more if pre-check fails fast)
Per merge to main ~7 min × 4 (CI) + ~7 min × 4 (release-please) + 1 min build = ~57 min ~7 min × 4 + 1 min build = ~29 min
Per release-please bot PR push ~7 min × 4 = ~28 min ~7s (guard + validation)
Per release commit on main ~7 min × 4 = ~28 min ~7s (guard + validation) -- pypi-publish builds its own wheel

⚠️ Required before merge -- branch protection swap

Branch protection on main needs to be reconfigured:

  • Remove required checks: CI / Quality Gates (3.10), (3.11), (3.12), (3.13)
  • Add required check: CI / Validation

The Validation job aggregates all upstream results, so this becomes a stable required-check name regardless of future matrix or job changes.

Test plan

  • make quality-check passes locally with the new Makefile layout
  • make dev produces a working install with flash entry point registered (no uv pip install -e .)
  • make ci-quality-github writes both pytest-results-parallel.xml and pytest-results-serial.xml
  • pre-check uses lockfile-pinned ruff (0.14.14), matching quality-gates
  • guard classification works on PR diffs (regular PR → full, spoofed release-please branch with code changes → full, real release-please metadata-only → bypass)
  • validation aggregator visible in CI runs as the single green check
  • Branch protection updated by admin: remove four Quality Gates (X) checks, add Validation
  • After merge, confirm release-please release commit on main triggers only guard + validation (~7s)
  • Next release-please bot PR push triggers only guard + validation (~7s)

Copy link
Copy Markdown
Contributor

@runpod-Henrik runpod-Henrik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Henrik's AI-Powered Bug Finder — PR #345 Review

Verdict: PASS WITH NITS (with one repo-admin action required before merge)

CI-only refactor. Local SDK behavior is unchanged. The optimization claims hold up (75% compute reduction for the typical PR-fixup loop; full matrix dedupe on merge-to-main). CI on this PR proves the new pipeline works: pre-check 10s, all four quality-gates legs 4-4:30 min, build skipped per the new if: guard.

The PR description already flags the branch-protection compatibility issue, which is the real release-blocker.


1. Issue: branch-protection compatibility on release-please bot PRs

The PR description calls this out explicitly. Recapping the QA risk in user terms:

  • After this PR, a release-please bot PR that touches only CHANGELOG.md + .release-please-manifest.json causes ci.yml to be skipped via paths-ignore — workflow doesn't run at all. GitHub branch protection treats paths-ignore skips as "check satisfied," so the bot PR can merge cleanly.
  • But if branch protection on main requires a check named CI / Quality Gates (4 statuses, one per matrix leg), and the workflow never runs, those status checks are simply absent. Whether branch protection blocks "missing required check" or accepts "not run because path-ignored" depends on a setting on the rule ("Require status checks to pass before merging" with vs. without "Strict — Require branches to be up to date"). Repo admin needs to verify this on main before merging.

If the rule is strict and missing checks block, the bot PR will be permanently un-mergeable.

A safer path is the PR description's option 2: add a no-op success job named to match the required check (e.g., name: Quality Gates) that fires only on release-please--* branches.

This is not a code defect — it's a deployment-coordination requirement. Flagging it as a hard gate so it doesn't get missed.


2. Issue: PR-time wheel build is dropped

build job now runs only on push: main (`if: ${{ github.event_name == 'push' && github.ref == 'refs/heads/main' }}`). That means make build + validate-wheel.sh no longer run on PR.

User scenario: a contributor changes pyproject.toml (e.g., adjusts [tool.setuptools.package-data], adds a non-Python file that needs include_package_data). The PR's CI passes (quality-gates is pure pytest), the wheel-validation regression lands on main, and the next PR's build job — or worse, release-please's PyPI publish — is the first signal.

The PR's rationale: "a PR-time wheel build would add ~1 min for no extra signal beyond quality-gates." That's mostly true — but validate-wheel.sh is wheel-packaging-specific signal that quality-gates doesn't catch. Examples that would slip past:

  • Stale package-data glob after a directory rename.
  • runpod_flash/rules/AGENTS.md (just added in 1.17.0) accidentally excluded from the wheel.
  • A MANIFEST.in regression.

Suggest gating the build job on the actual files that affect packaging:

```yaml
if: ${{ (github.event_name == 'push' && github.ref == 'refs/heads/main') ||
(github.event_name == 'pull_request' && contains(github.event.pull_request.changed_files, 'pyproject.toml')) }}
```

Or even simpler: keep build on PRs, but only when pyproject.toml, Makefile, or scripts/validate-wheel.sh are touched. Cheap signal preserved.


3. Question: redundant skip-guards for release-please bot PRs

ci.yml skips release-please bot PRs in two places:

  1. Workflow-level paths-ignore: [CHANGELOG.md, .release-please-manifest.json] — workflow doesn't run if all changed files match.
  2. Job-level if: ${{ !startsWith(github.head_ref, 'release-please--') }} on pre-check.

These overlap but aren't equivalent:

  • If the bot PR ever touches a non-ignored file (e.g., the bot decides to update a workflow file), paths-ignore no longer applies → workflow runs.
  • Then pre-check skips via the if:, and quality-gates needs: [pre-check] will also skip per default needs semantics → no quality signal on a release-please PR that's broader than CHANGELOG.

Is the intent "always skip on release-please branches even if it touches code," or "skip only when the change is metadata-only"? If the former, this is consistent. If the latter, drop the if: on pre-check and rely on paths-ignore alone.

Confirming the intent now is cheap; debugging which guard skipped which check after the fact is not.


4. Question: does make dev (without uv pip install -e .) make the flash CLI entry point importable?

User scenario: a contributor clones the repo, runs make dev, then flash --version.

Before: uv sync --all-groups + uv pip install -e . — the editable install registered the flash = runpod_flash.cli.main:app entry point in the venv.

After: uv sync --all-groups alone.

Whether uv sync registers the entry point for the project depends on whether the project has [tool.uv.sources] / workspace config that includes itself. CI passes because the matrix runs make ci-quality-github which uses uv run pytest … — that activates the project context regardless.

But a contributor running flash outside uv run (i.e., just flash --version from an activated .venv) might find it missing. This is a developer-onboarding hazard, not a production user issue. Worth a sanity check: from a clean clone, make dev followed by .venv/bin/flash --version — does it print 1.17.0, or "command not found"?

If "command not found," restore the uv pip install -e . and call it explicitly redundant-but-required for entry-point registration.


5. Nit: local make quality-check UX now noisier

quality-check aliases ci-quality-github, which means local devs running it see GitHub Actions annotation markers (::group::, ::endgroup::) sprinkled in their output. Functionally fine — the markers are inert outside Actions — but it's a minor UX regression for the most common local quality-check invocation.

If keeping single-source-of-truth is the priority (it should be), at least add a brief note in the help text that markers in the output are intentional. Or have quality-check set an env var that conditionally suppresses the @echo "::group::..." lines.


Nits

  • Makefile:75 test-coverage comment update mentions "serial pass for state isolation" — good clarification, matches the PR description's rationale for keeping the two-pass.
  • e2e.yml loses the unit-tests job entirely. That's fine because the workflow is workflow_dispatch only (manually triggered E2E runs), and the manual user always runs ci.yml first via PR. But worth a one-line comment in the workflow file saying so, otherwise a future maintainer might add it back assuming it was dropped by accident.
  • pre-check runs uvx ruff … which downloads ruff into a temporary venv every time. Cached via setup-uv's cache-dependency-glob: pyproject.toml, so warm runs are fast — but on a first-of-day cold cache, ~30s vs ~10s. Acceptable tradeoff.
  • Trailing newline restored in release-please.yml final line. Good.

🤖 Reviewed by Henrik's AI-Powered Bug Finder

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Removes redundant CI work that was costing ~24 min of compute per merge and ~28 min per release-please bot PR. Centralizes the quality gate as a single canonical Makefile target, fixes a silent junit overwrite, and adds a fast-fail format/lint pre-check plus PR-cancel concurrency to ci.yml. Drops the duplicated 4-version matrix from release-please.yml and the duplicated unit-tests job (plus its artifact round-trip summary) from e2e.yml.

Changes:

  • Makefile: drop redundant editable install in dev; alias quality-check to ci-quality-github; split pytest passes into distinct pytest-results-parallel.xml / pytest-results-serial.xml.
  • ci.yml: add concurrency group, paths-ignore for release-please bot files, fast-fail pre-check job, gate build to push: main, bump setup-uv@v2 → v5, upload pytest-results-*.xml.
  • release-please.yml / e2e.yml: remove duplicated quality-gates matrix and unit-tests/summary jobs; bump setup-uv@v2 → v5; fold summary into the e2e job.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
Makefile Simplifies dev, makes ci-quality-github the canonical gate, fixes junit filename collision.
.github/workflows/ci.yml Adds concurrency, paths-ignore, pre-check job; conditions build on push:main; v5 setup-uv.
.github/workflows/release-please.yml Removes duplicate quality-gates matrix; release-please is the entry point; v5 setup-uv.
.github/workflows/e2e.yml Drops duplicate unit-tests and summary jobs; folds summary into e2e job.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Makefile Outdated
Comment thread .github/workflows/ci.yml Outdated
deanq added a commit that referenced this pull request Jun 1, 2026
…release-please bot

Addresses @runpod-Henrik (PASS WITH NITS) and Copilot review on PR #345.

Henrik #3 / Copilot inline ci.yml:50 — skip cascade
- Previously `pre-check.if` skipped on release-please bot PRs, and
  `quality-gates needs: [pre-check]` cascaded to skip too, leaving any
  release-please PR that touched non-metadata files with zero CI signal.
- Drop the workflow-level `paths-ignore` (it interacted badly with branch
  protection's required-check semantics anyway) and instead put explicit
  `if: !startsWith(github.head_ref, 'release-please--')` on pre-check,
  quality-gates, packaging-guard, and build. Symmetric, auditable.

Henrik #1 + #6 — branch-protection compatibility (release-please bot PRs)
- Add `release-please-sentinel` job with matching matrix + name "Quality
  Gates" so it produces the same "Quality Gates (3.10)"…"(3.13)" check
  names that branch protection requires, but auto-passes. Branch
  protection on main can now require those checks without admins having
  to either loosen the rule or hand-craft an external sentinel workflow.

Henrik #2 — PR-time wheel build is dropped
- Reintroduce build on PRs but gated by `packaging-guard` job that
  inspects `git diff` against the PR base. Build runs when:
    - pyproject.toml / Makefile / MANIFEST.in / scripts/validate-wheel.sh
      changed, OR
    - any new non-Python file was added under src/ (data files need
      explicit package-data inclusion or they're silently dropped).
- Push to main always builds.

Henrik #4 — `make dev` entry-point sanity check
- Verified locally: `rm -rf .venv && make dev && .venv/bin/flash --version`
  prints "Runpod Flash CLI v1.16.0". `uv sync` registers the entry point
  for the workspace project; the dropped `uv pip install -e .` was
  redundant. No code change needed.

Henrik #5 / Copilot Makefile:108 — misleading "plain output" comment
- Rewrite the comment block above `quality-check`. The aliases do NOT
  produce plain output -- local invocations see GitHub Actions
  annotation markers (::group::, --output-format=github) which are
  cosmetically noisy but functionally inert. Comment now reflects this.

Henrik misc nit — e2e.yml `unit-tests` job removal
- Add a leading comment on the `e2e` job explaining why the duplicate
  unit-tests job was dropped, so a future maintainer doesn't re-add it
  thinking it was lost by accident.

https://linear.app/runpod/issue/AE-3161
@deanq
Copy link
Copy Markdown
Member Author

deanq commented Jun 1, 2026

Addressed all review feedback in 1136d5c.

@runpod-Henrik items:

  • Dean's refactors - phase 1 #1 + Add matrix operations example #6 (branch protection): Added release-please-sentinel job in ci.yml that mirrors the Quality Gates matrix (3.10/3.11/3.12/3.13) for release-please--* branches and auto-passes. Branch protection's required checks are satisfied without admin coordination or a separate workflow file.
  • ux update to terminal #2 (PR-time wheel build dropped): Reintroduced build on PRs but gated by a packaging-guard job. Build runs when pyproject.toml, Makefile, MANIFEST.in, scripts/validate-wheel.sh change, or when any new non-Python file is added under src/ (catches the dropped-data-file scenario you flagged). Push to main always builds.
  • add trt runtime #3 (skip cascade) / Copilot inline ci.yml:50: Dropped workflow-level paths-ignore and put explicit if: !startsWith(github.head_ref, 'release-please--') on pre-check, quality-gates, packaging-guard, and build. No more needs: cascade masking the intent. Symmetric with the sentinel.
  • add serialized function cache [In Memory] #4 (make dev entry-point): Verified locally — rm -rf .venv && make dev && .venv/bin/flash --version prints Runpod Flash CLI v1.16.0. uv sync registers the workspace entry point; the dropped uv pip install -e . was genuinely redundant. No code change.
  • Update README.md #5 / Copilot inline Makefile:108: Rewrote the misleading comment block — the aliases do NOT produce plain output. Now explicitly notes that local invocations see ::group:: / --output-format=github markers in their terminal as inert noise, with a one-line rationale for keeping single source of truth.
  • Misc nit: Added a leading comment on the e2e job in e2e.yml explaining why the duplicate unit-tests job was dropped, so future maintainers don't re-add it.

Skipped (already acceptable per your review):

  • pre-check cold-cache 30s — you flagged this as acceptable tradeoff.
  • Trailing newline in release-please.yml — already correct.

Both Copilot inline threads resolved. Ready for re-review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

Comment thread .github/workflows/ci.yml
Comment thread .github/workflows/ci.yml Outdated
Comment thread .github/workflows/ci.yml
Comment thread .github/workflows/ci.yml Outdated
Comment thread .github/workflows/ci.yml
Comment thread .github/workflows/ci.yml Outdated
@deanq deanq changed the title ci: dedupe quality gates and add fast-fail pre-check (AE-3161) ci: dedupe quality gates and add fast-fail pre-check Jun 2, 2026
deanq added a commit that referenced this pull request Jun 2, 2026
Addresses Copilot's second review on PR #345.

Copilot ci.yml:42 — sentinel-bypass vulnerability
- Previous sentinel gated only on `startsWith(head_ref, 'release-please--')`.
  A contributor could open a PR from any branch named with that prefix and
  inherit the sentinel's auto-passes for "Quality Gates (3.10..3.13)" while
  pre-check / quality-gates / build all skipped via the same `if:` guard.
  Net: CI fully bypassed, branch protection satisfied, code unreviewed.
- Replace the three independent `if:` gates with a single `guard` job that
  inspects the actual PR diff and classifies the run as either
  `release-please-bypass` (branch prefix AND diff is exactly CHANGELOG.md +
  .release-please-manifest.json) or `full`. `release-please-sentinel` only
  fires when classification is bypass; pre-check / quality-gates / build
  fire only when classification is full. A contributor naming a branch
  release-please-- but changing code falls through to full CI.
- Also folds the previous packaging-guard logic into the same job; build
  reads `should_build` from the unified guard output.

Copilot ci.yml:59 — ruff version drift in pre-check
- `uvx ruff …` installed the latest ruff at runtime, which can drift from
  the locked version (uv.lock pins ruff 0.14.14) used by
  `make ci-quality-github`. That produces flaky "passes locally / fails CI"
  (or vice versa) verdicts when ruff cuts a release between runs.
- Pre-check now runs `uv sync --only-group dev --frozen` then `uv run ruff
  …`. Pulls the exact pinned version; warm-cache install is ~1s, cold is
  a few seconds. Fast-fail premise preserved; determinism gained.

https://linear.app/runpod/issue/AE-3161
@deanq
Copy link
Copy Markdown
Member Author

deanq commented Jun 2, 2026

Addressed Copilot's second review in 1a16897.

Sentinel-bypass vulnerability (3 dup threads at ci.yml:42) — fixed. Replaced the three independent if: !startsWith(head_ref, 'release-please--') gates with a single guard job that inspects the actual PR diff. Classification is release-please-bypass only when both the branch prefix matches AND the diff is exactly CHANGELOG.md + .release-please-manifest.json; otherwise it falls through to full mode and runs the real matrix. A contributor naming a branch release-please--anything but pushing code changes can no longer inherit the sentinel's auto-passes. Also folded the previous packaging-guard into the same job so build/skip classification is centralized.

Ruff version drift in pre-check (3 dup threads at ci.yml:59) — fixed. Switched from uvx ruff (latest at runtime) to uv sync --only-group dev --frozen + uv run ruff. Pre-check now uses the exact ruff version pinned in uv.lock (currently 0.14.14), matching what make ci-quality-github runs in quality-gates. No more "passes locally / fails CI" divergence when ruff cuts a release between runs. Warm-cache install is ~1s; cold is a few seconds. Fast-fail premise preserved.

All 6 threads resolved (3 dups each). Ready for re-review.

deanq added 3 commits June 2, 2026 09:16
Eliminates redundant CI/CD work that was costing ~24 min of compute per
merge and ~28 min per release-please bot PR push.

Workflow changes:
- ci.yml: add concurrency group (cancel in-flight PR runs), paths-ignore
  for release-please bot PRs, new pre-check job (uvx ruff format+check,
  ~10s no-install fast-fail), bump setup-uv v2 -> v5, build job only on
  push:main since release-please.yml builds + publishes the same SHA.
- release-please.yml: remove duplicate quality-gates job. Branch protection
  on main already enforces ci.yml green before code lands, so the second
  matrix validated nothing. Saves a full 4-leg matrix per merge.
- e2e.yml: drop unit-tests job (duplicated ci.yml quality-gates), fold the
  summary step into the e2e job to remove the third workflow definition of
  the same install setup.

Makefile changes:
- dev: drop redundant 'uv pip install -e .' (uv sync handles editable
  install in workspace mode).
- quality-check: alias ci-quality-github so there is one canonical CI
  quality gate, no drift between local and CI.
- ci-quality-github / test-coverage: fix the latent junit overwrite bug
  where both pytest invocations wrote pytest-results.xml -- now writes
  pytest-results-parallel.xml and pytest-results-serial.xml. The double
  pytest pass is retained because it provides process isolation between
  parallel and serial tests; collapsing to a single loadgroup run
  surfaced state pollution in test_load_balancer_sls_stub.

Note: removing quality-gates from release-please.yml + paths-ignore for
the bot's files means the release-please PR will have no required CI
check. Branch protection on main will need to allow missing checks for
the release-please--* branch pattern, or add a no-op sentinel job named
to match the protection rule.

https://linear.app/runpod/issue/AE-3161
…release-please bot

Addresses @runpod-Henrik (PASS WITH NITS) and Copilot review on PR #345.

Henrik #3 / Copilot inline ci.yml:50 — skip cascade
- Previously `pre-check.if` skipped on release-please bot PRs, and
  `quality-gates needs: [pre-check]` cascaded to skip too, leaving any
  release-please PR that touched non-metadata files with zero CI signal.
- Drop the workflow-level `paths-ignore` (it interacted badly with branch
  protection's required-check semantics anyway) and instead put explicit
  `if: !startsWith(github.head_ref, 'release-please--')` on pre-check,
  quality-gates, packaging-guard, and build. Symmetric, auditable.

Henrik #1 + #6 — branch-protection compatibility (release-please bot PRs)
- Add `release-please-sentinel` job with matching matrix + name "Quality
  Gates" so it produces the same "Quality Gates (3.10)"…"(3.13)" check
  names that branch protection requires, but auto-passes. Branch
  protection on main can now require those checks without admins having
  to either loosen the rule or hand-craft an external sentinel workflow.

Henrik #2 — PR-time wheel build is dropped
- Reintroduce build on PRs but gated by `packaging-guard` job that
  inspects `git diff` against the PR base. Build runs when:
    - pyproject.toml / Makefile / MANIFEST.in / scripts/validate-wheel.sh
      changed, OR
    - any new non-Python file was added under src/ (data files need
      explicit package-data inclusion or they're silently dropped).
- Push to main always builds.

Henrik #4 — `make dev` entry-point sanity check
- Verified locally: `rm -rf .venv && make dev && .venv/bin/flash --version`
  prints "Runpod Flash CLI v1.16.0". `uv sync` registers the entry point
  for the workspace project; the dropped `uv pip install -e .` was
  redundant. No code change needed.

Henrik #5 / Copilot Makefile:108 — misleading "plain output" comment
- Rewrite the comment block above `quality-check`. The aliases do NOT
  produce plain output -- local invocations see GitHub Actions
  annotation markers (::group::, --output-format=github) which are
  cosmetically noisy but functionally inert. Comment now reflects this.

Henrik misc nit — e2e.yml `unit-tests` job removal
- Add a leading comment on the `e2e` job explaining why the duplicate
  unit-tests job was dropped, so a future maintainer doesn't re-add it
  thinking it was lost by accident.

https://linear.app/runpod/issue/AE-3161
Addresses Copilot's second review on PR #345.

Copilot ci.yml:42 — sentinel-bypass vulnerability
- Previous sentinel gated only on `startsWith(head_ref, 'release-please--')`.
  A contributor could open a PR from any branch named with that prefix and
  inherit the sentinel's auto-passes for "Quality Gates (3.10..3.13)" while
  pre-check / quality-gates / build all skipped via the same `if:` guard.
  Net: CI fully bypassed, branch protection satisfied, code unreviewed.
- Replace the three independent `if:` gates with a single `guard` job that
  inspects the actual PR diff and classifies the run as either
  `release-please-bypass` (branch prefix AND diff is exactly CHANGELOG.md +
  .release-please-manifest.json) or `full`. `release-please-sentinel` only
  fires when classification is bypass; pre-check / quality-gates / build
  fire only when classification is full. A contributor naming a branch
  release-please-- but changing code falls through to full CI.
- Also folds the previous packaging-guard logic into the same job; build
  reads `should_build` from the unified guard output.

Copilot ci.yml:59 — ruff version drift in pre-check
- `uvx ruff …` installed the latest ruff at runtime, which can drift from
  the locked version (uv.lock pins ruff 0.14.14) used by
  `make ci-quality-github`. That produces flaky "passes locally / fails CI"
  (or vice versa) verdicts when ruff cuts a release between runs.
- Pre-check now runs `uv sync --only-group dev --frozen` then `uv run ruff
  …`. Pulls the exact pinned version; warm-cache install is ~1s, cold is
  a few seconds. Fast-fail premise preserved; determinism gained.

https://linear.app/runpod/issue/AE-3161
@deanq deanq force-pushed the deanq/ae-3161-optimize-ci-cd branch from f748fae to 64bc5fb Compare June 2, 2026 16:16
Borrowed from runpod-python (ci.yml:105) -- one required check that
inspects upstream `needs.*.result` and treats "success" or "skipped" as
green. Much cleaner than the previous sentinel-with-matching-matrix-names
hack:

- Drops `release-please-sentinel` job entirely. No more duplicate
  `Quality Gates` display name. No more "skipped Quality Gates" cosmetic
  row alongside the real four matrix legs on every regular PR.
- For release-please bot bypass runs: guard classifies bypass ->
  pre-check / quality-gates / build all skip -> `validation` aggregates
  ("skipped" treated as ok) -> green.
- For regular PRs: everything runs -> `validation` aggregates -> green.
- For any genuinely failed or cancelled job: `validation` exits 1 ->
  branch protection blocks merge.

Future matrix changes (add 3.14, drop 3.10) or new jobs (security scans,
etc.) no longer require touching branch protection -- only the `needs:`
list and the results array in this aggregator.

Requires branch protection on `main` to be reconfigured before merge:
- Remove required checks: `CI / Quality Gates (3.10)`, `(3.11)`, `(3.12)`,
  `(3.13)`
- Add required check: `CI / Validation`

https://linear.app/runpod/issue/AE-3161
@deanq
Copy link
Copy Markdown
Member Author

deanq commented Jun 2, 2026

Applied the single-required-check aggregator pattern from runpod-python's ci.yml:105. Commit c548e52.

Dropped: release-please-sentinel job. No more duplicate `Quality Gates` display name. The skipped row Daisy spotted on every regular PR is gone.

Added: validation job — runs unconditionally (if: always()), inspects needs.guard.result, needs.pre-check.result, needs.quality-gates.result, needs.build.result. Passes if every upstream is success or skipped. Fails on anything else.

How each path resolves now:

  • Regular PR: guard runs (success) → pre-check (success) → quality-gates 4-leg (success) → build (success or skipped via packaging-guard) → validation green.
  • Release-please bot PR (metadata only): guard runs (success, classifies bypass) → pre-check/quality-gates/build all skip → validation aggregates ("skipped" treated as ok) → green.
  • Spoofed release-please-- branch with code changes: guard classifies full → pre-check/quality-gates/build all run → validation aggregates real results.
  • Any genuine failure: validation exits 1.

⚠ Required before merge — branch protection on main needs:

  • Remove: `CI / Quality Gates (3.10)`, `(3.11)`, `(3.12)`, `(3.13)`
  • Add: `CI / Validation`

After this one-time change, future matrix or job additions don't require any branch-protection updates. The aggregator handles it automatically.

Closes the last remaining redundancy in the lifecycle. When release-please's
release PR is merged, the resulting commit on main only changes CHANGELOG.md
and .release-please-manifest.json -- the same code was already validated
when the underlying feature PRs merged. Previously ci.yml ran the full
4-version matrix + build on those merge commits (~7 min compute per release).

Now `guard` classifies push:main by diff:
- Metadata-only diff -> release-please-bypass -> pre-check / quality-gates /
  build all skip, validation aggregates "skipped" as success.
- Any non-metadata change (feature merges, admin direct pushes) -> full.

release-please.yml's pypi-publish builds its own wheel before publishing,
so skipping ci.yml's build on release commits doesn't blind us to wheel
regressions.

https://linear.app/runpod/issue/AE-3161
@deanq
Copy link
Copy Markdown
Member Author

deanq commented Jun 2, 2026

Applied stage-4 optimization in 1a6a0b6.

guard now classifies push:main by diff:

  • Metadata-only diff (CHANGELOG.md + .release-please-manifest.json) → release-please-bypass → pre-check / quality-gates / build all skip → validation aggregates skipped as success → green in ~7s.
  • Anything else (feature merges, admin direct pushes) → full → matrix + build run as before.

Safe to skip the build here because release-please.yml's pypi-publish job builds its own wheel before publishing — no blind spot.

Final lifecycle redundancy map

Stage Trigger Work Redundant?
1. PR commit pull_request push guard + pre-check + quality-gates(4) + build(conditional) + validation No
2. Feature PR merged → main push: main (non-metadata diff) Full matrix re-runs on merge commit Yes (defense-in-depth, kept)
3. Bot updates its PR pull_request push on release-please-- guard(bypass) + validation only — ~7s No
4. Release commit on main push: main (metadata-only diff) guard(bypass) + validation only — ~7s; release-please.yml publishes No (was redundant, now fixed)

Only remaining redundancy is Stage 2 — the merge commit re-validation. That one's deliberate defense-in-depth (squash/rebase changes SHAs; admin override exists). Removing it would mean trusting PR-only validation, which most shops don't.

@deanq deanq requested a review from Copilot June 3, 2026 02:02
@deanq deanq requested review from KAJdev and jhcipar June 3, 2026 02:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.

Comment thread .github/workflows/ci.yml Outdated
Comment thread .github/workflows/ci.yml Outdated
Comment thread .github/workflows/ci.yml
Comment thread .github/workflows/ci.yml Outdated
Comment thread Makefile Outdated
Comment thread .github/workflows/release-please.yml
Comment thread .github/workflows/e2e.yml
…or cache

Addresses Copilot's third review on PR #345.

Copilot ci.yml:66 + ci.yml:88 — bypass missing release-please-managed files
- release-please-config.json declares release-type=python (auto-bumps
  pyproject.toml) and extra-files=src/runpod_flash/__init__.py. Real bot
  PRs touch 4 files, not 2. Previous ALLOWED regex missed pyproject.toml
  and __init__.py, so every real release-please bot PR was misclassified
  as `full`. Entire bypass optimization was broken for the actual case
  it was designed for.
- Extend ALLOWED to all four files. Add a comment pointing back to
  release-please-config.json so the regex stays in sync.
- Add defense-in-depth: PR classifier now also requires
  github.event.pull_request.user.login == 'runpod-release-please-bot[bot]'
  (the GitHub App account). Branch prefix + bot author + file content all
  required. A contributor spoofing the branch name fails the author check;
  a contributor with write access to the bot would already be trusted.
- Push:main classifier keeps content-only check (admin can squash-merge
  the bot's PR by hand, so actor isn't a reliable signal on push).

Copilot ci.yml:164 + ci.yml:193 + release-please.yml:61 + e2e.yml:39
— setup-uv cache keyed on pyproject.toml
- uv sync is lockfile-driven; pyproject.toml-only cache key means uv.lock
  changes (most common dependency update) don't invalidate the cache,
  causing stale-deps cache hits.
- Switch cache-dependency-glob to uv.lock everywhere. pre-check already
  used uv.lock; this aligns the rest.

Copilot Makefile:107 — misleading "aliases" comment
- quality-check-strict is NOT an alias for ci-quality-github -- it runs a
  different set (adds typecheck, plain output, no junit). Comment block
  rewritten to enumerate each target individually so maintainers don't
  conflate them.

https://linear.app/runpod/issue/AE-3161
@deanq
Copy link
Copy Markdown
Member Author

deanq commented Jun 3, 2026

Addressed Copilot's third review in 97920a6.

Bypass classifier was broken for the actual case it was designed for (2 threads, ci.yml:66 + ci.yml:88) — caught this verified against release-please-config.json and real bot PR #328. release-please-config has release-type=python (auto-bumps pyproject.toml) and extra-files=src/runpod_flash/__init__.py, so real bot PRs touch 4 files, not 2. My previous regex only allowed 2, meaning every real release-please bot PR was being misclassified as full — the entire bypass optimization was broken for its primary use case.

Fixed:

  • Extended ALLOWED to CHANGELOG.md | .release-please-manifest.json | pyproject.toml | src/runpod_flash/__init__.py with a comment pointing back to release-please-config.json so they stay in sync.
  • Added defense-in-depth: PR classifier now also requires github.event.pull_request.user.login == 'runpod-release-please-bot[bot]' (verified against chore: release 1.16.0 #328's actual author). Branch prefix + bot author + file content all three required for bypass.
  • Push:main classifier kept content-only (admins can squash-merge the bot's PR by hand, so actor isn't a reliable signal on push events).

setup-uv cache keyed on pyproject.toml instead of uv.lock (4 threads across ci.yml:164, ci.yml:193, release-please.yml:61, e2e.yml:39) — fixed. uv sync is lockfile-driven, so dependency updates that touch only uv.lock (the common case) weren't invalidating the cache. Switched cache-dependency-glob to uv.lock everywhere. pre-check already used uv.lock; the rest now match.

Makefile:107 misleading "aliases" comment (1 thread) — quality-check-strict is NOT an alias for ci-quality-github; it runs a different recipe (adds typecheck, plain output, no junit). Rewrote the comment block as a bullet-per-target list so maintainers don't conflate them.

All 7 threads resolved.

Copy link
Copy Markdown
Contributor

@runpod-Henrik runpod-Henrik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Henrik's AI-Powered Bug Finder — PR #345 Review

Delta review — since 2026-05-28

Significant rework since the prior review. The author replaced the original paths-ignore + job-level if: approach with a content-based guard job and a single Validation aggregator. CI is green across all checks (Guard, Pre-check, 4× Quality Gates, Build, Validation).


Prior findings — status

1. ✅ Resolved — branch-protection compatibility

Original concern: release-please bot PRs skipped via paths-ignore would leave required-check status absent, possibly blocking merge.
Now: Validation aggregator job runs if: always() and treats skipped upstream as success. Branch protection only needs to require CI / Validation — works for both full runs and release-please-bypass runs. Cleaner than the no-op-sentinel approach I suggested.

2. ✅ Resolved — PR-time wheel build dropped

Original concern: validate-wheel.sh not running on PRs would let packaging regressions land on main.
Now: guard.outputs.should_build is true when pyproject.toml, Makefile, MANIFEST.in, or scripts/validate-wheel.sh change, OR when a new non-Python file is added under src/. Build runs on relevant PRs; skipped on pure-Python changes. Targeted exactly at the packaging-regression risk I flagged.

3. ✅ Resolved — skip-guard redundancy

Original concern: two overlapping skip mechanisms (paths-ignore + if: !startsWith(...)) with unclear semantics.
Now: single guard job is the only classifier. Downstream pre-check, quality-gates, build gate on needs.guard.outputs.mode == 'full'. One source of truth.

4. ⏸ Still open — make dev entry-point sanity

Question wasn't blocking; CI is green which suggests uv sync registers entry points correctly in workspace mode. Worth a one-off make dev && .venv/bin/flash --version on a clean clone before merging, but not a release-blocker.

5. ⏸ Still open — local make quality-check UX

Annotation markers (::group::) still surface in local output. Minor nit, unchanged.


New findings (from commits since 2026-05-28)

6. Issue: ALLOWED regex must stay in sync with release-please-config.json by hand

The guard hardcodes the files release-please touches:

```
ALLOWED='^(CHANGELOG.md|.release-please-manifest.json|pyproject.toml|src/runpod_flash/init.py)$'
```

The comment correctly notes "Keep in sync with release-please-config.json" but there's no automated check. The asymmetric failure modes are favorable (out-of-sync → bypass doesn't fire → full CI runs → safe), so this is hardening, not a blocker. But the next person who adds an extra-files entry to release-please-config.json will spend a confused 20 minutes wondering why the bypass stopped firing on bot PRs.

Two cheap mitigations: (a) add a unit test in tests/ that reads release-please-config.json and asserts the ALLOWED regex matches; (b) read the config in the guard script and derive the regex dynamically.

7. Question: pre-check installs dev group via uv sync --only-group dev --frozen, then runs uv run ruff

uv sync --only-group dev --frozen installs dev dependencies without the project itself. Then the next steps run uv run ruff …. Default uv run behaviour syncs the project on every invocation — so the "no project install" intent of the previous step may be undone by uv run.

If the goal is to keep pre-check at ~10s, consider uv run --no-project ruff … for both ruff commands. Otherwise the cold-cache pre-check pays the full project install cost twice. Not a correctness issue — CI still passes — just possibly leaves the optimization unrealized.

8. Question: should_build heuristic doesn't trigger on modified non-Python src/ files

The guard checks --diff-filter=A for added files under src/:

```bash
git diff --name-only --diff-filter=A "origin/${BASE_REF}...HEAD" | grep -E '^src/.+' | grep -qv '.py$'
```

A PR that modifies an existing non-Python data file (e.g., a future edit to runpod_flash/rules/AGENTS.md) won't trigger should_build. The wheel will still include the file because it's already in the package-data manifest, so a modification doesn't change packaging behaviour — but the modified content won't be smoke-tested via validate-wheel.sh. Probably acceptable; worth confirming the intent.

If catching content regressions in packaged non-Python files matters, change --diff-filter=A to --diff-filter=AM (Added or Modified).

9. Nit: fetch-depth: 0 on guard checkout

Pulls full history just to run git diff HEAD~1...HEAD or git diff origin/\${BASE_REF}...HEAD. Full fetch is cheap on this repo size (no monorepo concerns) but fetch-depth: 2 would suffice for the push case and fetch-depth: 0 is needed only for the PR comparison-against-base case (and even then, only enough history to find the common ancestor). Not a real cost issue, just over-broad.


Pre-merge action still required

Branch protection on main needs to change the required status check from Quality Gates (4 contexts) to Validation (1 context). Without this rule update, merging still fails because old required-check names are absent on bypass runs. The PR's no-op-sentinel option from the original description has been replaced with the aggregator; the admin-side step is now "switch required check to Validation."


Verdict

PASS WITH NITS. The guard + Validation pattern is a clean solution for the two original issues that needed real design. New observations are minor — config-sync brittleness, pre-check optimization, modified-file heuristic — none block merge. CI proves the new pipeline works on this PR (all four quality-gates legs pass, build runs because the diff touched Makefile/.github/workflows/* which trips should_build).

Pre-merge action: branch protection rule must be updated to require Validation instead of Quality Gates. Without that admin change, the bypass path still leaves merges blocked.


🤖 Reviewed by Henrik's AI-Powered Bug Finder

…sync; catch modified data files

Henrik's delta review on PR #345.

Henrik #6 — ALLOWED was hand-synced with release-please-config.json
- Previous regex hardcoded the four files release-please touches today.
  Asymmetric failure mode is safe (out-of-sync -> bypass doesn't fire ->
  full CI runs) but the next person to add an `extra-files` entry to
  release-please-config.json would silently break the bypass for bot PRs.
- Derive the file list dynamically via jq from release-please-config.json's
  `.packages["."]["extra-files"]`, append the three implicit files
  (CHANGELOG.md, manifest, pyproject.toml from release-type=python). Use
  `grep -Fxvf <(allowed_files)` for fixed-string exact-line matching --
  removes the regex-escaping concern entirely.

Henrik #7 — `uv run` undid the dev-only sync
- `uv run` defaults to syncing the environment before each invocation, so
  the steps after `uv sync --only-group dev --frozen` would pull in the
  full project on each ruff call -- defeating the "no project install"
  premise of pre-check.
- Add `--no-sync` to both ruff invocations. ruff runs from the existing
  dev-only venv directly; pre-check stays fast.

Henrik #8 — `--diff-filter=A` missed modified non-Python files
- Edits to existing packaged data files (e.g. future tweaks to
  runpod_flash/rules/AGENTS.md) wouldn't trigger should_build, so
  validate-wheel.sh wouldn't smoke-test the modified content. Packaging
  metadata wouldn't change, but the wheel content would.
- Switch to `--diff-filter=AM` (Added or Modified). Cheap and correct.

Henrik #9 (fetch-depth: 0 over-broad) intentionally not addressed -- Henrik
flagged it as "not a real cost issue."

Henrik #4 (make dev entry-point) re-verified locally: `rm -rf .venv &&
make dev && .venv/bin/flash --version` -> Runpod Flash CLI v1.16.0.

https://linear.app/runpod/issue/AE-3161
@deanq
Copy link
Copy Markdown
Member Author

deanq commented Jun 4, 2026

Addressed your delta review in e467bc3.

#6ALLOWED hand-synced with release-please-config.json → went with option (b) from your suggestion. The bypass file list is now derived dynamically in the guard script via jq from release-please-config.json's .packages["."]["extra-files"], prepended with the three implicit files (CHANGELOG.md, manifest, pyproject.toml from release-type=python). Switched matching from a hand-built regex to grep -Fxvf <(allowed_files) (fixed-string, exact-line) — no more escaping concerns either. Smoke-tested the bash locally: real bot PR pattern (4 files) classifies clean; spoofed PR with code changes correctly flags the unexpected file.

#7uv run re-syncing after the dev-only install → correct catch, uv run does default to syncing. Both ruff invocations in pre-check now use uv run --no-sync ruff …, so they run from the existing dev-only venv without pulling in the project.

#8--diff-filter=A misses modified data files → switched to --diff-filter=AM. Edits to existing non-Python files under src/ now also trip should_build, so validate-wheel.sh exercises the modified content.

#9fetch-depth: 0 → intentionally not changed, per your "not a real cost issue" framing.

#4make dev entry-point → re-verified on a clean-clone-equivalent: rm -rf .venv && make dev && .venv/bin/flash --version prints Runpod Flash CLI v1.16.0. uv sync registers the entry point correctly in workspace mode.

#5 — local quality-check UX → the Makefile comment was rewritten in an earlier commit to honestly describe the annotation markers and call out that quality-check-strict is not an alias. Cosmetic-only; functionally fine.

Pre-merge admin step still required: switch the required check on main from Quality Gates (3.10..3.13) (4 contexts) to Validation (1 context). Same as the previous comment.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

Comment thread .github/workflows/ci.yml
Comment thread .github/workflows/ci.yml Outdated
Comment thread Makefile
Comment thread .github/workflows/e2e.yml Outdated
…y table fix

Copilot fourth review on PR #345.

Copilot ci.yml:98 — pyproject.toml in ALLOWED could let non-version edits
slip through the bypass
- A push:main touching only the metadata filenames could smuggle
  non-version edits to pyproject.toml (deps, ruff config, scripts) since
  the previous check was filename-only. Push:main can't rely on actor
  identity (admins can squash-merge by hand), so the file-list check was
  the only gate.
- Add `verify_version_shape` helper: enforces that any pyproject.toml
  change is a single `version = "..."` line, and any
  src/runpod_flash/__init__.py change is a single `__version__ = "..."`
  line. Applies to both push:main and PR bypass paths. Tested locally
  against simulated bot and malicious diffs.

Copilot ci.yml:103 — `should_build=true` unconditional on push:main
- Previous code only computed should_build content-based on PR full-mode;
  push:main non-bypass always built. Contradicted the documented intent
  and cost ~1 min per feature merge.
- Factor the content heuristic into `compute_should_build`; use it for
  both push:main full and PR full paths. Same allow-list (pyproject.toml,
  Makefile, MANIFEST.in, validate-wheel.sh) plus AM-filter on non-Python
  files under src/.

Copilot e2e.yml:129 — malformed markdown table when JUnit file missing
- Python ternary `... if duration is not None else "- |"` had the wrong
  scope: when duration was None, the *entire* print expression became
  the string `"- |"`, breaking the table.
- Extract `_num()` and `_dur()` helpers that placehold "-" per field;
  build the row in one f-string. Table stays valid when the suite
  didn't run.

Copilot Makefile:116 (NOT changed) — claim was that aliasing quality-check
to ci-quality-github drops `--cov-fail-under=65` enforcement. Verified
locally: the serial pass in ci-quality-github inherits --cov-fail-under=65
from pyproject.toml's addopts (only the parallel pass overrides to 0).
Local `make quality-check` still hits the 65% gate at the end of the
serial pass; observed "Required test coverage of 65% reached. Total
coverage: 85.96%" on this branch. Will reply on the thread, not change code.

https://linear.app/runpod/issue/AE-3161
@deanq
Copy link
Copy Markdown
Member Author

deanq commented Jun 4, 2026

Addressed Copilot's fourth review in 722b38c.

ci.yml:103 — should_build=true unconditional on push:main → factored the build-trigger heuristic into a compute_should_build shell function shared between push:main full mode and PR full mode. Both paths now build only when packaging-relevant files (pyproject.toml, Makefile, MANIFEST.in, scripts/validate-wheel.sh) change or a non-Python file under src/ is added/modified. ~1 min saved on every feature-PR merge.

ci.yml:98 — pyproject.toml in ALLOWED could let non-version bypass → added verify_version_shape helper that enforces pyproject.toml diffs only touch version = "..." lines and __init__.py diffs only touch __version__ = "..." lines. Closes the residual exploitation path on push:main (where actor isn't a reliable signal since admins can squash-merge by hand). Smoke-tested locally against a simulated real release diff (passes) and a malicious diff with an injected dependencies line (correctly rejected). Bypass now requires four conditions on PR (branch prefix + bot author + file list + version shape) and three on push:main (file list + version shape + diff non-empty).

e2e.yml:129 — malformed table when JUnit missing → the Python ternary ... if duration is not None else "- |" had wrong scope: when duration was None the entire print expression became "- |", breaking the markdown table. Extracted _num() and _dur() helpers that placehold per-field; table stays valid when the suite didn't run.

Makefile:116 — left in an open reply on that thread with verification output. The --cov-fail-under=0 only applies to the parallel pass; the serial pass inherits --cov-fail-under=65 from pyproject.toml's addopts and the combined coverage trips the 65% gate at the end. Local make quality-check still enforces the threshold (verified just now: "Required test coverage of 65% reached. Total coverage: 85.96%"). Not changing code; thread left open for re-review.

3 of 4 threads resolved; 1 left open with rationale for re-review.

@deanq deanq requested a review from jhcipar June 4, 2026 14:16
@deanq deanq merged commit fd5ac9a into main Jun 4, 2026
8 checks passed
@deanq deanq deleted the deanq/ae-3161-optimize-ci-cd branch June 4, 2026 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants