chore(release): 0.7.4 — deferred-work sweep no longer defers every bundle it finished#17
Conversation
…it finished After the migration to the generic upstream bmad-dev-auto primitive, bundle dev sessions completed the work but verify_dev_bundle rejected them with "result.json dw_ids [] do not match the bundle's […]", retried to budget, deferred, and rolled the work back — so a sweep could never close a bundled entry. The retired dev fork echoed the dw ids; the generic skill does not, and the adapter synthesized result.json without them. The orchestrator already owns the bundle→dw-id binding (_post_dev_state_sync), so verify_dev_bundle now enforces the dw_ids cross-check only when the session actually claims ids — an absent claim is the normal generic path and passes. As hardening, the run exports BMAD_AUTO_DW_IDS and the generic adapter stamps them onto the synthesized result so the cross-check stays live. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 10 minutes and 34 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
WalkthroughUpdates release metadata to v0.7.4, propagates deferred-work DW-id bindings into synthesized results, and changes bundle verification to accept missing DW-id claims. ChangesDeferred-work DW-id flow and release metadata
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🤖 Augment PR SummarySummary: Patch release Changes:
🤖 Was this summary useful? React with 👍 or 👎 |
| # Bundle dev sessions: the orchestrator exports the bundle's | ||
| # owned dw ids (the generic skill never authors them). Stamp | ||
| # them onto the result so verify_dev_bundle's cross-check passes. | ||
| dw_ids = [i for i in spec.env.get("BMAD_AUTO_DW_IDS", "").split(",") if i] |
There was a problem hiding this comment.
BMAD_AUTO_DW_IDS is parsed via split(",") without trimming whitespace, so a value like "DW-1, DW-2" would produce " DW-2" and could cause verify_dev_bundle to retry due to an apparent mismatch. Consider normalizing each token (e.g., trimming) to make this env-based contract more robust to formatting differences (plugins/hooks, manual runs).
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
There was a problem hiding this comment.
Good catch — addressed in 4ae716f. The orchestrator sets this env via ",".join(task.dw_ids) (no spaces), so the normal path was already safe, but a plugin session hook (_emit_session_gate) or a hand-set value could supply "DW-1, DW-2" and trip verify_dev_bundle's dw_ids cross-check into the exact spurious defer this PR fixes. The parse now .strip()s each token, and test_generic_dev_bundle_stamps_dw_ids_from_env covers the whitespace-tolerant case.
The orchestrator joins the env value without spaces, but a hook-rewritten or hand-set "DW-1, DW-2" would parse to " DW-2" and trip verify_dev_bundle's dw_ids cross-check into a spurious defer. Strip each token. Adds a generic- adapter test covering the whitespace-tolerant parse. Addresses PR #17 review feedback. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
augment review |
|
✅ Action performedReview finished.
|
| # Bundle dev sessions: the orchestrator exports the bundle's | ||
| # owned dw ids (the generic skill never authors them). Stamp | ||
| # them onto the result so verify_dev_bundle's cross-check passes. | ||
| raw_dw_ids = spec.env.get("BMAD_AUTO_DW_IDS", "").split(",") |
There was a problem hiding this comment.
spec.env.get("BMAD_AUTO_DW_IDS", "").split(",") will raise if a plugin/session hook sets BMAD_AUTO_DW_IDS to a non-string (e.g., None) rather than deleting it; that would crash result synthesis and could incorrectly mark a completed session as stalled.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
There was a problem hiding this comment.
Valid — fixed in e277c34. spec.env is typed dict[str, str], so a None value is a contract violation, but a misbehaving hook setting it (rather than deleting the key) would crash synthesis and false-stall a completed session. The parse now coerces with (spec.env.get("BMAD_AUTO_DW_IDS") or ""), matching the story_key guard on the line above; test_generic_dev_dw_ids_none_env_does_not_crash covers it.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/test_verify.py (1)
278-289: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd coverage for explicit empty
dw_ids.The regression called out in this PR was an empty
dw_idsarray, but this test only covers a missing key. A companion case with{"dw_ids": []}would lock down the exact payload that previously caused defer/retry behavior.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_verify.py` around lines 278 - 289, Add a companion test case in test_verify_dev_bundle_absent_dw_ids_passes to cover an explicit empty dw_ids payload. Reuse verify.verify_dev_bundle and the existing project/task setup, but set rj to include "dw_ids": [] alongside "workflow" and "spec_file", then assert the result still passes and the spec_file is bound as expected. This will lock down the empty-array path in addition to the missing-key case.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/test_verify.py`:
- Around line 278-289: Add a companion test case in
test_verify_dev_bundle_absent_dw_ids_passes to cover an explicit empty dw_ids
payload. Reuse verify.verify_dev_bundle and the existing project/task setup, but
set rj to include "dw_ids": [] alongside "workflow" and "spec_file", then assert
the result still passes and the spec_file is bound as expected. This will lock
down the empty-array path in addition to the missing-key case.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: fb3d9de4-f220-41cb-ba42-1d6899b613f4
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (11)
.claude-plugin/marketplace.jsonCHANGELOG.mdmodule.yamlpyproject.tomlsrc/automator/__init__.pysrc/automator/adapters/generic.pysrc/automator/data/skills/bmad-auto-setup/assets/module.yamlsrc/automator/engine.pysrc/automator/verify.pytests/test_generic_tmux.pytests/test_verify.py
… value
A plugin/session hook that sets BMAD_AUTO_DW_IDS to None (rather than deleting
it) would make `.split(",")` raise and crash result synthesis, false-stalling a
completed session. Coerce to "" the same way the adjacent story_key line guards.
Adds tests: a None-valued env no longer crashes synthesis, and an explicit empty
dw_ids list (the literal payload that defered in production) passes verify.
Addresses PR #17 review feedback.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
@coderabbitai addressed in e277c34: |
|
Tip For best results, initiate chat on the files or code changes.
🐇✨ If you found this review helpful, would you consider giving us a shout-out on X? Thank you for using CodeRabbit! |
|
augment review |
|
|
||
| claimed_ids = {str(i) for i in rj.get("dw_ids", [])} | ||
| if claimed_ids != set(task.dw_ids): | ||
| if claimed_ids and claimed_ids != set(task.dw_ids): |
There was a problem hiding this comment.
src/automator/verify.py:747: claimed_ids = {str(i) for i in rj.get("dw_ids", [])} will raise if a result payload includes "dw_ids": null (or any non-iterable), which can crash bundle verification even though empty/absent claims are now treated as valid.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
There was a problem hiding this comment.
Valid — fixed in 3b55fa8. Our synthesize_result never emits a null dw_ids (it's gated if dw_ids:), but verify_dev_bundle is a standalone dict consumer with no launch-time protection, so rj.get("dw_ids", []) returning None on an explicit "dw_ids": null would crash the set comprehension. Now coerced with or []; the parametrized test_verify_dev_bundle_absent_dw_ids_passes gains a null case.
| # Bundle dev sessions: the orchestrator exports the bundle's | ||
| # owned dw ids (the generic skill never authors them). Stamp | ||
| # them onto the result so verify_dev_bundle's cross-check passes. | ||
| raw_dw_ids = (spec.env.get("BMAD_AUTO_DW_IDS") or "").split(",") |
There was a problem hiding this comment.
src/automator/adapters/generic.py:331: (spec.env.get("BMAD_AUTO_DW_IDS") or "").split(",") still assumes a string; if a hook sets a truthy non-string (e.g., list/int), this will raise and can incorrectly mark a completed session as stalled.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
There was a problem hiding this comment.
Declining this one, by design. spec.env is typed dict[str, str] and is consumed as the process environment ({**self.profile.env, **spec.env} at generic.py:118/142), which only accepts string values — a list/int there would fail at session launch, so the session could never reach the Stop event that calls _result_json. The guarded state is unreachable. And str()-coercing would be actively wrong: a list ["DW-1","DW-2"] stringifies to "['DW-1', 'DW-2']", which split(",") turns into garbage tokens → a spurious dw_ids mismatch, the exact false-defer this PR fixes. A crash on a hard type-contract violation is preferable to silently-wrong behavior; the shipped or "" guard already covers None, the only idiomatic unset sentinel.
rj.get("dw_ids", []) returns None on a "dw_ids": null payload (the default only
covers an absent key), so the set comprehension raised TypeError and crashed
bundle verification. Coerce with `or []`. Extends the parametrized test with the
null case.
Addresses PR #17 review feedback.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Release 0.7.4
Patch release fixing a regression in the deferred-work sweep introduced by the
bmad-dev-automigration.Fixed
Deferred-work sweep no longer defers every bundle it just finished. After the migration to the generic upstream
bmad-dev-autoprimitive, bundle dev sessions completed the work but were rejected byverify_dev_bundlewithresult.json dw_ids [] do not match the bundle's […], retried to budget, deferred, and rolled the work back — so a sweep could never close a bundled entry. The retired dev fork used to echo the dw ids; the generic skill doesn't.The orchestrator already owns the bundle→dw-id binding (
SweepEngine._post_dev_state_sync), soverify_dev_bundlenow enforces the dw_ids cross-check only when the session actually claims ids — an absent claim is the normal generic path and passes. As hardening, the run exportsBMAD_AUTO_DW_IDSand the generic adapter stamps them onto the synthesizedresult.jsonso the cross-check stays live.Impact
Observed on a real Lights-Out sweep (run
20260625-203150-80f6): every bundle ran a full dev session (tens of millions of tokens each), did the work, then deferred and discarded it. Deterministic for every bundle on the prior code. No manual ledger repair needed — the rollbacks already reverted the prematuresweep-bundle-closedmarks, so a fresh sweep closes the entries correctly.Commits
fix(bmad-auto): the verify relaxation +BMAD_AUTO_DW_IDSplumbing + regression testchore(release): 0.7.4: version stamp + CHANGELOGVerification
test_verify_dev_bundle_absent_dw_ids_passes(the generic no-dw_ids path the original tests never covered); mismatch test still asserts an explicit wrong claim failstrunk check(full, no filter): clean🤖 Generated with Claude Code
Summary by CodeRabbit
Fixed
Chores
Documentation