chore(release): 0.7.4 — deferred-work sweep no longer defers every bundle it finished by pbean · Pull Request #17 · bmad-code-org/bmad-auto

pbean · 2026-06-26T18:43:04Z

Release 0.7.4

Patch release fixing a regression in the deferred-work sweep introduced by the bmad-dev-auto migration.

Fixed

Deferred-work sweep no longer defers every bundle it just finished. After the migration to the generic upstream bmad-dev-auto primitive, bundle dev sessions completed the work but were rejected by verify_dev_bundle with result.json dw_ids [] do not match the bundle's […], retried to budget, deferred, and rolled the work back — so a sweep could never close a bundled entry. The retired dev fork used to echo the dw ids; the generic skill doesn't.

The orchestrator already owns the bundle→dw-id binding (SweepEngine._post_dev_state_sync), so verify_dev_bundle now enforces the dw_ids cross-check only when the session actually claims ids — an absent claim is the normal generic path and passes. As hardening, the run exports BMAD_AUTO_DW_IDS and the generic adapter stamps them onto the synthesized result.json so the cross-check stays live.

Impact

Observed on a real Lights-Out sweep (run 20260625-203150-80f6): every bundle ran a full dev session (tens of millions of tokens each), did the work, then deferred and discarded it. Deterministic for every bundle on the prior code. No manual ledger repair needed — the rollbacks already reverted the premature sweep-bundle-closed marks, so a fresh sweep closes the entries correctly.

Commits

fix(bmad-auto): the verify relaxation + BMAD_AUTO_DW_IDS plumbing + regression test
chore(release): 0.7.4: version stamp + CHANGELOG

Verification

New regression test test_verify_dev_bundle_absent_dw_ids_passes (the generic no-dw_ids path the original tests never covered); mismatch test still asserts an explicit wrong claim fails
Full suite: 1072 passed
trunk check (full, no filter): clean

🤖 Generated with Claude Code

Summary by CodeRabbit

Fixed
- Resolved an issue where deferred-work bundles could fail to close properly during verification.
- Bundled results now retain the correct deferred-work identifiers, improving validation for related workflows.
Chores
- Updated the release/version number to 0.7.4 across package and plugin metadata.
Documentation
- Added a new changelog entry for v0.7.4.

…it finished After the migration to the generic upstream bmad-dev-auto primitive, bundle dev sessions completed the work but verify_dev_bundle rejected them with "result.json dw_ids [] do not match the bundle's […]", retried to budget, deferred, and rolled the work back — so a sweep could never close a bundled entry. The retired dev fork echoed the dw ids; the generic skill does not, and the adapter synthesized result.json without them. The orchestrator already owns the bundle→dw-id binding (_post_dev_state_sync), so verify_dev_bundle now enforces the dw_ids cross-check only when the session actually claims ids — an absent claim is the normal generic path and passes. As hardening, the run exports BMAD_AUTO_DW_IDS and the generic adapter stamps them onto the synthesized result so the cross-check stays live. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ndle it just fi…

coderabbitai · 2026-06-26T18:43:16Z

Warning

Review limit reached

@pbean, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 10 minutes and 34 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0f1e8b51-5d54-4889-9a7f-551268da2e93

📥 Commits

Reviewing files that changed from the base of the PR and between 4ae716f and 3b55fa8.

📒 Files selected for processing (4)

src/automator/adapters/generic.py
src/automator/verify.py
tests/test_generic_tmux.py
tests/test_verify.py

Walkthrough

Updates release metadata to v0.7.4, propagates deferred-work DW-id bindings into synthesized results, and changes bundle verification to accept missing DW-id claims.

Changes

Deferred-work DW-id flow and release metadata

Layer / File(s)	Summary
Release metadata `.claude-plugin/marketplace.json`, `CHANGELOG.md`, `module.yaml`, `pyproject.toml`, `src/automator/__init__.py`, `src/automator/data/skills/bmad-auto-setup/assets/module.yaml`	Version fields and the changelog entry are updated to v0.7.4.
DW-id export to results `src/automator/engine.py`, `src/automator/adapters/generic.py`, `tests/test_generic_tmux.py`	`BMAD_AUTO_DW_IDS` is added to session env when deferred-work IDs exist, read back in `GenericDevAdapter._result_json`, and covered by adapter result tests.
DW-id verification `src/automator/verify.py`, `tests/test_verify.py`	`verify_dev_bundle` now skips the mismatch check when `result.json` omits `dw_ids`, and a regression test covers that path.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

bmad-code-org/bmad-auto#15: Also changes src/automator/adapters/generic.py and tests/test_generic_tmux.py around GenericDevAdapter._result_json and result discovery.

Poem

I hop with bundles, neat and bright,
and tuck DW-IDs in just right.
My result.json now knows the trail,
v0.7.4 sails off by rail. 🐇

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly matches the release and the deferred-work sweep regression fix described in the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch release/0.7.4

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

augmentcode · 2026-06-26T18:45:33Z

🤖 Augment PR Summary

Summary: Patch release 0.7.4 fixes a deferred-work sweep regression after migrating to the generic bmad-dev-auto dev primitive.

Changes:

Export bundle-owned deferred-work IDs via BMAD_AUTO_DW_IDS and stamp them into the synthesized dev result.json for generic sessions.
Relax verify_dev_bundle so the dw_ids cross-check is enforced only when the session actually claims IDs.
Add regression tests for the “missing/empty dw_ids” generic path and env parsing hardening.
Bump version metadata and update the changelog for 0.7.4.

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 1 suggestion posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-06-26T18:45:34Z

+                    # Bundle dev sessions: the orchestrator exports the bundle's
+                    # owned dw ids (the generic skill never authors them). Stamp
+                    # them onto the result so verify_dev_bundle's cross-check passes.
+                    dw_ids = [i for i in spec.env.get("BMAD_AUTO_DW_IDS", "").split(",") if i]


BMAD_AUTO_DW_IDS is parsed via split(",") without trimming whitespace, so a value like "DW-1, DW-2" would produce " DW-2" and could cause verify_dev_bundle to retry due to an apparent mismatch. Consider normalizing each token (e.g., trimming) to make this env-based contract more robust to formatting differences (plugins/hooks, manual runs).

Severity: low

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

Good catch — addressed in 4ae716f. The orchestrator sets this env via ",".join(task.dw_ids) (no spaces), so the normal path was already safe, but a plugin session hook (_emit_session_gate) or a hand-set value could supply "DW-1, DW-2" and trip verify_dev_bundle's dw_ids cross-check into the exact spurious defer this PR fixes. The parse now .strip()s each token, and test_generic_dev_bundle_stamps_dw_ids_from_env covers the whitespace-tolerant case.

The orchestrator joins the env value without spaces, but a hook-rewritten or hand-set "DW-1, DW-2" would parse to " DW-2" and trip verify_dev_bundle's dw_ids cross-check into a spurious defer. Strip each token. Adds a generic- adapter test covering the whitespace-tolerant parse. Addresses PR #17 review feedback. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

pbean · 2026-06-26T18:58:57Z

augment review
@CodeRabbit review

coderabbitai · 2026-06-26T19:00:36Z

@pbean: I’ll take another pass and augment the review.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

augmentcode

Review completed. 1 suggestion posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-06-26T19:01:49Z

+                    # Bundle dev sessions: the orchestrator exports the bundle's
+                    # owned dw ids (the generic skill never authors them). Stamp
+                    # them onto the result so verify_dev_bundle's cross-check passes.
+                    raw_dw_ids = spec.env.get("BMAD_AUTO_DW_IDS", "").split(",")


spec.env.get("BMAD_AUTO_DW_IDS", "").split(",") will raise if a plugin/session hook sets BMAD_AUTO_DW_IDS to a non-string (e.g., None) rather than deleting it; that would crash result synthesis and could incorrectly mark a completed session as stalled.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

Valid — fixed in e277c34. spec.env is typed dict[str, str], so a None value is a contract violation, but a misbehaving hook setting it (rather than deleting the key) would crash synthesis and false-stall a completed session. The parse now coerces with (spec.env.get("BMAD_AUTO_DW_IDS") or ""), matching the story_key guard on the line above; test_generic_dev_dw_ids_none_env_does_not_crash covers it.

coderabbitai

🧹 Nitpick comments (1)

tests/test_verify.py (1)
278-289: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add coverage for explicit empty dw_ids.

The regression called out in this PR was an empty dw_ids array, but this test only covers a missing key. A companion case with {"dw_ids": []} would lock down the exact payload that previously caused defer/retry behavior.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_verify.py` around lines 278 - 289, Add a companion test case in
test_verify_dev_bundle_absent_dw_ids_passes to cover an explicit empty dw_ids
payload. Reuse verify.verify_dev_bundle and the existing project/task setup, but
set rj to include "dw_ids": [] alongside "workflow" and "spec_file", then assert
the result still passes and the spec_file is bound as expected. This will lock
down the empty-array path in addition to the missing-key case.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/test_verify.py`:
- Around line 278-289: Add a companion test case in
test_verify_dev_bundle_absent_dw_ids_passes to cover an explicit empty dw_ids
payload. Reuse verify.verify_dev_bundle and the existing project/task setup, but
set rj to include "dw_ids": [] alongside "workflow" and "spec_file", then assert
the result still passes and the spec_file is bound as expected. This will lock
down the empty-array path in addition to the missing-key case.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fb3d9de4-f220-41cb-ba42-1d6899b613f4

📥 Commits

Reviewing files that changed from the base of the PR and between 64b8dc2 and 4ae716f.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (11)

.claude-plugin/marketplace.json
CHANGELOG.md
module.yaml
pyproject.toml
src/automator/__init__.py
src/automator/adapters/generic.py
src/automator/data/skills/bmad-auto-setup/assets/module.yaml
src/automator/engine.py
src/automator/verify.py
tests/test_generic_tmux.py
tests/test_verify.py

… value A plugin/session hook that sets BMAD_AUTO_DW_IDS to None (rather than deleting it) would make `.split(",")` raise and crash result synthesis, false-stalling a completed session. Coerce to "" the same way the adjacent story_key line guards. Adds tests: a None-valued env no longer crashes synthesis, and an explicit empty dw_ids list (the literal payload that defered in production) passes verify. Addresses PR #17 review feedback. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

pbean · 2026-06-26T19:17:56Z

@coderabbitai addressed in e277c34: test_verify_dev_bundle_absent_dw_ids_passes is now parametrized over {} (missing key) and {"dw_ids": []} (the literal empty-array payload that defered in production), both asserting verify passes and binds spec_file. Thanks for the catch.

coderabbitai · 2026-06-26T19:18:13Z

Tip

For best results, initiate chat on the files or code changes.

@pbean Nice — thanks for tightening that up. Parametrizing the test over both the missing-key case and the literal empty-array payload closes the regression gap I was worried about, and the spec_file assertion keeps the bundle-binding behavior covered as well.

🐇✨

If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

pbean · 2026-06-26T19:24:44Z

augment review

augmentcode

Review completed. 2 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-06-26T19:26:40Z


    claimed_ids = {str(i) for i in rj.get("dw_ids", [])}
-    if claimed_ids != set(task.dw_ids):
+    if claimed_ids and claimed_ids != set(task.dw_ids):


src/automator/verify.py:747: claimed_ids = {str(i) for i in rj.get("dw_ids", [])} will raise if a result payload includes "dw_ids": null (or any non-iterable), which can crash bundle verification even though empty/absent claims are now treated as valid.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

Valid — fixed in 3b55fa8. Our synthesize_result never emits a null dw_ids (it's gated if dw_ids:), but verify_dev_bundle is a standalone dict consumer with no launch-time protection, so rj.get("dw_ids", []) returning None on an explicit "dw_ids": null would crash the set comprehension. Now coerced with or []; the parametrized test_verify_dev_bundle_absent_dw_ids_passes gains a null case.

augmentcode · 2026-06-26T19:26:40Z

+                    # Bundle dev sessions: the orchestrator exports the bundle's
+                    # owned dw ids (the generic skill never authors them). Stamp
+                    # them onto the result so verify_dev_bundle's cross-check passes.
+                    raw_dw_ids = (spec.env.get("BMAD_AUTO_DW_IDS") or "").split(",")


src/automator/adapters/generic.py:331: (spec.env.get("BMAD_AUTO_DW_IDS") or "").split(",") still assumes a string; if a hook sets a truthy non-string (e.g., list/int), this will raise and can incorrectly mark a completed session as stalled.

Severity: low

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

Declining this one, by design. spec.env is typed dict[str, str] and is consumed as the process environment ({**self.profile.env, **spec.env} at generic.py:118/142), which only accepts string values — a list/int there would fail at session launch, so the session could never reach the Stop event that calls _result_json. The guarded state is unreachable. And str()-coercing would be actively wrong: a list ["DW-1","DW-2"] stringifies to "['DW-1', 'DW-2']", which split(",") turns into garbage tokens → a spurious dw_ids mismatch, the exact false-defer this PR fixes. A crash on a hard type-contract violation is preferable to silently-wrong behavior; the shipped or "" guard already covers None, the only idiomatic unset sentinel.

rj.get("dw_ids", []) returns None on a "dw_ids": null payload (the default only covers an absent key), so the set comprehension raised TypeError and crashed bundle verification. Coerce with `or []`. Extends the parametrized test with the null case. Addresses PR #17 review feedback. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

pbean and others added 2 commits June 26, 2026 11:31

chore(release): 0.7.4 — Deferred-work sweep no longer defers every bu…

4276c67

…ndle it just fi…

augmentcode Bot reviewed Jun 26, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 26, 2026

View reviewed changes

augmentcode Bot reviewed Jun 26, 2026

View reviewed changes

pbean merged commit df5418e into main Jun 26, 2026
7 checks passed

pbean deleted the release/0.7.4 branch June 26, 2026 19:59

Uh oh!

Conversation

pbean commented Jun 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release 0.7.4

Fixed

Impact

Commits

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

augmentcode Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pbean Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

pbean commented Jun 26, 2026

Uh oh!

coderabbitai Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pbean Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

pbean commented Jun 26, 2026

Uh oh!

coderabbitai Bot commented Jun 26, 2026

Uh oh!

pbean commented Jun 26, 2026

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pbean Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pbean Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

pbean commented Jun 26, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 26, 2026 •

edited

Loading

augmentcode Bot commented Jun 26, 2026 •

edited

Loading

augmentcode Bot Jun 26, 2026 •

edited

Loading

coderabbitai Bot commented Jun 26, 2026 •

edited

Loading

augmentcode Bot Jun 26, 2026 •

edited

Loading

augmentcode Bot Jun 26, 2026 •

edited

Loading

augmentcode Bot Jun 26, 2026 •

edited

Loading