Skip to content

chore(release): 0.7.4 — deferred-work sweep no longer defers every bundle it finished#17

Merged
pbean merged 5 commits into
mainfrom
release/0.7.4
Jun 26, 2026
Merged

chore(release): 0.7.4 — deferred-work sweep no longer defers every bundle it finished#17
pbean merged 5 commits into
mainfrom
release/0.7.4

Conversation

@pbean

@pbean pbean commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Release 0.7.4

Patch release fixing a regression in the deferred-work sweep introduced by the bmad-dev-auto migration.

Fixed

Deferred-work sweep no longer defers every bundle it just finished. After the migration to the generic upstream bmad-dev-auto primitive, bundle dev sessions completed the work but were rejected by verify_dev_bundle with result.json dw_ids [] do not match the bundle's […], retried to budget, deferred, and rolled the work back — so a sweep could never close a bundled entry. The retired dev fork used to echo the dw ids; the generic skill doesn't.

The orchestrator already owns the bundle→dw-id binding (SweepEngine._post_dev_state_sync), so verify_dev_bundle now enforces the dw_ids cross-check only when the session actually claims ids — an absent claim is the normal generic path and passes. As hardening, the run exports BMAD_AUTO_DW_IDS and the generic adapter stamps them onto the synthesized result.json so the cross-check stays live.

Impact

Observed on a real Lights-Out sweep (run 20260625-203150-80f6): every bundle ran a full dev session (tens of millions of tokens each), did the work, then deferred and discarded it. Deterministic for every bundle on the prior code. No manual ledger repair needed — the rollbacks already reverted the premature sweep-bundle-closed marks, so a fresh sweep closes the entries correctly.

Commits

  • fix(bmad-auto): the verify relaxation + BMAD_AUTO_DW_IDS plumbing + regression test
  • chore(release): 0.7.4: version stamp + CHANGELOG

Verification

  • New regression test test_verify_dev_bundle_absent_dw_ids_passes (the generic no-dw_ids path the original tests never covered); mismatch test still asserts an explicit wrong claim fails
  • Full suite: 1072 passed
  • trunk check (full, no filter): clean

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Fixed

    • Resolved an issue where deferred-work bundles could fail to close properly during verification.
    • Bundled results now retain the correct deferred-work identifiers, improving validation for related workflows.
  • Chores

    • Updated the release/version number to 0.7.4 across package and plugin metadata.
  • Documentation

    • Added a new changelog entry for v0.7.4.

pbean and others added 2 commits June 26, 2026 11:31
…it finished

After the migration to the generic upstream bmad-dev-auto primitive, bundle dev
sessions completed the work but verify_dev_bundle rejected them with
"result.json dw_ids [] do not match the bundle's […]", retried to budget,
deferred, and rolled the work back — so a sweep could never close a bundled
entry. The retired dev fork echoed the dw ids; the generic skill does not, and
the adapter synthesized result.json without them.

The orchestrator already owns the bundle→dw-id binding (_post_dev_state_sync),
so verify_dev_bundle now enforces the dw_ids cross-check only when the session
actually claims ids — an absent claim is the normal generic path and passes. As
hardening, the run exports BMAD_AUTO_DW_IDS and the generic adapter stamps them
onto the synthesized result so the cross-check stays live.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@pbean, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 10 minutes and 34 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0f1e8b51-5d54-4889-9a7f-551268da2e93

📥 Commits

Reviewing files that changed from the base of the PR and between 4ae716f and 3b55fa8.

📒 Files selected for processing (4)
  • src/automator/adapters/generic.py
  • src/automator/verify.py
  • tests/test_generic_tmux.py
  • tests/test_verify.py

Walkthrough

Updates release metadata to v0.7.4, propagates deferred-work DW-id bindings into synthesized results, and changes bundle verification to accept missing DW-id claims.

Changes

Deferred-work DW-id flow and release metadata

Layer / File(s) Summary
Release metadata
.claude-plugin/marketplace.json, CHANGELOG.md, module.yaml, pyproject.toml, src/automator/__init__.py, src/automator/data/skills/bmad-auto-setup/assets/module.yaml
Version fields and the changelog entry are updated to v0.7.4.
DW-id export to results
src/automator/engine.py, src/automator/adapters/generic.py, tests/test_generic_tmux.py
BMAD_AUTO_DW_IDS is added to session env when deferred-work IDs exist, read back in GenericDevAdapter._result_json, and covered by adapter result tests.
DW-id verification
src/automator/verify.py, tests/test_verify.py
verify_dev_bundle now skips the mismatch check when result.json omits dw_ids, and a regression test covers that path.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • bmad-code-org/bmad-auto#15: Also changes src/automator/adapters/generic.py and tests/test_generic_tmux.py around GenericDevAdapter._result_json and result discovery.

Poem

I hop with bundles, neat and bright,
and tuck DW-IDs in just right.
My result.json now knows the trail,
v0.7.4 sails off by rail. 🐇

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the release and the deferred-work sweep regression fix described in the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch release/0.7.4

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@augmentcode

augmentcode Bot commented Jun 26, 2026

Copy link
Copy Markdown
🤖 Augment PR Summary

Summary: Patch release 0.7.4 fixes a deferred-work sweep regression after migrating to the generic bmad-dev-auto dev primitive.

Changes:

  • Export bundle-owned deferred-work IDs via BMAD_AUTO_DW_IDS and stamp them into the synthesized dev result.json for generic sessions.
  • Relax verify_dev_bundle so the dw_ids cross-check is enforced only when the session actually claims IDs.
  • Add regression tests for the “missing/empty dw_ids” generic path and env parsing hardening.
  • Bump version metadata and update the changelog for 0.7.4.

🤖 Was this summary useful? React with 👍 or 👎

@augmentcode augmentcode Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 1 suggestion posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/automator/adapters/generic.py Outdated
# Bundle dev sessions: the orchestrator exports the bundle's
# owned dw ids (the generic skill never authors them). Stamp
# them onto the result so verify_dev_bundle's cross-check passes.
dw_ids = [i for i in spec.env.get("BMAD_AUTO_DW_IDS", "").split(",") if i]

@augmentcode augmentcode Bot Jun 26, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BMAD_AUTO_DW_IDS is parsed via split(",") without trimming whitespace, so a value like "DW-1, DW-2" would produce " DW-2" and could cause verify_dev_bundle to retry due to an apparent mismatch. Consider normalizing each token (e.g., trimming) to make this env-based contract more robust to formatting differences (plugins/hooks, manual runs).

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — addressed in 4ae716f. The orchestrator sets this env via ",".join(task.dw_ids) (no spaces), so the normal path was already safe, but a plugin session hook (_emit_session_gate) or a hand-set value could supply "DW-1, DW-2" and trip verify_dev_bundle's dw_ids cross-check into the exact spurious defer this PR fixes. The parse now .strip()s each token, and test_generic_dev_bundle_stamps_dw_ids_from_env covers the whitespace-tolerant case.

The orchestrator joins the env value without spaces, but a hook-rewritten or
hand-set "DW-1, DW-2" would parse to " DW-2" and trip verify_dev_bundle's
dw_ids cross-check into a spurious defer. Strip each token. Adds a generic-
adapter test covering the whitespace-tolerant parse.

Addresses PR #17 review feedback.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@pbean

pbean commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator Author

augment review
@CodeRabbit review

@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown

@pbean: I’ll take another pass and augment the review.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@augmentcode augmentcode Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 1 suggestion posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/automator/adapters/generic.py Outdated
# Bundle dev sessions: the orchestrator exports the bundle's
# owned dw ids (the generic skill never authors them). Stamp
# them onto the result so verify_dev_bundle's cross-check passes.
raw_dw_ids = spec.env.get("BMAD_AUTO_DW_IDS", "").split(",")

@augmentcode augmentcode Bot Jun 26, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spec.env.get("BMAD_AUTO_DW_IDS", "").split(",") will raise if a plugin/session hook sets BMAD_AUTO_DW_IDS to a non-string (e.g., None) rather than deleting it; that would crash result synthesis and could incorrectly mark a completed session as stalled.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid — fixed in e277c34. spec.env is typed dict[str, str], so a None value is a contract violation, but a misbehaving hook setting it (rather than deleting the key) would crash synthesis and false-stall a completed session. The parse now coerces with (spec.env.get("BMAD_AUTO_DW_IDS") or ""), matching the story_key guard on the line above; test_generic_dev_dw_ids_none_env_does_not_crash covers it.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/test_verify.py (1)

278-289: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add coverage for explicit empty dw_ids.

The regression called out in this PR was an empty dw_ids array, but this test only covers a missing key. A companion case with {"dw_ids": []} would lock down the exact payload that previously caused defer/retry behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_verify.py` around lines 278 - 289, Add a companion test case in
test_verify_dev_bundle_absent_dw_ids_passes to cover an explicit empty dw_ids
payload. Reuse verify.verify_dev_bundle and the existing project/task setup, but
set rj to include "dw_ids": [] alongside "workflow" and "spec_file", then assert
the result still passes and the spec_file is bound as expected. This will lock
down the empty-array path in addition to the missing-key case.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/test_verify.py`:
- Around line 278-289: Add a companion test case in
test_verify_dev_bundle_absent_dw_ids_passes to cover an explicit empty dw_ids
payload. Reuse verify.verify_dev_bundle and the existing project/task setup, but
set rj to include "dw_ids": [] alongside "workflow" and "spec_file", then assert
the result still passes and the spec_file is bound as expected. This will lock
down the empty-array path in addition to the missing-key case.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fb3d9de4-f220-41cb-ba42-1d6899b613f4

📥 Commits

Reviewing files that changed from the base of the PR and between 64b8dc2 and 4ae716f.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (11)
  • .claude-plugin/marketplace.json
  • CHANGELOG.md
  • module.yaml
  • pyproject.toml
  • src/automator/__init__.py
  • src/automator/adapters/generic.py
  • src/automator/data/skills/bmad-auto-setup/assets/module.yaml
  • src/automator/engine.py
  • src/automator/verify.py
  • tests/test_generic_tmux.py
  • tests/test_verify.py

… value

A plugin/session hook that sets BMAD_AUTO_DW_IDS to None (rather than deleting
it) would make `.split(",")` raise and crash result synthesis, false-stalling a
completed session. Coerce to "" the same way the adjacent story_key line guards.
Adds tests: a None-valued env no longer crashes synthesis, and an explicit empty
dw_ids list (the literal payload that defered in production) passes verify.

Addresses PR #17 review feedback.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@pbean

pbean commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator Author

@coderabbitai addressed in e277c34: test_verify_dev_bundle_absent_dw_ids_passes is now parametrized over {} (missing key) and {"dw_ids": []} (the literal empty-array payload that defered in production), both asserting verify passes and binds spec_file. Thanks for the catch.

@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

@pbean Nice — thanks for tightening that up. Parametrizing the test over both the missing-key case and the literal empty-array payload closes the regression gap I was worried about, and the spec_file assertion keeps the bundle-binding behavior covered as well.

🐇✨


If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

@pbean

pbean commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator Author

augment review

@augmentcode augmentcode Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/automator/verify.py

claimed_ids = {str(i) for i in rj.get("dw_ids", [])}
if claimed_ids != set(task.dw_ids):
if claimed_ids and claimed_ids != set(task.dw_ids):

@augmentcode augmentcode Bot Jun 26, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/automator/verify.py:747: claimed_ids = {str(i) for i in rj.get("dw_ids", [])} will raise if a result payload includes "dw_ids": null (or any non-iterable), which can crash bundle verification even though empty/absent claims are now treated as valid.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid — fixed in 3b55fa8. Our synthesize_result never emits a null dw_ids (it's gated if dw_ids:), but verify_dev_bundle is a standalone dict consumer with no launch-time protection, so rj.get("dw_ids", []) returning None on an explicit "dw_ids": null would crash the set comprehension. Now coerced with or []; the parametrized test_verify_dev_bundle_absent_dw_ids_passes gains a null case.

# Bundle dev sessions: the orchestrator exports the bundle's
# owned dw ids (the generic skill never authors them). Stamp
# them onto the result so verify_dev_bundle's cross-check passes.
raw_dw_ids = (spec.env.get("BMAD_AUTO_DW_IDS") or "").split(",")

@augmentcode augmentcode Bot Jun 26, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/automator/adapters/generic.py:331: (spec.env.get("BMAD_AUTO_DW_IDS") or "").split(",") still assumes a string; if a hook sets a truthy non-string (e.g., list/int), this will raise and can incorrectly mark a completed session as stalled.

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Declining this one, by design. spec.env is typed dict[str, str] and is consumed as the process environment ({**self.profile.env, **spec.env} at generic.py:118/142), which only accepts string values — a list/int there would fail at session launch, so the session could never reach the Stop event that calls _result_json. The guarded state is unreachable. And str()-coercing would be actively wrong: a list ["DW-1","DW-2"] stringifies to "['DW-1', 'DW-2']", which split(",") turns into garbage tokens → a spurious dw_ids mismatch, the exact false-defer this PR fixes. A crash on a hard type-contract violation is preferable to silently-wrong behavior; the shipped or "" guard already covers None, the only idiomatic unset sentinel.

rj.get("dw_ids", []) returns None on a "dw_ids": null payload (the default only
covers an absent key), so the set comprehension raised TypeError and crashed
bundle verification. Coerce with `or []`. Extends the parametrized test with the
null case.

Addresses PR #17 review feedback.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@pbean pbean merged commit df5418e into main Jun 26, 2026
7 checks passed
@pbean pbean deleted the release/0.7.4 branch June 26, 2026 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant