release: 0.7.1 — escalation auto-recovery + Log-tab underline fix#13
Conversation
…back The resolve workflow corrects the frozen spec under the BMAD artifact folder (_bmad-output/..., tracked). On resume the resolved re-drive ran attempt_dirty against the frozen baseline_commit, read the spec edit as a failed attempt, and with scm.rollback_on_failure OFF (default) paused for a manual rollback. Because the check diffs the working tree against the frozen baseline, even committing the spec re-paused on the next resume — an endless loop. A latent sibling bug: with the flag ON, safe_rollback's `git reset --hard` would silently revert the spec. A resolved re-drive is human-initiated, so it now always auto-recovers regardless of the flag. The BMAD artifact folders are treated as orchestrator-owned on that path: excluded from the dirty check (attempt_dirty gains `exclude=`) and preserved through the reset (safe_rollback gains `preserve=`, snapshotting tracked content via `git stash create` and restoring the artifact paths after the hard reset). The failed attempt's source changes still revert to baseline. Unattended/stopped attempts are unchanged: OFF pauses, ON does a clean reset. _safe_reset takes opt-in `preserve=` so sweep's ledger recovery (same folder) still reverts a broken ledger. The manual-recovery notice is now stopped-only. Bump to 0.7.1. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Log tab replays raw pipe-pane captures through pyte. Modern CLIs (Claude Code et al.) emit XTMODKEYS at startup — `CSI > 4 ; 2 m` (modifyOtherKeys) — a private-marker CSI sequence that is not an SGR. pyte 0.8.2 ignores the `>` marker (only `?` sets private; `>` is `pass`-ed) and dispatches it to SGR anyway, reading the `4` as underline-on. The matching disable (`CSI > 4 ; 0 m`) only fires at app exit, which a live or killed capture never contains, so every rendered line stayed underlined. LogView.read_new now strips private-marker SGR (`CSI [<>=?] ... m`) before feeding pyte. A legitimate SGR never carries such a marker, so the filter can only remove non-display sequences — never printable text or genuine color/bold/ underline styling. A held-back trailing-CSI keeps it correct when a sequence straddles two reads (the file is re-read from the held offset); the file seek/offset bookkeeping is unchanged and the approximate log_pos->line mapping is unaffected. Verified over 110 real run logs through the real LogView: worst underline 38-97% -> 0.0%, reverse 0.0%, color coverage preserved, plain text unchanged. Adds regression, legitimate-underline-preserved, and split-across-reads tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (3)
WalkthroughBumped release metadata to 0.7.1, added release notes, refined rollback handling for resolved escalations, and updated log rendering to strip private-marker CSI sequences before Changes0.7.1 release updates
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🤖 Augment PR SummarySummary: Release 0.7.1 focuses on two user-facing reliability fixes: safer auto-recovery after resolved CRITICAL escalations in in-place mode, and a TUI Log-tab rendering fix that prevented “everything underlined” sessions. Changes:
Technical Notes: The rollback preserve path snapshots via 🤖 Was this summary useful? React with 👍 or 👎 |
| # Best-effort per-path: a dir with no tracked files in the snapshot makes | ||
| # `git checkout` exit non-zero ("pathspec did not match") — not an error. | ||
| for d in preserve: | ||
| _git(repo, "checkout", snapshot, "--", d) |
There was a problem hiding this comment.
safe_rollback() ignores the return code/output from git checkout when restoring preserve paths; if checkout fails for reasons other than a missing pathspec, the corrected spec may be silently lost and the resolved re-drive could regress back into recovery loops.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
There was a problem hiding this comment.
Validated and fixed in 05135d9. The restore loop tolerated the benign empty-dir case (pathspec ... did not match) but swallowed every other git checkout failure too, so a corrupt/locked snapshot could drop the corrected spec silently and loop the re-drive. It now raises GitError on a non-benign failure and still tolerates the empty-dir case. Covered by test_safe_rollback_raises_on_genuine_restore_failure and test_safe_rollback_tolerates_empty_preserve_dir.
| while len(self._checkpoints) > 1 and self._checkpoints[1][1] <= top.dropped: | ||
| self._checkpoints.pop(0) | ||
| return True | ||
| return consumed > 0 |
There was a problem hiding this comment.
If a log ends with an incomplete CSI tail, held == len(chunk) keeps consumed == 0, so _offset never reaches EOF and read_new() will re-read the same tail on every poll (and never consider the file fully consumed). That can also keep LogIndex checkpointing permanently behind the actual file size.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
There was a problem hiding this comment.
Validated as reachable but intentionally not fixing — added a documenting comment in 05135d9 instead. When a chunk is entirely an unterminated trailing CSI, consumed == 0 and _offset does hold a few bytes short of EOF. But the impact is negligible and there's no safe fix:
- During active streaming it self-heals the instant the next write completes the sequence.
- A log that permanently ends mid-escape has nothing renderable in those held bytes;
read_new()correctly returnsFalse, the only caller (dashboard.py) just skips an index refresh, andline_for_offsetclamps past-EOF to the last row anyway. It is not a busy loop — it's a few bytes re-read per poll. - We can't advance past the held bytes without risking eating a legitimately split escape sequence — which is the whole reason they're held back.
So forcing _offset to EOF would trade a harmless cosmetic offset for a real correctness regression. Documented the rationale inline.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/automator/engine.py`:
- Around line 680-714: The rollback flow in _rollback_or_pause and _safe_reset
only preserves _protected_relpaths() on the initial resolved cleanup, so a later
retry/defer rollback can still revert the corrected spec. Update the re-drive
state handling in engine.py so the resolved artifact path under _bmad-output
stays protected for the entire re-drive, not just when cause == "resolved"; use
the existing _protected_relpaths(), _finish_inflight(), and _dev_phase() flow to
carry that preservation through subsequent resets.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 92e2e715-44de-4834-baca-ca1ee234e94c
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (14)
.claude-plugin/marketplace.jsonCHANGELOG.mdmodule.yamlpyproject.tomlsrc/automator/__init__.pysrc/automator/data/settings/core.tomlsrc/automator/data/skills/bmad-auto-setup/assets/module.yamlsrc/automator/engine.pysrc/automator/policy.pysrc/automator/tui/data.pysrc/automator/verify.pytests/test_engine.pytests/test_tui_data.pytests/test_verify.py
Two automated review findings validated and fixed; one validated as harmless and documented. - engine/model: carry resolved-spec preservation through the WHOLE resolved re-drive, not just the resume-time reset. A new latched StoryTask.resolved_redrive flag keeps the BMAD artifact folders preserved/excluded on every rollback until the correction commits, so a later mid-re-drive retry/defer reset (with rollback_on_failure=true) can no longer silently revert the human correction and loop the re-drive. (coderabbitai) - verify.safe_rollback: a genuine `git checkout` failure while restoring the preserved folders now raises GitError instead of being swallowed alongside the benign empty-dir "pathspec did not match" case — so a corrected spec can't vanish without a trace. (augmentcode) - tui/data.py: documented why an all-held incomplete-CSI chunk is intentionally not consumed (validated harmless, no safe advance possible). (augmentcode) Tests: model round-trip, engine later-rollback preservation regression, verify genuine-failure-raises + empty-dir-tolerated. Full suite green; trunk clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Release 0.7.1. Two independent fixes, version bumped to
0.7.1, CHANGELOG curated.Fixes
fix(bmad-auto): auto-recover resolved escalations without manual rollbackResolving a CRITICAL escalation looped on a manual-rollback prompt when the resolve edited the frozen spec. The resolve workflow's whole job is to correct the spec under the tracked BMAD artifact folder (
_bmad-output/...), so on resume the orchestrator saw a dirty tree and — with the defaultscm.rollback_on_failure = false— paused for a manual reset; because the dirty check diffs against the frozenbaseline_commit, even committing the spec re-paused, an endless loop.A resolved re-drive is human-initiated, so it now always auto-recovers regardless of the flag: BMAD artifact folders are treated as orchestrator-owned — excluded from the dirty check and preserved through the reset — so the spec correction survives while the failed attempt's source reverts to baseline. Also fixes a latent sibling bug where
rollback_on_failure = truesilently reverted the just-corrected spec. The flag now governs only unattended/stopped attempts; tooltips/docstrings updated to match.fix(tui): strip private-marker SGR so the Log tab isn't all underlinedThe Log tab replays raw
pipe-panecaptures through pyte. Modern CLIs (Claude Code et al.) emit XTMODKEYS at startup —CSI > 4 ; 2 m(modifyOtherKeys) — a private-marker CSI sequence that is not an SGR. pyte 0.8.2 ignores the>marker and dispatches it to SGR anyway, reading the4as underline-on; the matching disable only fires at app exit, which a live/killed capture never contains, so every line stayed underlined.LogView.read_newnow strips private-marker SGR (CSI [<>=?] … m) before feeding pyte. A legitimate SGR never carries such a marker, so the filter can only remove non-display sequences — never printable text or genuine color/bold/properly-closed underline styling. A held-back trailing-CSI keeps it correct when a sequence straddles two reads.Verified over 110 real run logs through the real
LogView: worst underline 38–97% → 0.0%, reverse 0.0%, color coverage preserved, plain text byte-identical. Adds regression, legitimate-underline-preserved, and split-across-reads tests.Release mechanics
0.7.1acrosspyproject.toml,module.yaml,__init__.py,marketplace.json,uv.lock.main, the Release workflow auto-creates thev0.7.1tag + GitHub release from the CHANGELOG.Tests
tests/test_tui_data.py: 47 pass (3 new);tests/test_engine.py+tests/test_verify.pycover the escalation fix.trunk checkclean.🤖 Generated with Claude Code
Summary by CodeRabbit