Skip to content

fix(detection): detect indented spinners and CR-overwritten esc-to-interrupt#108

Merged
tstapler merged 8 commits into
mainfrom
stapler-squad-detection-bug
Jun 12, 2026
Merged

fix(detection): detect indented spinners and CR-overwritten esc-to-interrupt#108
tstapler merged 8 commits into
mainfrom
stapler-squad-detection-bug

Conversation

@tstapler

Copy link
Copy Markdown
Owner

Summary

Fixes two status detection bugs affecting sessions running Claude Code's task manager panel:

Bug 1 — Active session detected as idle when the task manager panel is open:

  • The claude_thinking_verb regex required the spinner character at column 0 (^[spinner]). Spinners inside the task manager are indented 2 spaces ( ✽ Roosting…), so they never matched.
  • The task manager also overlays the "esc to interrupt" status bar by writing ↑/↓ to select with a \r prefix on the same row. collapseCarriageReturns discarded the hidden "esc to interrupt" text, so the session appeared idle.

Bug 2 — InputRequired not flagged when a selection dialog appears below a prior completion line:

  • mapStatusToIdleState lacked explicit cases for StatusInputRequired and StatusSuccess, both falling to default: return IdleStateWaiting. Added explicit cases to document intent.

Changes

session/detection/detector.go

  • Pattern fix: (?m)^[spinner](?m)^\s*[spinner] to allow indented spinners
  • DetectFromLines / DetectWithContextFromLines: scan CR-split segments in reverse; the last (visual) segment is authoritative, but earlier segments can promote high-urgency statuses (Active, NeedsApproval, InputRequired, Error) — low-urgency statuses (Success, Processing) in overwritten earlier segments are not promoted

session/detection/idle.go

  • Explicit case StatusInputRequired and case StatusSuccess in mapStatusToIdleState

Tests

  • session/detection/bug_regression_test.go — 11 regression tests covering both bugs and boundary cases (CR-collapse, idle-stays-idle, last-segment-authoritative)
  • session/detection/snapshot_test.go — new snapshot entry for active session with task manager overlay
  • session/detection/testdata/ — two new fixtures

Note: This branch includes 3 commits from local main that are not yet on origin/main (feat(sessions): Phase 1 programmatic control, fix(db): approval_rule JSON Optional, chore(ent): untrack gitignored generated files). The detection fixes are the 3 fix(detection) / test(detection) commits.

🤖 Generated with Claude Code

tstapler and others added 6 commits June 9, 2026 14:21
…pture for OneShot

Two improvements to OneShot (-p) session control:

1. **Fix initial prompt not submitting** (\r instead of \n): The session driver
   was sending initialPrompt + "\n" (LF) to the PTY. Claude Code's interactive
   readline needs \r (CR) to submit — the same signal a physical Enter key sends.
   Sessions received the typed text but never executed it, causing inactivity
   timeouts. Confirmed by log: "sent initial prompt" followed by 10-min idle.
   The startup dialog answers ("1\n") work because those menus handle both,
   but Claude's readline interface only responds to \r.

2. **Capture claude session_id from --output-format json output**: OneShot
   sessions now launch with -p --output-format json. When the session exits,
   the driver parses the JSON output for "session_id" and stores it as
   ConversationUUID on the instance. Subsequent restarts automatically use
   --resume <uuid>, sending Claude back into the same conversation with full
   context instead of re-running the task from scratch.

   Supporting infrastructure:
   - SetClaudeConversationUUID() — thread-safe setter that fires a save callback
   - SetClaudeSessionIDSavedCallback() — wired in service layer to flush to DB
   - wireClaudeSessionIDCallback() — registered on all creation paths including
     loadInstancesWithWiring (startup) and CreateDirectorySession (backlog)
   - Prompt is now appended even when claudeSessionID is set for OneShot
     sessions so continuation prompts are delivered after --resume

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New JSON columns (programs, subcommands, etc.) were NOT NULL with no
SQL-level DEFAULT, causing the SQLite copy-table migration to fail for
existing rows. Adding Optional() makes the columns nullable so old rows
can be copied; new rows still get []string{} from the Go-level Default.

Also adds build backup + auto-rollback to install-service:
- make install-service saves stapler-squad.prev before building
- health check polls /health for 15s after service start
- on failure, auto-restores .prev and restarts the service
- make rollback provides a manual escape hatch

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
These files are auto-generated by ent and covered by .gitignore.
Force-added in the previous commit by mistake.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…terrupt

Two compounding bugs caused active sessions to appear idle when Claude Code's
task manager panel was open:

1. The `claude_thinking_verb` pattern required the spinner at column 0
   (`(?m)^[spinner]`). When a spinner appears as a sub-item inside the task
   manager panel it is indented by 2 spaces ("  ✽ Roosting…"), so the pattern
   failed to match. Fix: `(?m)^\s*[spinner]` allows any leading whitespace.

2. `DetectFromLines` applied `collapseCarriageReturns` before scanning, which
   silently discarded "esc to interrupt" when the task manager panel overwrote
   it via `\r` on the same terminal row. The scan then continued upward and
   found "✻ Baked for 18m 15s" (StatusSuccess), mapping to IdleStateWaiting
   via the `default` branch instead of IdleStateActive. Fix: scan CR-split
   segments in reverse (last-written → first-written) in both DetectFromLines
   and DetectWithContextFromLines; the last segment still takes precedence
   (preserves "esc to interrupt\r? for shortcuts" → Idle), but earlier segments
   are checked when the last one is only Ready/Unknown.

Also adds explicit `StatusInputRequired` and `StatusSuccess` cases to
`mapStatusToIdleState` (both already resolved to `IdleStateWaiting` via the
`default` branch — making them explicit documents intent and prevents future
omissions from silently passing).

Regression tests in bug_regression_test.go cover:
- Indented spinner variants (2-space, 4-space, tab)
- CR-collapse: esc-to-interrupt preserved when overwritten by task manager
- CR-collapse: idle state still detected when overwritten by "? for shortcuts"
- Full-content DetectFromLines returning Active despite older Success scrollback
- InputRequired dialog detected correctly over older Success completion line
- "Esc to cancel" (capital E) correctly NOT matching esc_to_interrupt (Active)
- mapStatusToIdleState explicit coverage for all DetectedStatus values

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
DetectFromLines / DetectWithContextFromLines: earlier CR segments now
only promote high-urgency statuses (Active, NeedsApproval,
InputRequired, Error). Low-urgency statuses (Success, Processing, Idle)
in overwritten segments no longer override the visually-last segment.

TestMapStatusToIdleState_ExplicitCoverage: replace IdleState(-1) sentinel
with a dedicated skipMustNot bool field to eliminate the type-cast hack.

TestBug1_IndentedSpinner_NoRegression: add inline clause-level comments
explaining which regex clause rejects each false-positive case.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ent test

Add missing DetectedStatus rows (Idle, Ready, TestsFailing, Unknown) to
TestMapStatusToIdleState_ExplicitCoverage so the test actually covers every
status value, including the default fall-through cases.

Add TestCRCollapse_LastSegmentSuccessIsAuthoritative to document and guard the
asymmetry: when the LAST CR segment produces Success it is authoritative (mirrors
TestBug1_CRCollapse_EscToInterrupt_Preserved where the earlier Active segment wins).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 12, 2026 01:12
@github-actions

Copy link
Copy Markdown
Contributor

✅ Registry Validation

Registry Validation
===================

Building backend scanner...
Scanning backend features...
Wrote 97 feature files to /tmp/tmp.HLiB4lJhyc/backend
Wrote 14 feature files to /tmp/tmp.HLiB4lJhyc/backend
Wrote 22 feature files to /tmp/tmp.HLiB4lJhyc/backend
Wrote 6 feature files to /tmp/tmp.HLiB4lJhyc/backend

=== Backend Registry Diff ===
Committed: 131  Generated: 130  Divergence: 0.76%
⚠️  Removed RPCs:
  - upload:image
⚠️  90 feature(s) missing // +api: marker (markerFound: false)

✅ Registry validation passed. Divergence: 0.76%

Test Coverage: 93/131 features have testIds (71.0%)

Registry validation is in observation mode until 2026-05-02.
After that date, divergence > 2% will block merges.
Coverage reporting is advisory only.

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Go Benchmarks (Tier 1)

benchmarks/go/tier1-baseline.txt:8: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:1911: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:3851: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:5818: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:7757: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:9765: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:11762: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:13718: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:15713: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:22046: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:28331: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:34667: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:41112: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:47519: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:54111: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:60936: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:67663: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:72937: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:78608: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:84042: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:89486: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:95390: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:100594: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:105408: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:111071: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:117978: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:125832: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:132503: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:140205: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:146776: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:153558: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:160254: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:167029: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:174225: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:181126: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:188418: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:196178: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:204037: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:211583: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:219315: parsing iteration count: invalid syntax
tier1-bench.txt:8: parsing iteration count: invalid syntax
tier1-bench.txt:1015: parsing iteration count: invalid syntax
tier1-bench.txt:2911: parsing iteration count: invalid syntax
tier1-bench.txt:4917: parsing iteration count: invalid syntax
tier1-bench.txt:6917: parsing iteration count: invalid syntax
tier1-bench.txt:8870: parsing iteration count: invalid syntax
tier1-bench.txt:10881: parsing iteration count: invalid syntax
tier1-bench.txt:12852: parsing iteration count: invalid syntax
tier1-bench.txt:14857: parsing iteration count: invalid syntax
tier1-bench.txt:21677: parsing iteration count: invalid syntax
tier1-bench.txt:28617: parsing iteration count: invalid syntax
tier1-bench.txt:34755: parsing iteration count: invalid syntax
tier1-bench.txt:41201: parsing iteration count: invalid syntax
tier1-bench.txt:48920: parsing iteration count: invalid syntax
tier1-bench.txt:55825: parsing iteration count: invalid syntax
tier1-bench.txt:62985: parsing iteration count: invalid syntax
tier1-bench.txt:69447: parsing iteration count: invalid syntax
tier1-bench.txt:75194: parsing iteration count: invalid syntax
tier1-bench.txt:80399: parsing iteration count: invalid syntax
tier1-bench.txt:85452: parsing iteration count: invalid syntax
tier1-bench.txt:90803: parsing iteration count: invalid syntax
tier1-bench.txt:96392: parsing iteration count: invalid syntax
tier1-bench.txt:102005: parsing iteration count: invalid syntax
tier1-bench.txt:107076: parsing iteration count: invalid syntax
tier1-bench.txt:112511: parsing iteration count: invalid syntax
tier1-bench.txt:119558: parsing iteration count: invalid syntax
tier1-bench.txt:126730: parsing iteration count: invalid syntax
tier1-bench.txt:133624: parsing iteration count: invalid syntax
tier1-bench.txt:140317: parsing iteration count: invalid syntax
tier1-bench.txt:147600: parsing iteration count: invalid syntax
tier1-bench.txt:153942: parsing iteration count: invalid syntax
tier1-bench.txt:160604: parsing iteration count: invalid syntax
tier1-bench.txt:167721: parsing iteration count: invalid syntax
tier1-bench.txt:175058: parsing iteration count: invalid syntax
tier1-bench.txt:182436: parsing iteration count: invalid syntax
tier1-bench.txt:190742: parsing iteration count: invalid syntax
tier1-bench.txt:197927: parsing iteration count: invalid syntax
tier1-bench.txt:205320: parsing iteration count: invalid syntax
tier1-bench.txt:213286: parsing iteration count: invalid syntax
tier1-bench.txt:220552: parsing iteration count: invalid syntax
goos: linux
goarch: amd64
pkg: github.com/tstapler/stapler-squad/session/detection/ratelimit
cpu: AMD EPYC 9V74 80-Core Processor                
                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt          │
                              │              sec/op              │   sec/op     vs base              │
StripANSI_PlainText-4                                6.867n ± 3%   6.892n ± 2%       ~ (p=0.442 n=8)
StripANSI_WithEscapes-4                              659.6n ± 0%   661.1n ± 2%       ~ (p=1.000 n=8)
ProcessOutput_InactiveState-4                        6.635n ± 1%   6.607n ± 0%       ~ (p=0.068 n=8)
geomean                                              31.09n        31.11n       +0.06%

                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                              │               B/op               │    B/op     vs base                │
StripANSI_PlainText-4                               0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
StripANSI_WithEscapes-4                             136.0 ± 0%     136.0 ± 0%       ~ (p=1.000 n=8) ¹
ProcessOutput_InactiveState-4                       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                        ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                              │            allocs/op             │ allocs/op   vs base                │
StripANSI_PlainText-4                               0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
StripANSI_WithEscapes-4                             5.000 ± 0%     5.000 ± 0%       ~ (p=1.000 n=8) ¹
ProcessOutput_InactiveState-4                       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                        ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/tstapler/stapler-squad/session/queue
                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                              │              sec/op              │    sec/op     vs base              │
ReviewQueue_ConcurrentReads-4                       82.27n ± 14%   83.99n ± 17%       ~ (p=0.234 n=8)
ReviewQueue_Add-4                                   409.1n ±  0%   408.0n ±  1%       ~ (p=0.342 n=8)
geomean                                             183.4n         185.1n        +0.91%

                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                              │               B/op               │    B/op     vs base                │
ReviewQueue_ConcurrentReads-4                       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
ReviewQueue_Add-4                                   640.0 ± 0%     640.0 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                        ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                              │            allocs/op             │ allocs/op   vs base                │
ReviewQueue_ConcurrentReads-4                       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
ReviewQueue_Add-4                                   4.000 ± 0%     4.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                        ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/tstapler/stapler-squad/session/scrollback
                                      │ benchmarks/go/tier1-baseline.txt │           tier1-bench.txt           │
                                      │              sec/op              │    sec/op     vs base               │
CircularBuffer_ConcurrentReadWrite-4                         3.135µ ± 1%    3.140µ ± 1%        ~ (p=0.382 n=8)
CircularBuffer_BurstAppend-4                                 104.8µ ± 0%    105.2µ ± 4%   +0.46% (p=0.015 n=8)
CircularBuffer_GetLastN_LargeBuffer-4                        16.12µ ± 4%    15.86µ ± 1%        ~ (p=0.065 n=8)
CircularBuffer_GetRange_Sequential-4                         9.207µ ± 5%   10.316µ ± 6%  +12.05% (p=0.000 n=8)
CircularBufferAppend-4                                       103.0n ± 0%    105.8n ± 0%   +2.67% (p=0.000 n=8)
CircularBufferGetLastN-4                                     1.775µ ± 1%    1.924µ ± 3%   +8.39% (p=0.000 n=8)
CircularBufferConcurrentAppend-4                             135.0n ± 0%    137.8n ± 1%   +2.07% (p=0.000 n=8)
geomean                                                      2.754µ         2.847µ        +3.36%

                                      │ benchmarks/go/tier1-baseline.txt │           tier1-bench.txt            │
                                      │               B/op               │     B/op      vs base                │
CircularBuffer_ConcurrentReadWrite-4                        6.062Ki ± 0%   6.062Ki ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_BurstAppend-4                                62.50Ki ± 0%   62.50Ki ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_GetLastN_LargeBuffer-4                       56.00Ki ± 0%   56.00Ki ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_GetRange_Sequential-4                        28.00Ki ± 0%   28.00Ki ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferAppend-4                                        24.00 ± 0%     24.00 ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferGetLastN-4                                    6.000Ki ± 0%   6.000Ki ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferConcurrentAppend-4                              32.00 ± 0%     32.00 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                     3.077Ki        3.077Ki       +0.00%
¹ all samples are equal

                                      │ benchmarks/go/tier1-baseline.txt │           tier1-bench.txt           │
                                      │            allocs/op             │  allocs/op   vs base                │
CircularBuffer_ConcurrentReadWrite-4                          2.000 ± 0%    2.000 ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_BurstAppend-4                                 1.000k ± 0%   1.000k ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_GetLastN_LargeBuffer-4                         1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_GetRange_Sequential-4                          1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferAppend-4                                        1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferGetLastN-4                                      1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferConcurrentAppend-4                              1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                       2.962         2.962       +0.00%
¹ all samples are equal

                             │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                             │               B/s                │     B/s       vs base              │
CircularBuffer_BurstAppend-4                       582.6Mi ± 0%   580.0Mi ± 7%  -0.45% (p=0.015 n=8)

pkg: github.com/tstapler/stapler-squad/session/tmux
                             │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt          │
                             │              sec/op              │   sec/op     vs base              │
StripANSICodes_PlainText-4                          6.550n ± 3%   6.619n ± 4%       ~ (p=0.328 n=8)
StripANSICodes_WithEscapes-4                        602.0n ± 1%   601.9n ± 1%       ~ (p=0.935 n=8)
IsBanner_PlainText-4                                452.6n ± 2%   450.8n ± 0%  -0.39% (p=0.014 n=8)
geomean                                             121.3n        121.6n       +0.22%

                             │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                             │               B/op               │    B/op     vs base                │
StripANSICodes_PlainText-4                         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
StripANSICodes_WithEscapes-4                       56.00 ± 0%     56.00 ± 0%       ~ (p=1.000 n=8) ¹
IsBanner_PlainText-4                               0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                       ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                             │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                             │            allocs/op             │ allocs/op   vs base                │
StripANSICodes_PlainText-4                         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
StripANSICodes_WithEscapes-4                       4.000 ± 0%     4.000 ± 0%       ~ (p=1.000 n=8) ¹
IsBanner_PlainText-4                               0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                       ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/tstapler/stapler-squad/session/tokens
                                   │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt          │
                                   │              sec/op              │   sec/op     vs base              │
TokenParser_ProcessUserEntry-4                            5.547m ± 2%   5.535m ± 1%       ~ (p=0.959 n=8)
DetectCommandsInText/NoSlash-4                            6.696n ± 1%   6.689n ± 3%  -0.09% (p=0.016 n=8)
DetectCommandsInText/WithCommand-4                        1.444µ ± 0%   1.449µ ± 1%       ~ (p=0.222 n=8)
geomean                                                   3.771µ        3.771µ       +0.00%

                                   │ benchmarks/go/tier1-baseline.txt │           tier1-bench.txt            │
                                   │               B/op               │     B/op      vs base                │
TokenParser_ProcessUserEntry-4                         11.02Mi ± 0%     11.02Mi ± 0%       ~ (p=0.057 n=8)
DetectCommandsInText/NoSlash-4                           0.000 ± 0%       0.000 ± 0%       ~ (p=1.000 n=8) ¹
DetectCommandsInText/WithCommand-4                       434.0 ± 0%       433.0 ± 0%  -0.23% (p=0.026 n=8)
geomean                                                             ²                 -0.08%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                   │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                                   │            allocs/op             │ allocs/op   vs base                │
TokenParser_ProcessUserEntry-4                           34.00 ± 0%     34.00 ± 0%       ~ (p=1.000 n=8) ¹
DetectCommandsInText/NoSlash-4                           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
DetectCommandsInText/WithCommand-4                       6.000 ± 0%     6.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                             ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves Claude Code session status detection to correctly identify active and input-required states when the task manager UI is present (indented spinners and carriage-return overwrites), and adds regression coverage to prevent these cases from reappearing. It also includes related repo maintenance changes: making ApprovalRule JSON criteria fields nullable for safer SQLite migrations and adding an install-service health check with automatic rollback support.

Changes:

  • Fix detection patterns and line-scanning logic to handle indented spinners and \r-overwritten status lines (while keeping the “last visual segment is authoritative” behavior).
  • Add regression fixtures + tests (including a new snapshot case) for the reported detection failures and boundary cases.
  • Add service install health-check + rollback workflow and adjust ent schema nullability for JSON criteria fields.

Reviewed changes

Copilot reviewed 19 out of 20 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
session/session_driver.go Removes a stray blank line in diff context (no functional change).
session/instance_claude.go Removes a stray blank line in diff context (no functional change).
session/ent/schema/approvalrule.go Makes JSON criteria fields optional/nullable with defaults to ease SQLite migrations.
session/ent/runtime.go Removes generated ent runtime file from tracking.
session/ent/migrate/schema.go Removes generated ent migration schema file from tracking.
session/ent/hook/hook.go Removes generated ent hook file from tracking.
session/ent/approvalrule/where.go Removes generated ent predicate file from tracking.
session/ent/approvalrule/approvalrule.go Removes generated ent constants/helpers file from tracking.
session/ent/approvalrule.go Removes generated ent model file from tracking.
session/ent/approvalrule_update.go Removes generated ent update builder file from tracking.
session/ent/approvalrule_create.go Removes generated ent create/upsert builder file from tracking.
session/detection/testdata/claude_input_required_with_success_scrollback.txt Adds fixture for “InputRequired below prior success scrollback” case.
session/detection/testdata/claude_active_task_manager.txt Adds fixture for active task manager overlay (indented spinner + status bar).
session/detection/snapshot_test.go Adds snapshot coverage for the active task manager overlay fixture.
session/detection/idle.go Adds explicit StatusInputRequired and StatusSuccess mappings to document intent.
session/detection/detector.go Updates spinner regex to allow indentation; adds CR-segment reverse scanning in line-based detectors.
session/detection/bug_regression_test.go Adds regression tests for indented spinners, CR overwrite behavior, and mapping coverage.
scripts/install-service.sh Adds health check and auto-rollback behavior after installing/updating the service.
Makefile Adds binary backup/rollback targets and wires them into install-service.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread session/detection/detector.go Outdated
Comment on lines +566 to +568
// ^\s* allows leading whitespace so indented spinners (e.g. task manager sub-items)
// are detected: " ✽ Roosting… (9m 52s · ↓ 2.8k tokens)"
Pattern: `(?m)^\s*[·✢✳✶✻✽●*]\s+[A-Z][a-zA-Z'\-éèêàâùûôîïëüöäÿæœ]*(?:…|\.{1,3})`,

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the latest commit: changed to so only horizontal whitespace (spaces/tabs) is consumed, preventing RE2 from matching across newlines.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the latest commit: changed the leading anchor from ^\s* to ^[ \t]* (and the post-spinner whitespace class too) so only horizontal whitespace (spaces/tabs) is consumed — RE2 cannot consume a newline and match a spinner on a different line.

Comment on lines +276 to +280
url="http://localhost:8543/health"
printf "==> Waiting for service to be healthy"
while [ "$elapsed" -lt "$max_wait" ]; do
if curl -sf "$url" >/dev/null 2>&1; then
printf "\n"

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: added a command -v curl guard so the health check is skipped entirely (with an informational message) if curl is not available, preventing false rollbacks.

Comment on lines +28 to +31
if got != StatusActive && got != StatusProcessing {
t.Errorf("Detect(%q) = %s, want StatusActive (indented spinner must be detected)",
tc.input, got)
}

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: error message now reads "want StatusActive or StatusProcessing" to match the two-value acceptance condition.

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

E2E RPC Latency

list-sessions-ttfb-mean: 6ms (▼ faster -13.0%; baseline: 7ms)
list-sessions-total-mean: 7ms (▼ faster -20.9%; baseline: 9ms)

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Frontend Terminal Throughput

terminal-throughput-mean: 16 KB/s ▲ +14.2% (baseline: 14 KB/s)
terminal-throughput-p50: 16 KB/s ▼ -0.9% (baseline: 16 KB/s)

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

🎬 E2E Feature Demos

2 shard(s) recorded feature flows for this PR.

recordings shard 1
recordings shard 2

Demo preview opens directly in browser (single-file HTML). Raw WebM recordings in ZIP. Expires after 30 days.

…est error

- Replace `\s*` with `[ \t]*` and `\s+` with `[ \t]+` in claude_thinking_verb
  pattern so it cannot consume newlines and match spinners across lines
- Guard health-check curl call in install-service.sh with `command -v curl`
  so absence of curl skips the check rather than triggering rollback
- Update TestBug1_IndentedSpinner error message to mention both accepted
  statuses (StatusActive or StatusProcessing)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown
Contributor

✅ Registry Validation

Registry Validation
===================

Building backend scanner...
Scanning backend features...
Wrote 97 feature files to /tmp/tmp.hl4HWC7avO/backend
Wrote 14 feature files to /tmp/tmp.hl4HWC7avO/backend
Wrote 22 feature files to /tmp/tmp.hl4HWC7avO/backend
Wrote 6 feature files to /tmp/tmp.hl4HWC7avO/backend

=== Backend Registry Diff ===
Committed: 131  Generated: 130  Divergence: 0.76%
⚠️  Removed RPCs:
  - upload:image
⚠️  90 feature(s) missing // +api: marker (markerFound: false)

✅ Registry validation passed. Divergence: 0.76%

Test Coverage: 93/131 features have testIds (71.0%)

Registry validation is in observation mode until 2026-05-02.
After that date, divergence > 2% will block merges.
Coverage reporting is advisory only.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown
Contributor

✅ Registry Validation

Registry Validation
===================

Building backend scanner...
Scanning backend features...
Wrote 97 feature files to /tmp/tmp.dNAuX3DZOT/backend
Wrote 14 feature files to /tmp/tmp.dNAuX3DZOT/backend
Wrote 22 feature files to /tmp/tmp.dNAuX3DZOT/backend
Wrote 6 feature files to /tmp/tmp.dNAuX3DZOT/backend

=== Backend Registry Diff ===
Committed: 131  Generated: 130  Divergence: 0.76%
⚠️  Removed RPCs:
  - upload:image
⚠️  90 feature(s) missing // +api: marker (markerFound: false)

✅ Registry validation passed. Divergence: 0.76%

Test Coverage: 93/131 features have testIds (71.0%)

Registry validation is in observation mode until 2026-05-02.
After that date, divergence > 2% will block merges.
Coverage reporting is advisory only.

@tstapler tstapler merged commit 54a5f63 into main Jun 12, 2026
20 checks passed
@tstapler tstapler deleted the stapler-squad-detection-bug branch June 12, 2026 02:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants