feat(autonomous): P0 foundation — AutonomousDriver + StatusChangeListener fan-out by tstapler · Pull Request #105 · tstapler/stapler-squad

tstapler · 2026-06-09T21:22:50Z

Summary

Complete end-to-end implementation for pointing Stapler Squad at a GitHub issue/PR and having it fix the issue autonomously.

session/autonomous_driver.go — New AutonomousDriver goroutine: waits for ClaudeController idle state, calls headless LLM pool to decide NEXT_MESSAGE vs DONE, injects via SendCommandImmediate. Idempotency guard + panic recovery + 20-turn cap. ExtractPRURL scans last 200 lines for newly-created PR URL.
session/claude_controller.go — Fan-out []StatusChangeListener slice replaces single slot so both session driver and AutonomousDriver can coexist. SetStatusChangeListener kept as compatibility shim.
Proto + session_service + instance — bool autonomous_mode = 23 on CreateSessionRequest; InstanceOptions.AutonomousMode threaded through constructor; CreateSession starts the driver when flag is set.
Omnibar "Fix Autonomously" mode — All 7 session-creation touchpoints: sessionType union, OmnibarCreationPanel entry, OmnibarContext sessionTypeMap, useSessionService RPC fields, OmnibarAction auto_fix union variant, dispatch case, dispatch tests.
Backlog "Run Autonomously" button — SpawnSessionFromItemRequest.autonomous proto flag; handler wires AutonomousMode: true, PermissionMode: "auto"; UI button on ready items.
LLM-assisted approval (E5) — FeatureKeyAutonomousApproval in headless features; ApprovalHandler.SetAutonomousChecker + SetHeadlessPool; risky tool calls on autonomous sessions go to headless LLM (APPROVE:/DENY:) before falling back to human review queue. Wired via closure in server.go.
Goal completion + notifications (E6) — onAutonomousDriverComplete transitions backlog item to done/failed, stores PR URL, sends push notification.
GitHub PR backlog plugin (E6) — backlog_plugin_github_prs.go fetches open PRs, tags with pr:review-requested/pr:ci-failing, registered in default plugin registry.

Demo harness

TestAutonomousDriver_* in session/autonomous_driver_test.go uses FakeHeadlessPool + fake controller — zero real credentials needed. Proves the full loop: spawn session → driver detects idle → injects message → detects completion → fires callback.

Test plan

🤖 Generated with Claude Code

…pture for OneShot Two improvements to OneShot (-p) session control: 1. **Fix initial prompt not submitting** (\r instead of \n): The session driver was sending initialPrompt + "\n" (LF) to the PTY. Claude Code's interactive readline needs \r (CR) to submit — the same signal a physical Enter key sends. Sessions received the typed text but never executed it, causing inactivity timeouts. Confirmed by log: "sent initial prompt" followed by 10-min idle. The startup dialog answers ("1\n") work because those menus handle both, but Claude's readline interface only responds to \r. 2. **Capture claude session_id from --output-format json output**: OneShot sessions now launch with -p --output-format json. When the session exits, the driver parses the JSON output for "session_id" and stores it as ConversationUUID on the instance. Subsequent restarts automatically use --resume <uuid>, sending Claude back into the same conversation with full context instead of re-running the task from scratch. Supporting infrastructure: - SetClaudeConversationUUID() — thread-safe setter that fires a save callback - SetClaudeSessionIDSavedCallback() — wired in service layer to flush to DB - wireClaudeSessionIDCallback() — registered on all creation paths including loadInstancesWithWiring (startup) and CreateDirectorySession (backlog) - Prompt is now appended even when claudeSessionID is set for OneShot sessions so continuation prompts are delivered after --resume Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… --permission-mode Three programmatic control improvements over Phase 1: 1. **steer_session via subprocess**: When a OneShot session has completed (Stopped + ConversationUUID set), steer_session now runs 'claude -p --resume <uuid> --output-format json <message>' instead of PTY send-keys. Returns structured result text in the MCP response. Send-keys path (interactive sessions) now uses \r correctly and reports method: "send_keys" vs "resume_subprocess". 2. **--allowedTools**: New AllowedTools field on Instance/InstanceOptions/ proto CreateSessionRequest (field 21). When set, passed as --allowedTools to the claude CLI at launch. Allows callers to pre-approve specific tools (e.g. "Bash(git *),Read,Edit") without the all-or-nothing --dangerously-skip-permissions flag. 3. **--permission-mode**: New PermissionMode field (proto field 22). Passes --permission-mode to claude at launch. Supports values like "acceptEdits" (auto-approve file writes only) and "auto" (full autonomous classification). Also: - parseJSONField() generic helper in session_driver.go; parseClaudeSessionID() now delegates to it instead of duplicating the scan logic - GetClaudeConversationUUID() thread-safe getter on Instance - RunWithResume() on Instance: spawns subprocess, parses result, updates UUID Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ener fan-out Lays the groundwork for pointing Stapler Squad at a GitHub issue/PR and having it fix the issue without human steering. **E1 — StatusChangeListener fan-out** (`claude_controller.go`) Replace single-listener slot with `[]StatusChangeListener` slice so both the existing session driver and the new AutonomousDriver can subscribe. `SetStatusChangeListener` kept as a backwards-compatible shim. **E2 — AutonomousDriver** (`session/autonomous_driver.go`) New goroutine that runs when `AutonomousMode=true`: - Registers as a status-change listener (signals a channel; never blocks the listener callback) - Waits for `ClaudeController` idle state before each turn - Calls the headless LLM pool (`FeatureKey("autonomous_fix-<id[:8]>")`) with goal + session tail → decides NEXT_MESSAGE or DONE - Injects via `SendCommandImmediate` from its own goroutine (avoids PTY race with CommandQueue executor) - `atomic.Bool` idempotency guard + `defer recover()` + 20-turn max cap - `ExtractPRURL` scans last 200 lines for newly-created PR URL **E3 — Session creation wiring** (proto + session_service + instance) - `bool autonomous_mode = 23` added to `CreateSessionRequest` - `InstanceOptions.AutonomousMode` threaded through `New()` constructor - `CreateSession` starts the driver when `AutonomousMode=true && headlessPool != nil` 14 tests added: max-turns, done-signal, idempotency, status-channel, panic-recovery, stop, PR URL extraction, prompt parsing. P1–P3 epics (omnibar UI, backlog button, LLM approval hook, goal completion notifications, GitHub PR plugin) are follow-up PRs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

This PR lays the P0 groundwork for “autonomous” Stapler Squad sessions by introducing an AutonomousDriver loop (orchestrator → inject next message when idle), expanding ClaudeController status-change callbacks to support multiple subscribers, and wiring new session creation options (allowed_tools, permission_mode, autonomous_mode) through proto → service → instance startup.

Changes:

Added session/AutonomousDriver to observe idle transitions and inject orchestrator-selected messages via SendCommandImmediate, with max-turns + stop + panic recovery.
Refactored ClaudeController to fan-out status-change notifications to multiple listeners (adds AddStatusChangeListener, keeps SetStatusChangeListener as a shim), and updated callback wiring/tests accordingly.
Added OneShot session “session_id” capture + --resume subprocess steering path, plus new create-session request fields and instance launch flags (--allowedTools, --permission-mode, --output-format json).

Reviewed changes

Copilot reviewed 21 out of 33 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
session/session_driver.go	Captures Claude `session_id` for OneShot sessions; switches initial prompt send terminator to `\r`; adds lightweight JSON field extraction helper.
session/session_driver_test.go	Adds unit tests for `parseClaudeSessionID`.
session/instance.go	Adds `AllowedTools`, `PermissionMode`, and `AutonomousMode` to instance/options; adds callback hook for persisting newly discovered `session_id`.
session/instance_tmux.go	Passes `--allowedTools`, `--permission-mode`, and `--output-format json` for OneShot launches; adjusts prompt arg behavior.
session/instance_controller.go	Switches status wiring to controller fan-out and adds `RegisterStatusChangeCallback`.
session/instance_claude.go	Adds thread-safe UUID accessor; adds `RunWithResume` + `SetClaudeConversationUUID`/callback support.
session/claude_controller.go	Replaces single listener with fan-out slice + lock; updates status-change loop to notify all listeners.
session/claude_controller_test.go	Updates listener tests to use `AddStatusChangeListener`.
session/autonomous_driver.go	New autonomous orchestrator/injection loop with rate-limit waiting, max-turn cap, DONE sentinel, PR URL extraction.
session/autonomous_driver_test.go	New unit tests for orchestration parsing, max-turn cap, DONE exit, stop/idempotency, etc.
server/services/session_service.go	Wires new create-session options; persists discovered `session_id`; starts autonomous driver on `autonomous_mode`.
server/mcp/tools_terminal.go	Enhances `steer_session` to use `claude --resume` subprocess for completed OneShot sessions; standardizes PTY send terminator to `\r`.
proto/session/v1/session.proto	Adds `allowed_tools`, `permission_mode`, `autonomous_mode` to `CreateSessionRequest`.
project_plans/github-autonomous-fix/**	Adds research + implementation plan docs for the autonomous fix initiative.
gen/proto/go/session/v1/**	Regenerated Connect/Protobuf outputs reflecting proto changes.

Files not reviewed (7)

gen/proto/go/session/v1/headless.pb.go: Language not supported
gen/proto/go/session/v1/insights.pb.go: Language not supported
gen/proto/go/session/v1/sessionv1connect/backlog.connect.go: Language not supported
gen/proto/go/session/v1/sessionv1connect/headless.connect.go: Language not supported
gen/proto/go/session/v1/sessionv1connect/insights.connect.go: Language not supported
gen/proto/go/session/v1/sessionv1connect/unfinished.connect.go: Language not supported
gen/proto/go/session/v1/unfinished.pb.go: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

… GitHub PR plugin Completes the full autonomous GitHub fix feature on a single branch. **E4 — User-facing entry points** - Omnibar: "Fix Autonomously" session type (all 7 touchpoints: type union, OmnibarCreationPanel, OmnibarContext sessionTypeMap, useSessionService, OmnibarAction `auto_fix` union variant, dispatch case, dispatch tests) - Backlog: "Run Autonomously" button on items in `ready` status (SpawnSessionFromItem proto `autonomous=true` flag, backlog service handler) **E5 — LLM-assisted approval** - `FeatureKeyAutonomousFix` + `FeatureKeyAutonomousApproval` in headless features - `ApprovalHandler.SetAutonomousChecker` + `SetHeadlessPool` injection points - When a risky tool call hits an autonomous session: headless LLM pool is queried (APPROVE:/DENY: response) before falling back to human review queue - Wired via closure in `server.go` (avoids construction-time circular dep) **E6 — Goal completion, artifacts, notifications, GitHub PR plugin** - `onAutonomousDriverComplete` transitions backlog item to done/failed, stores PR URL, sends push notification - `backlog_plugin_github_prs.go`: new `github_prs` plugin fetching open PRs, tagging with `pr:review-requested`/`pr:ci-failing`, registered in default plugin registry **E7 — Feature registry** - `docs/registry/features/autonomous-fix.json` added All existing tests pass. New dispatch tests (24 total, 3 new). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Engineering blockers: - Goroutine leak: replace context.Background() with server lifecycleCtx in StartAutonomousDriverForInstance and CreateSession driver start; add SetLifecycleContext() wired in wireDepsIntoServer - Driver registry: add driverRegistry map + RWMutex to SessionService; register drivers on start, stop+deregister in DeleteSession and HibernateSession to eliminate use-after-delete hazard - Prompt injection: encode toolInput as JSON in buildApprovalQuery so raw command strings cannot embed APPROVE:/DENY: to hijack LLM decision UX blockers: - Autonomous badge: add autonomous_mode = 60 to Session proto; populate in InstanceToProto adapter; render "Auto" badge in SessionCard - Stop control: stopping/hibernating a session now stops the autonomous driver via the registry — existing pause/delete actions are sufficient PM gap: - Add measurable success metrics table to requirements.md (completion rate, time-to-PR, LLM approval utilization, goroutine leak rate, backlog conversion count) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Code quality: - autonomous_driver.go: guard Start() against nil headlessPool (returns clear error instead of nil-deref panic in run()) - autonomous_driver.go: safe UUID slice for featureKey (guard short/empty UUID instead of panicking on sessionID[:8]) - autonomous_driver.go: cap waitForRateLimitClear with 4h maxRateLimitWait so an unrecoverable rate limit doesn't loop the driver indefinitely Security: - autonomous_driver.go: wrap goal and session tail in <goal>/</goal> and <session_output> XML delimiters in buildOrchestrationPrompt so that user-controlled PR body cannot spoof a NEXT_MESSAGE/DONE directive Ops: - session_service.go: deregister completed drivers in onAutonomousDriverComplete so the registry doesn't grow unbounded for long-lived servers - session_service.go: document intentional context.Background() in completion callback (must persist result even during concurrent shutdown) Tests: - autonomous_driver_test.go: TestAutonomousDriver_NilPool_Start (nil pool guard) - autonomous_driver_test.go: TestAutonomousDriver_ShortUUID (UUID < 8 chars) - autonomous_driver_test.go: TestBuildOrchestrationPrompt_GoalWrappedInDelimiters - approval_handler_integration_test.go: TestBuildApprovalQuery_PromptInjectionResistance UX: - SessionCard.tsx: add aria-label to autonomous badge for accessibility Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ver registry Covers: lifecycle context wiring, register/deregister semantics, delete-stops-driver, and completion-callback deregistration — the four behaviors identified by the Plan reviewer as missing from server/services coverage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…g, UX hints, PM doc Engineering: - approval_handler.go: fill empty sessionTail in autonomous LLM approval query by fetching session preview via queueChecker; LLM now has context to assess tool safety - autonomous_driver.go: log warning when session does not become idle after a turn (return value of waitForIdle was silently ignored) UX: - OmnibarCreationPanel: clarify autonomous hint to mention delete/hibernate stops the run - SessionCard: add role="status" to Auto badge for screen reader announcements PM: - requirements.md: add Target Users personas, Risky Assumptions, and Observability & SLA Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Label "Fix Autonomously" → "Fix Autonomously (Beta)" to signal experimental status - Hint text now explicitly states LLM reviewer decides tool permissions and that a completion notification fires Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- session_driver.go: parseJSONField now uses escape-aware rune scan instead of naive strings.Index, correctly handles \" \\ \n \t \r and other escape sequences - instance_claude.go: GetConversationUUID now holds stateMutex.RLock to prevent data race with concurrent SetClaudeConversationUUID calls - instance_claude.go: SetClaudeConversationUUID is idempotent — no-op and no callback if the UUID hasn't changed, preventing spurious storage saves on restart Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

# Conflicts: # docs/registry/features/backend/approval/delete-rule.json # docs/registry/features/backend/approval/get-analytics.json # docs/registry/features/backend/approval/list-pending.json # docs/registry/features/backend/approval/list-rules.json # docs/registry/features/backend/approval/resolve.json # docs/registry/features/backend/approval/upsert-rule.json # gen/proto/go/session/v1/backlog.pb.go # gen/proto/go/session/v1/session.pb.go # gen/proto/go/session/v1/sessionv1connect/session.connect.go # gen/proto/go/session/v1/types.pb.go # proto/session/v1/session.proto # server/services/session_service.go # session/instance.go # session/instance_claude.go # session/session_driver.go

github-actions · 2026-06-12T01:39:21Z

✅ Registry Validation

Registry Validation
===================

Building backend scanner...
Scanning backend features...
Wrote 97 feature files to /tmp/tmp.maYbPfr4XH/backend
Wrote 14 feature files to /tmp/tmp.maYbPfr4XH/backend
Wrote 22 feature files to /tmp/tmp.maYbPfr4XH/backend
Wrote 6 feature files to /tmp/tmp.maYbPfr4XH/backend

=== Backend Registry Diff ===
Committed: 132  Generated: 130  Divergence: 1.52%
⚠️  Removed RPCs:
  - backlog:spawn-session-autonomous
  - upload:image
⚠️  90 feature(s) missing // +api: marker (markerFound: false)

⚠️  Divergence 1.52% above warning threshold.

Test Coverage: 93/132 features have testIds (70.5%)

Registry validation is in observation mode until 2026-05-02.
After that date, divergence > 2% will block merges.
Coverage reporting is advisory only.

github-actions · 2026-06-12T01:41:17Z

Go Benchmarks (Tier 1)

benchmarks/go/tier1-baseline.txt:8: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:2035: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:4040: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:6040: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:8071: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:10099: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:12134: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:14168: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:16192: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:22884: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:30608: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:37827: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:44589: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:51346: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:58304: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:64780: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:71600: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:77066: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:82029: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:87665: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:93279: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:98752: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:104017: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:109727: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:115757: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:123122: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:130424: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:137211: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:144671: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:152393: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:159266: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:166225: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:173299: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:180826: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:188270: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:196607: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:205005: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:212668: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:220592: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:228363: parsing iteration count: invalid syntax
tier1-bench.txt:8: parsing iteration count: invalid syntax
tier1-bench.txt:1983: parsing iteration count: invalid syntax
tier1-bench.txt:4023: parsing iteration count: invalid syntax
tier1-bench.txt:5969: parsing iteration count: invalid syntax
tier1-bench.txt:7938: parsing iteration count: invalid syntax
tier1-bench.txt:9954: parsing iteration count: invalid syntax
tier1-bench.txt:11938: parsing iteration count: invalid syntax
tier1-bench.txt:13950: parsing iteration count: invalid syntax
tier1-bench.txt:17447: parsing iteration count: invalid syntax
tier1-bench.txt:24030: parsing iteration count: invalid syntax
tier1-bench.txt:31151: parsing iteration count: invalid syntax
tier1-bench.txt:37883: parsing iteration count: invalid syntax
tier1-bench.txt:44104: parsing iteration count: invalid syntax
tier1-bench.txt:50605: parsing iteration count: invalid syntax
tier1-bench.txt:57291: parsing iteration count: invalid syntax
tier1-bench.txt:64217: parsing iteration count: invalid syntax
tier1-bench.txt:70595: parsing iteration count: invalid syntax
tier1-bench.txt:75258: parsing iteration count: invalid syntax
tier1-bench.txt:80735: parsing iteration count: invalid syntax
tier1-bench.txt:86116: parsing iteration count: invalid syntax
tier1-bench.txt:91520: parsing iteration count: invalid syntax
tier1-bench.txt:96813: parsing iteration count: invalid syntax
tier1-bench.txt:102475: parsing iteration count: invalid syntax
tier1-bench.txt:107964: parsing iteration count: invalid syntax
tier1-bench.txt:113229: parsing iteration count: invalid syntax
tier1-bench.txt:119985: parsing iteration count: invalid syntax
tier1-bench.txt:127182: parsing iteration count: invalid syntax
tier1-bench.txt:134717: parsing iteration count: invalid syntax
tier1-bench.txt:141616: parsing iteration count: invalid syntax
tier1-bench.txt:148770: parsing iteration count: invalid syntax
tier1-bench.txt:156827: parsing iteration count: invalid syntax
tier1-bench.txt:164035: parsing iteration count: invalid syntax
tier1-bench.txt:171167: parsing iteration count: invalid syntax
tier1-bench.txt:178981: parsing iteration count: invalid syntax
tier1-bench.txt:186269: parsing iteration count: invalid syntax
tier1-bench.txt:194083: parsing iteration count: invalid syntax
tier1-bench.txt:201558: parsing iteration count: invalid syntax
tier1-bench.txt:209158: parsing iteration count: invalid syntax
tier1-bench.txt:216375: parsing iteration count: invalid syntax
tier1-bench.txt:223581: parsing iteration count: invalid syntax
goos: linux
goarch: amd64
pkg: github.com/tstapler/stapler-squad/session/detection/ratelimit
cpu: AMD EPYC 9V74 80-Core Processor                
                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt          │
                              │              sec/op              │   sec/op     vs base              │
StripANSI_PlainText-4                                6.915n ± 4%   6.946n ± 4%       ~ (p=0.442 n=8)
StripANSI_WithEscapes-4                              665.6n ± 1%   663.1n ± 1%       ~ (p=1.000 n=8)
ProcessOutput_InactiveState-4                        6.606n ± 0%   6.664n ± 1%  +0.89% (p=0.010 n=8)
geomean                                              31.21n        31.31n       +0.33%

                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                              │               B/op               │    B/op     vs base                │
StripANSI_PlainText-4                               0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
StripANSI_WithEscapes-4                             136.0 ± 0%     136.0 ± 0%       ~ (p=1.000 n=8) ¹
ProcessOutput_InactiveState-4                       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                        ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                              │            allocs/op             │ allocs/op   vs base                │
StripANSI_PlainText-4                               0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
StripANSI_WithEscapes-4                             5.000 ± 0%     5.000 ± 0%       ~ (p=1.000 n=8) ¹
ProcessOutput_InactiveState-4                       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                        ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/tstapler/stapler-squad/session/queue
                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                              │              sec/op              │    sec/op     vs base              │
ReviewQueue_ConcurrentReads-4                       91.80n ± 13%   80.88n ± 15%       ~ (p=0.505 n=8)
ReviewQueue_Add-4                                   406.1n ±  0%   409.9n ±  1%  +0.95% (p=0.000 n=8)
geomean                                             193.1n         182.1n        -5.69%

                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                              │               B/op               │    B/op     vs base                │
ReviewQueue_ConcurrentReads-4                       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
ReviewQueue_Add-4                                   640.0 ± 0%     640.0 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                        ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                              │            allocs/op             │ allocs/op   vs base                │
ReviewQueue_ConcurrentReads-4                       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
ReviewQueue_Add-4                                   4.000 ± 0%     4.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                        ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/tstapler/stapler-squad/session/scrollback
                                      │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt          │
                                      │              sec/op              │   sec/op     vs base              │
CircularBuffer_ConcurrentReadWrite-4                         3.167µ ± 2%   3.149µ ± 2%       ~ (p=0.266 n=8)
CircularBuffer_BurstAppend-4                                 105.3µ ± 1%   109.4µ ± 3%  +3.84% (p=0.000 n=8)
CircularBuffer_GetLastN_LargeBuffer-4                        15.68µ ± 1%   16.50µ ± 3%  +5.25% (p=0.000 n=8)
CircularBuffer_GetRange_Sequential-4                         9.069µ ± 3%   9.953µ ± 4%  +9.75% (p=0.000 n=8)
CircularBufferAppend-4                                       103.4n ± 0%   105.6n ± 1%  +2.13% (p=0.000 n=8)
CircularBufferGetLastN-4                                     1.752µ ± 1%   1.893µ ± 3%  +8.05% (p=0.000 n=8)
CircularBufferConcurrentAppend-4                             133.9n ± 0%   141.2n ± 4%  +5.41% (p=0.000 n=8)
geomean                                                      2.737µ        2.868µ       +4.79%

                                      │ benchmarks/go/tier1-baseline.txt │           tier1-bench.txt            │
                                      │               B/op               │     B/op      vs base                │
CircularBuffer_ConcurrentReadWrite-4                        6.062Ki ± 0%   6.062Ki ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_BurstAppend-4                                62.50Ki ± 0%   62.50Ki ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_GetLastN_LargeBuffer-4                       56.00Ki ± 0%   56.00Ki ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_GetRange_Sequential-4                        28.00Ki ± 0%   28.00Ki ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferAppend-4                                        24.00 ± 0%     24.00 ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferGetLastN-4                                    6.000Ki ± 0%   6.000Ki ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferConcurrentAppend-4                              32.00 ± 0%     32.00 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                     3.077Ki        3.077Ki       +0.00%
¹ all samples are equal

                                      │ benchmarks/go/tier1-baseline.txt │           tier1-bench.txt           │
                                      │            allocs/op             │  allocs/op   vs base                │
CircularBuffer_ConcurrentReadWrite-4                          2.000 ± 0%    2.000 ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_BurstAppend-4                                 1.000k ± 0%   1.000k ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_GetLastN_LargeBuffer-4                         1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_GetRange_Sequential-4                          1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferAppend-4                                        1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferGetLastN-4                                      1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferConcurrentAppend-4                              1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                       2.962         2.962       +0.00%
¹ all samples are equal

                             │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                             │               B/s                │     B/s       vs base              │
CircularBuffer_BurstAppend-4                       579.5Mi ± 1%   558.2Mi ± 3%  -3.69% (p=0.000 n=8)

pkg: github.com/tstapler/stapler-squad/session/tmux
                             │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt          │
                             │              sec/op              │   sec/op     vs base              │
StripANSICodes_PlainText-4                          6.649n ± 5%   6.422n ± 2%       ~ (p=0.234 n=8)
StripANSICodes_WithEscapes-4                        599.5n ± 1%   604.5n ± 1%  +0.84% (p=0.021 n=8)
IsBanner_PlainText-4                                452.4n ± 1%   452.4n ± 1%       ~ (p=1.000 n=8)
geomean                                             121.7n        120.7n       -0.87%

                             │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                             │               B/op               │    B/op     vs base                │
StripANSICodes_PlainText-4                         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
StripANSICodes_WithEscapes-4                       56.00 ± 0%     56.00 ± 0%       ~ (p=1.000 n=8) ¹
IsBanner_PlainText-4                               0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                       ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                             │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                             │            allocs/op             │ allocs/op   vs base                │
StripANSICodes_PlainText-4                         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
StripANSICodes_WithEscapes-4                       4.000 ± 0%     4.000 ± 0%       ~ (p=1.000 n=8) ¹
IsBanner_PlainText-4                               0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                       ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/tstapler/stapler-squad/session/tokens
                                   │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt          │
                                   │              sec/op              │   sec/op     vs base              │
TokenParser_ProcessUserEntry-4                            5.469m ± 1%   5.500m ± 0%       ~ (p=0.105 n=8)
DetectCommandsInText/NoSlash-4                            6.697n ± 2%   6.692n ± 1%       ~ (p=0.523 n=8)
DetectCommandsInText/WithCommand-4                        1.443µ ± 1%   1.440µ ± 1%       ~ (p=0.589 n=8)
geomean                                                   3.752µ        3.756µ       +0.11%

                                   │ benchmarks/go/tier1-baseline.txt │           tier1-bench.txt            │
                                   │               B/op               │     B/op      vs base                │
TokenParser_ProcessUserEntry-4                         11.02Mi ± 0%     11.02Mi ± 0%       ~ (p=0.993 n=8)
DetectCommandsInText/NoSlash-4                           0.000 ± 0%       0.000 ± 0%       ~ (p=1.000 n=8) ¹
DetectCommandsInText/WithCommand-4                       433.5 ± 0%       433.0 ± 0%       ~ (p=0.608 n=8)
geomean                                                             ²                 -0.04%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                   │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                                   │            allocs/op             │ allocs/op   vs base                │
TokenParser_ProcessUserEntry-4                           34.00 ± 0%     34.00 ± 0%       ~ (p=1.000 n=8) ¹
DetectCommandsInText/NoSlash-4                           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
DetectCommandsInText/WithCommand-4                       6.000 ± 0%     6.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                             ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

github-actions · 2026-06-12T01:43:03Z

E2E RPC Latency

list-sessions-ttfb-mean: 4ms (▼ faster -33.3%; baseline: 7ms)
list-sessions-total-mean: 5ms (▼ faster -40.7%; baseline: 8ms)

github-actions · 2026-06-12T01:43:48Z

UX Analysis

Check	Status	Details
✅ Axe Core (WCAG 2.1 AA)	success	Critical/serious violations block merge
⚠️ Lighthouse Performance	Score: unknown	Warning if < 70 (non-blocking)
🤖 Claude UX Analysis	Advisory	See docs/qa/ for findings

Axe Core excludes terminal rendering areas (intentional design).
Lighthouse runs in desktop preset for this developer tool.

github-actions · 2026-06-12T01:43:58Z

Frontend Terminal Throughput

terminal-throughput-mean: 16 KB/s ▲ +14.3% (baseline: 14 KB/s)
terminal-throughput-p50: 16 KB/s ▼ -0.1% (baseline: 16 KB/s)

github-actions · 2026-06-12T01:44:31Z

🎬 E2E Feature Demos

2 shard(s) recorded feature flows for this PR.

recordings shard 1
recordings shard 2

Demo preview opens directly in browser (single-file HTML). Raw WebM recordings in ZIP. Expires after 30 days.

tstapler and others added 3 commits June 5, 2026 08:03

Copilot AI review requested due to automatic review settings June 9, 2026 21:22

Copilot started reviewing on behalf of tstapler June 9, 2026 21:23 View session

Copilot AI reviewed Jun 9, 2026

View reviewed changes

tstapler and others added 9 commits June 10, 2026 19:31

chore(registry): update feature registry

c3d5378

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(autonomous): P0 foundation — AutonomousDriver + StatusChangeListener fan-out#105

feat(autonomous): P0 foundation — AutonomousDriver + StatusChangeListener fan-out#105
tstapler wants to merge 12 commits into
mainfrom
stapler-squad-autonomous

tstapler commented Jun 9, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tstapler commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Demo harness

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 12, 2026

✅ Registry Validation

Uh oh!

github-actions Bot commented Jun 12, 2026

Go Benchmarks (Tier 1)

Uh oh!

github-actions Bot commented Jun 12, 2026

E2E RPC Latency

Uh oh!

github-actions Bot commented Jun 12, 2026

UX Analysis

Uh oh!

github-actions Bot commented Jun 12, 2026

Frontend Terminal Throughput

Uh oh!

github-actions Bot commented Jun 12, 2026

🎬 E2E Feature Demos

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tstapler commented Jun 9, 2026 •

edited

Loading