Skip to content

feat(autonomous): P0 foundation — AutonomousDriver + StatusChangeListener fan-out#105

Open
tstapler wants to merge 12 commits into
mainfrom
stapler-squad-autonomous
Open

feat(autonomous): P0 foundation — AutonomousDriver + StatusChangeListener fan-out#105
tstapler wants to merge 12 commits into
mainfrom
stapler-squad-autonomous

Conversation

@tstapler

@tstapler tstapler commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Summary

Complete end-to-end implementation for pointing Stapler Squad at a GitHub issue/PR and having it fix the issue autonomously.

  • session/autonomous_driver.go — New AutonomousDriver goroutine: waits for ClaudeController idle state, calls headless LLM pool to decide NEXT_MESSAGE vs DONE, injects via SendCommandImmediate. Idempotency guard + panic recovery + 20-turn cap. ExtractPRURL scans last 200 lines for newly-created PR URL.
  • session/claude_controller.go — Fan-out []StatusChangeListener slice replaces single slot so both session driver and AutonomousDriver can coexist. SetStatusChangeListener kept as compatibility shim.
  • Proto + session_service + instancebool autonomous_mode = 23 on CreateSessionRequest; InstanceOptions.AutonomousMode threaded through constructor; CreateSession starts the driver when flag is set.
  • Omnibar "Fix Autonomously" mode — All 7 session-creation touchpoints: sessionType union, OmnibarCreationPanel entry, OmnibarContext sessionTypeMap, useSessionService RPC fields, OmnibarAction auto_fix union variant, dispatch case, dispatch tests.
  • Backlog "Run Autonomously" buttonSpawnSessionFromItemRequest.autonomous proto flag; handler wires AutonomousMode: true, PermissionMode: "auto"; UI button on ready items.
  • LLM-assisted approval (E5)FeatureKeyAutonomousApproval in headless features; ApprovalHandler.SetAutonomousChecker + SetHeadlessPool; risky tool calls on autonomous sessions go to headless LLM (APPROVE:/DENY:) before falling back to human review queue. Wired via closure in server.go.
  • Goal completion + notifications (E6)onAutonomousDriverComplete transitions backlog item to done/failed, stores PR URL, sends push notification.
  • GitHub PR backlog plugin (E6)backlog_plugin_github_prs.go fetches open PRs, tags with pr:review-requested/pr:ci-failing, registered in default plugin registry.

Demo harness

TestAutonomousDriver_* in session/autonomous_driver_test.go uses FakeHeadlessPool + fake controller — zero real credentials needed. Proves the full loop: spawn session → driver detects idle → injects message → detects completion → fires callback.

Test plan

  • TestAutonomousDriver_MaxTurnsLimit — exits after maxTurns injections
  • TestAutonomousDriver_DoneSignal — exits immediately on DONE: response
  • TestAutonomousDriver_IdempotencyGuard — second Start() is a no-op
  • TestAutonomousDriver_StatusChannelSignal — listener fires channel without blocking
  • TestAutonomousDriver_PanicRecovery — panicking LLM pool is recovered
  • TestAutonomousDriver_Stop_CancelsLoopStop() cancels in-flight run
  • TestExtractPRURL_* — PR URL regex (4 cases)
  • TestParseOrchestrationResponse_* — response parsing (3 cases)
  • TestClaudeController_StatusChangeListener_* — fan-out refactor doesn't break existing tests
  • dispatch.test.ts: 24 tests pass including new auto_fix and create_session (autonomous) cases
  • go build . — clean
  • go test ./session ./server/services/... — all pass

🤖 Generated with Claude Code

tstapler and others added 3 commits June 5, 2026 08:03
…pture for OneShot

Two improvements to OneShot (-p) session control:

1. **Fix initial prompt not submitting** (\r instead of \n): The session driver
   was sending initialPrompt + "\n" (LF) to the PTY. Claude Code's interactive
   readline needs \r (CR) to submit — the same signal a physical Enter key sends.
   Sessions received the typed text but never executed it, causing inactivity
   timeouts. Confirmed by log: "sent initial prompt" followed by 10-min idle.
   The startup dialog answers ("1\n") work because those menus handle both,
   but Claude's readline interface only responds to \r.

2. **Capture claude session_id from --output-format json output**: OneShot
   sessions now launch with -p --output-format json. When the session exits,
   the driver parses the JSON output for "session_id" and stores it as
   ConversationUUID on the instance. Subsequent restarts automatically use
   --resume <uuid>, sending Claude back into the same conversation with full
   context instead of re-running the task from scratch.

   Supporting infrastructure:
   - SetClaudeConversationUUID() — thread-safe setter that fires a save callback
   - SetClaudeSessionIDSavedCallback() — wired in service layer to flush to DB
   - wireClaudeSessionIDCallback() — registered on all creation paths including
     loadInstancesWithWiring (startup) and CreateDirectorySession (backlog)
   - Prompt is now appended even when claudeSessionID is set for OneShot
     sessions so continuation prompts are delivered after --resume

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… --permission-mode

Three programmatic control improvements over Phase 1:

1. **steer_session via subprocess**: When a OneShot session has completed
   (Stopped + ConversationUUID set), steer_session now runs
   'claude -p --resume <uuid> --output-format json <message>' instead of
   PTY send-keys. Returns structured result text in the MCP response.
   Send-keys path (interactive sessions) now uses \r correctly and
   reports method: "send_keys" vs "resume_subprocess".

2. **--allowedTools**: New AllowedTools field on Instance/InstanceOptions/
   proto CreateSessionRequest (field 21). When set, passed as
   --allowedTools to the claude CLI at launch. Allows callers to
   pre-approve specific tools (e.g. "Bash(git *),Read,Edit") without
   the all-or-nothing --dangerously-skip-permissions flag.

3. **--permission-mode**: New PermissionMode field (proto field 22).
   Passes --permission-mode to claude at launch. Supports values like
   "acceptEdits" (auto-approve file writes only) and "auto" (full
   autonomous classification).

Also:
- parseJSONField() generic helper in session_driver.go; parseClaudeSessionID()
  now delegates to it instead of duplicating the scan logic
- GetClaudeConversationUUID() thread-safe getter on Instance
- RunWithResume() on Instance: spawns subprocess, parses result, updates UUID

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ener fan-out

Lays the groundwork for pointing Stapler Squad at a GitHub issue/PR and
having it fix the issue without human steering.

**E1 — StatusChangeListener fan-out** (`claude_controller.go`)
Replace single-listener slot with `[]StatusChangeListener` slice so both
the existing session driver and the new AutonomousDriver can subscribe.
`SetStatusChangeListener` kept as a backwards-compatible shim.

**E2 — AutonomousDriver** (`session/autonomous_driver.go`)
New goroutine that runs when `AutonomousMode=true`:
- Registers as a status-change listener (signals a channel; never blocks
  the listener callback)
- Waits for `ClaudeController` idle state before each turn
- Calls the headless LLM pool (`FeatureKey("autonomous_fix-<id[:8]>")`)
  with goal + session tail → decides NEXT_MESSAGE or DONE
- Injects via `SendCommandImmediate` from its own goroutine (avoids PTY
  race with CommandQueue executor)
- `atomic.Bool` idempotency guard + `defer recover()` + 20-turn max cap
- `ExtractPRURL` scans last 200 lines for newly-created PR URL

**E3 — Session creation wiring** (proto + session_service + instance)
- `bool autonomous_mode = 23` added to `CreateSessionRequest`
- `InstanceOptions.AutonomousMode` threaded through `New()` constructor
- `CreateSession` starts the driver when `AutonomousMode=true &&
  headlessPool != nil`

14 tests added: max-turns, done-signal, idempotency, status-channel,
panic-recovery, stop, PR URL extraction, prompt parsing.

P1–P3 epics (omnibar UI, backlog button, LLM approval hook, goal
completion notifications, GitHub PR plugin) are follow-up PRs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 9, 2026 21:22

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR lays the P0 groundwork for “autonomous” Stapler Squad sessions by introducing an AutonomousDriver loop (orchestrator → inject next message when idle), expanding ClaudeController status-change callbacks to support multiple subscribers, and wiring new session creation options (allowed_tools, permission_mode, autonomous_mode) through proto → service → instance startup.

Changes:

  • Added session/AutonomousDriver to observe idle transitions and inject orchestrator-selected messages via SendCommandImmediate, with max-turns + stop + panic recovery.
  • Refactored ClaudeController to fan-out status-change notifications to multiple listeners (adds AddStatusChangeListener, keeps SetStatusChangeListener as a shim), and updated callback wiring/tests accordingly.
  • Added OneShot session “session_id” capture + --resume subprocess steering path, plus new create-session request fields and instance launch flags (--allowedTools, --permission-mode, --output-format json).

Reviewed changes

Copilot reviewed 21 out of 33 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
session/session_driver.go Captures Claude session_id for OneShot sessions; switches initial prompt send terminator to \r; adds lightweight JSON field extraction helper.
session/session_driver_test.go Adds unit tests for parseClaudeSessionID.
session/instance.go Adds AllowedTools, PermissionMode, and AutonomousMode to instance/options; adds callback hook for persisting newly discovered session_id.
session/instance_tmux.go Passes --allowedTools, --permission-mode, and --output-format json for OneShot launches; adjusts prompt arg behavior.
session/instance_controller.go Switches status wiring to controller fan-out and adds RegisterStatusChangeCallback.
session/instance_claude.go Adds thread-safe UUID accessor; adds RunWithResume + SetClaudeConversationUUID/callback support.
session/claude_controller.go Replaces single listener with fan-out slice + lock; updates status-change loop to notify all listeners.
session/claude_controller_test.go Updates listener tests to use AddStatusChangeListener.
session/autonomous_driver.go New autonomous orchestrator/injection loop with rate-limit waiting, max-turn cap, DONE sentinel, PR URL extraction.
session/autonomous_driver_test.go New unit tests for orchestration parsing, max-turn cap, DONE exit, stop/idempotency, etc.
server/services/session_service.go Wires new create-session options; persists discovered session_id; starts autonomous driver on autonomous_mode.
server/mcp/tools_terminal.go Enhances steer_session to use claude --resume subprocess for completed OneShot sessions; standardizes PTY send terminator to \r.
proto/session/v1/session.proto Adds allowed_tools, permission_mode, autonomous_mode to CreateSessionRequest.
project_plans/github-autonomous-fix/** Adds research + implementation plan docs for the autonomous fix initiative.
gen/proto/go/session/v1/** Regenerated Connect/Protobuf outputs reflecting proto changes.
Files not reviewed (7)
  • gen/proto/go/session/v1/headless.pb.go: Language not supported
  • gen/proto/go/session/v1/insights.pb.go: Language not supported
  • gen/proto/go/session/v1/sessionv1connect/backlog.connect.go: Language not supported
  • gen/proto/go/session/v1/sessionv1connect/headless.connect.go: Language not supported
  • gen/proto/go/session/v1/sessionv1connect/insights.connect.go: Language not supported
  • gen/proto/go/session/v1/sessionv1connect/unfinished.connect.go: Language not supported
  • gen/proto/go/session/v1/unfinished.pb.go: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread session/session_driver.go Outdated
Comment thread session/instance_claude.go
Comment thread session/instance_claude.go
Comment thread session/autonomous_driver.go
Comment thread session/autonomous_driver.go
Comment thread session/autonomous_driver.go
Comment thread server/services/session_service.go
tstapler and others added 9 commits June 10, 2026 19:31
… GitHub PR plugin

Completes the full autonomous GitHub fix feature on a single branch.

**E4 — User-facing entry points**
- Omnibar: "Fix Autonomously" session type (all 7 touchpoints: type union,
  OmnibarCreationPanel, OmnibarContext sessionTypeMap, useSessionService,
  OmnibarAction `auto_fix` union variant, dispatch case, dispatch tests)
- Backlog: "Run Autonomously" button on items in `ready` status
  (SpawnSessionFromItem proto `autonomous=true` flag, backlog service handler)

**E5 — LLM-assisted approval**
- `FeatureKeyAutonomousFix` + `FeatureKeyAutonomousApproval` in headless features
- `ApprovalHandler.SetAutonomousChecker` + `SetHeadlessPool` injection points
- When a risky tool call hits an autonomous session: headless LLM pool is
  queried (APPROVE:/DENY: response) before falling back to human review queue
- Wired via closure in `server.go` (avoids construction-time circular dep)

**E6 — Goal completion, artifacts, notifications, GitHub PR plugin**
- `onAutonomousDriverComplete` transitions backlog item to done/failed,
  stores PR URL, sends push notification
- `backlog_plugin_github_prs.go`: new `github_prs` plugin fetching open PRs,
  tagging with `pr:review-requested`/`pr:ci-failing`, registered in default plugin registry

**E7 — Feature registry**
- `docs/registry/features/autonomous-fix.json` added

All existing tests pass. New dispatch tests (24 total, 3 new).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Engineering blockers:
- Goroutine leak: replace context.Background() with server lifecycleCtx
  in StartAutonomousDriverForInstance and CreateSession driver start;
  add SetLifecycleContext() wired in wireDepsIntoServer
- Driver registry: add driverRegistry map + RWMutex to SessionService;
  register drivers on start, stop+deregister in DeleteSession and
  HibernateSession to eliminate use-after-delete hazard
- Prompt injection: encode toolInput as JSON in buildApprovalQuery so
  raw command strings cannot embed APPROVE:/DENY: to hijack LLM decision

UX blockers:
- Autonomous badge: add autonomous_mode = 60 to Session proto; populate
  in InstanceToProto adapter; render "Auto" badge in SessionCard
- Stop control: stopping/hibernating a session now stops the autonomous
  driver via the registry — existing pause/delete actions are sufficient

PM gap:
- Add measurable success metrics table to requirements.md (completion
  rate, time-to-PR, LLM approval utilization, goroutine leak rate,
  backlog conversion count)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Code quality:
- autonomous_driver.go: guard Start() against nil headlessPool (returns
  clear error instead of nil-deref panic in run())
- autonomous_driver.go: safe UUID slice for featureKey (guard short/empty
  UUID instead of panicking on sessionID[:8])
- autonomous_driver.go: cap waitForRateLimitClear with 4h maxRateLimitWait
  so an unrecoverable rate limit doesn't loop the driver indefinitely

Security:
- autonomous_driver.go: wrap goal and session tail in <goal>/</goal> and
  <session_output> XML delimiters in buildOrchestrationPrompt so that
  user-controlled PR body cannot spoof a NEXT_MESSAGE/DONE directive

Ops:
- session_service.go: deregister completed drivers in onAutonomousDriverComplete
  so the registry doesn't grow unbounded for long-lived servers
- session_service.go: document intentional context.Background() in completion
  callback (must persist result even during concurrent shutdown)

Tests:
- autonomous_driver_test.go: TestAutonomousDriver_NilPool_Start (nil pool guard)
- autonomous_driver_test.go: TestAutonomousDriver_ShortUUID (UUID < 8 chars)
- autonomous_driver_test.go: TestBuildOrchestrationPrompt_GoalWrappedInDelimiters
- approval_handler_integration_test.go: TestBuildApprovalQuery_PromptInjectionResistance

UX:
- SessionCard.tsx: add aria-label to autonomous badge for accessibility

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ver registry

Covers: lifecycle context wiring, register/deregister semantics, delete-stops-driver,
and completion-callback deregistration — the four behaviors identified by the Plan
reviewer as missing from server/services coverage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…g, UX hints, PM doc

Engineering:
- approval_handler.go: fill empty sessionTail in autonomous LLM approval query by
  fetching session preview via queueChecker; LLM now has context to assess tool safety
- autonomous_driver.go: log warning when session does not become idle after a turn
  (return value of waitForIdle was silently ignored)

UX:
- OmnibarCreationPanel: clarify autonomous hint to mention delete/hibernate stops the run
- SessionCard: add role="status" to Auto badge for screen reader announcements

PM:
- requirements.md: add Target Users personas, Risky Assumptions, and Observability & SLA

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Label "Fix Autonomously" → "Fix Autonomously (Beta)" to signal experimental status
- Hint text now explicitly states LLM reviewer decides tool permissions and that a
  completion notification fires

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- session_driver.go: parseJSONField now uses escape-aware rune scan instead of
  naive strings.Index, correctly handles \" \\ \n \t \r and other escape sequences
- instance_claude.go: GetConversationUUID now holds stateMutex.RLock to prevent
  data race with concurrent SetClaudeConversationUUID calls
- instance_claude.go: SetClaudeConversationUUID is idempotent — no-op and no
  callback if the UUID hasn't changed, preventing spurious storage saves on restart

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
# Conflicts:
#	docs/registry/features/backend/approval/delete-rule.json
#	docs/registry/features/backend/approval/get-analytics.json
#	docs/registry/features/backend/approval/list-pending.json
#	docs/registry/features/backend/approval/list-rules.json
#	docs/registry/features/backend/approval/resolve.json
#	docs/registry/features/backend/approval/upsert-rule.json
#	gen/proto/go/session/v1/backlog.pb.go
#	gen/proto/go/session/v1/session.pb.go
#	gen/proto/go/session/v1/sessionv1connect/session.connect.go
#	gen/proto/go/session/v1/types.pb.go
#	proto/session/v1/session.proto
#	server/services/session_service.go
#	session/instance.go
#	session/instance_claude.go
#	session/session_driver.go
@github-actions

Copy link
Copy Markdown
Contributor

✅ Registry Validation

Registry Validation
===================

Building backend scanner...
Scanning backend features...
Wrote 97 feature files to /tmp/tmp.maYbPfr4XH/backend
Wrote 14 feature files to /tmp/tmp.maYbPfr4XH/backend
Wrote 22 feature files to /tmp/tmp.maYbPfr4XH/backend
Wrote 6 feature files to /tmp/tmp.maYbPfr4XH/backend

=== Backend Registry Diff ===
Committed: 132  Generated: 130  Divergence: 1.52%
⚠️  Removed RPCs:
  - backlog:spawn-session-autonomous
  - upload:image
⚠️  90 feature(s) missing // +api: marker (markerFound: false)

⚠️  Divergence 1.52% above warning threshold.

Test Coverage: 93/132 features have testIds (70.5%)

Registry validation is in observation mode until 2026-05-02.
After that date, divergence > 2% will block merges.
Coverage reporting is advisory only.

@github-actions

Copy link
Copy Markdown
Contributor

Go Benchmarks (Tier 1)

benchmarks/go/tier1-baseline.txt:8: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:2035: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:4040: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:6040: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:8071: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:10099: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:12134: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:14168: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:16192: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:22884: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:30608: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:37827: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:44589: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:51346: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:58304: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:64780: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:71600: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:77066: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:82029: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:87665: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:93279: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:98752: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:104017: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:109727: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:115757: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:123122: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:130424: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:137211: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:144671: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:152393: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:159266: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:166225: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:173299: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:180826: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:188270: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:196607: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:205005: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:212668: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:220592: parsing iteration count: invalid syntax
benchmarks/go/tier1-baseline.txt:228363: parsing iteration count: invalid syntax
tier1-bench.txt:8: parsing iteration count: invalid syntax
tier1-bench.txt:1983: parsing iteration count: invalid syntax
tier1-bench.txt:4023: parsing iteration count: invalid syntax
tier1-bench.txt:5969: parsing iteration count: invalid syntax
tier1-bench.txt:7938: parsing iteration count: invalid syntax
tier1-bench.txt:9954: parsing iteration count: invalid syntax
tier1-bench.txt:11938: parsing iteration count: invalid syntax
tier1-bench.txt:13950: parsing iteration count: invalid syntax
tier1-bench.txt:17447: parsing iteration count: invalid syntax
tier1-bench.txt:24030: parsing iteration count: invalid syntax
tier1-bench.txt:31151: parsing iteration count: invalid syntax
tier1-bench.txt:37883: parsing iteration count: invalid syntax
tier1-bench.txt:44104: parsing iteration count: invalid syntax
tier1-bench.txt:50605: parsing iteration count: invalid syntax
tier1-bench.txt:57291: parsing iteration count: invalid syntax
tier1-bench.txt:64217: parsing iteration count: invalid syntax
tier1-bench.txt:70595: parsing iteration count: invalid syntax
tier1-bench.txt:75258: parsing iteration count: invalid syntax
tier1-bench.txt:80735: parsing iteration count: invalid syntax
tier1-bench.txt:86116: parsing iteration count: invalid syntax
tier1-bench.txt:91520: parsing iteration count: invalid syntax
tier1-bench.txt:96813: parsing iteration count: invalid syntax
tier1-bench.txt:102475: parsing iteration count: invalid syntax
tier1-bench.txt:107964: parsing iteration count: invalid syntax
tier1-bench.txt:113229: parsing iteration count: invalid syntax
tier1-bench.txt:119985: parsing iteration count: invalid syntax
tier1-bench.txt:127182: parsing iteration count: invalid syntax
tier1-bench.txt:134717: parsing iteration count: invalid syntax
tier1-bench.txt:141616: parsing iteration count: invalid syntax
tier1-bench.txt:148770: parsing iteration count: invalid syntax
tier1-bench.txt:156827: parsing iteration count: invalid syntax
tier1-bench.txt:164035: parsing iteration count: invalid syntax
tier1-bench.txt:171167: parsing iteration count: invalid syntax
tier1-bench.txt:178981: parsing iteration count: invalid syntax
tier1-bench.txt:186269: parsing iteration count: invalid syntax
tier1-bench.txt:194083: parsing iteration count: invalid syntax
tier1-bench.txt:201558: parsing iteration count: invalid syntax
tier1-bench.txt:209158: parsing iteration count: invalid syntax
tier1-bench.txt:216375: parsing iteration count: invalid syntax
tier1-bench.txt:223581: parsing iteration count: invalid syntax
goos: linux
goarch: amd64
pkg: github.com/tstapler/stapler-squad/session/detection/ratelimit
cpu: AMD EPYC 9V74 80-Core Processor                
                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt          │
                              │              sec/op              │   sec/op     vs base              │
StripANSI_PlainText-4                                6.915n ± 4%   6.946n ± 4%       ~ (p=0.442 n=8)
StripANSI_WithEscapes-4                              665.6n ± 1%   663.1n ± 1%       ~ (p=1.000 n=8)
ProcessOutput_InactiveState-4                        6.606n ± 0%   6.664n ± 1%  +0.89% (p=0.010 n=8)
geomean                                              31.21n        31.31n       +0.33%

                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                              │               B/op               │    B/op     vs base                │
StripANSI_PlainText-4                               0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
StripANSI_WithEscapes-4                             136.0 ± 0%     136.0 ± 0%       ~ (p=1.000 n=8) ¹
ProcessOutput_InactiveState-4                       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                        ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                              │            allocs/op             │ allocs/op   vs base                │
StripANSI_PlainText-4                               0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
StripANSI_WithEscapes-4                             5.000 ± 0%     5.000 ± 0%       ~ (p=1.000 n=8) ¹
ProcessOutput_InactiveState-4                       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                        ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/tstapler/stapler-squad/session/queue
                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                              │              sec/op              │    sec/op     vs base              │
ReviewQueue_ConcurrentReads-4                       91.80n ± 13%   80.88n ± 15%       ~ (p=0.505 n=8)
ReviewQueue_Add-4                                   406.1n ±  0%   409.9n ±  1%  +0.95% (p=0.000 n=8)
geomean                                             193.1n         182.1n        -5.69%

                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                              │               B/op               │    B/op     vs base                │
ReviewQueue_ConcurrentReads-4                       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
ReviewQueue_Add-4                                   640.0 ± 0%     640.0 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                        ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                              │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                              │            allocs/op             │ allocs/op   vs base                │
ReviewQueue_ConcurrentReads-4                       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
ReviewQueue_Add-4                                   4.000 ± 0%     4.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                        ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/tstapler/stapler-squad/session/scrollback
                                      │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt          │
                                      │              sec/op              │   sec/op     vs base              │
CircularBuffer_ConcurrentReadWrite-4                         3.167µ ± 2%   3.149µ ± 2%       ~ (p=0.266 n=8)
CircularBuffer_BurstAppend-4                                 105.3µ ± 1%   109.4µ ± 3%  +3.84% (p=0.000 n=8)
CircularBuffer_GetLastN_LargeBuffer-4                        15.68µ ± 1%   16.50µ ± 3%  +5.25% (p=0.000 n=8)
CircularBuffer_GetRange_Sequential-4                         9.069µ ± 3%   9.953µ ± 4%  +9.75% (p=0.000 n=8)
CircularBufferAppend-4                                       103.4n ± 0%   105.6n ± 1%  +2.13% (p=0.000 n=8)
CircularBufferGetLastN-4                                     1.752µ ± 1%   1.893µ ± 3%  +8.05% (p=0.000 n=8)
CircularBufferConcurrentAppend-4                             133.9n ± 0%   141.2n ± 4%  +5.41% (p=0.000 n=8)
geomean                                                      2.737µ        2.868µ       +4.79%

                                      │ benchmarks/go/tier1-baseline.txt │           tier1-bench.txt            │
                                      │               B/op               │     B/op      vs base                │
CircularBuffer_ConcurrentReadWrite-4                        6.062Ki ± 0%   6.062Ki ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_BurstAppend-4                                62.50Ki ± 0%   62.50Ki ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_GetLastN_LargeBuffer-4                       56.00Ki ± 0%   56.00Ki ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_GetRange_Sequential-4                        28.00Ki ± 0%   28.00Ki ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferAppend-4                                        24.00 ± 0%     24.00 ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferGetLastN-4                                    6.000Ki ± 0%   6.000Ki ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferConcurrentAppend-4                              32.00 ± 0%     32.00 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                     3.077Ki        3.077Ki       +0.00%
¹ all samples are equal

                                      │ benchmarks/go/tier1-baseline.txt │           tier1-bench.txt           │
                                      │            allocs/op             │  allocs/op   vs base                │
CircularBuffer_ConcurrentReadWrite-4                          2.000 ± 0%    2.000 ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_BurstAppend-4                                 1.000k ± 0%   1.000k ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_GetLastN_LargeBuffer-4                         1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=8) ¹
CircularBuffer_GetRange_Sequential-4                          1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferAppend-4                                        1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferGetLastN-4                                      1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=8) ¹
CircularBufferConcurrentAppend-4                              1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                       2.962         2.962       +0.00%
¹ all samples are equal

                             │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                             │               B/s                │     B/s       vs base              │
CircularBuffer_BurstAppend-4                       579.5Mi ± 1%   558.2Mi ± 3%  -3.69% (p=0.000 n=8)

pkg: github.com/tstapler/stapler-squad/session/tmux
                             │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt          │
                             │              sec/op              │   sec/op     vs base              │
StripANSICodes_PlainText-4                          6.649n ± 5%   6.422n ± 2%       ~ (p=0.234 n=8)
StripANSICodes_WithEscapes-4                        599.5n ± 1%   604.5n ± 1%  +0.84% (p=0.021 n=8)
IsBanner_PlainText-4                                452.4n ± 1%   452.4n ± 1%       ~ (p=1.000 n=8)
geomean                                             121.7n        120.7n       -0.87%

                             │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                             │               B/op               │    B/op     vs base                │
StripANSICodes_PlainText-4                         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
StripANSICodes_WithEscapes-4                       56.00 ± 0%     56.00 ± 0%       ~ (p=1.000 n=8) ¹
IsBanner_PlainText-4                               0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                       ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                             │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                             │            allocs/op             │ allocs/op   vs base                │
StripANSICodes_PlainText-4                         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
StripANSICodes_WithEscapes-4                       4.000 ± 0%     4.000 ± 0%       ~ (p=1.000 n=8) ¹
IsBanner_PlainText-4                               0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                       ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/tstapler/stapler-squad/session/tokens
                                   │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt          │
                                   │              sec/op              │   sec/op     vs base              │
TokenParser_ProcessUserEntry-4                            5.469m ± 1%   5.500m ± 0%       ~ (p=0.105 n=8)
DetectCommandsInText/NoSlash-4                            6.697n ± 2%   6.692n ± 1%       ~ (p=0.523 n=8)
DetectCommandsInText/WithCommand-4                        1.443µ ± 1%   1.440µ ± 1%       ~ (p=0.589 n=8)
geomean                                                   3.752µ        3.756µ       +0.11%

                                   │ benchmarks/go/tier1-baseline.txt │           tier1-bench.txt            │
                                   │               B/op               │     B/op      vs base                │
TokenParser_ProcessUserEntry-4                         11.02Mi ± 0%     11.02Mi ± 0%       ~ (p=0.993 n=8)
DetectCommandsInText/NoSlash-4                           0.000 ± 0%       0.000 ± 0%       ~ (p=1.000 n=8) ¹
DetectCommandsInText/WithCommand-4                       433.5 ± 0%       433.0 ± 0%       ~ (p=0.608 n=8)
geomean                                                             ²                 -0.04%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                   │ benchmarks/go/tier1-baseline.txt │          tier1-bench.txt           │
                                   │            allocs/op             │ allocs/op   vs base                │
TokenParser_ProcessUserEntry-4                           34.00 ± 0%     34.00 ± 0%       ~ (p=1.000 n=8) ¹
DetectCommandsInText/NoSlash-4                           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=8) ¹
DetectCommandsInText/WithCommand-4                       6.000 ± 0%     6.000 ± 0%       ~ (p=1.000 n=8) ¹
geomean                                                             ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

@github-actions

Copy link
Copy Markdown
Contributor

E2E RPC Latency

list-sessions-ttfb-mean: 4ms (▼ faster -33.3%; baseline: 7ms)
list-sessions-total-mean: 5ms (▼ faster -40.7%; baseline: 8ms)

@github-actions

Copy link
Copy Markdown
Contributor

UX Analysis

Check Status Details
✅ Axe Core (WCAG 2.1 AA) success Critical/serious violations block merge
⚠️ Lighthouse Performance Score: unknown Warning if < 70 (non-blocking)
🤖 Claude UX Analysis Advisory See docs/qa/ for findings

Axe Core excludes terminal rendering areas (intentional design).
Lighthouse runs in desktop preset for this developer tool.

@github-actions

Copy link
Copy Markdown
Contributor

Frontend Terminal Throughput

terminal-throughput-mean: 16 KB/s ▲ +14.3% (baseline: 14 KB/s)
terminal-throughput-p50: 16 KB/s ▼ -0.1% (baseline: 16 KB/s)

@github-actions

Copy link
Copy Markdown
Contributor

🎬 E2E Feature Demos

2 shard(s) recorded feature flows for this PR.

recordings shard 1
recordings shard 2

Demo preview opens directly in browser (single-file HTML). Raw WebM recordings in ZIP. Expires after 30 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants