feat(autonomous): P0 foundation — AutonomousDriver + StatusChangeListener fan-out#105
feat(autonomous): P0 foundation — AutonomousDriver + StatusChangeListener fan-out#105tstapler wants to merge 12 commits into
Conversation
…pture for OneShot
Two improvements to OneShot (-p) session control:
1. **Fix initial prompt not submitting** (\r instead of \n): The session driver
was sending initialPrompt + "\n" (LF) to the PTY. Claude Code's interactive
readline needs \r (CR) to submit — the same signal a physical Enter key sends.
Sessions received the typed text but never executed it, causing inactivity
timeouts. Confirmed by log: "sent initial prompt" followed by 10-min idle.
The startup dialog answers ("1\n") work because those menus handle both,
but Claude's readline interface only responds to \r.
2. **Capture claude session_id from --output-format json output**: OneShot
sessions now launch with -p --output-format json. When the session exits,
the driver parses the JSON output for "session_id" and stores it as
ConversationUUID on the instance. Subsequent restarts automatically use
--resume <uuid>, sending Claude back into the same conversation with full
context instead of re-running the task from scratch.
Supporting infrastructure:
- SetClaudeConversationUUID() — thread-safe setter that fires a save callback
- SetClaudeSessionIDSavedCallback() — wired in service layer to flush to DB
- wireClaudeSessionIDCallback() — registered on all creation paths including
loadInstancesWithWiring (startup) and CreateDirectorySession (backlog)
- Prompt is now appended even when claudeSessionID is set for OneShot
sessions so continuation prompts are delivered after --resume
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… --permission-mode Three programmatic control improvements over Phase 1: 1. **steer_session via subprocess**: When a OneShot session has completed (Stopped + ConversationUUID set), steer_session now runs 'claude -p --resume <uuid> --output-format json <message>' instead of PTY send-keys. Returns structured result text in the MCP response. Send-keys path (interactive sessions) now uses \r correctly and reports method: "send_keys" vs "resume_subprocess". 2. **--allowedTools**: New AllowedTools field on Instance/InstanceOptions/ proto CreateSessionRequest (field 21). When set, passed as --allowedTools to the claude CLI at launch. Allows callers to pre-approve specific tools (e.g. "Bash(git *),Read,Edit") without the all-or-nothing --dangerously-skip-permissions flag. 3. **--permission-mode**: New PermissionMode field (proto field 22). Passes --permission-mode to claude at launch. Supports values like "acceptEdits" (auto-approve file writes only) and "auto" (full autonomous classification). Also: - parseJSONField() generic helper in session_driver.go; parseClaudeSessionID() now delegates to it instead of duplicating the scan logic - GetClaudeConversationUUID() thread-safe getter on Instance - RunWithResume() on Instance: spawns subprocess, parses result, updates UUID Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ener fan-out
Lays the groundwork for pointing Stapler Squad at a GitHub issue/PR and
having it fix the issue without human steering.
**E1 — StatusChangeListener fan-out** (`claude_controller.go`)
Replace single-listener slot with `[]StatusChangeListener` slice so both
the existing session driver and the new AutonomousDriver can subscribe.
`SetStatusChangeListener` kept as a backwards-compatible shim.
**E2 — AutonomousDriver** (`session/autonomous_driver.go`)
New goroutine that runs when `AutonomousMode=true`:
- Registers as a status-change listener (signals a channel; never blocks
the listener callback)
- Waits for `ClaudeController` idle state before each turn
- Calls the headless LLM pool (`FeatureKey("autonomous_fix-<id[:8]>")`)
with goal + session tail → decides NEXT_MESSAGE or DONE
- Injects via `SendCommandImmediate` from its own goroutine (avoids PTY
race with CommandQueue executor)
- `atomic.Bool` idempotency guard + `defer recover()` + 20-turn max cap
- `ExtractPRURL` scans last 200 lines for newly-created PR URL
**E3 — Session creation wiring** (proto + session_service + instance)
- `bool autonomous_mode = 23` added to `CreateSessionRequest`
- `InstanceOptions.AutonomousMode` threaded through `New()` constructor
- `CreateSession` starts the driver when `AutonomousMode=true &&
headlessPool != nil`
14 tests added: max-turns, done-signal, idempotency, status-channel,
panic-recovery, stop, PR URL extraction, prompt parsing.
P1–P3 epics (omnibar UI, backlog button, LLM approval hook, goal
completion notifications, GitHub PR plugin) are follow-up PRs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR lays the P0 groundwork for “autonomous” Stapler Squad sessions by introducing an AutonomousDriver loop (orchestrator → inject next message when idle), expanding ClaudeController status-change callbacks to support multiple subscribers, and wiring new session creation options (allowed_tools, permission_mode, autonomous_mode) through proto → service → instance startup.
Changes:
- Added
session/AutonomousDriverto observe idle transitions and inject orchestrator-selected messages viaSendCommandImmediate, with max-turns + stop + panic recovery. - Refactored
ClaudeControllerto fan-out status-change notifications to multiple listeners (addsAddStatusChangeListener, keepsSetStatusChangeListeneras a shim), and updated callback wiring/tests accordingly. - Added OneShot session “session_id” capture +
--resumesubprocess steering path, plus new create-session request fields and instance launch flags (--allowedTools,--permission-mode,--output-format json).
Reviewed changes
Copilot reviewed 21 out of 33 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| session/session_driver.go | Captures Claude session_id for OneShot sessions; switches initial prompt send terminator to \r; adds lightweight JSON field extraction helper. |
| session/session_driver_test.go | Adds unit tests for parseClaudeSessionID. |
| session/instance.go | Adds AllowedTools, PermissionMode, and AutonomousMode to instance/options; adds callback hook for persisting newly discovered session_id. |
| session/instance_tmux.go | Passes --allowedTools, --permission-mode, and --output-format json for OneShot launches; adjusts prompt arg behavior. |
| session/instance_controller.go | Switches status wiring to controller fan-out and adds RegisterStatusChangeCallback. |
| session/instance_claude.go | Adds thread-safe UUID accessor; adds RunWithResume + SetClaudeConversationUUID/callback support. |
| session/claude_controller.go | Replaces single listener with fan-out slice + lock; updates status-change loop to notify all listeners. |
| session/claude_controller_test.go | Updates listener tests to use AddStatusChangeListener. |
| session/autonomous_driver.go | New autonomous orchestrator/injection loop with rate-limit waiting, max-turn cap, DONE sentinel, PR URL extraction. |
| session/autonomous_driver_test.go | New unit tests for orchestration parsing, max-turn cap, DONE exit, stop/idempotency, etc. |
| server/services/session_service.go | Wires new create-session options; persists discovered session_id; starts autonomous driver on autonomous_mode. |
| server/mcp/tools_terminal.go | Enhances steer_session to use claude --resume subprocess for completed OneShot sessions; standardizes PTY send terminator to \r. |
| proto/session/v1/session.proto | Adds allowed_tools, permission_mode, autonomous_mode to CreateSessionRequest. |
| project_plans/github-autonomous-fix/** | Adds research + implementation plan docs for the autonomous fix initiative. |
| gen/proto/go/session/v1/** | Regenerated Connect/Protobuf outputs reflecting proto changes. |
Files not reviewed (7)
- gen/proto/go/session/v1/headless.pb.go: Language not supported
- gen/proto/go/session/v1/insights.pb.go: Language not supported
- gen/proto/go/session/v1/sessionv1connect/backlog.connect.go: Language not supported
- gen/proto/go/session/v1/sessionv1connect/headless.connect.go: Language not supported
- gen/proto/go/session/v1/sessionv1connect/insights.connect.go: Language not supported
- gen/proto/go/session/v1/sessionv1connect/unfinished.connect.go: Language not supported
- gen/proto/go/session/v1/unfinished.pb.go: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… GitHub PR plugin Completes the full autonomous GitHub fix feature on a single branch. **E4 — User-facing entry points** - Omnibar: "Fix Autonomously" session type (all 7 touchpoints: type union, OmnibarCreationPanel, OmnibarContext sessionTypeMap, useSessionService, OmnibarAction `auto_fix` union variant, dispatch case, dispatch tests) - Backlog: "Run Autonomously" button on items in `ready` status (SpawnSessionFromItem proto `autonomous=true` flag, backlog service handler) **E5 — LLM-assisted approval** - `FeatureKeyAutonomousFix` + `FeatureKeyAutonomousApproval` in headless features - `ApprovalHandler.SetAutonomousChecker` + `SetHeadlessPool` injection points - When a risky tool call hits an autonomous session: headless LLM pool is queried (APPROVE:/DENY: response) before falling back to human review queue - Wired via closure in `server.go` (avoids construction-time circular dep) **E6 — Goal completion, artifacts, notifications, GitHub PR plugin** - `onAutonomousDriverComplete` transitions backlog item to done/failed, stores PR URL, sends push notification - `backlog_plugin_github_prs.go`: new `github_prs` plugin fetching open PRs, tagging with `pr:review-requested`/`pr:ci-failing`, registered in default plugin registry **E7 — Feature registry** - `docs/registry/features/autonomous-fix.json` added All existing tests pass. New dispatch tests (24 total, 3 new). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Engineering blockers: - Goroutine leak: replace context.Background() with server lifecycleCtx in StartAutonomousDriverForInstance and CreateSession driver start; add SetLifecycleContext() wired in wireDepsIntoServer - Driver registry: add driverRegistry map + RWMutex to SessionService; register drivers on start, stop+deregister in DeleteSession and HibernateSession to eliminate use-after-delete hazard - Prompt injection: encode toolInput as JSON in buildApprovalQuery so raw command strings cannot embed APPROVE:/DENY: to hijack LLM decision UX blockers: - Autonomous badge: add autonomous_mode = 60 to Session proto; populate in InstanceToProto adapter; render "Auto" badge in SessionCard - Stop control: stopping/hibernating a session now stops the autonomous driver via the registry — existing pause/delete actions are sufficient PM gap: - Add measurable success metrics table to requirements.md (completion rate, time-to-PR, LLM approval utilization, goroutine leak rate, backlog conversion count) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Code quality: - autonomous_driver.go: guard Start() against nil headlessPool (returns clear error instead of nil-deref panic in run()) - autonomous_driver.go: safe UUID slice for featureKey (guard short/empty UUID instead of panicking on sessionID[:8]) - autonomous_driver.go: cap waitForRateLimitClear with 4h maxRateLimitWait so an unrecoverable rate limit doesn't loop the driver indefinitely Security: - autonomous_driver.go: wrap goal and session tail in <goal>/</goal> and <session_output> XML delimiters in buildOrchestrationPrompt so that user-controlled PR body cannot spoof a NEXT_MESSAGE/DONE directive Ops: - session_service.go: deregister completed drivers in onAutonomousDriverComplete so the registry doesn't grow unbounded for long-lived servers - session_service.go: document intentional context.Background() in completion callback (must persist result even during concurrent shutdown) Tests: - autonomous_driver_test.go: TestAutonomousDriver_NilPool_Start (nil pool guard) - autonomous_driver_test.go: TestAutonomousDriver_ShortUUID (UUID < 8 chars) - autonomous_driver_test.go: TestBuildOrchestrationPrompt_GoalWrappedInDelimiters - approval_handler_integration_test.go: TestBuildApprovalQuery_PromptInjectionResistance UX: - SessionCard.tsx: add aria-label to autonomous badge for accessibility Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ver registry Covers: lifecycle context wiring, register/deregister semantics, delete-stops-driver, and completion-callback deregistration — the four behaviors identified by the Plan reviewer as missing from server/services coverage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…g, UX hints, PM doc Engineering: - approval_handler.go: fill empty sessionTail in autonomous LLM approval query by fetching session preview via queueChecker; LLM now has context to assess tool safety - autonomous_driver.go: log warning when session does not become idle after a turn (return value of waitForIdle was silently ignored) UX: - OmnibarCreationPanel: clarify autonomous hint to mention delete/hibernate stops the run - SessionCard: add role="status" to Auto badge for screen reader announcements PM: - requirements.md: add Target Users personas, Risky Assumptions, and Observability & SLA Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Label "Fix Autonomously" → "Fix Autonomously (Beta)" to signal experimental status - Hint text now explicitly states LLM reviewer decides tool permissions and that a completion notification fires Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- session_driver.go: parseJSONField now uses escape-aware rune scan instead of naive strings.Index, correctly handles \" \\ \n \t \r and other escape sequences - instance_claude.go: GetConversationUUID now holds stateMutex.RLock to prevent data race with concurrent SetClaudeConversationUUID calls - instance_claude.go: SetClaudeConversationUUID is idempotent — no-op and no callback if the UUID hasn't changed, preventing spurious storage saves on restart Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
# Conflicts: # docs/registry/features/backend/approval/delete-rule.json # docs/registry/features/backend/approval/get-analytics.json # docs/registry/features/backend/approval/list-pending.json # docs/registry/features/backend/approval/list-rules.json # docs/registry/features/backend/approval/resolve.json # docs/registry/features/backend/approval/upsert-rule.json # gen/proto/go/session/v1/backlog.pb.go # gen/proto/go/session/v1/session.pb.go # gen/proto/go/session/v1/sessionv1connect/session.connect.go # gen/proto/go/session/v1/types.pb.go # proto/session/v1/session.proto # server/services/session_service.go # session/instance.go # session/instance_claude.go # session/session_driver.go
✅ Registry ValidationTest Coverage: 93/132 features have
|
Go Benchmarks (Tier 1) |
E2E RPC Latency |
UX Analysis
|
Frontend Terminal Throughput |
🎬 E2E Feature Demos2 shard(s) recorded feature flows for this PR. recordings shard 1 Demo preview opens directly in browser (single-file HTML). Raw WebM recordings in ZIP. Expires after 30 days. |
Summary
Complete end-to-end implementation for pointing Stapler Squad at a GitHub issue/PR and having it fix the issue autonomously.
session/autonomous_driver.go— NewAutonomousDrivergoroutine: waits forClaudeControlleridle state, calls headless LLM pool to decideNEXT_MESSAGEvsDONE, injects viaSendCommandImmediate. Idempotency guard + panic recovery + 20-turn cap.ExtractPRURLscans last 200 lines for newly-created PR URL.session/claude_controller.go— Fan-out[]StatusChangeListenerslice replaces single slot so both session driver and AutonomousDriver can coexist.SetStatusChangeListenerkept as compatibility shim.bool autonomous_mode = 23onCreateSessionRequest;InstanceOptions.AutonomousModethreaded through constructor;CreateSessionstarts the driver when flag is set.sessionTypeunion,OmnibarCreationPanelentry,OmnibarContextsessionTypeMap,useSessionServiceRPC fields,OmnibarAction auto_fixunion variant, dispatch case, dispatch tests.SpawnSessionFromItemRequest.autonomousproto flag; handler wiresAutonomousMode: true, PermissionMode: "auto"; UI button onreadyitems.FeatureKeyAutonomousApprovalin headless features;ApprovalHandler.SetAutonomousChecker+SetHeadlessPool; risky tool calls on autonomous sessions go to headless LLM (APPROVE:/DENY:) before falling back to human review queue. Wired via closure inserver.go.onAutonomousDriverCompletetransitions backlog item todone/failed, stores PR URL, sends push notification.backlog_plugin_github_prs.gofetches open PRs, tags withpr:review-requested/pr:ci-failing, registered in default plugin registry.Demo harness
TestAutonomousDriver_*insession/autonomous_driver_test.gousesFakeHeadlessPool+ fake controller — zero real credentials needed. Proves the full loop: spawn session → driver detects idle → injects message → detects completion → fires callback.Test plan
TestAutonomousDriver_MaxTurnsLimit— exits aftermaxTurnsinjectionsTestAutonomousDriver_DoneSignal— exits immediately onDONE:responseTestAutonomousDriver_IdempotencyGuard— secondStart()is a no-opTestAutonomousDriver_StatusChannelSignal— listener fires channel without blockingTestAutonomousDriver_PanicRecovery— panicking LLM pool is recoveredTestAutonomousDriver_Stop_CancelsLoop—Stop()cancels in-flight runTestExtractPRURL_*— PR URL regex (4 cases)TestParseOrchestrationResponse_*— response parsing (3 cases)TestClaudeController_StatusChangeListener_*— fan-out refactor doesn't break existing testsauto_fixandcreate_session (autonomous)casesgo build .— cleango test ./session ./server/services/...— all pass🤖 Generated with Claude Code