You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jun 8, 2026. It is now read-only.
Add a supervisor control surface over the always-on reflection-3 judge — porting the essence of Claude Code's /goal (an independent evaluator that keeps the agent working until a condition holds), which the existing OpenCode goal plugins lack (sentinel-based / todo-driven).
reflection-3.ts already implements the hard part: session.idle → independent judge (throwaway session) → continuation via promptAsync. This adds three things on top:
Configurable rubric — move the judge's hardcoded patterns/antipatterns (currently duplicated inline in two builders) into a single user-editable rubric.md (## Patterns / ## Antipatterns), with an embedded default so the single-file install keeps working.
/supervisor:retry <n> — make the retry budget configurable; raise default 3 → 16.
/supervisor:goal … — session-scoped goal; completion requires the condition AND all applicable workflow gates; auto-clears on achieve; resumes active (CC-faithful).
Design decisions captured in the spec: completion = "goal AND applicable gates"; resume = active; retry scope = all reflection; rubric = one file, two sections.
Spec:docs/superpowers/specs/2026-06-03-supervisor-mode-design.md Plan:docs/superpowers/plans/2026-06-03-supervisor-mode.md (both on branch feat/supervisor-mode)
Phase 0 — Spikes (resolve OpenCode API unknowns)
These gate the command-capture code only. Rubric/retry/store phases do not depend on them and can proceed in parallel.
Task 0.1: Spike command namespacing + arg capture
Step 1: Create a throwaway probe plugin /tmp/probe/probe.ts that logs every event.type and full event.properties to a file, and registers a command.executed log.
Step 2: Add commands two ways and see which produces /supervisor:goal: (a) .opencode/command/supervisor/goal.md; (b) opencode.jsoncommand["supervisor:goal"]. Run opencode and invoke each.
Step 3: Record in the issue: does command.executed carry {name, arguments} for the invoked command? Does a supervisor-namespaced command appear as /supervisor:goal?
Step 4: Decide capture mechanism: A)command.executed event (preferred, deterministic) or B) control-marker in the command template parsed from the user message. Document the choice + payload shape in the issue before Phase 4.
Acceptance: issue comment states the namespacing mechanism, the command.executed payload (or its absence), and the chosen capture path with a concrete example.
Phase 1 — Configurable rubric (no API dependency)
Task 1.1: Extract inline antipatterns into DEFAULT_RUBRIC
Files: Modify reflection-3.ts (near :25); source text from :1140–:1143 (self-assessment) and :1400–:1403 (judge).
Step 1: Write failing test — test/supervisor.unit.test.ts:
importassertfrom"node:assert"import{describe,it}from"@jest/globals"import{DEFAULT_RUBRIC,parseRubric}from"../reflection-3.ts"describe("rubric",()=>{it("DEFAULT_RUBRIC has both sections and the mined antipatterns",()=>{constr=parseRubric(DEFAULT_RUBRIC)assert.ok(r.patterns.length>0,"patterns section present")assert.match(r.antipatterns,/PERMISSION-SEEKING/)assert.match(r.antipatterns,/STOPPED-WITH-TODOS/)assert.match(r.antipatterns,/FALSE-COMPLETE/)})})
Step 2: Run, verify fail — npx jest test/supervisor.unit.test.ts -t rubric → FAIL (DEFAULT_RUBRIC/parseRubric not exported).
Step 3: Implement — add to reflection-3.ts:
exportconstDEFAULT_RUBRIC=`## Patterns<verbatim positive-completion framing extracted from the requirement text in buildSelfAssessmentPrompt :1082-1092>## Antipatterns<verbatim PREMATURE-STOP ANTIPATTERNS block extracted from :1140-1143>`exportfunctionparseRubric(md: string): {patterns: string;antipatterns: string}{constsection=(name: string)=>{constre=newRegExp(`##\\s+${name}\\s*\\n([\\s\\S]*?)(?=\\n##\\s|$)`,"i")return(md.match(re)?.[1]??"").trim()}return{patterns: section("Patterns"),antipatterns: section("Antipatterns")}}
Copy the antipattern text verbatim from the two existing inline blocks (use the more complete :1140 wording).
Task 1.2: loadRubric(directory) with override precedence
Step 1: Failing test:
import{loadRubric}from"../reflection-3.ts"import{mkdtempSync,writeFileSync,mkdirSync}from"node:fs"import{tmpdir}from"node:os";import{join}from"node:path"it("project .reflection/rubric.md overrides default",async()=>{constdir=mkdtempSync(join(tmpdir(),"rub-"))mkdirSync(join(dir,".reflection"),{recursive: true})writeFileSync(join(dir,".reflection/rubric.md"),"## Patterns\nP\n## Antipatterns\nMY-RULE")constr=awaitloadRubric(dir)assert.strictEqual(r.source,"project")assert.match(r.antipatterns,/MY-RULE/)})it("falls back to default when no override / empty file",async()=>{constdir=mkdtempSync(join(tmpdir(),"rub-"))constr=awaitloadRubric(dir)assert.strictEqual(r.source,"default")assert.match(r.antipatterns,/PERMISSION-SEEKING/)})
Step 2: Run, verify fail.
Step 3: ImplementloadRubric(directory): try ${directory}/.reflection/rubric.md (source project) → ~/.config/opencode/supervisor/rubric.md (source global) → DEFAULT_RUBRIC (source default). parseRubric each; if either section empty, fall through to default. Return { patterns, antipatterns, source }.
Step 4: Run, verify pass.
Step 5: Commit.
Task 1.3: Wire loadRubric into both prompt builders
Files:reflection-3.tsbuildSelfAssessmentPrompt:1053, analyzeSelfAssessmentWithLLM:1350, call site runReflection:1717.
Step 1: Failing test — buildSelfAssessmentPrompt accepts a rubric arg and interpolates rubric.antipatterns:
Step 3: Implement — add optional rubric param to both builders, replace the inline antipattern literals with ${rubric.antipatterns} / ${rubric.patterns}; default the param to parseRubric(DEFAULT_RUBRIC) for back-compat. In runReflection, call const rubric = await loadRubric(directory) once and thread it into both builders (and the judge path :1717).
Step 4: Run full unit suite + npm run typecheck; verify pass.
Step 5: Commit — feat(supervisor): load rubric from file with default fallback
Phase 2 — Configurable retry budget
Task 2.1: DEFAULT_MAX_ATTEMPTS = 16 + cap resolver
Step 3: Implement — rename const to DEFAULT_MAX_ATTEMPTS = 16; add resolveMaxAttempts({sessionOverride?, config?}) clamped to [1,100]. Read maxAttempts from reflection.yaml in the config loader.
Step 4: Replace the hardcoded MAX_ATTEMPTS use at :1927/:1929/:1080 with an effectiveMaxAttempts resolved per session (computed in runReflection, passed where needed).
Step 1: Per Phase-0 finding, create .opencode/command/supervisor/goal.md and retry.md (or opencode.json entries). Template carries $ARGUMENTS; if capture path B, prefix a control marker (e.g. <!--supervisor:goal-->).
Step 2: Document install in README.
Step 3: Commit — feat(supervisor): add /supervisor:goal and /supervisor:retry commands
Step 3: ImplementparseSupervisorCommand(name, args); then in the event handler, on the captured command (path A: command.executed; path B: scan latest user message for the marker), call supervisorStore.setGoal/clearGoal/setRetry and showToast the status. Condition clamped to 4000 chars.
Step 4: Run, verify pass. Commit — feat(supervisor): capture /supervisor commands into store
Phase 5 — Goal loop integration
Task 5.1: buildGoalRequirementSection
Step 1: Failing test:
import{buildGoalRequirementSection}from"../reflection-3.ts"consts=buildGoalRequirementSection("all tests in test/auth pass")assert.match(s,/MANDATORY/i)assert.match(s,/alltestsintest\/authpass/)assert.match(s,/evidence/i)// reinforces no-false-complete
Step 2: Run, verify fail.
Step 3: Implement — returns a prompt fragment marking the condition as a mandatory completion requirement, restating that claims need transcript evidence. Appended after rubric.antipatterns in both builders when a goal is active.
Step 4: Run, verify pass. Commit.
Task 5.2: Integrate into runReflection
Files:reflection-3.tsrunReflection:1667, budget gate before judge, completion + continuation at :1925–:1976.
Step 1: Failing integration test — test/supervisor.integration.test.ts with a mocked client (mirror test/reflection.test.ts mock): a session with an active goal whose judge verdict is complete:false triggers client.session.promptAsync; verdict complete:true sets goal status:"achieved" (assert via supervisorStore.load) and injects no continuation; attempts >= effectiveMaxAttempts sets status:"exhausted" and injects nothing.
Add promptfoo cases to evals/ (or a new evals/supervisor-goal.yaml): (a) condition met with test evidence → judge complete:true; (b) bare "done" claim, no evidence → complete:false; (c) editing the ## Antipatterns section of a fixture rubric.md flips the verdict.
Run npm run eval:judge (or the new config); record pass rate in the issue. Commit.
Task 6.2: README
Document /supervisor:goal, /supervisor:retry, rubric override (rubric.md resolution order), resume behavior, and the anthropic-provider recommendation for long unattended runs. Commit.
Open items intentionally deferred to Phase 0 spike (not placeholders): exact command namespacing + command.executed payload. All other steps are concrete and code-complete.
Summary
Add a supervisor control surface over the always-on
reflection-3judge — porting the essence of Claude Code's/goal(an independent evaluator that keeps the agent working until a condition holds), which the existing OpenCode goal plugins lack (sentinel-based / todo-driven).reflection-3.tsalready implements the hard part:session.idle→ independent judge (throwaway session) → continuation viapromptAsync. This adds three things on top:rubric.md(## Patterns/## Antipatterns), with an embedded default so the single-file install keeps working./supervisor:retry <n>— make the retry budget configurable; raise default 3 → 16./supervisor:goal …— session-scoped goal; completion requires the condition AND all applicable workflow gates; auto-clears on achieve; resumes active (CC-faithful).Design decisions captured in the spec: completion = "goal AND applicable gates"; resume = active; retry scope = all reflection; rubric = one file, two sections.
Spec:
docs/superpowers/specs/2026-06-03-supervisor-mode-design.mdPlan:
docs/superpowers/plans/2026-06-03-supervisor-mode.md(both on branchfeat/supervisor-mode)Phase 0 — Spikes (resolve OpenCode API unknowns)
These gate the command-capture code only. Rubric/retry/store phases do not depend on them and can proceed in parallel.
Task 0.1: Spike command namespacing + arg capture
/tmp/probe/probe.tsthat logs everyevent.typeand fullevent.propertiesto a file, and registers acommand.executedlog./supervisor:goal: (a).opencode/command/supervisor/goal.md; (b)opencode.jsoncommand["supervisor:goal"]. Runopencodeand invoke each.command.executedcarry{name, arguments}for the invoked command? Does asupervisor-namespaced command appear as/supervisor:goal?command.executedevent (preferred, deterministic) or B) control-marker in the command template parsed from the user message. Document the choice + payload shape in the issue before Phase 4.Acceptance: issue comment states the namespacing mechanism, the
command.executedpayload (or its absence), and the chosen capture path with a concrete example.Phase 1 — Configurable rubric (no API dependency)
Task 1.1: Extract inline antipatterns into
DEFAULT_RUBRICFiles: Modify
reflection-3.ts(near:25); source text from:1140–:1143(self-assessment) and:1400–:1403(judge).test/supervisor.unit.test.ts:npx jest test/supervisor.unit.test.ts -t rubric→ FAIL (DEFAULT_RUBRIC/parseRubricnot exported).reflection-3.ts:Copy the antipattern text verbatim from the two existing inline blocks (use the more complete
:1140wording).git commit -m "feat(supervisor): extract default rubric into embedded constant"Task 1.2:
loadRubric(directory)with override precedenceloadRubric(directory): try${directory}/.reflection/rubric.md(sourceproject) →~/.config/opencode/supervisor/rubric.md(sourceglobal) →DEFAULT_RUBRIC(sourcedefault).parseRubriceach; if either section empty, fall through to default. Return{ patterns, antipatterns, source }.Task 1.3: Wire
loadRubricinto both prompt buildersFiles:
reflection-3.tsbuildSelfAssessmentPrompt:1053,analyzeSelfAssessmentWithLLM:1350, call siterunReflection:1717.buildSelfAssessmentPromptaccepts arubricarg and interpolatesrubric.antipatterns:rubricparam to both builders, replace the inline antipattern literals with${rubric.antipatterns}/${rubric.patterns}; default the param toparseRubric(DEFAULT_RUBRIC)for back-compat. InrunReflection, callconst rubric = await loadRubric(directory)once and thread it into both builders (and the judge path:1717).npm run typecheck; verify pass.feat(supervisor): load rubric from file with default fallbackPhase 2 — Configurable retry budget
Task 2.1:
DEFAULT_MAX_ATTEMPTS = 16+ cap resolverFiles:
reflection-3.ts:25(MAX_ATTEMPTS),reflection.yamlloader (loadRoutingConfig:765area).DEFAULT_MAX_ATTEMPTS = 16; addresolveMaxAttempts({sessionOverride?, config?})clamped to[1,100]. ReadmaxAttemptsfromreflection.yamlin the config loader.MAX_ATTEMPTSuse at:1927/:1929/:1080with aneffectiveMaxAttemptsresolved per session (computed inrunReflection, passed where needed).feat(supervisor): make retry budget configurable, default 16Phase 3 —
supervisorStore(per-session state)Task 3.1: Store round-trip
Files:
reflection-3.ts; state at${directory}/.reflection/supervisor/<sid>.json.supervisorStoreobject:load,save,setGoal(init{condition,status:"active",attempts:0,tokenBaseline:0,startedAt:Date.now(),deadline:Date.now()+maxDurationMs,lastReason:""}),clearGoal,setRetry,list. Files0600; corrupt JSON →{}. Mkdir.reflection/supervisoron save.feat(supervisor): per-session goal+retry storePhase 4 —
/supervisor:*command capture (after Phase 0)Task 4.1: Ship the commands
.opencode/command/supervisor/goal.mdandretry.md(oropencode.jsonentries). Template carries$ARGUMENTS; if capture path B, prefix a control marker (e.g.<!--supervisor:goal-->).feat(supervisor): add /supervisor:goal and /supervisor:retry commandsTask 4.2: Capture handler
Files:
reflection-3.tseventhandler (:1990), parserparseSupervisorCommand.Aliases for clear:
stop|off|reset|none|cancel.parseSupervisorCommand(name, args); then in theeventhandler, on the captured command (path A:command.executed; path B: scan latest user message for the marker), callsupervisorStore.setGoal/clearGoal/setRetryandshowToastthe status. Condition clamped to 4000 chars.feat(supervisor): capture /supervisor commands into storePhase 5 — Goal loop integration
Task 5.1:
buildGoalRequirementSectionrubric.antipatternsin both builders when a goal is active.Task 5.2: Integrate into
runReflectionFiles:
reflection-3.tsrunReflection:1667, budget gate before judge, completion + continuation at:1925–:1976.test/supervisor.integration.test.tswith a mockedclient(mirrortest/reflection.test.tsmock): a session with an active goal whose judge verdict iscomplete:falsetriggersclient.session.promptAsync; verdictcomplete:truesets goalstatus:"achieved"(assert viasupervisorStore.load) and injects no continuation;attempts >= effectiveMaxAttemptssetsstatus:"exhausted"and injects nothing.runReflection:supervisorState;effectiveMaxAttempts = resolveMaxAttempts({sessionOverride: state.maxAttempts, config}).state.goal?.status === "active": budget gate first — ifgoal.attempts >= effectiveMaxAttempts|| tokens/deadline exceeded → setstatus:"exhausted", save, toast,return.buildGoalRequirementSection(goal.condition)into the prompt builders (Task 5.1).analysis.completewith active goal → setstatus:"achieved", save,✓toast,return(no continuation).goal.attemptsalongside the existingattemptsmap and persist; reuse existingpromptAsyncblock.npm test+ typecheck; verify pass.feat(supervisor): goal loop — gates AND condition, auto-clear on achieveTask 5.3: Resume active
activegoal loaded fresh staysactivewithattemptsreset to 0 (unlesssupervisorResumePaused).supervisorResumePaused(default false). Run, verify, commit.Phase 6 — Evals & docs
Task 6.1: Verification-theater fixtures
evals/(or a newevals/supervisor-goal.yaml): (a) condition met with test evidence → judgecomplete:true; (b) bare "done" claim, no evidence →complete:false; (c) editing the## Antipatternssection of a fixturerubric.mdflips the verdict.npm run eval:judge(or the new config); record pass rate in the issue. Commit.Task 6.2: README
/supervisor:goal,/supervisor:retry, rubric override (rubric.mdresolution order), resume behavior, and theanthropic-provider recommendation for long unattended runs. Commit.Self-review (spec coverage)
/supervisor:retry) → Phase 2 + 4 ✓./supervisor:goal, gates AND condition, auto-clear, resume active) → Phases 3–5 ✓.promptAsyncat:1957✓.Open items intentionally deferred to Phase 0 spike (not placeholders): exact command namespacing +
command.executedpayload. All other steps are concrete and code-complete.