Supervisor controls for reflection-3: configurable rubric, retry budget, session goals (engine) by dvashchuk · Pull Request #144 · dzianisv/opencode-plugins

dvashchuk · 2026-06-03T16:53:28Z

Supervisor controls for `reflection-3` (engine + state)

Adds a configurable supervisor layer over the always-on reflection judge — porting the essence of Claude Code's /goal (an independent evaluator that keeps the agent working until a condition holds), reusing the plugin's existing session.idle → independent judge → continue loop.

Closes part of #143. See docs/superpowers/specs/2026-06-03-supervisor-mode-design.md and docs/superpowers/plans/2026-06-03-supervisor-mode.md.

What's implemented (engine + state, fully tested)

Configurable rubric — judge Patterns/Antipatterns extracted from two divergent inline copies into one embedded DEFAULT_RUBRIC, overridable via a single rubric.md (project → global → embedded default). loadRubric wired into both prompt builders.
Configurable retry budget — default raised 3 → 16; resolveMaxAttempts (clamped 1–100) + reflection.yaml maxAttempts; effective cap reflected in prompt strings.
supervisorStore — per-session goal+retry state at .reflection/supervisor/<sid>.json (0600, path-traversal-guarded, corrupt-safe).
Session goals — parseSupervisorCommand, buildGoalRequirementSection, and decideGoalTransition integrated into runReflection: an active goal augments the rubric with a mandatory completion requirement (bypassing file/tool prompt precedence), completion requires applicable gates AND the condition, achieved goals auto-clear, exhausted goals pause. Budget is burned only when a continuation is actually injected.
README documents the features (honestly marking the command surface as being finalized).

Tests: 233 passing across supervisor.unit, supervisor.integration, reflection-3.unit, reflection.test. Each task went through spec + code-quality review; a final whole-branch review flagged the budget-accounting bug now fixed in 2c7b165.

Why draft — remaining work (needs a running OpenCode)

The reflection engine drives goals correctly and is fully tested, but a user can't yet type /supervisor:goal. The remaining pieces need a live OpenCode instance to resolve the plugin command API honestly (rather than guessing):

Spike: how supervisor:-namespaced commands map + whether command.executed carries args (vs a control-marker fallback).
Ship .opencode/command/supervisor/{goal,retry}.md (or opencode.json entries) + wire the event handler to capture commands into supervisorStore.
Resume-active (5.3) — refresh deadline / reset counters on session resume so a goal set before a pause doesn't read as instantly "exhausted" (absolute deadline); add supervisorResumePaused.
Token/time budget knobs (goalMaxTokens / goalMaxDurationMs) — currently goals use a fixed 30-min timeout; README updated to not over-claim.
promptfoo verification-theater eval fixtures.

🤖 Generated with Claude Code

Spec + plan for /supervisor:goal, /supervisor:retry, and a configurable rubric layered on the reflection-3 judge loop. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Extract the judge's positive completion rules (Patterns) and mined premature-stop rules (Antipatterns) into an embedded DEFAULT_RUBRIC, overridable via .reflection/rubric.md (project) or ~/.config/opencode/supervisor/rubric.md (global). Falls back to the default if an override is missing either section. Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…dge-prompt rubric Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Renames MAX_ATTEMPTS→DEFAULT_MAX_ATTEMPTS (3→16), exports pure resolveMaxAttempts() with sessionOverride>config>default priority and [1,100] clamping, and adds loadConfiguredMaxAttempts() to read maxAttempts: from ~/.config/opencode/reflection.yaml. runReflection now computes effectiveMaxAttempts at runtime. Tests and test-helpers updated to match new defaults (3/16, 2/16 display strings). Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…pts parse Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…raversal Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Goal-active sessions augment the judge rubric with a mandatory completion requirement (bypassing file/tool prompt precedence so it composes with the gates), and a pure decideGoalTransition drives budget-exhaustion / achieved / continue. Completion = applicable gates AND condition met; achieved goals auto-clear, exhausted goals pause without continuation. Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…larify dual-counter Addresses code-review minors on goal-loop integration. Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…y injected Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

dzianisv and others added 14 commits June 3, 2026 01:07

docs(supervisor): add design spec and implementation plan

6b4a88b

Spec + plan for /supervisor:goal, /supervisor:retry, and a configurable rubric layered on the reflection-3 judge loop. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

feat(supervisor): load rubric from file in both prompt builders

c12b0ff

Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

test(supervisor): de-dupe buildSelfAssessmentPrompt helper + cover ju…

f375eb5

…dge-prompt rubric Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

fix(supervisor): show effective retry cap in prompts; strict maxAttem…

28c9bbb

…pts parse Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

feat(supervisor): per-session goal+retry store

adcb18a

Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

fix(supervisor): enforce 0600 on store update; guard sessionId path t…

1f6be47

…raversal Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

feat(supervisor): parseSupervisorCommand parser

447dd89

Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

feat(supervisor): buildGoalRequirementSection

9afb0d7

Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

polish(supervisor): guard blank goal section, single achieve toast, c…

9398602

…larify dual-counter Addresses code-review minors on goal-loop integration. Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

docs(supervisor): document goal, retry, and configurable rubric

f3dd970

Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

fix(supervisor): only burn goal budget when a continuation is actuall…

2c7b165

…y injected Refs #143 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

dvashchuk mentioned this pull request Jun 3, 2026

Supervisor controls for reflection-3: /supervisor:goal, /supervisor:retry, configurable rubric #143

Open

49 tasks

dzianisv closed this Jun 8, 2026

dzianisv deleted the feat/supervisor-mode branch June 8, 2026 09:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supervisor controls for reflection-3: configurable rubric, retry budget, session goals (engine)#144

Supervisor controls for reflection-3: configurable rubric, retry budget, session goals (engine)#144
dvashchuk wants to merge 14 commits into
mainfrom
feat/supervisor-mode

dvashchuk commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dvashchuk commented Jun 3, 2026

Supervisor controls for reflection-3 (engine + state)

What's implemented (engine + state, fully tested)

Why draft — remaining work (needs a running OpenCode)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Supervisor controls for `reflection-3` (engine + state)