Skip to content
This repository was archived by the owner on Jun 8, 2026. It is now read-only.

Supervisor controls for reflection-3: configurable rubric, retry budget, session goals (engine)#144

Closed
dvashchuk wants to merge 14 commits into
mainfrom
feat/supervisor-mode
Closed

Supervisor controls for reflection-3: configurable rubric, retry budget, session goals (engine)#144
dvashchuk wants to merge 14 commits into
mainfrom
feat/supervisor-mode

Conversation

@dvashchuk

Copy link
Copy Markdown
Collaborator

Supervisor controls for reflection-3 (engine + state)

Adds a configurable supervisor layer over the always-on reflection judge — porting the essence of Claude Code's /goal (an independent evaluator that keeps the agent working until a condition holds), reusing the plugin's existing session.idle → independent judge → continue loop.

Closes part of #143. See docs/superpowers/specs/2026-06-03-supervisor-mode-design.md and docs/superpowers/plans/2026-06-03-supervisor-mode.md.

What's implemented (engine + state, fully tested)

  • Configurable rubric — judge Patterns/Antipatterns extracted from two divergent inline copies into one embedded DEFAULT_RUBRIC, overridable via a single rubric.md (project → global → embedded default). loadRubric wired into both prompt builders.
  • Configurable retry budget — default raised 3 → 16; resolveMaxAttempts (clamped 1–100) + reflection.yaml maxAttempts; effective cap reflected in prompt strings.
  • supervisorStore — per-session goal+retry state at .reflection/supervisor/<sid>.json (0600, path-traversal-guarded, corrupt-safe).
  • Session goalsparseSupervisorCommand, buildGoalRequirementSection, and decideGoalTransition integrated into runReflection: an active goal augments the rubric with a mandatory completion requirement (bypassing file/tool prompt precedence), completion requires applicable gates AND the condition, achieved goals auto-clear, exhausted goals pause. Budget is burned only when a continuation is actually injected.
  • README documents the features (honestly marking the command surface as being finalized).

Tests: 233 passing across supervisor.unit, supervisor.integration, reflection-3.unit, reflection.test. Each task went through spec + code-quality review; a final whole-branch review flagged the budget-accounting bug now fixed in 2c7b165.

Why draft — remaining work (needs a running OpenCode)

The reflection engine drives goals correctly and is fully tested, but a user can't yet type /supervisor:goal. The remaining pieces need a live OpenCode instance to resolve the plugin command API honestly (rather than guessing):

  • Spike: how supervisor:-namespaced commands map + whether command.executed carries args (vs a control-marker fallback).
  • Ship .opencode/command/supervisor/{goal,retry}.md (or opencode.json entries) + wire the event handler to capture commands into supervisorStore.
  • Resume-active (5.3) — refresh deadline / reset counters on session resume so a goal set before a pause doesn't read as instantly "exhausted" (absolute deadline); add supervisorResumePaused.
  • Token/time budget knobs (goalMaxTokens / goalMaxDurationMs) — currently goals use a fixed 30-min timeout; README updated to not over-claim.
  • promptfoo verification-theater eval fixtures.

🤖 Generated with Claude Code

dzianisv and others added 14 commits June 3, 2026 01:07
Spec + plan for /supervisor:goal, /supervisor:retry, and a configurable
rubric layered on the reflection-3 judge loop.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extract the judge's positive completion rules (Patterns) and mined
premature-stop rules (Antipatterns) into an embedded DEFAULT_RUBRIC,
overridable via .reflection/rubric.md (project) or
~/.config/opencode/supervisor/rubric.md (global). Falls back to the
default if an override is missing either section.

Refs #143

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Refs #143

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…dge-prompt rubric

Refs #143

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Renames MAX_ATTEMPTS→DEFAULT_MAX_ATTEMPTS (3→16), exports pure
resolveMaxAttempts() with sessionOverride>config>default priority
and [1,100] clamping, and adds loadConfiguredMaxAttempts() to read
maxAttempts: from ~/.config/opencode/reflection.yaml. runReflection
now computes effectiveMaxAttempts at runtime. Tests and test-helpers
updated to match new defaults (3/16, 2/16 display strings).

Refs #143

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…pts parse

Refs #143

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Refs #143

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…raversal

Refs #143

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Refs #143

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Refs #143

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Goal-active sessions augment the judge rubric with a mandatory completion
requirement (bypassing file/tool prompt precedence so it composes with the
gates), and a pure decideGoalTransition drives budget-exhaustion / achieved
/ continue. Completion = applicable gates AND condition met; achieved goals
auto-clear, exhausted goals pause without continuation.

Refs #143

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…larify dual-counter

Addresses code-review minors on goal-loop integration.

Refs #143

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Refs #143

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…y injected

Refs #143

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants