Releases: dzianisv/opencode-plugins
v3.1.0 — Reflection for OpenCode + Claude Code, macOS keychain fix, 25× cheaper CI eval
What's new
Cross-platform support
Reflection now works on both OpenCode (session.idle event) and Claude Code (Stop hook).
Install on OpenCode:
{ "plugin": ["opencode-reflection"] }Install on Claude Code:
/plugin marketplace add dzianisv/opencode-plugins
/plugin install reflection-cc
Real-data antipatterns in the judge prompt
Mined 143 sessions (OpenCode + Claude Code), classified 227 stops. 78% were premature:
- 91 permission-seeking ("Want me to run the tests?")
- 68 stopped-with-todos ("Next steps: create PR" — then stopped)
- 41 legitimate (OAuth, 2FA, genuine completion)
The judge prompt now encodes these as PERMISSION-SEEKING and STOPPED-WITH-TODOS antipatterns with a decisive test: if the final turn is a yes/no question about something the agent can do itself → premature.
macOS keychain auth fix
Claude Code stores OAuth in the macOS login keychain (Claude Code-credentials), not ~/.claude/.credentials.json. The in-hook judge could never authenticate on macOS before this fix. Verified via a live claude -p session.
25× cheaper CI eval
Switched the judge eval suite from gpt-5.1 to gpt-5.4-nano (~25× cheaper). Benchmarked all deployed models: the entire gpt-5.4 family scores 33/34 (one calibration-variance miss on a borderline case). Added EVAL_PASS_THRESHOLD=0.97 to tolerate the known borderline case while still catching real regressions. Both CI checks green.
Reflexion taxonomy
README and docs now document the mapping to Reflexion (Shinn et al. 2023): actor=agent, evaluator=LLM judge, verbal self-reflection=injected feedback, MAX_ATTEMPTS=3 ≈ bounded reflection memory.
Upgrading
No breaking changes. Update your opencode.json to use opencode-reflection (npm package) or pull the latest reflection-3.ts directly.