feat(low-token): bundle caveman skill and wire into --low-token preset#67
feat(low-token): bundle caveman skill and wire into --low-token preset#67SamPlvs wants to merge 2 commits into
Conversation
First Tier 1 deliverable per the Q3/H2 roadmap (PR #66). Bundles caveman terse-output skill at .claude/skills/caveman/, wires it into the --low-token preset (default-on, opt out via --no-caveman or plan caveman: false), and extends the benchmark harness with an ablation arm to isolate caveman's contribution from the rest of the preset. What ships: - .claude/skills/caveman/{SKILL.md, LICENSE, README.md} — vendored from upstream (github.com/JuliusBrussee/caveman, MIT, fetched 2026-05-05). Canonical content from caveman/SKILL.md at upstream main. - src/zo/cli.py — _LOW_TOKEN_PRESET gains caveman: True. New _resolve_caveman() helper enforces precedence: CLI --no-caveman > plan caveman: false > preset default. --no-caveman flag added to both build and continue commands; banner becomes [low-token + caveman] when both active. - src/zo/orchestrator.py — Orchestrator.__init__ accepts caveman: bool (default False); caveman property added. _prompt_low_token_overrides() extended with a "Token Efficiency Skill: Caveman Mode" subsection that directs the lead and all sub-agents to invoke the auto-loaded skill, with explicit callouts to the safety guarantees (code blocks, quoted errors, tool inputs, structured artifacts all preserved verbatim — caveman only compresses chat prose). - src/zo/plan.py — PlanFrontmatter.caveman: bool | None added (None means "use preset default", explicit False is opt-out). - scripts/benchmark_low_token.sh — adds a third run variant "low-token-no-caveman" that runs --low-token --no-caveman in isolation. Comparing low-token vs low-token-no-caveman gives the caveman delta within the preset. Verified: - ruff src/zo/{cli,orchestrator,plan}.py: clean - bash -n scripts/benchmark_low_token.sh: OK - _resolve_caveman precedence: 4 cases match design (preset default on, --no-caveman wins, plan caveman: false wins, low_token=False forces caveman off regardless) - validate-docs.sh: 10/10 + 1 pre-existing warning Deferred to follow-up commits before PR opens: - Unit tests for new preset flag, _resolve_caveman, prompt directive - Docs: low-token-mode.mdx caveman section, low-token-preset.mdx knob row, cost-benchmark.mdx after-bench number - The actual bench run (needs interactive tmux per harness comments) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Deploying zero-operators with
|
| Latest commit: |
6a87b41
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://1c4cbb36.zero-operators.pages.dev |
| Branch Preview URL: | https://claude-caveman-ablation.zero-operators.pages.dev |
… don't install After reading caveman's upstream CLAUDE.md, found that always-on activation in Claude Code is delivered by SessionStart and UserPromptSubmit hooks (their hooks/install.sh patches ~/.claude/settings.json and copies hook scripts into ~/.claude/hooks/). The previous commit's prompt directive said "ACTIVATE caveman skill, the vendored SKILL.md is auto-loaded by Claude Code". This was incorrect. Skills in .claude/skills/<name>/SKILL.md are *invocations*, not always-on enforcement. Without hooks, the file alone does nothing to make agents adopt caveman speech. We deliberately don't install upstream's hooks (would modify the user's ~/.claude/ globally — too invasive for an opt-in cost-saving feature). Fix: inline the caveman rules directly into the orchestrator prompt. The lead and every spawned sub-agent get the rules verbatim in their prompts. No skill or hook system required — savings come from agents adopting the style as instructed. Files: - src/zo/orchestrator.py — _prompt_low_token_overrides() caveman subsection rewritten. Drops "load this skill" framing; contains the rules inline (intensity full, what to compress, what stays verbatim, special handling for DECISION_LOG / gate rationales / hand-off summaries — those use lite intensity to stay readable cold). Docstring updated to reflect the inline approach. - .claude/skills/caveman/README.md — clarified that this directory is a reference copy, not a runtime activation mechanism. Corrected upstream source-of-truth path (skills/caveman/SKILL.md, not caveman/SKILL.md which is auto-generated). Updated the "Updating" command and "Activation summary" to match. Verified: - ruff src/zo/orchestrator.py: clean - smoke test confirms new "Token Efficiency: Caveman-Style Prose" section present, no "auto-loaded by Claude" claim remaining - validate-docs.sh: 10/10 + 1 pre-existing warning The file-vendoring stays — it's a reference for developers and forward-compat if Claude Code adds proper project-level skill auto-loading later. Bench remains the only honest measurement of actual savings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Correction in commit While reviewing caveman's upstream The first commit ( We deliberately don't run upstream's Fix in Net effect: same intent (caveman auto-on with The bench remains the only honest measurement of actual savings. |
Summary
First Tier 1 deliverable per the Q3/H2 roadmap (#66). Bundles the caveman terse-output skill and wires it into the
--low-tokenpreset so it activates automatically when users opt into low-token mode.How it works
Opt-out paths (precedence order):
--no-cavemanCLI flag > plancaveman: false> preset default. Banner shows[low-token + caveman]for visual confirmation.Why caveman is safe to auto-activate
The caveman skill explicitly preserves:
metrics.jsonl,result.md,training_status.json, agent contracts) — written via Write/Edit, not chatSo the gates that depend on structured outputs (Phase 4
ZOTrainingCallbackartifacts, oracle'sresult.md, etc.) are not at risk.Files
.claude/skills/caveman/{SKILL.md, LICENSE, README.md}— vendored from upstream (MIT, Copyright (c) 2026 Julius Brussee), fetched 2026-05-05src/zo/cli.py—_LOW_TOKEN_PRESET["caveman"] = True; new_resolve_caveman()helper;--no-cavemanflag onbuildandcontinue; banner badge updatesrc/zo/orchestrator.py—Orchestrator.__init__acceptscaveman: bool; newcavemanproperty;_prompt_low_token_overrides()extended with caveman directivesrc/zo/plan.py—PlanFrontmatter.caveman: bool | None(None = preset default; explicitFalse= opt-out)scripts/benchmark_low_token.sh— third run variantlow-token-no-cavemanfor ablationTest plan
ruffclean onsrc/zo/{cli,orchestrator,plan}.pybash -nclean onscripts/benchmark_low_token.sh_resolve_cavemanprecedence verified for all 4 cases (default-on, --no-caveman wins, plan opt-out wins, no-low-token forces off)validate-docs.sh10/10 + 1 pre-existing warning_resolve_caveman, prompt directive — follow-up commitlow-token-mode.mdx,low-token-preset.mdx,cost-benchmark.mdx— follow-up commitWhy draft
Wiring is in and unit-smoke-tested, but not yet validated end-to-end against a real
zo buildsession. The bench run is the proof-of-savings step. Tests + docs land after the bench so they reflect what actually happens, not what the prompt says should happen.🤖 Generated with Claude Code