Skip to content

feat(low-token): bundle caveman skill and wire into --low-token preset#67

Draft
SamPlvs wants to merge 2 commits into
mainfrom
claude/caveman-ablation
Draft

feat(low-token): bundle caveman skill and wire into --low-token preset#67
SamPlvs wants to merge 2 commits into
mainfrom
claude/caveman-ablation

Conversation

@SamPlvs
Copy link
Copy Markdown
Owner

@SamPlvs SamPlvs commented May 5, 2026

Summary

First Tier 1 deliverable per the Q3/H2 roadmap (#66). Bundles the caveman terse-output skill and wires it into the --low-token preset so it activates automatically when users opt into low-token mode.

How it works

zo build --low-token
   ↓
_LOW_TOKEN_PRESET["caveman"] = True
   ↓
_resolve_caveman() (CLI > plan > preset precedence)
   ↓
Orchestrator(caveman=True)
   ↓
_prompt_low_token_overrides() appends "Token Efficiency Skill: Caveman Mode"
   ↓
Lead prompt directs lead + sub-agents to use the auto-loaded skill

Opt-out paths (precedence order): --no-caveman CLI flag > plan caveman: false > preset default. Banner shows [low-token + caveman] for visual confirmation.

Why caveman is safe to auto-activate

The caveman skill explicitly preserves:

  • Code blocks (verbatim)
  • Quoted error strings (verbatim)
  • Tool inputs (Write/Edit args — caveman only touches chat output, not tool calls)
  • Structured artifacts (metrics.jsonl, result.md, training_status.json, agent contracts) — written via Write/Edit, not chat
  • Security warnings + irreversible-action confirmations — caveman auto-disables for these per its own rules

So the gates that depend on structured outputs (Phase 4 ZOTrainingCallback artifacts, oracle's result.md, etc.) are not at risk.

Files

  • .claude/skills/caveman/{SKILL.md, LICENSE, README.md} — vendored from upstream (MIT, Copyright (c) 2026 Julius Brussee), fetched 2026-05-05
  • src/zo/cli.py_LOW_TOKEN_PRESET["caveman"] = True; new _resolve_caveman() helper; --no-caveman flag on build and continue; banner badge update
  • src/zo/orchestrator.pyOrchestrator.__init__ accepts caveman: bool; new caveman property; _prompt_low_token_overrides() extended with caveman directive
  • src/zo/plan.pyPlanFrontmatter.caveman: bool | None (None = preset default; explicit False = opt-out)
  • scripts/benchmark_low_token.sh — third run variant low-token-no-caveman for ablation

Test plan

  • ruff clean on src/zo/{cli,orchestrator,plan}.py
  • bash -n clean on scripts/benchmark_low_token.sh
  • _resolve_caveman precedence verified for all 4 cases (default-on, --no-caveman wins, plan opt-out wins, no-low-token forces off)
  • validate-docs.sh 10/10 + 1 pre-existing warning
  • Unit tests for new preset flag, _resolve_caveman, prompt directive — follow-up commit
  • Docs cascade: low-token-mode.mdx, low-token-preset.mdx, cost-benchmark.mdx — follow-up commit
  • Bench run on MNIST plan (~25min, ~$5–7) to measure the actual savings delta — follow-up commit

Why draft

Wiring is in and unit-smoke-tested, but not yet validated end-to-end against a real zo build session. The bench run is the proof-of-savings step. Tests + docs land after the bench so they reflect what actually happens, not what the prompt says should happen.

🤖 Generated with Claude Code

First Tier 1 deliverable per the Q3/H2 roadmap (PR #66).

Bundles caveman terse-output skill at .claude/skills/caveman/, wires
it into the --low-token preset (default-on, opt out via --no-caveman
or plan caveman: false), and extends the benchmark harness with an
ablation arm to isolate caveman's contribution from the rest of the
preset.

What ships:

- .claude/skills/caveman/{SKILL.md, LICENSE, README.md} — vendored
  from upstream (github.com/JuliusBrussee/caveman, MIT, fetched
  2026-05-05). Canonical content from caveman/SKILL.md at upstream main.

- src/zo/cli.py — _LOW_TOKEN_PRESET gains caveman: True. New
  _resolve_caveman() helper enforces precedence: CLI --no-caveman >
  plan caveman: false > preset default. --no-caveman flag added to
  both build and continue commands; banner becomes
  [low-token + caveman] when both active.

- src/zo/orchestrator.py — Orchestrator.__init__ accepts caveman: bool
  (default False); caveman property added. _prompt_low_token_overrides()
  extended with a "Token Efficiency Skill: Caveman Mode" subsection
  that directs the lead and all sub-agents to invoke the auto-loaded
  skill, with explicit callouts to the safety guarantees (code blocks,
  quoted errors, tool inputs, structured artifacts all preserved
  verbatim — caveman only compresses chat prose).

- src/zo/plan.py — PlanFrontmatter.caveman: bool | None added (None
  means "use preset default", explicit False is opt-out).

- scripts/benchmark_low_token.sh — adds a third run variant
  "low-token-no-caveman" that runs --low-token --no-caveman in
  isolation. Comparing low-token vs low-token-no-caveman gives the
  caveman delta within the preset.

Verified:
- ruff src/zo/{cli,orchestrator,plan}.py: clean
- bash -n scripts/benchmark_low_token.sh: OK
- _resolve_caveman precedence: 4 cases match design
  (preset default on, --no-caveman wins, plan caveman: false wins,
  low_token=False forces caveman off regardless)
- validate-docs.sh: 10/10 + 1 pre-existing warning

Deferred to follow-up commits before PR opens:
- Unit tests for new preset flag, _resolve_caveman, prompt directive
- Docs: low-token-mode.mdx caveman section, low-token-preset.mdx
  knob row, cost-benchmark.mdx after-bench number
- The actual bench run (needs interactive tmux per harness comments)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 5, 2026

Deploying zero-operators with  Cloudflare Pages  Cloudflare Pages

Latest commit: 6a87b41
Status: ✅  Deploy successful!
Preview URL: https://1c4cbb36.zero-operators.pages.dev
Branch Preview URL: https://claude-caveman-ablation.zero-operators.pages.dev

View logs

… don't install

After reading caveman's upstream CLAUDE.md, found that always-on
activation in Claude Code is delivered by SessionStart and
UserPromptSubmit hooks (their hooks/install.sh patches
~/.claude/settings.json and copies hook scripts into ~/.claude/hooks/).

The previous commit's prompt directive said "ACTIVATE caveman skill,
the vendored SKILL.md is auto-loaded by Claude Code". This was
incorrect. Skills in .claude/skills/<name>/SKILL.md are *invocations*,
not always-on enforcement. Without hooks, the file alone does nothing
to make agents adopt caveman speech.

We deliberately don't install upstream's hooks (would modify the
user's ~/.claude/ globally — too invasive for an opt-in cost-saving
feature).

Fix: inline the caveman rules directly into the orchestrator prompt.
The lead and every spawned sub-agent get the rules verbatim in their
prompts. No skill or hook system required — savings come from agents
adopting the style as instructed.

Files:
- src/zo/orchestrator.py — _prompt_low_token_overrides() caveman
  subsection rewritten. Drops "load this skill" framing; contains
  the rules inline (intensity full, what to compress, what stays
  verbatim, special handling for DECISION_LOG / gate rationales /
  hand-off summaries — those use lite intensity to stay readable
  cold). Docstring updated to reflect the inline approach.
- .claude/skills/caveman/README.md — clarified that this directory
  is a reference copy, not a runtime activation mechanism. Corrected
  upstream source-of-truth path (skills/caveman/SKILL.md, not
  caveman/SKILL.md which is auto-generated). Updated the "Updating"
  command and "Activation summary" to match.

Verified:
- ruff src/zo/orchestrator.py: clean
- smoke test confirms new "Token Efficiency: Caveman-Style Prose"
  section present, no "auto-loaded by Claude" claim remaining
- validate-docs.sh: 10/10 + 1 pre-existing warning

The file-vendoring stays — it's a reference for developers and
forward-compat if Claude Code adds proper project-level skill
auto-loading later. Bench remains the only honest measurement of
actual savings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@SamPlvs
Copy link
Copy Markdown
Owner Author

SamPlvs commented May 5, 2026

Correction in commit 6a87b41 — please read before reviewing the first commit's prompt directive.

While reviewing caveman's upstream CLAUDE.md after a sharp question from the user, found that caveman's always-on activation in Claude Code is delivered by SessionStart and UserPromptSubmit hooks, not by being a skill. The standalone install (hooks/install.sh in their repo) copies hook scripts into ~/.claude/hooks/ and patches ~/.claude/settings.json.

The first commit (f190bc7) wrote a prompt directive that said "ACTIVATE caveman skill. The vendored SKILL.md is auto-loaded by Claude Code". This was incorrect — skills in .claude/skills/<name>/SKILL.md are invocations, not always-on enforcement. Without the hooks, the file alone doesn't make agents adopt caveman speech.

We deliberately don't run upstream's install.sh (would modify the user's ~/.claude/ globally — too invasive for an opt-in cost-saving feature).

Fix in 6a87b41: the caveman rules are now inlined directly into the orchestrator prompt. Lead and every spawned sub-agent see the rules verbatim in their prompts. No skill or hook system dependency. The vendored SKILL.md stays as a reference + forward-compat artifact (and as documentation for users who choose to install caveman properly with hooks).

Net effect: same intent (caveman auto-on with --low-token), more honest mechanism (inline directive instead of "skill auto-load"), no architectural surprises hidden behind the abstraction.

The bench remains the only honest measurement of actual savings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant