diff --git a/docs/COMMANDS.md b/docs/COMMANDS.md index 9f0f696..c4c8be8 100644 --- a/docs/COMMANDS.md +++ b/docs/COMMANDS.md @@ -25,10 +25,10 @@ zo build plans/project.md [--gate-mode supervised|auto|full-auto] [--no-tmux] ``` **Cost-saving options:** -- `--low-token`: activates the cost-saving preset (Sonnet lead, Haiku for code-reviewer / test-engineer / oracle-qa, 2 max iterations, full-auto gates, no headlines, earlier auto-compaction). Equivalent to `low_token: true` in plan frontmatter. **~30% measured savings** on MNIST ($7.75 vs ~$11 default); 50-60% targeted, 70-80% on the roadmap (prompt caching + Batch API + Files API via SDK refactor). See `docs/concepts/low-token-mode.mdx`. +- `--low-token`: activates the cost-saving preset (Sonnet lead, Haiku for code-reviewer / test-engineer / oracle-qa, 2 max iterations, full-auto gates, no end-of-session Haiku summary, earlier auto-compaction). Equivalent to `low_token: true` in plan frontmatter. **~30% measured savings** on MNIST ($7.75 vs ~$11 default); 50-60% targeted, 70-80% on the roadmap (prompt caching + Batch API + Files API via SDK refactor). See `docs/concepts/low-token-mode.mdx`. - `--lead-model`: override lead orchestrator model. Composes with `--low-token`. - `--max-iterations N`: hard cap on Phase-4 experiment iterations. Wins over plan and preset. -- `--no-headlines`: disable the Haiku headline ticker (saves ~60 small calls/hour). +- `--no-headlines`: skip the end-of-session Haiku bullet summary (~1 Haiku call per run). ### zo continue diff --git a/docs/cli/build.mdx b/docs/cli/build.mdx index f78137b..272fb49 100644 --- a/docs/cli/build.mdx +++ b/docs/cli/build.mdx @@ -39,7 +39,7 @@ You don't pass a flag for this, `zo build` figures it out. Creates a tmux pane (default) or runs headless (`--no-tmux`). Pastes the Lead Orchestrator's prompt. The Lead reads the full plan, creates the team via `TeamCreate`, and drives execution. - The Python `LifecycleWrapper` observes via `~/.claude/tasks/` and the tmux pane. Captures lifecycle events, pipes them to JSONL comms log, surfaces Haiku-summarised headlines every 60 seconds in your terminal. Auto-launches the training dashboard split-pane during Phase 4. + The Python `LifecycleWrapper` observes via `~/.claude/tasks/` and the tmux pane. Captures lifecycle events, pipes them to JSONL comms log, prints the live task list and recent agent events in your terminal. Auto-launches the training dashboard split-pane during Phase 4. At every phase boundary, checks artifact contracts and oracle thresholds. Automated gates auto-PROCEED. Blocking gates pause; you approve via `/gates:approve` or reject via `/gates:reject`. @@ -117,9 +117,6 @@ While `zo build` is running, you have several windows into what the team is doin `Ctrl-b n` switches to the agent session. Watch the Lead Orchestrator coordinate the team in real time. `Ctrl-b p` returns. - - Every 60 seconds, your invoking terminal gets a 1-line Haiku summary of recent agent events. Lightweight live signal. - Auto-launched as a split-pane during Phase 4. Rich Live panel: progress bar, metrics table, sparkline, checkpoints. See `zo watch-training`. @@ -138,7 +135,7 @@ While `zo build` is running, you have several windows into what the team is doin | `--low-token` | (off) | Activate the cost-saving preset. See [low-token mode](/concepts/low-token-mode). | | `--lead-model` | `opus` (or `sonnet` if `--low-token`) | Override the lead orchestrator model: `opus` / `sonnet` / `haiku`. | | `--max-iterations N` | `10` (or `2` if `--low-token`) | Hard cap on Phase-4 experiment iterations. Wins over plan and preset. | -| `--no-headlines` | (off) | Disable the Haiku headline ticker (saves ~60 small calls/hour). | +| `--no-headlines` | (off) | Skip the end-of-session Haiku bullet summary (~1 Haiku call per run, ~$0.0002). | | `--bypass-permissions` | (off) | Auto-approve Claude Code tool-call prompts. Implied by `--gate-mode full-auto`. See [Permission prompts](#permission-prompts-bypass-permissions). | ## Examples @@ -173,7 +170,7 @@ While `zo build` is running, you have several windows into what the team is doin ```bash zo build plans/my-project.md --low-token ``` - Sonnet lead, Haiku for code-reviewer / test-engineer / oracle-qa, 2 max iterations, no headlines, full-auto gates, earlier auto-compaction. **~30% measured savings** on MNIST (\$7.75 vs ~\$11 default); 50-60% targeted, 70-80% on the [roadmap](/roadmap). See [low-token mode](/concepts/low-token-mode) for the full preset and trade-offs. + Sonnet lead, Haiku for code-reviewer / test-engineer / oracle-qa, 2 max iterations, no end-of-session Haiku summary, full-auto gates, earlier auto-compaction. **~30% measured savings** on MNIST (\$7.75 vs ~\$11 default); 50-60% targeted, 70-80% on the [roadmap](/roadmap). See [low-token mode](/concepts/low-token-mode) for the full preset and trade-offs. ```bash diff --git a/docs/cli/overview.mdx b/docs/cli/overview.mdx index debfa66..f622871 100644 --- a/docs/cli/overview.mdx +++ b/docs/cli/overview.mdx @@ -77,7 +77,7 @@ Most project-aware commands also accept: | `--low-token` | Activate the cost-saving preset (Sonnet lead, max-iterations 2, full-auto gates, no headlines, earlier auto-compaction). See [low-token mode](/concepts/low-token-mode). | | `--lead-model` | Override the lead orchestrator model (`opus`/`sonnet`/`haiku`). Composes with `--low-token`. | | `--max-iterations N` | Hard cap on Phase-4 experiment iterations. Wins over plan-level `## Experiment Loop` and the low-token preset. | -| `--no-headlines` | Disable the Haiku headline ticker. | +| `--no-headlines` | Skip the end-of-session Haiku bullet summary (~1 small call per run). | | `--bypass-permissions` | Auto-approve Claude Code tool-call permission prompts. Implied by `--gate-mode full-auto`. Works in both tmux and headless modes. See [build → Permission prompts](/cli/build#permission-prompts-bypass-permissions). | ## Modes of operation diff --git a/docs/concepts/low-token-mode.mdx b/docs/concepts/low-token-mode.mdx index fbc6833..9746a82 100644 --- a/docs/concepts/low-token-mode.mdx +++ b/docs/concepts/low-token-mode.mdx @@ -43,7 +43,7 @@ The banner shows a `[low-token]` badge whenever the preset is active so you have | Phase-4 `max_iterations` | `10` | `2` | Phase 4 is the dominant cost; iteration count is the multiplier. | | Phase-4 `stop_on_tier` | `must_pass` | `could_pass` | Stops at the weakest acceptable oracle tier instead of pushing for the strongest. | | Cross-cutting `research-scout` | on every phase | dropped | Saves ~6 spawns and their contracts. | -| Haiku headline ticker | every 60s | disabled | ~60 small calls/hour. Small individually, cumulative across long runs. | +| End-of-session Haiku summary | 1 call per run | disabled | ~$0.0002 per run. The previous per-60-second ticker (~60 small calls/hour) was removed unconditionally; only the one-shot wrap-up remains, and `--low-token` skips even that. | | Default gate mode | `supervised` | `full-auto` | No human-loop overhead. You can still pass `--gate-mode supervised` to override. | | `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE` | unset (Claude Code default ~83%) | `60` | Auto-compacts conversation context earlier, prevents performance degradation near the window limit. | @@ -109,7 +109,7 @@ In each case, run on default settings, validate the result, then switch on `low_ The features below would help but require a larger architectural change (switching ZO from `claude` CLI launcher to direct Anthropic SDK). They're tracked as future work, not in scope for low-token mode v1: - **Prompt caching**: Anthropic's 5-minute TTL cache. Would save ~90% on cached input tokens. -- **Batch API**: 50% discount for async jobs. Suitable for the Haiku ticker but incompatible with interactive tmux. +- **Batch API**: 50% discount for async jobs. Applicable to other Haiku/Sonnet calls ZO might add in future (e.g., end-of-phase reports) but incompatible with interactive tmux. The per-60-second Haiku ticker that originally motivated this note has since been removed. - **Files API**: upload static artifacts (plans, specs) once, reference by ID. - **Extended thinking budget tuning**: cap thinking tokens explicitly. @@ -127,8 +127,7 @@ What changes, end-to-end, between a default run and a low-token run on the same | Phase 4 stop condition | `must_pass` tier | `could_pass` tier | Stops earlier on weakest acceptable result | | Cross-cutting `research-scout` | Spawned per phase | Skipped | ~6 spawns × ~1 KB contract | | Cross-cutting `code-reviewer` | Spawned per phase | Spawned per phase | (kept, quality safety net) | -| Haiku headline ticker | Every 60s | Disabled | ~60 small calls/hour | -| End-of-session Haiku summary | Generated | Skipped | 1 small call per session | +| End-of-session Haiku summary | Generated | Skipped | 1 small call per session. (The previous per-60-second headline ticker — ~60 small calls/hour — was removed unconditionally.) | | Auto-compaction trigger | ~83% of context window | 60% of context window | Earlier compaction → less degraded reasoning at the tail | ## Worked example: MNIST @@ -236,7 +235,7 @@ In each case: run on default settings, validate the result, then switch on `low_ Several Anthropic API features could provide additional savings but require switching ZO from the `claude` CLI launcher to direct Anthropic SDK calls, out of scope for low-token v1: - **Prompt caching**: 5-minute TTL cache. Would save ~90% on cached input tokens. - - **Batch API**: 50% discount for async jobs. Suitable for the Haiku ticker but incompatible with interactive tmux. + - **Batch API**: 50% discount for async jobs. Applicable to other Haiku/Sonnet calls ZO might add in future (e.g., end-of-phase reports) but incompatible with interactive tmux. The per-60-second Haiku ticker that originally motivated this note has since been removed. - **Files API**: upload static artifacts (plans, specs) once, reference by ID. - **Extended thinking budget tuning**: cap thinking tokens explicitly. diff --git a/docs/quickstart.mdx b/docs/quickstart.mdx index b896427..32f2791 100644 --- a/docs/quickstart.mdx +++ b/docs/quickstart.mdx @@ -99,8 +99,9 @@ The **Lead Orchestrator** reads the plan, decomposes it into phases, builds cont - A tmux pane with the orchestrator's live session - Phase progress + gate status streamed to your terminal -- Haiku-summarised headline events every 60 seconds +- Live task list + recent agent events from the comms log - A live training dashboard (split-pane) once Phase 4 starts +- At the end of the run, a short Haiku-generated bullet summary of what the team accomplished Press Ctrl-b n to switch into the agent pane and watch the team work in real time. Ctrl-b p jumps back. @@ -113,7 +114,7 @@ The **Lead Orchestrator** reads the plan, decomposes it into phases, builds cont zo build plans/mnist-demo.md --low-token ``` - Sonnet lead instead of Opus, Haiku for code-reviewer / test-engineer / oracle-qa, 2 Phase-4 iterations instead of 10, full-auto gates, no headlines, earlier auto-compaction. **~30% measured savings** on the MNIST bench (\$7.75 vs ~\$11 default); 50-60% targeted with the Haiku routing + per-phase trims (pending second bench), 70-80% on the [roadmap](/roadmap) via SDK refactor (prompt caching, Batch API, Files API). Slight quality trade at the lead step. See [low-token mode](/concepts/low-token-mode) for the full preset and trade-offs. + Sonnet lead instead of Opus, Haiku for code-reviewer / test-engineer / oracle-qa, 2 Phase-4 iterations instead of 10, full-auto gates, no end-of-session Haiku summary, earlier auto-compaction. **~30% measured savings** on the MNIST bench (\$7.75 vs ~\$11 default); 50-60% targeted with the Haiku routing + per-phase trims (pending second bench), 70-80% on the [roadmap](/roadmap) via SDK refactor (prompt caching, Batch API, Files API). Slight quality trade at the lead step. See [low-token mode](/concepts/low-token-mode) for the full preset and trade-offs. ## 6. Verify diff --git a/docs/reference/low-token-preset.mdx b/docs/reference/low-token-preset.mdx index a7f20fb..5028bd2 100644 --- a/docs/reference/low-token-preset.mdx +++ b/docs/reference/low-token-preset.mdx @@ -57,8 +57,7 @@ Authoritative locations: [`src/zo/cli.py`](https://github.com/SamPlvs/zero-opera | Phase-4 `stop_on_tier` | `must_pass` | `could_pass` | `_LOW_TOKEN_LOOP_CLAMPS` in `experiment_loop.py` | | Cross-cutting `research-scout` | enabled | disabled | `_agents_for_phase` in `orchestrator.py` | | Cross-cutting `code-reviewer` | enabled | dropped from Phase 1; **on Haiku** elsewhere | `LOW_TOKEN_PHASE_DROPS` + `LOW_TOKEN_HAIKU_AGENTS` | -| Haiku headline ticker | every 60s | disabled | `_maybe_print_headline` in `cli.py` | -| End-of-session Haiku summary | generated | skipped | `_generate_session_summary` in `cli.py` | +| End-of-session Haiku summary | generated | skipped | `_generate_session_summary` in `cli.py`. (The previous per-60-second `_maybe_print_headline` ticker was removed unconditionally; only the one-shot wrap-up call remains, and `--low-token` skips even that.) | | Default gate mode | `supervised` | `full-auto` | `_resolve_gate_mode` in `cli.py` | | Lead prompt: dedicated adaptations section | included | omitted | `build_lead_prompt` in `orchestrator.py` | | Lead prompt: roster | full descriptive | compact comma-list | `_prompt_roster` in `orchestrator.py` | @@ -72,7 +71,7 @@ These compose with `--low-token` to fine-tune individual knobs: |---|---|---| | `--lead-model {opus,sonnet,haiku}` | choice | Override the lead model | | `--max-iterations N` | int | Hard cap on Phase-4 iterations | -| `--no-headlines` | bool | Disable the headline ticker (independent of low-token) | +| `--no-headlines` | bool | Skip the end-of-session Haiku bullet summary (independent of low-token) | | `--gate-mode {supervised,auto,full-auto}` | choice | Override the gate mode | ## Plan-level fields diff --git a/memory/zo-platform/DECISION_LOG.md b/memory/zo-platform/DECISION_LOG.md index e487925..e8a94ad 100644 --- a/memory/zo-platform/DECISION_LOG.md +++ b/memory/zo-platform/DECISION_LOG.md @@ -1082,3 +1082,36 @@ The tmux mechanism (settings.local.json overlay) is novel for ZO and the risk su - **Single-flag shorthand `--unattended` aliasing both `--gate-mode full-auto` and `--bypass-permissions`** — defer for future polish; the current explicit form is more learnable and the two-flag combo is short enough. **Outcome:** Single feature commit on branch `claude/bypass-permissions-flag`, PR opened against `main`. validate-docs 10/11 (improved from baseline as the test-count warning resolved). 760 pytest pass / 7 skipped. **No new PRIOR added** — this is a clean feature with no failure trace; the design choices are auditable here. Power-user usage: `zo continue --repo /path/to/prod-001 --gate-mode full-auto` for an unattended overnight tmux run with full agent-team visibility AND no permission prompts. + +--- + +## Decision: 2026-05-28T20:00:00Z +**Type:** FEATURE-REMOVAL + BEHAVIOR-CHANGE +**Title:** Remove the per-60-second Haiku headline ticker (keep the end-of-session bullet summary) + +**Decision:** Remove the periodic Haiku headline ticker from `zo build` / `zo continue` runs. The ticker spawned `claude -p --model haiku` every 60 seconds throughout a session to summarise the last 15 buffered events into a one-line headline printed in the lead pane. User feedback: nobody uses the headlines. Cost: ~60 subprocess spawns per hour at ~$0.0001-$0.0003 each, totalling ~$0.06-$0.18 per 10-hour overnight run, plus the latency/CPU cost of repeatedly spawning a Claude Code subprocess. Pure waste if the output is going unread. + +The companion `_generate_session_summary` call at session end is **kept** — it's a single Haiku call per run (~$0.0002) that prints a 2-3 bullet wrap-up of what the team accomplished. Useful, cheap, one-shot. User explicitly confirmed keeping this piece. + +Scope (single PR to `main`): +- **`src/zo/cli.py`** — `_launch_and_monitor`: deleted `_maybe_print_headline` function (~30 lines), the `_last_headline_time` and `_headline_interval` timer variables, and the `_maybe_print_headline()` call inside `_print_status`. Kept `_headline_buffer` (still consumed by `_generate_session_summary`) and all the `_headline_buffer.append(...)` event-capture calls in `_print_status`. Kept `_generate_session_summary` and its end-of-session invocation. Updated `headlines_disabled` parameter docstring + the auxiliary-call comment to reflect the narrowed meaning. Updated `_LOW_TOKEN_PRESET["headlines_disabled"]` inline comment from "disable Haiku ticker (~60 calls/hr)" to "skip end-of-session Haiku summary (~1 call/session)". Updated `--no-headlines` flag help text on both `build` and `continue` commands. +- **`docs/cli/build.mdx`** — Step 5 monitoring sentence rewritten ("surfaces Haiku-summarised headlines every 60 seconds" → "prints the live task list and recent agent events"), "Headlines" Live-monitoring `` removed entirely, options table `--no-headlines` row updated, `--low-token` accordion text updated. +- **`docs/cli/overview.mdx`** — shared options table `--no-headlines` row updated. +- **`docs/quickstart.mdx`** — "What you'll see" bullet list rewritten + new end-of-session bullet added, `--low-token` Note text updated. +- **`docs/COMMANDS.md`** — `--low-token` and `--no-headlines` lines updated. +- **`docs/concepts/low-token-mode.mdx`** — two preset-comparison tables collapsed the "Haiku headline ticker" row into the existing "End-of-session Haiku summary" row with an explanatory note that the ticker has been removed unconditionally. Updated the Batch-API forward-looking note that referenced the (now-removed) ticker. +- **`docs/reference/low-token-preset.mdx`** — preset table same collapse, `--no-headlines` override-flag entry updated. + +**Rationale:** This is the cheapest large-payoff cleanup left on the cost-reduction roadmap. The ticker output was visible (one line every minute in the lead pane) but redundant — the lead pane already shows the live task list, agent events, and gate decisions in real time. The Haiku summary added decoration, not signal. Removing it cuts ~$0.06-$0.18 per overnight run AND removes 60/hr subprocess churn, with no observed user impact (per user-confirmed "no one is using this feature"). Keeping the end-of-session wrap-up preserves the genuinely useful one-shot summary at session close without re-introducing the per-tick cost. + +The `--no-headlines` flag is preserved (not removed) for backwards compatibility. Its meaning narrows from "disable both the ticker and the end-of-session summary" to "disable the end-of-session summary only" — the ticker is gone for everyone now, with or without the flag. + +**Alternatives considered:** +- **Remove the end-of-session summary too** — rejected. User explicitly wanted to keep it. Cost is trivial (~$0.0002/run), and the bullet wrap-up is genuinely useful at session close. +- **Remove the `--no-headlines` flag entirely** — rejected. Backwards-compatibility concern; the flag still has a meaningful (if narrower) effect. Keeping it is one line with positive UX value. +- **Move the ticker off Haiku to a non-LLM summariser** — rejected. The redundancy with the live task list pane is the actual problem, not the Haiku call. A non-LLM summary would still be ignored noise. +- **Make the ticker opt-in via a `--headlines` flag** — rejected. Adds complexity for a feature with zero confirmed users. If demand surfaces later, easy to re-add. + +**Outcome:** Single feature-removal commit on branch `claude/remove-haiku-ticker`, PR opened against `main`. After rebase onto main (which now includes PR #92), pytest 760 passed / 7 skipped on both Python 3.11 and 3.12. ruff `src/` clean. validate-docs clean. Full CI-equivalent matrix run locally per PR-039 protocol before push. **No new PRIOR added** — this is a clean feature-removal driven by user feedback, not a failure-trace self-evolution. + +**Cross-reference:** PR-039 (pre-push verification must mirror full CI matrix) — this PR's verification followed that protocol end-to-end. Related to session 024's `--low-token` design where `headlines_disabled` was first introduced as a cost-saving preset field; this PR extends that thinking to the unconditional case (everyone gets the cost saving, not just `--low-token` users). diff --git a/memory/zo-platform/STATE.md b/memory/zo-platform/STATE.md index 7364b29..e896d84 100644 --- a/memory/zo-platform/STATE.md +++ b/memory/zo-platform/STATE.md @@ -8,7 +8,9 @@ status: complete ## Current Position -**Session 031 hand-off — pick up here.** New CLI flag `--bypass-permissions` added to `zo build` / `zo continue` to give users an explicit opt-in for auto-approving Claude Code tool-call permission prompts. Design lesson captured in **PRIORS PR-038** (a CLI flag should map to one concern; coupling visibility-mode to safety-mode silently bypasses user expectations). Behavior change worth flagging: previously `--no-tmux` ALWAYS injected `--dangerously-skip-permissions` into the Claude CLI command (`src/zo/wrapper.py:376` unconditional); now it's conditional on the resolved bypass setting. Truth table (resolver: `_resolve_bypass_permissions` at `src/zo/cli.py:301`): `--gate-mode supervised|auto` → bypass off unless explicit flag; `--gate-mode full-auto` → bypass implicitly on (no-human-on-gates + must-click-every-tool was a contradiction). **Tmux mode now also supports bypass** via a new `src/zo/permissions_overlay.py` module: writes `permissions.defaultMode: "bypassPermissions"` into `/.claude/settings.local.json` on launch, backs up the original to a sibling `.zo-backup` file, restores via `atexit` on exit; `cleanup_stale_overlay()` called at every `_launch_and_monitor` invocation handles crashed-mid-run recovery from a leftover backup. Settings.local.json was also added to `.gitignore` in PR #91 (already merged) so the per-user file doesn't accumulate transient secrets in tracked state. **Tests added: +17** (`tests/unit/test_permissions_overlay.py` 12 cases covering existing-settings / no-settings / malformed / stale-cleanup / non-dict-permissions; `tests/unit/test_wrapper.py` 3 new cases for headless conditional flag + tmux overlay apply/skip; `tests/unit/test_cli.py` 1 case for the resolver truth table). pytest 760 passed / 7 skipped (was 743 + 7). validate-docs 10/11 (1 pre-existing warning, 0 failures — the previously-flagged test-count warning resolved naturally as the suite grew above 743). Docs updated: `docs/cli/build.mdx` gains a "Permission prompts" section with the full truth table and the behavior-change note; `docs/cli/overview.mdx` adds the flag to the shared options table. **Next action when picking up:** unchanged — monitor Discussions for early external-user signal on Tier 2 sequencing (extension vs cost), then resume the Tier 1 recommendation from session 028 (caveman ablation → onboarding hardening). Power-user UX note: for an unattended prod-001 overnight run, `zo continue --repo /path/to/prod-001 --gate-mode full-auto` now Just Works in tmux without any manual settings.local.json tinkering. +**Session 032 hand-off — pick up here.** Periodic Haiku headline ticker removed from `_launch_and_monitor` in `src/zo/cli.py`. The ticker spawned `claude -p --model haiku` every 60s during a run to summarise recent agent events into a 1-line headline — ~60 subprocess launches/hour at ~$0.0001-$0.0003 each, totalling ~$0.06-$0.18 per overnight run. User confirmed nobody uses it. Removed the `_maybe_print_headline` function, the `_last_headline_time` and `_headline_interval` timer variables, and the `_maybe_print_headline()` call inside `_print_status`. Kept the `_headline_buffer` list (still consumed once at session end by `_generate_session_summary` for a 2-3 bullet Haiku wrap-up, ~1 call/run, ~$0.0002 — cheap and useful). Kept `_generate_session_summary` and the `--no-headlines` flag; the flag's meaning narrows to "skip the end-of-session bullet summary too" (it can no longer disable the ticker, because the ticker doesn't exist). Updated docstrings, `_LOW_TOKEN_PRESET` inline comment, the auxiliary-call comment near the end-of-session call. **Docs cascade:** `docs/cli/build.mdx` (Step 5 monitoring sentence rewritten, "Headlines" Live-monitoring Card removed, options table `--no-headlines` row + `--low-token` accordion text updated), `docs/cli/overview.mdx` shared options table, `docs/quickstart.mdx` "What you'll see" bullet list + `--low-token` Note, `docs/COMMANDS.md` `--low-token` + `--no-headlines` lines, `docs/concepts/low-token-mode.mdx` two tables + the Batch API forward-looking note, `docs/reference/low-token-preset.mdx` preset table + override flag entry. **Behaviour change:** anyone running `zo build` today sees a console headline every 60s; after this PR they don't. End-of-session summary unchanged. After rebase onto main (now includes PR #92's bypass-permissions feature + 17 new tests), pytest 760 passed / 7 skipped on both Python 3.11 and 3.12. ruff src/ clean. validate-docs clean. **Next action when picking up:** unchanged — monitor Discussions for early external-user signal on Tier 2 sequencing, then resume Tier 1 (caveman ablation → onboarding hardening). + +**Session 031 hand-off (prior).** New CLI flag `--bypass-permissions` added to `zo build` / `zo continue` to give users an explicit opt-in for auto-approving Claude Code tool-call permission prompts. Design lesson captured in **PRIORS PR-038** (a CLI flag should map to one concern; coupling visibility-mode to safety-mode silently bypasses user expectations). Behavior change worth flagging: previously `--no-tmux` ALWAYS injected `--dangerously-skip-permissions` into the Claude CLI command (`src/zo/wrapper.py:376` unconditional); now it's conditional on the resolved bypass setting. Truth table (resolver: `_resolve_bypass_permissions` at `src/zo/cli.py:301`): `--gate-mode supervised|auto` → bypass off unless explicit flag; `--gate-mode full-auto` → bypass implicitly on (no-human-on-gates + must-click-every-tool was a contradiction). **Tmux mode now also supports bypass** via a new `src/zo/permissions_overlay.py` module: writes `permissions.defaultMode: "bypassPermissions"` into `/.claude/settings.local.json` on launch, backs up the original to a sibling `.zo-backup` file, restores via `atexit` on exit; `cleanup_stale_overlay()` called at every `_launch_and_monitor` invocation handles crashed-mid-run recovery from a leftover backup. Settings.local.json was also added to `.gitignore` in PR #91 (already merged) so the per-user file doesn't accumulate transient secrets in tracked state. **Tests added: +17** (`tests/unit/test_permissions_overlay.py` 12 cases covering existing-settings / no-settings / malformed / stale-cleanup / non-dict-permissions; `tests/unit/test_wrapper.py` 3 new cases for headless conditional flag + tmux overlay apply/skip; `tests/unit/test_cli.py` 1 case for the resolver truth table). pytest 760 passed / 7 skipped (was 743 + 7). validate-docs 10/11 (1 pre-existing warning, 0 failures — the previously-flagged test-count warning resolved naturally as the suite grew above 743). Docs updated: `docs/cli/build.mdx` gains a "Permission prompts" section with the full truth table and the behavior-change note; `docs/cli/overview.mdx` adds the flag to the shared options table. Power-user UX note: for an unattended prod-001 overnight run, `zo continue --repo /path/to/prod-001 --gate-mode full-auto` now Just Works in tmux without any manual settings.local.json tinkering. **Session 030 hand-off (prior).** Single-file copy + styling tweak on the public website hero. `website/src/pages/index.html:84` byline extended from "by Samyakh (Sam) Tukra" to "by Samyakh (Sam) Tukra / with Callum Adamson" laid out as a two-line credit; Callum's name links to his LinkedIn (already present in the footer Contributors list at line 1053, now also surfaced in the opening eyebrow). Visual treatment in `website/public/styles.css:251` (new `.author-secondary` rule): `display: block` forces the secondary clause onto its own line below the main byline, `font-size: 0.9em` (renders at 9px against the eyebrow's 10px) + `opacity: 0.55` produce a subsidiary read while keeping the link discoverable (hover lifts opacity to 0.85, matching the existing `.author-link` hover affordance). The two-line layout was chosen over an inline same-line credit after preview testing — comma + "with" on one line wraps unpredictably at narrow viewports and can strand "with" at the end of line 1. Verified live in Astro 5 dev preview at 1280×720: Sam's name renders full coral oklch(0.58 0.16 35) at 10px, secondary line at 9px / opacity 0.55, both links underlined per `.author-link` rule. No code, tests, agents, commands, version, or model tiers touched — pure copy + styling, single PR scope. validate-docs 9/11 (2 pre-existing warnings, 0 failures — identical state to session 029). **Next action when picking up:** unchanged — monitor Discussions for early external-user signal on Tier 2 sequencing (extension vs cost), then resume the Tier 1 recommendation from session 028 (caveman ablation → onboarding hardening). diff --git a/src/zo/cli.py b/src/zo/cli.py index a269a79..c96eb19 100644 --- a/src/zo/cli.py +++ b/src/zo/cli.py @@ -263,7 +263,7 @@ def _gate_mode_from_str(value: str) -> GateMode: "max_iterations": 2, # cuts the dominant Phase-4 multiplier "stop_on_tier": "could_pass", # earlier stop on weakest acceptable tier "drop_research_scout": True, # skip cross-cutting literature review - "headlines_disabled": True, # disable Haiku ticker (~60 calls/hr) + "headlines_disabled": True, # skip end-of-session Haiku summary (~1 call/session) "gate_mode": "full-auto", # no human-loop overhead "compact_threshold": "60", # CLAUDE_AUTOCOMPACT_PCT_OVERRIDE } @@ -731,8 +731,11 @@ def _launch_and_monitor( extra_env: Extra environment variables passed to the Claude Code subprocess. The low-token preset uses this to set ``CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=60``. - headlines_disabled: When True, skips the periodic Haiku headline - summaries. Set by ``--low-token`` and ``--no-headlines``. + headlines_disabled: When True, skips the end-of-session Haiku + bullet summary. Set by ``--low-token`` and ``--no-headlines``. + The previous per-60-second headline ticker has been removed + unconditionally; this flag now controls only the one-shot + session-end summary. bypass_permissions: When True, Claude Code tool-call permission prompts are auto-approved. Set by ``--bypass-permissions`` or implied by ``--gate-mode full-auto``. @@ -773,42 +776,12 @@ def _launch_and_monitor( console.print() _seen_events: set[str] = set() + # Buffer of recent agent events. Accumulated in _print_status and + # consumed once at session end by _generate_session_summary to print + # a 2-3 bullet Haiku-generated wrap-up. The previous per-60-second + # ticker that also consumed this buffer was removed — Haiku + # output is now strictly one call per session, not 60/hr. _headline_buffer: list[str] = [] - _last_headline_time: float = 0.0 - _headline_interval = 60 # seconds between Haiku summaries - - def _maybe_print_headline() -> None: - """Send buffered events to Haiku for a 1-line summary.""" - import time as _time - - nonlocal _last_headline_time - if headlines_disabled: - return - now = _time.monotonic() - if not _headline_buffer: - return - if now - _last_headline_time < _headline_interval: - return - - events_text = "\n".join(_headline_buffer[-15:]) - _headline_buffer.clear() - _last_headline_time = now - - try: - result = __import__("subprocess").run( - ["claude", "-p", "--model", "haiku", - f"Summarise these agent events in ONE short " - f"headline (max 80 chars). No preamble, just " - f"the headline:\n\n{events_text}"], - capture_output=True, text=True, timeout=15, - ) - headline = result.stdout.strip().split("\n")[0][:80] - if headline: - console.print( - f" [{_AMBER}]▸ {headline}[/]" - ) - except Exception: - pass # Non-critical — skip if Haiku unavailable def _print_status(team_status, pane_snapshot=""): # noqa: ANN001 from datetime import UTC, datetime @@ -897,7 +870,6 @@ def _print_status(team_status, pane_snapshot=""): # noqa: ANN001 if not tasks and not header_parts: console.print(f" [{_DIM}][{elapsed}] Waiting for agents...[/]") - _maybe_print_headline() console.print() process = wrapper.wait_for_completion( @@ -911,8 +883,10 @@ def _print_status(team_status, pane_snapshot=""): # noqa: ANN001 else: console.print(f"[red bold]Session ended with status:[/] {process.status}") - # Generate a Haiku summary of the session from buffered events. - # Low-token / --no-headlines opts out of this auxiliary call too. + # Generate a Haiku-summarised 2-3 bullet wrap-up from buffered + # events. This is the only Haiku call ZO makes during a run + # (the per-60-second ticker was removed). ``--low-token`` and + # ``--no-headlines`` skip even this one ~$0.0002 call. if _headline_buffer and not headlines_disabled: _generate_session_summary(_headline_buffer, team_name) @@ -952,7 +926,7 @@ def _print_status(team_status, pane_snapshot=""): # noqa: ANN001 ) @click.option( "--no-headlines", is_flag=True, - help="Disable the Haiku headline ticker (saves ~60 small calls/hour).", + help="Skip the end-of-session Haiku bullet summary (~1 call per run).", ) @click.option( "--bypass-permissions", is_flag=True, @@ -1150,7 +1124,7 @@ def build( ) @click.option( "--no-headlines", is_flag=True, - help="Disable the Haiku headline ticker.", + help="Skip the end-of-session Haiku bullet summary.", ) @click.option( "--bypass-permissions", is_flag=True,