diff --git a/docs/cli/build.mdx b/docs/cli/build.mdx index faf0f72..f78137b 100644 --- a/docs/cli/build.mdx +++ b/docs/cli/build.mdx @@ -79,6 +79,26 @@ You don't pass a flag for this, `zo build` figures it out. Toggle mid-session with `zo gates set -p `. +## Permission prompts: --bypass-permissions + +`--gate-mode` controls human involvement at **ZO's phase gates**. Independently, **Claude Code's tool-call permission prompts** can also be auto-approved with `--bypass-permissions`. The two are independent, but `--gate-mode full-auto` implicitly turns bypass on (no-human-on-gates plus must-click-every-tool is a contradiction). + +| Flags | Gates | Permission prompts | +|---|---|---| +| `--gate-mode supervised` (default) | human approves | human approves each | +| `--gate-mode supervised --bypass-permissions` | human approves gates | auto-approved (walk away) | +| `--gate-mode auto` | auto on success, pause on must-pass fail | human approves each | +| `--gate-mode auto --bypass-permissions` | auto on success, pause on fail | auto-approved | +| `--gate-mode full-auto` | all auto | auto-approved (implied) | +| `--gate-mode full-auto --bypass-permissions` | all auto | auto-approved (redundant) | + +Works in both **tmux** and **headless** modes: + +- **Headless** (`--no-tmux`): passes `--dangerously-skip-permissions` to the Claude Code CLI. +- **Tmux** (default): temporarily overlays `permissions.defaultMode: "bypassPermissions"` onto your project's `.claude/settings.local.json` for the duration of the run. The original file is backed up to `.claude/settings.local.json.zo-backup` and restored automatically when the run exits (via `atexit`). If ZO crashes mid-run, the next `zo` command auto-detects the stale backup and restores. + +**Behavior change note:** previously, `--no-tmux` runs unconditionally bypassed permissions. Now they only bypass when `--bypass-permissions` or `--gate-mode full-auto` is set. If you previously relied on `--no-tmux --gate-mode supervised` for prompt-free runs, add `--bypass-permissions` to keep that behavior. + ## Cross-machine: --repo When you've moved a project from one machine to another (Mac dev → GPU server), use `--repo` to point at the delivery repo containing the `.zo/` layout: @@ -119,6 +139,7 @@ While `zo build` is running, you have several windows into what the team is doin | `--lead-model` | `opus` (or `sonnet` if `--low-token`) | Override the lead orchestrator model: `opus` / `sonnet` / `haiku`. | | `--max-iterations N` | `10` (or `2` if `--low-token`) | Hard cap on Phase-4 experiment iterations. Wins over plan and preset. | | `--no-headlines` | (off) | Disable the Haiku headline ticker (saves ~60 small calls/hour). | +| `--bypass-permissions` | (off) | Auto-approve Claude Code tool-call prompts. Implied by `--gate-mode full-auto`. See [Permission prompts](#permission-prompts-bypass-permissions). | ## Examples diff --git a/docs/cli/overview.mdx b/docs/cli/overview.mdx index 28c1a31..debfa66 100644 --- a/docs/cli/overview.mdx +++ b/docs/cli/overview.mdx @@ -78,6 +78,7 @@ Most project-aware commands also accept: | `--lead-model` | Override the lead orchestrator model (`opus`/`sonnet`/`haiku`). Composes with `--low-token`. | | `--max-iterations N` | Hard cap on Phase-4 experiment iterations. Wins over plan-level `## Experiment Loop` and the low-token preset. | | `--no-headlines` | Disable the Haiku headline ticker. | +| `--bypass-permissions` | Auto-approve Claude Code tool-call permission prompts. Implied by `--gate-mode full-auto`. Works in both tmux and headless modes. See [build → Permission prompts](/cli/build#permission-prompts-bypass-permissions). | ## Modes of operation diff --git a/memory/zo-platform/DECISION_LOG.md b/memory/zo-platform/DECISION_LOG.md index b284b40..e487925 100644 --- a/memory/zo-platform/DECISION_LOG.md +++ b/memory/zo-platform/DECISION_LOG.md @@ -1055,3 +1055,30 @@ Scope (single PR to `main`): - **Equal-billing treatment** ("by Sam & Callum") — rejected per user direction; Sam is the lead/creator and the visual hierarchy needs to reflect that without making it cringe-y to read. **Outcome:** Single conventional commit on branch `claude/website-byline-callum`, PR opened against `main`. validate-docs 9/11 (2 pre-existing warnings, 0 failures — identical state to session 029, unchanged by this PR). No code, tests, agents, commands, version, or model tiers touched. **No new PRIOR added** — this is a copy/styling decision, not a failure-driven self-evolution; design rationale captured in this DECISION_LOG entry is the auditable record. + +--- + +## Decision: 2026-05-28T18:00:00Z +**Type:** FEATURE + BEHAVIOR-CHANGE +**Title:** Add `--bypass-permissions` flag; decouple Claude permission bypass from `--no-tmux` mode + +**Decision:** Introduce an explicit `--bypass-permissions` CLI flag on `zo build` and `zo continue` for auto-approving Claude Code's tool-call permission prompts. Effective bypass resolves as `cli_bypass OR gate_mode == "full-auto"` — the latter implicit because no-human-on-gates plus must-click-every-tool is a self-contradicting UX. The flag works identically in tmux and headless modes via two different mechanisms (tmux: settings-file overlay; headless: existing CLI flag, now conditional). Also: previously `--no-tmux` unconditionally bypassed permissions (a baked-in `--dangerously-skip-permissions` at `wrapper.py:376`); that is now conditional on the resolved bypass setting, restoring symmetry between tmux and headless behavior. + +Scope: +- **`src/zo/permissions_overlay.py`** (new, ~140 LOC): `apply_bypass_overlay(claude_dir)` writes the overlay (with safe backup) and returns a `restore_fn` callable; `cleanup_stale_overlay(claude_dir)` detects + restores from a crashed previous run (sentinel-marker pattern for the "no-original-file" case). +- **`src/zo/wrapper.py`**: `launch_lead_session` / `_launch_tmux` / `_launch_headless` each gain `bypass_permissions: bool = False` param. Tmux path calls `apply_bypass_overlay` + `atexit.register(restore_fn)` when bypass is True. Headless path appends `--dangerously-skip-permissions` to the Claude CLI invocation only when bypass is True. +- **`src/zo/cli.py`**: new helper `_resolve_bypass_permissions(*, cli_bypass, gate_mode)` (visible truth table in docstring); new `--bypass-permissions` flag on `build` and `continue`; `_launch_and_monitor` accepts and threads `bypass_permissions` through to `launch_lead_session`; calls `cleanup_stale_overlay(zo_root / ".claude")` at every invocation to recover from prior crashes. +- **Tests:** `tests/unit/test_permissions_overlay.py` (12 cases) covers existing/no/malformed settings, stale-cleanup paths, and the defensive non-dict-permissions case. `tests/unit/test_wrapper.py` gains 3 new cases (headless conditional flag + tmux overlay apply/skip). `tests/unit/test_cli.py` gains the resolver truth-table case. **+17 tests total**; pytest 760 passed (was 743) + 7 skipped, validate-docs 10/11 (0 failures, 1 pre-existing warning). +- **Docs:** `docs/cli/build.mdx` gains a "Permission prompts: --bypass-permissions" section with the full truth table + behavior-change note. `docs/cli/overview.mdx` adds the flag to the shared options table. + +**Rationale:** Two independent concerns were previously coupled in confusing ways. `--gate-mode` controls ZO-level human checkpoints (Phase 2, Phase 4 reviews); Claude Code's permission prompts are a separate layer about *tool-call* approval (each Bash/Edit/Read/Write). The pre-PR behavior bypassed Claude permissions whenever `--no-tmux` was set — which is wrong because (a) `--no-tmux` is a visibility-mode choice, not a safety-mode choice, and (b) users running `zo continue --no-tmux --gate-mode supervised` (e.g., on a CI box but wanting human review of gates via Slack) had no way to keep prompts on. The new design makes the user's intent the source of truth via an explicit, named flag, with the only implicit behavior being a contradiction-avoiding default (`full-auto` ⇒ bypass on). + +The tmux mechanism (settings.local.json overlay) is novel for ZO and the risk surface is the restore-on-exit logic. Three layers of safety: (1) `atexit` handler fires on normal exit and Python-level exceptions; (2) sibling `.zo-backup` file is left on disk if the process dies before atexit (crash, kill -9); (3) `cleanup_stale_overlay()` runs at every subsequent `_launch_and_monitor` invocation, detects the orphan backup, and restores. Combined coverage handles every termination path short of filesystem corruption. + +**Alternatives considered:** +- **Single `--full-auto` shorthand** — rejected. Two independent concerns deserve two flags; combining them removes the supervised+bypass-permissions case the user explicitly wanted ("walk away from the terminal but still review gates"). +- **Always inject `--dangerously-skip-permissions` (preserve old `--no-tmux` behavior)** — rejected. The asymmetry between tmux (prompts) and headless (no prompts) is itself the bug. Symmetric behavior with explicit opt-in is the cleaner contract. +- **Write the overlay to a temp file and point Claude Code at it via env var** — would be cleaner (no mutation of user files) but no Claude Code env var exists for this; settings.local.json mutation with safe-restore is the available path. +- **Single-flag shorthand `--unattended` aliasing both `--gate-mode full-auto` and `--bypass-permissions`** — defer for future polish; the current explicit form is more learnable and the two-flag combo is short enough. + +**Outcome:** Single feature commit on branch `claude/bypass-permissions-flag`, PR opened against `main`. validate-docs 10/11 (improved from baseline as the test-count warning resolved). 760 pytest pass / 7 skipped. **No new PRIOR added** — this is a clean feature with no failure trace; the design choices are auditable here. Power-user usage: `zo continue --repo /path/to/prod-001 --gate-mode full-auto` for an unattended overnight tmux run with full agent-team visibility AND no permission prompts. diff --git a/memory/zo-platform/PRIORS.md b/memory/zo-platform/PRIORS.md index 871b958..c381d5f 100644 --- a/memory/zo-platform/PRIORS.md +++ b/memory/zo-platform/PRIORS.md @@ -1150,3 +1150,103 @@ Test count 738 → 743 + 7 skipped. ruff `src/zo/orchestrator.py tests/unit/test The fix would have caught the original failure: with ACTIVE handled, the user's `zo continue` on prod-001 with `phase_3: active` would have resumed cleanly into the Phase 3 main wave kickoff. With ACTIVE handled, any session interrupted during the autonomous experiment loop's CONTINUE iteration also recovers correctly — the more important durable benefit. **Cross-reference:** PR-036 (parse-time status validation, same investigation, same domain). Both ship in PR #64. + +--- + +## PR-038: A CLI Flag Should Map to One Concern — Coupling Visibility-Mode to Safety-Mode Silently Bypasses User Expectations + +**Failure ref:** Before PR #92, `zo build --no-tmux` (a visibility flag — "don't render the lead in a tmux pane") unconditionally appended `--dangerously-skip-permissions` to the Claude CLI invocation at `wrapper.py:376`. The original rationale was reasonable in isolation: headless mode is automation-ish, so prompts don't make sense. But the coupling produced a silent UX bug: `zo build --no-tmux --gate-mode supervised` — a perfectly legitimate combination for someone supervising gates via Slack/email while ZO runs on a CI box — bypassed every tool-call permission prompt without the user realising it. + +The bug was undetected by tests because the failing combination's expected behaviour was never asserted. The test at the time read `assert "--dangerously-skip-permissions" in cmd` under `use_tmux=False`, locking in the bug rather than catching it. + +The deeper failure mode: **two independent concerns sharing one toggle**. `--no-tmux` answers *"where does output go?"*. Permission bypass answers *"what is the agent allowed to do without asking?"*. Coupling them meant a user who only intended to change rendering also (silently) changed safety posture. + +The user discovered it organically when pushing back on my recommendation: *"by doing --gate-mode supervised I want gates to be human passes ... but all permissions to be asked to me for approval / denial (like normal)"*. The supervised + prompts combination should be possible. It wasn't. + +### Rules + +1. **A CLI flag controls one semantic concern; never two.** When you find yourself writing "well, since they passed X they probably also want Y," stop and add Y as its own flag. The user can always combine; they cannot decouple. + - **Why:** Silent side-effects are non-debuggable. A user who reads `--no-tmux` in the help text reasonably believes it only changes output rendering. If it also changes safety mode, that's hidden behaviour with high blast radius. + - **How to apply:** When adding a flag that does N things, check whether each thing is the same concept. "Where output renders" and "what safety prompts fire" are not the same concept. Split them. Provide a shorthand later if combos are common (e.g., `--gate-mode full-auto` implicitly enables bypass because the supervised-gates-without-tool-prompts variant is genuinely incoherent — but that *one* implication is explicit and documented in the resolver function's docstring). + +2. **Safe-by-default is the only sane default for safety modes.** If a flag controls whether a destructive operation needs approval, the default must be "require approval." Skipping prompts is something the user opts into, never something they inherit by accident. + - **Why:** "Off by default, on by explicit user request" is the only configuration that scales across user contexts. The previous design meant headless users on a remote server (often the people most in need of audit trails) got the least safety. + - **How to apply:** Every safety-bypass flag must default to False. Any implicit coupling to another flag must be one-directional and named explicitly in the resolver (e.g., `full-auto → bypass=True` is a contradiction-avoiding coupling; the inverse `supervised → bypass=False` is genuinely the default, not a side effect). + +3. **Behaviour coupling needs a contract test that fails when the coupling is wrong.** Before this PR, there was no test asserting `--no-tmux --gate-mode supervised` should preserve the safety prompts. The test suite tolerated a contract violation because the contract wasn't expressed as a test. The new resolver truth-table test (`test_resolve_bypass_permissions_truth_table` in `tests/unit/test_cli.py`) covers all six (gate_mode × cli_bypass) combinations and would catch any future regression where someone re-introduces the coupling. + - **Why:** Coupling regressions are common (a future contributor sees the lazy-default code and thinks "I'll just OR these together for convenience"). A truth-table test makes the right behaviour the *only* behaviour that compiles to green. + - **How to apply:** Any resolver function that combines multiple input flags into a single behavioural output must have a test that enumerates every input combination. Six lines of test for six possible flag combinations is cheap insurance against the next contributor accidentally re-introducing a silent coupling. + +### Verified Solution + +`src/zo/cli.py` adds `_resolve_bypass_permissions(*, cli_bypass: bool, gate_mode: str) -> bool` returning `cli_bypass OR gate_mode == "full-auto"`. The single allowed implicit coupling (`full-auto ⇒ bypass`) is justified inline in the function's docstring: no-human-on-gates with must-click-every-tool is itself a UX contradiction; the coupling avoids that and only that. + +`src/zo/wrapper.py`'s `_launch_headless` now appends `--dangerously-skip-permissions` only when `bypass_permissions=True` (parameter threaded through `launch_lead_session` from `cli.py`). The tmux path gains the parallel mechanism via `src/zo/permissions_overlay.py`: writes `permissions.defaultMode: "bypassPermissions"` into `/.claude/settings.local.json` for the run, backs up the original to `.zo-backup`, restores via `atexit` on exit. `cleanup_stale_overlay()` runs at every `_launch_and_monitor` invocation to recover from a crashed previous run via the sibling-backup pattern. + +`tests/unit/test_cli.py:TestLowTokenFlags::test_resolve_bypass_permissions_truth_table` locks all six combinations as explicit assertions. `tests/unit/test_wrapper.py` gains three cases proving `--dangerously-skip-permissions` is present iff bypass=True (covering bypass=True, bypass=False, and default-arg). `tests/unit/test_permissions_overlay.py` (new, 12 cases) covers existing/no/malformed settings, idempotent restore, and the full stale-cleanup matrix including the no-original sentinel path. + +Test count 743 → 760 + 7 skipped. validate-docs 10/11 (1 pre-existing warning, 0 failures). + +**Secondary lesson (testability):** During implementation, my first cut used a lazy `import atexit` inside `_launch_tmux`. The mock-based test `@mock.patch("zo.wrapper.atexit.register")` failed with `AttributeError: module 'zo.wrapper' has no attribute 'atexit'` because the lazy import never made `atexit` a module-level attribute that `mock.patch` could find. Moving the import to module level fixed both tests. Generic Python testing lesson worth a footnote: any module attribute you intend to mock from outside must be imported at module load time, not lazily inside a function. + +**Cross-reference:** Adjacent in design philosophy to PR-035 (aspirational vs. hard contracts) — same family of failures where a value the producer writes isn't honoured the way the consumer interprets it. Ships in PR #92. + +--- + +## PR-039: Pre-Push Verification Must Mirror the Full CI Matrix, Not a Subset + +**Failure ref:** PR #92's first push had `pytest -q` and `validate-docs.sh` passing locally but failed CI immediately on the `ruff check src/` step (8 errors in newly-added code). A second push fixed the src/ violations but introduced 5 more in the new test files. A third push was needed to close everything out. The user observed: *"This is unacceptable. You should have caught this the first time."* + +The root cause was a partial pre-push checklist. The local protocol I had been following was: + +1. `uv run pytest -q` +2. `bash scripts/validate-docs.sh` + +The actual CI surface (`.github/workflows/ci.yml`) is: + +1. `uv sync --extra dev` +2. `uv run ruff check src/` ← **missed** +3. `uv run pytest -q` (matrix: Python 3.11 AND Python 3.12) ← **only ran default Python** +4. `./scripts/validate-docs.sh` + +Three gaps in the local verification: no ruff at all, no explicit Python 3.11 run, and no scan of the workflow files at the start of the task to discover what gates exist. The consequence was a CI red, a corrective push, and the user re-prompting with frustration. + +### Rules + +1. **Before claiming a PR ready, read `.github/workflows/*.yml` and execute every step locally.** Workflow files are the contract; the local protocol must mirror them. Treating any locally-known check (pytest, validate-docs) as a proxy for "all CI checks" is a category error — CI has authoritative scope that local memory does not. + - **Why:** The CI surface evolves. New checks get added without coordination ("we now lint with mypy too"). A stale mental model of "the usual checks" silently loses coverage. The workflow YAML is the source of truth and reads in 30 seconds. + - **How to apply:** First action in any code-change task: `ls .github/workflows/*.yml && cat ...` to enumerate every gate. Build the local check list from the YAML, not from memory. Run every step before pushing. If running a step is genuinely impractical locally (cloud-only steps, secrets-dependent jobs), name them explicitly in the PR body as "verified by CI only, not locally." + +2. **Run language/runtime matrices fully, not just the default.** When the CI matrix lists Python 3.11 + 3.12, both must be verified locally (`uv run --python 3.11 pytest -q` and `uv run --python 3.12 pytest -q`). Default-Python-only verification leaves the other matrix entry as a coin flip. + - **Why:** Cross-version bugs are common and subtle (f-string syntax, walrus operator, generic syntax, stdlib API changes). The matrix exists because the project's compatibility floor is broader than the developer's daily environment. Trusting one cell of the matrix to represent all is exactly the assumption matrix testing is designed to break. + - **How to apply:** `uv python list` to discover what's installed; `uv python install ` if a matrix entry isn't local yet; explicit `--python X.Y` invocation per pytest run; one combined command at the end of the pre-push checklist that runs both. Cheap in wall-clock (seconds per Python version on a small suite) compared to the cost of a failed CI run + corrective push. + +3. **Lint scope in CI may differ from optimal lint scope locally. Lint your own additions everywhere, not just where CI gates.** This project's CI runs `ruff check src/` (production code only). My new test files weren't gated by CI, but they had 5 ruff violations, which is sloppy regardless of whether CI catches them. The convention "CI gates production; tests are advisory" is fine as a project policy but a poor one for the author to adopt for their own code — at minimum the author should ensure their additions pass on every scope ruff is configured to consider. + - **Why:** Test lint debt accumulates silently (this project had 49 pre-existing ruff errors in `tests/` when I checked). Each contributor leaving their own additions lint-dirty grows the debt. Linting your additions broadly is a one-command discipline; cleaning up someone else's mess after the fact is harder. + - **How to apply:** After CI-required checks pass, also run `ruff check tests/unit/.py` and clean those. If pre-existing violations in other files surface, leave them alone (out of scope) but record an observation so the project owner can decide whether to schedule a cleanup. + +### Verified Solution + +Local pre-push checklist now codified as the complete CI mirror plus a "your own additions" sweep: + +```bash +# 1. Sync (matches `uv sync --extra dev` in CI) +uv sync --extra dev + +# 2. Lint src/ (matches `uv run ruff check src/` in CI) +uv run ruff check src/ + +# 3. Bonus: lint my own test-file additions (not CI-gated, but mine) +uv run ruff check tests/unit/.py + +# 4. pytest on both matrix Python versions (matches CI matrix) +uv run --python 3.11 pytest -q +uv run --python 3.12 pytest -q + +# 5. Doc validation (matches `./scripts/validate-docs.sh` in validate-docs.yml) +bash scripts/validate-docs.sh +``` + +For PR #92 the corrective sequence was: identify the actual CI gates by reading `.github/workflows/ci.yml`, fix 8 ruff errors in `src/zo/permissions_overlay.py` and `src/zo/wrapper.py`, fix 5 more in `tests/unit/test_permissions_overlay.py`, then run the full checklist locally and watch CI green via `gh pr checks 92 --watch`. All three pushes verified end-to-end after the third corrective commit. + +**Cross-reference:** Process-discipline prior, not a code-design prior. Pairs with PR-005 ("enforcement > aspiration") applied to the developer's own pre-push verification: aspirational adherence to "I usually run the checks" produced a real failure; the checklist must be enforced by execution, not by intention. diff --git a/memory/zo-platform/STATE.md b/memory/zo-platform/STATE.md index ecca567..7364b29 100644 --- a/memory/zo-platform/STATE.md +++ b/memory/zo-platform/STATE.md @@ -8,7 +8,9 @@ status: complete ## Current Position -**Session 030 hand-off — pick up here.** Single-file copy + styling tweak on the public website hero. `website/src/pages/index.html:84` byline extended from "by Samyakh (Sam) Tukra" to "by Samyakh (Sam) Tukra / with Callum Adamson" laid out as a two-line credit; Callum's name links to his LinkedIn (already present in the footer Contributors list at line 1053, now also surfaced in the opening eyebrow). Visual treatment in `website/public/styles.css:251` (new `.author-secondary` rule): `display: block` forces the secondary clause onto its own line below the main byline, `font-size: 0.9em` (renders at 9px against the eyebrow's 10px) + `opacity: 0.55` produce a subsidiary read while keeping the link discoverable (hover lifts opacity to 0.85, matching the existing `.author-link` hover affordance). The two-line layout was chosen over an inline same-line credit after preview testing — comma + "with" on one line wraps unpredictably at narrow viewports and can strand "with" at the end of line 1. Verified live in Astro 5 dev preview at 1280×720: Sam's name renders full coral oklch(0.58 0.16 35) at 10px, secondary line at 9px / opacity 0.55, both links underlined per `.author-link` rule. No code, tests, agents, commands, version, or model tiers touched — pure copy + styling, single PR scope. validate-docs 9/11 (2 pre-existing warnings, 0 failures — identical state to session 029). **Next action when picking up:** unchanged — monitor Discussions for early external-user signal on Tier 2 sequencing (extension vs cost), then resume the Tier 1 recommendation from session 028 (caveman ablation → onboarding hardening). +**Session 031 hand-off — pick up here.** New CLI flag `--bypass-permissions` added to `zo build` / `zo continue` to give users an explicit opt-in for auto-approving Claude Code tool-call permission prompts. Design lesson captured in **PRIORS PR-038** (a CLI flag should map to one concern; coupling visibility-mode to safety-mode silently bypasses user expectations). Behavior change worth flagging: previously `--no-tmux` ALWAYS injected `--dangerously-skip-permissions` into the Claude CLI command (`src/zo/wrapper.py:376` unconditional); now it's conditional on the resolved bypass setting. Truth table (resolver: `_resolve_bypass_permissions` at `src/zo/cli.py:301`): `--gate-mode supervised|auto` → bypass off unless explicit flag; `--gate-mode full-auto` → bypass implicitly on (no-human-on-gates + must-click-every-tool was a contradiction). **Tmux mode now also supports bypass** via a new `src/zo/permissions_overlay.py` module: writes `permissions.defaultMode: "bypassPermissions"` into `/.claude/settings.local.json` on launch, backs up the original to a sibling `.zo-backup` file, restores via `atexit` on exit; `cleanup_stale_overlay()` called at every `_launch_and_monitor` invocation handles crashed-mid-run recovery from a leftover backup. Settings.local.json was also added to `.gitignore` in PR #91 (already merged) so the per-user file doesn't accumulate transient secrets in tracked state. **Tests added: +17** (`tests/unit/test_permissions_overlay.py` 12 cases covering existing-settings / no-settings / malformed / stale-cleanup / non-dict-permissions; `tests/unit/test_wrapper.py` 3 new cases for headless conditional flag + tmux overlay apply/skip; `tests/unit/test_cli.py` 1 case for the resolver truth table). pytest 760 passed / 7 skipped (was 743 + 7). validate-docs 10/11 (1 pre-existing warning, 0 failures — the previously-flagged test-count warning resolved naturally as the suite grew above 743). Docs updated: `docs/cli/build.mdx` gains a "Permission prompts" section with the full truth table and the behavior-change note; `docs/cli/overview.mdx` adds the flag to the shared options table. **Next action when picking up:** unchanged — monitor Discussions for early external-user signal on Tier 2 sequencing (extension vs cost), then resume the Tier 1 recommendation from session 028 (caveman ablation → onboarding hardening). Power-user UX note: for an unattended prod-001 overnight run, `zo continue --repo /path/to/prod-001 --gate-mode full-auto` now Just Works in tmux without any manual settings.local.json tinkering. + +**Session 030 hand-off (prior).** Single-file copy + styling tweak on the public website hero. `website/src/pages/index.html:84` byline extended from "by Samyakh (Sam) Tukra" to "by Samyakh (Sam) Tukra / with Callum Adamson" laid out as a two-line credit; Callum's name links to his LinkedIn (already present in the footer Contributors list at line 1053, now also surfaced in the opening eyebrow). Visual treatment in `website/public/styles.css:251` (new `.author-secondary` rule): `display: block` forces the secondary clause onto its own line below the main byline, `font-size: 0.9em` (renders at 9px against the eyebrow's 10px) + `opacity: 0.55` produce a subsidiary read while keeping the link discoverable (hover lifts opacity to 0.85, matching the existing `.author-link` hover affordance). The two-line layout was chosen over an inline same-line credit after preview testing — comma + "with" on one line wraps unpredictably at narrow viewports and can strand "with" at the end of line 1. Verified live in Astro 5 dev preview at 1280×720: Sam's name renders full coral oklch(0.58 0.16 35) at 10px, secondary line at 9px / opacity 0.55, both links underlined per `.author-link` rule. No code, tests, agents, commands, version, or model tiers touched — pure copy + styling, single PR scope. validate-docs 9/11 (2 pre-existing warnings, 0 failures — identical state to session 029). **Next action when picking up:** unchanged — monitor Discussions for early external-user signal on Tier 2 sequencing (extension vs cost), then resume the Tier 1 recommendation from session 028 (caveman ablation → onboarding hardening). **Session 029 hand-off (prior).** Launch-prep cascade (no code, copy + metadata only). Three external-review fixes shipped: (1) stale 70-80% token claims in `docs/quickstart.mdx:116`, `docs/cli/build.mdx:155`, `docs/COMMANDS.md:28` replaced with measured ~30% + 50-60% targeted + 70-80% roadmap framing (matches the cost-benchmark page already shipped in session 025); (2) stale test counts updated — `docs/installation.mdx:130` 675→743, `README.md` tests badge + line 508 735→743 (verified via `pytest -q`: 743 passed, 7 skipped); (3) website origin story at `website/src/pages/index.html:951` stripped of industry detail ("high-stakes client / power plant" → "eight-week production ML project") to remove confidentiality risk before broad launch push; (4) footer heading at `website/src/pages/index.html:1046` "The people behind it." → "Contributors." per user direction; (5) §03 headline at `website/src/pages/index.html:281` "Same six phases. *Now in days.*" → "Weeks of work. *Now in days.*" — the prior phrasing forward-referenced "six phases" four sections before §07 introduces them, and the §03 lede only enumerates five items, so the framing was both unearned and self-inconsistent. **GitHub repo metadata applied** via `gh repo edit`: description set, homepage `https://zerooperators.com`, 9 topics (ai-agents, claude-code, machine-learning, mlops, autonomous-agents, agent-orchestration, research-engineering, pytorch, devtools), Discussions enabled. **3 seed discussion threads posted** by SamPlvs: #80 "Show us your use case" (Show and tell), #81 "Roadmap: VS Code extension vs cost work" (Ideas, soliciting Tier 2 sequencing input), #82 "Help wanted: try the MNIST/CIFAR demos" (General). validate-docs 9/11 (2 pre-existing warnings, 0 failures). File changes shipped via PR (this session). Memory + DECISION_LOG updated per auto-protocol. **Next action when picking up:** monitor Discussions for early external-user signal on Tier 2 sequencing (extension vs cost), then resume the Tier 1 recommendation from session 028 (caveman ablation → onboarding hardening). diff --git a/src/zo/cli.py b/src/zo/cli.py index 493f20f..a269a79 100644 --- a/src/zo/cli.py +++ b/src/zo/cli.py @@ -298,6 +298,35 @@ def _resolve_gate_mode( return "supervised" +def _resolve_bypass_permissions( + *, + cli_bypass: bool, + gate_mode: str, +) -> bool: + """Resolve whether Claude Code permission prompts should be bypassed. + + Truth table: + + +------------------------------------+--------+ + | flags | bypass | + +====================================+========+ + | --gate-mode supervised (default) | False | + | --gate-mode supervised --bypass-p* | True | + | --gate-mode auto | False | + | --gate-mode auto --bypass-p* | True | + | --gate-mode full-auto | True | + | --gate-mode full-auto --bypass-p* | True | + +------------------------------------+--------+ + + The flag is its own opt-in; ``--gate-mode full-auto`` implicitly + enables it because "no human in the loop for gates" + "human must + click every tool prompt" is a contradiction. + """ + if cli_bypass: + return True + return gate_mode == "full-auto" + + def _show_banner( project: str = "", mode: str = "", @@ -694,6 +723,7 @@ def _launch_and_monitor( add_dirs: list[str] | None = None, extra_env: dict[str, str] | None = None, headlines_disabled: bool = False, + bypass_permissions: bool = False, ) -> None: """Shared launch → monitor → end-session flow for build and draft. @@ -703,7 +733,20 @@ def _launch_and_monitor( ``CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=60``. headlines_disabled: When True, skips the periodic Haiku headline summaries. Set by ``--low-token`` and ``--no-headlines``. + bypass_permissions: When True, Claude Code tool-call permission + prompts are auto-approved. Set by ``--bypass-permissions`` + or implied by ``--gate-mode full-auto``. """ + # Clean up any stale settings.local.json overlay left by a crashed + # previous run before launching this one. + from zo.permissions_overlay import cleanup_stale_overlay + cleaned = cleanup_stale_overlay(zo_root / ".claude") + if cleaned: + console.print( + f"[{_DIM}]Restored .claude/settings.local.json from a previous " + f"interrupted run.[/]" + ) + use_tmux = not no_tmux console.print(f"\n[{_AMBER}]Launching lead session:[/] team={team_name}") process = wrapper.launch_lead_session( @@ -711,6 +754,7 @@ def _launch_and_monitor( model=model, max_turns=max_turns, use_tmux=use_tmux, add_dirs=add_dirs or [], extra_env=extra_env or {}, + bypass_permissions=bypass_permissions, ) if process.tmux_pane_id: @@ -910,6 +954,11 @@ def _print_status(team_status, pane_snapshot=""): # noqa: ANN001 "--no-headlines", is_flag=True, help="Disable the Haiku headline ticker (saves ~60 small calls/hour).", ) +@click.option( + "--bypass-permissions", is_flag=True, + help="Auto-approve Claude Code tool-call permission prompts. Useful when " + "you want to walk away from the terminal. Implied by --gate-mode full-auto.", +) def build( plan_path: Path, gate_mode: str | None, @@ -918,6 +967,7 @@ def build( lead_model: str | None, max_iterations: int | None, no_headlines: bool, + bypass_permissions: bool, ) -> None: """Launch a project from a plan.md file. @@ -963,6 +1013,9 @@ def build( effective_headlines_disabled = ( no_headlines or effective_low_token ) + effective_bypass_permissions = _resolve_bypass_permissions( + cli_bypass=bypass_permissions, gate_mode=effective_gate_mode, + ) extra_env: dict[str, str] = {} if effective_low_token: extra_env["CLAUDE_AUTOCOMPACT_PCT_OVERRIDE"] = str( @@ -1063,6 +1116,7 @@ def build( add_dirs=[str(delivery_path)], extra_env=extra_env, headlines_disabled=effective_headlines_disabled, + bypass_permissions=effective_bypass_permissions, ) @@ -1098,6 +1152,11 @@ def build( "--no-headlines", is_flag=True, help="Disable the Haiku headline ticker.", ) +@click.option( + "--bypass-permissions", is_flag=True, + help="Auto-approve Claude Code tool-call permission prompts. " + "Implied by --gate-mode full-auto.", +) def continue_( project_name: str | None, repo: str | None, @@ -1107,6 +1166,7 @@ def continue_( lead_model: str | None, max_iterations: int | None, no_headlines: bool, + bypass_permissions: bool, ) -> None: """Resume a paused project or reconnect on a new machine. @@ -1172,6 +1232,7 @@ def continue_( lead_model=lead_model, max_iterations=max_iterations, no_headlines=no_headlines, + bypass_permissions=bypass_permissions, ) diff --git a/src/zo/permissions_overlay.py b/src/zo/permissions_overlay.py new file mode 100644 index 0000000..d035e3e --- /dev/null +++ b/src/zo/permissions_overlay.py @@ -0,0 +1,149 @@ +"""Temporarily overlay ``.claude/settings.local.json`` with bypassPermissions mode. + +Used when the user passes ``--bypass-permissions`` (explicitly or implied +by ``--gate-mode full-auto``). In tmux mode the Claude Code CLI flag +``--dangerously-skip-permissions`` can't be used (Claude Code exits +immediately in interactive mode), so the equivalent effect is achieved +via the settings-file mechanism: write +``permissions.defaultMode: "bypassPermissions"`` into the project's +``.claude/settings.local.json`` for the duration of the run. + +The overlay is restored on three paths: + +1. Normal exit — via the ``restore_fn`` returned to the caller, who + wires it to ``atexit.register`` and SIGINT/SIGTERM handlers. +2. Crash before restore — a sibling backup file + (``settings.local.json.zo-backup``) is left on disk. The next + ``zo`` invocation that calls :func:`cleanup_stale_overlay` will + detect it and restore. +3. Manual recovery — the backup file is human-readable; a user can + move it back themselves if both auto-paths fail. + +Designed to be **idempotent and crash-resilient**: never mutates the +user's settings file without leaving a recoverable backup, and the +restore step is safe to call multiple times. +""" +from __future__ import annotations + +import contextlib +import json +from typing import TYPE_CHECKING + +if TYPE_CHECKING: + from collections.abc import Callable + from pathlib import Path + +__all__ = ["apply_bypass_overlay", "cleanup_stale_overlay"] + +_BACKUP_SUFFIX = ".zo-backup" +_NO_ORIGINAL_MARKER = "__ZO_NO_ORIGINAL_FILE__" + + +def _settings_path(claude_dir: Path) -> Path: + return claude_dir / "settings.local.json" + + +def _backup_path(claude_dir: Path) -> Path: + return claude_dir / f"settings.local.json{_BACKUP_SUFFIX}" + + +def apply_bypass_overlay(claude_dir: Path) -> Callable[[], None]: + """Write a bypass-permissions overlay; return a restore callable. + + Backs up the original ``settings.local.json`` (if any) to a sibling + ``.zo-backup`` file, then writes a merged version that contains the + user's existing settings plus ``permissions.defaultMode: + "bypassPermissions"``. + + The returned callable restores the original file (or removes the + one we created if there was no original) and deletes the backup. + Safe to call multiple times — subsequent calls are no-ops. + + Args: + claude_dir: The ``.claude/`` directory whose ``settings.local.json`` + should be overlaid. Created if missing. + + Returns: + A zero-argument callable that restores the file. + """ + claude_dir.mkdir(parents=True, exist_ok=True) + settings_file = _settings_path(claude_dir) + backup_file = _backup_path(claude_dir) + + # 1. Capture original (or mark its absence). + if settings_file.exists(): + original_content = settings_file.read_text(encoding="utf-8") + backup_file.write_text(original_content, encoding="utf-8") + try: + existing = json.loads(original_content) + if not isinstance(existing, dict): + existing = {} + except json.JSONDecodeError: + existing = {} + else: + original_content = None + # Leave a sentinel so cleanup_stale_overlay knows to delete, + # not restore, in a crash-recovery scenario. + backup_file.write_text(_NO_ORIGINAL_MARKER, encoding="utf-8") + existing = {} + + # 2. Merge defaultMode into permissions (preserve allow/deny/etc.). + permissions = existing.get("permissions") or {} + if not isinstance(permissions, dict): + permissions = {} + permissions["defaultMode"] = "bypassPermissions" + existing["permissions"] = permissions + + # 3. Write overlay. + settings_file.write_text(json.dumps(existing, indent=2) + "\n", encoding="utf-8") + + # 4. Build restore callable. + restored = False + + def restore() -> None: + nonlocal restored + if restored: + return + restored = True + if original_content is not None: + settings_file.write_text(original_content, encoding="utf-8") + else: + with contextlib.suppress(FileNotFoundError): + settings_file.unlink() + with contextlib.suppress(FileNotFoundError): + backup_file.unlink() + + return restore + + +def cleanup_stale_overlay(claude_dir: Path) -> bool: + """Detect + restore a leftover overlay from a crashed previous run. + + Called at ZO command startup. If a ``.zo-backup`` file is present + in ``claude_dir``, restore the original settings file (or remove + it if the backup is the no-original sentinel) and delete the + backup. + + Args: + claude_dir: The ``.claude/`` directory to inspect. + + Returns: + True if a stale overlay was found and cleaned; False otherwise. + """ + backup_file = _backup_path(claude_dir) + if not backup_file.exists(): + return False + + settings_file = _settings_path(claude_dir) + content = backup_file.read_text(encoding="utf-8") + + if content == _NO_ORIGINAL_MARKER: + with contextlib.suppress(FileNotFoundError): + settings_file.unlink() + else: + settings_file.write_text(content, encoding="utf-8") + + with contextlib.suppress(FileNotFoundError): + backup_file.unlink() + + return True diff --git a/src/zo/wrapper.py b/src/zo/wrapper.py index 3c2458a..20f8570 100644 --- a/src/zo/wrapper.py +++ b/src/zo/wrapper.py @@ -16,6 +16,7 @@ from __future__ import annotations +import atexit import contextlib import json import os @@ -81,6 +82,8 @@ def __init__( self._log_dir.mkdir(parents=True, exist_ok=True) self._max_retries = max_retries self._base_backoff = base_backoff + # Restore callable for the settings.local.json overlay (set in _launch_tmux). + self._bypass_restore_fn: object | None = None # --- Launch --- @@ -95,6 +98,7 @@ def launch_lead_session( use_tmux: bool = True, add_dirs: list[str] | None = None, extra_env: dict[str, str] | None = None, + bypass_permissions: bool = False, ) -> LeadProcess: """Launch one Claude Code session as the Lead Orchestrator. @@ -112,16 +116,27 @@ def launch_lead_session( merged into the subprocess ``env``. Used by the low-token preset to set ``CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=60``. + bypass_permissions: When True, Claude Code's tool-call + permission prompts are suppressed. In headless mode + this is achieved via ``--dangerously-skip-permissions``; + in tmux mode (where that flag exits Claude Code + immediately) it's achieved by overlaying + ``permissions.defaultMode: "bypassPermissions"`` onto + the project's ``.claude/settings.local.json`` for the + duration of the run. See :mod:`zo.permissions_overlay` + for the safe-overlay mechanism. Default False. """ extra = add_dirs or [] env = extra_env or {} if use_tmux and self._is_in_tmux(): return self._launch_tmux(prompt, cwd=cwd, team_name=team_name, model=model, max_turns=max_turns, - add_dirs=extra, extra_env=env) + add_dirs=extra, extra_env=env, + bypass_permissions=bypass_permissions) return self._launch_headless(prompt, cwd=cwd, team_name=team_name, model=model, max_turns=max_turns, - add_dirs=extra, extra_env=env) + add_dirs=extra, extra_env=env, + bypass_permissions=bypass_permissions) def _launch_tmux( self, @@ -133,6 +148,7 @@ def _launch_tmux( max_turns: int, add_dirs: list[str] | None = None, extra_env: dict[str, str] | None = None, + bypass_permissions: bool = False, ) -> LeadProcess: """Launch Claude Code interactively in a visible tmux window. @@ -144,7 +160,20 @@ def _launch_tmux( 2. Wait for the TUI to render 3. Paste the prompt into the TUI via tmux's paste buffer 4. Send Enter — Claude processes it with the TUI visible + + If ``bypass_permissions`` is True, a settings-file overlay is + applied to ``/.claude/settings.local.json`` before launch + and restored via an atexit handler. This is the only way to + suppress permission prompts in interactive mode (the CLI flag + is rejected by Claude Code in TUI mode). """ + # Apply bypass overlay (if requested) BEFORE Claude reads settings. + if bypass_permissions: + from zo.permissions_overlay import apply_bypass_overlay + restore_fn = apply_bypass_overlay(Path(cwd) / ".claude") + atexit.register(restore_fn) + self._bypass_restore_fn = restore_fn + prompt_file = self._log_dir / f"{team_name}-prompt.txt" prompt_file.write_text(prompt, encoding="utf-8") @@ -365,16 +394,25 @@ def _launch_headless( max_turns: int, add_dirs: list[str] | None = None, extra_env: dict[str, str] | None = None, + bypass_permissions: bool = False, ) -> LeadProcess: - """Launch Claude Code as a headless subprocess (--print mode).""" + """Launch Claude Code as a headless subprocess (--print mode). + + When ``bypass_permissions`` is True, ``--dangerously-skip-permissions`` + is appended to the Claude CLI invocation so tool-call prompts + are auto-approved. When False (default), prompts fire as + normal — useful for ``--gate-mode supervised`` runs where the + user wants to review each tool call. + """ cmd: list[str] = [ self._claude_bin, "--print", "--output-format", "json", "--model", model, "--max-turns", str(max_turns), "--add-dir", cwd, - "--dangerously-skip-permissions", ] + if bypass_permissions: + cmd.append("--dangerously-skip-permissions") for d in (add_dirs or []): cmd.extend(["--add-dir", d]) cmd.extend(["-p", prompt]) diff --git a/tests/unit/test_cli.py b/tests/unit/test_cli.py index 1afdd36..98a982e 100644 --- a/tests/unit/test_cli.py +++ b/tests/unit/test_cli.py @@ -1002,6 +1002,44 @@ def test_resolve_gate_mode_precedence(self) -> None: cli_gate_mode=None, low_token=False, ) == "supervised" + def test_resolve_bypass_permissions_truth_table(self) -> None: + """The full truth table from the user-facing contract. + + +------------------------------------+--------+ + | flags | bypass | + +====================================+========+ + | --gate-mode supervised | False | + | --gate-mode supervised --bypass | True | + | --gate-mode auto | False | + | --gate-mode auto --bypass | True | + | --gate-mode full-auto | True | + | --gate-mode full-auto --bypass | True | + +------------------------------------+--------+ + """ + from zo.cli import _resolve_bypass_permissions + + # supervised: bypass off by default, on with explicit flag + assert _resolve_bypass_permissions( + cli_bypass=False, gate_mode="supervised", + ) is False + assert _resolve_bypass_permissions( + cli_bypass=True, gate_mode="supervised", + ) is True + # auto: same shape + assert _resolve_bypass_permissions( + cli_bypass=False, gate_mode="auto", + ) is False + assert _resolve_bypass_permissions( + cli_bypass=True, gate_mode="auto", + ) is True + # full-auto: bypass implied, redundant flag still True + assert _resolve_bypass_permissions( + cli_bypass=False, gate_mode="full-auto", + ) is True + assert _resolve_bypass_permissions( + cli_bypass=True, gate_mode="full-auto", + ) is True + def test_low_token_preset_constant_shape(self) -> None: """The preset has the documented keys with documented values.""" from zo.cli import _LOW_TOKEN_PRESET diff --git a/tests/unit/test_permissions_overlay.py b/tests/unit/test_permissions_overlay.py new file mode 100644 index 0000000..c4be6ef --- /dev/null +++ b/tests/unit/test_permissions_overlay.py @@ -0,0 +1,261 @@ +"""Unit tests for zo.permissions_overlay. + +Covers four scenarios: + 1. Existing settings.local.json — overlay merges defaultMode, restore preserves original + 2. No settings.local.json — overlay creates one, restore deletes it + 3. Malformed settings.local.json — overlay treats it as empty + 4. Stale overlay from a crashed run — startup cleanup restores +""" +from __future__ import annotations + +import json +from typing import TYPE_CHECKING + +from zo.permissions_overlay import apply_bypass_overlay, cleanup_stale_overlay + +if TYPE_CHECKING: + from pathlib import Path + +# ------------------------------------------------------------------ # +# Scenario 1: existing settings.local.json +# ------------------------------------------------------------------ # + + +class TestExistingSettings: + def test_overlay_merges_default_mode_preserving_allow_list( + self, tmp_path: Path + ) -> None: + claude_dir = tmp_path / ".claude" + claude_dir.mkdir() + settings = claude_dir / "settings.local.json" + original = { + "permissions": { + "allow": ["mcp__Foo__*", "Bash(npm install *)"], + "deny": ["Bash(rm -rf *)"], + }, + } + settings.write_text(json.dumps(original)) + + apply_bypass_overlay(claude_dir) + + # Overlay merged in defaultMode; existing allow/deny preserved + new_content = json.loads(settings.read_text()) + assert new_content["permissions"]["defaultMode"] == "bypassPermissions" + assert new_content["permissions"]["allow"] == [ + "mcp__Foo__*", "Bash(npm install *)", + ] + assert new_content["permissions"]["deny"] == ["Bash(rm -rf *)"] + + # Backup file exists with original content + backup = claude_dir / "settings.local.json.zo-backup" + assert backup.exists() + assert json.loads(backup.read_text()) == original + + def test_restore_returns_original_and_removes_backup( + self, tmp_path: Path + ) -> None: + claude_dir = tmp_path / ".claude" + claude_dir.mkdir() + settings = claude_dir / "settings.local.json" + original = {"permissions": {"allow": ["mcp__Foo__*"]}} + settings.write_text(json.dumps(original)) + + restore = apply_bypass_overlay(claude_dir) + restore() + + # Settings.local.json restored exactly + assert json.loads(settings.read_text()) == original + # Backup deleted + assert not (claude_dir / "settings.local.json.zo-backup").exists() + + def test_restore_is_idempotent(self, tmp_path: Path) -> None: + claude_dir = tmp_path / ".claude" + claude_dir.mkdir() + settings = claude_dir / "settings.local.json" + settings.write_text('{"permissions": {"allow": ["X"]}}') + + restore = apply_bypass_overlay(claude_dir) + restore() + # Second call is a no-op — does not raise + restore() + + assert settings.exists() + + +# ------------------------------------------------------------------ # +# Scenario 2: no settings.local.json +# ------------------------------------------------------------------ # + + +class TestNoExistingSettings: + def test_overlay_creates_settings_when_none_exists( + self, tmp_path: Path + ) -> None: + claude_dir = tmp_path / ".claude" + # Directory may not even exist yet — apply should create it + assert not claude_dir.exists() + + apply_bypass_overlay(claude_dir) + + settings = claude_dir / "settings.local.json" + assert settings.exists() + content = json.loads(settings.read_text()) + assert content["permissions"]["defaultMode"] == "bypassPermissions" + + # Backup file holds the no-original sentinel + backup = claude_dir / "settings.local.json.zo-backup" + assert backup.exists() + assert backup.read_text() == "__ZO_NO_ORIGINAL_FILE__" + + def test_restore_removes_overlay_when_no_original( + self, tmp_path: Path + ) -> None: + claude_dir = tmp_path / ".claude" + + restore = apply_bypass_overlay(claude_dir) + restore() + + # Both the overlay AND the sentinel backup should be gone + assert not (claude_dir / "settings.local.json").exists() + assert not (claude_dir / "settings.local.json.zo-backup").exists() + + +# ------------------------------------------------------------------ # +# Scenario 3: malformed existing settings +# ------------------------------------------------------------------ # + + +class TestMalformedSettings: + def test_overlay_handles_invalid_json(self, tmp_path: Path) -> None: + claude_dir = tmp_path / ".claude" + claude_dir.mkdir() + settings = claude_dir / "settings.local.json" + settings.write_text("{ not valid json ... ") + + # Should not raise — treats malformed file as empty + apply_bypass_overlay(claude_dir) + + new_content = json.loads(settings.read_text()) + assert new_content["permissions"]["defaultMode"] == "bypassPermissions" + + def test_restore_returns_original_malformed_content( + self, tmp_path: Path + ) -> None: + """Even if the original was malformed, restore returns it + verbatim — we don't silently 'fix' the user's file.""" + claude_dir = tmp_path / ".claude" + claude_dir.mkdir() + settings = claude_dir / "settings.local.json" + bad_text = "{ not valid json ... " + settings.write_text(bad_text) + + restore = apply_bypass_overlay(claude_dir) + restore() + + assert settings.read_text() == bad_text + + +# ------------------------------------------------------------------ # +# Scenario 4: stale overlay (crash recovery) +# ------------------------------------------------------------------ # + + +class TestStaleCleanup: + def test_cleanup_restores_original_from_backup( + self, tmp_path: Path + ) -> None: + """Simulate a crashed previous run: backup file present, + settings.local.json holds the overlay. Cleanup should + restore.""" + claude_dir = tmp_path / ".claude" + claude_dir.mkdir() + settings = claude_dir / "settings.local.json" + backup = claude_dir / "settings.local.json.zo-backup" + + original = {"permissions": {"allow": ["mcp__Foo__*"]}} + backup.write_text(json.dumps(original)) + # Overlay-state file (what crashed-mid-run would have left) + settings.write_text(json.dumps({ + "permissions": { + "allow": ["mcp__Foo__*"], + "defaultMode": "bypassPermissions", + }, + })) + + cleaned = cleanup_stale_overlay(claude_dir) + + assert cleaned is True + assert json.loads(settings.read_text()) == original + assert not backup.exists() + + def test_cleanup_removes_overlay_when_no_original( + self, tmp_path: Path + ) -> None: + """Simulate a crashed run where no original settings.local.json + existed before — cleanup should delete the overlay, not + restore it as a no-original sentinel.""" + claude_dir = tmp_path / ".claude" + claude_dir.mkdir() + settings = claude_dir / "settings.local.json" + backup = claude_dir / "settings.local.json.zo-backup" + + backup.write_text("__ZO_NO_ORIGINAL_FILE__") + settings.write_text(json.dumps({ + "permissions": {"defaultMode": "bypassPermissions"}, + })) + + cleaned = cleanup_stale_overlay(claude_dir) + + assert cleaned is True + assert not settings.exists() + assert not backup.exists() + + def test_cleanup_no_op_when_no_backup(self, tmp_path: Path) -> None: + claude_dir = tmp_path / ".claude" + claude_dir.mkdir() + # Normal state: no backup file, possibly a regular settings file + settings = claude_dir / "settings.local.json" + settings.write_text('{"permissions": {"allow": ["X"]}}') + + cleaned = cleanup_stale_overlay(claude_dir) + + assert cleaned is False + # Normal settings file untouched + assert json.loads(settings.read_text()) == { + "permissions": {"allow": ["X"]}, + } + + def test_cleanup_no_op_when_directory_missing( + self, tmp_path: Path + ) -> None: + """Calling cleanup against a non-existent .claude directory + must not raise.""" + claude_dir = tmp_path / "does-not-exist" / ".claude" + cleaned = cleanup_stale_overlay(claude_dir) + assert cleaned is False + + +# ------------------------------------------------------------------ # +# Scenario 5: permissions block existed but wasn't a dict +# ------------------------------------------------------------------ # + + +class TestPermissionsBlockNotDict: + def test_overlay_replaces_non_dict_permissions_block( + self, tmp_path: Path + ) -> None: + """Defensive: if permissions is somehow a list or string, + overlay should still produce a valid merged config.""" + claude_dir = tmp_path / ".claude" + claude_dir.mkdir() + settings = claude_dir / "settings.local.json" + settings.write_text('{"permissions": ["not", "a", "dict"]}') + + apply_bypass_overlay(claude_dir) + + new_content = json.loads(settings.read_text()) + assert isinstance(new_content["permissions"], dict) + assert ( + new_content["permissions"]["defaultMode"] + == "bypassPermissions" + ) diff --git a/tests/unit/test_wrapper.py b/tests/unit/test_wrapper.py index 60d8264..711c1b9 100644 --- a/tests/unit/test_wrapper.py +++ b/tests/unit/test_wrapper.py @@ -95,7 +95,8 @@ class TestLaunchLeadSession: def test_headless_builds_correct_command( self, mock_popen: mock.MagicMock, wrapper: LifecycleWrapper ) -> None: - """When use_tmux=False, launches headless with --print.""" + """When use_tmux=False with bypass_permissions=True, launches + headless with --print and --dangerously-skip-permissions.""" mock_popen.return_value.pid = 42 result = wrapper.launch_lead_session( @@ -105,6 +106,7 @@ def test_headless_builds_correct_command( model="opus", max_turns=100, use_tmux=False, + bypass_permissions=True, ) args = mock_popen.call_args @@ -129,6 +131,49 @@ def test_headless_builds_correct_command( assert result.stdout_log is not None assert result.tmux_pane_id is None + @mock.patch("zo.wrapper.subprocess.Popen") + def test_headless_omits_skip_flag_when_bypass_false( + self, mock_popen: mock.MagicMock, wrapper: LifecycleWrapper + ) -> None: + """When bypass_permissions=False, --dangerously-skip-permissions + must NOT be in the Claude command. Prompts are expected to fire + for each tool call.""" + mock_popen.return_value.pid = 43 + + wrapper.launch_lead_session( + "do the thing", + cwd="/target", + team_name="beta", + model="opus", + max_turns=50, + use_tmux=False, + bypass_permissions=False, + ) + + cmd = mock_popen.call_args[0][0] + assert "--dangerously-skip-permissions" not in cmd + # Sanity: rest of the command is still well-formed + assert "--print" in cmd + assert "-p" in cmd + + @mock.patch("zo.wrapper.subprocess.Popen") + def test_headless_default_bypass_is_false( + self, mock_popen: mock.MagicMock, wrapper: LifecycleWrapper + ) -> None: + """Default (no bypass_permissions arg) is the safe behavior: + --dangerously-skip-permissions is NOT included.""" + mock_popen.return_value.pid = 44 + + wrapper.launch_lead_session( + "do the thing", + cwd="/target", + team_name="gamma", + use_tmux=False, + ) + + cmd = mock_popen.call_args[0][0] + assert "--dangerously-skip-permissions" not in cmd + @mock.patch("zo.wrapper.subprocess.Popen") def test_add_dir_flag_present( self, mock_popen: mock.MagicMock, wrapper: LifecycleWrapper @@ -194,6 +239,72 @@ def test_tmux_falls_back_headless_when_not_in_tmux( assert result.tmux_pane_id is None assert mock_popen.called + @mock.patch("zo.wrapper.atexit.register") + @mock.patch("zo.wrapper.time.sleep") + @mock.patch("zo.wrapper.subprocess.run") + def test_tmux_with_bypass_applies_overlay( + self, + mock_run: mock.MagicMock, + mock_sleep: mock.MagicMock, + mock_atexit: mock.MagicMock, + wrapper: LifecycleWrapper, + tmp_path: Path, + ) -> None: + """tmux + bypass_permissions=True writes the settings overlay + and registers a restore callback with atexit before launching.""" + mock_run.return_value = mock.MagicMock(stdout="%9\n", returncode=0) + + with mock.patch.dict(os.environ, {"TMUX": "/tmp/tmux,1,0"}): + wrapper.launch_lead_session( + "prompt", + cwd=str(tmp_path), + team_name="overlay-team", + use_tmux=True, + bypass_permissions=True, + ) + + # Overlay should be on disk during launch + settings = tmp_path / ".claude" / "settings.local.json" + assert settings.exists() + content = json.loads(settings.read_text()) + assert content["permissions"]["defaultMode"] == "bypassPermissions" + + # Restore was registered with atexit + assert mock_atexit.called + # The wrapper holds a reference so an explicit restore is possible + assert wrapper._bypass_restore_fn is not None # noqa: SLF001 + + @mock.patch("zo.wrapper.atexit.register") + @mock.patch("zo.wrapper.time.sleep") + @mock.patch("zo.wrapper.subprocess.run") + def test_tmux_without_bypass_skips_overlay( + self, + mock_run: mock.MagicMock, + mock_sleep: mock.MagicMock, + mock_atexit: mock.MagicMock, + wrapper: LifecycleWrapper, + tmp_path: Path, + ) -> None: + """tmux without bypass_permissions does NOT touch + settings.local.json or register any atexit handler.""" + mock_run.return_value = mock.MagicMock(stdout="%10\n", returncode=0) + + with mock.patch.dict(os.environ, {"TMUX": "/tmp/tmux,1,0"}): + wrapper.launch_lead_session( + "prompt", + cwd=str(tmp_path), + team_name="no-overlay", + use_tmux=True, + bypass_permissions=False, + ) + + # No overlay was written + settings = tmp_path / ".claude" / "settings.local.json" + assert not settings.exists() + # No atexit handler was registered for an overlay + assert not mock_atexit.called + assert wrapper._bypass_restore_fn is None # noqa: SLF001 + # ------------------------------------------------------------------ # # monitor_team / read_task_list