You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ROADMAP ultraworkers#108: subcommand typos silently fall through to LLM prompt dispatch, burning billed tokens
Dogfooded 2026-04-18 on main HEAD 91c79ba from /tmp/cdCC.
Unrecognized first-positional tokens fall through the
_other => Ok(CliAction::Prompt { ... }) arm at main.rs:707.
Per --help this is 'Shorthand non-interactive prompt mode' —
documented behavior — but it eats known-subcommand typos too:
claw doctorr → Prompt("doctorr") → LLM API call
claw skilsl → Prompt("skilsl") → LLM API call
claw statuss → Prompt("statuss") → LLM API call
claw deply → Prompt("deply") → LLM API call
With credentials set, each burns real tokens. Without creds,
returns 'missing Anthropic credentials' — indistinguishable
from a legitimate prompt failure. No 'did you mean' suggestion.
Infrastructure exists:
slash command typos:
claw --resume s /skilsl
→ 'Unknown slash command: /skilsl. Did you mean /skill, /skills'
flag typos:
claw --fake-flag
→ structured error 'unknown option: --fake-flag'
subcommand typos:
→ silently become LLM prompts
The did-you-mean helper exists for slash commands. Flag
validation exists. Only subcommand dispatch has the silent-
fallthrough.
Fix shape (~60 lines):
- suggest_similar_subcommand(token) using levenshtein ≤ 2
against the ~16-item known-subcommand list
- gate the Prompt fallthrough on a shape heuristic:
single-token + near-match → return structured error with
did-you-mean. Otherwise fall through unchanged.
- preserve shorthand-prompt mode for multi-word inputs,
quoted inputs, and non-near-match tokens
- regression tests per typo shape + legit prompt + quoted
workaround
Cross-claw orchestration hazard: claws constructing subcommand
names from config or other claws' output have a latent 'typo →
live LLM call' vector. Over CI matrix with 1% typo rate, that's
billed-token waste + structural signal loss (error handler
can't distinguish typo from legit prompt failure).
Joins silent-flag cluster (ultraworkers#96-ultraworkers#101, ultraworkers#104) on subcommand axis —
6th instance of 'malformed input silently produces unintended
behavior.' Joins parallel-entry-point asymmetry (ultraworkers#91, ultraworkers#101,
ultraworkers#104, ultraworkers#105) — slash vs subcommand disagree on typo handling.
Natural bundles: ultraworkers#96 + ultraworkers#98 + ultraworkers#108 (--help/dispatch surface
hygiene triangle), ultraworkers#91 + ultraworkers#101 + ultraworkers#104 + ultraworkers#105 + ultraworkers#108 (parallel-
entry-point 5-way).
Filed in response to Clawhip pinpoint nudge 1494849975530815590
in #clawcode-building-in-public.
**Blocker.** None. All additive. `HookProgressEvent` already exists in the runtime — this is pure plumbing and surfacing. Parallel to #102's MCP preflight fix — same pattern, different subsystem.
2949
2949
2950
2950
**Source.** Jobdori dogfood 2026-04-18 against `/tmp/cdBB` on main HEAD `a436f9e` in response to Clawhip pinpoint nudge at `1494834879127486544`. Joins **truth-audit / diagnostic-integrity** (#80–#87, #89, #100, #102, #103, #105) — `doctor: ok` is a lie when hooks are nonexistent or hostile. Joins **unplumbed-subsystem** (#78, #96, #100, #102, #103) — hook progress event model exists but JSON-invisible; `/hooks` is a declared-but-stubbed slash command. Joins **subsystem-doctor-coverage** (#100, #102, #103) as the fourth subsystem (git state / MCP / agents / **hooks**) that doctor fails to report on. Cross-cluster with **Permission-audit** (#94, #97, #101, #106) because hooks are effectively a permission mechanism that runs without audit. Compounds with #106 specifically: #106 says downstream layers can silently replace hook arrays; #107 says the resulting effective hook set is invisible; together they constitute a policy-erasure-plus-hide pair. Natural bundle: **#102 + #103 + #107** — subsystem-doctor-coverage 3-way (MCP + agents + hooks), closing the "subsystem silently opaque" class. Also **#106 + #107** — policy-erasure mechanism + policy-visibility gap = the complete hook-security story.
2951
+
2952
+
108. **CLI subcommand typos fall through to the LLM prompt dispatch path and silently burn tokens — `claw doctorr`, `claw skilsl`, `claw statuss`, `claw deply` all resolve to `CliAction::Prompt { prompt: "doctorr", ... }` and attempt a live LLM turn. Slash commands have a "Did you mean /skill, /skills" suggestion system that works correctly; subcommands have the same infrastructure available but it is never applied. A claw or CI pipeline that typos a subcommand name gets no structural signal — just the prompt API error (usually "missing credentials" in local dev, or actual billed LLM output with provider keys configured)** — dogfooded 2026-04-18 on main HEAD `91c79ba` from `/tmp/cdCC`. Every unrecognized first-positional falls through the `_other => Ok(CliAction::Prompt { ... })` arm at `main.rs:707`, which is the documented shorthand-prompt mode — but with no levenshtein / prefix matching against the known subcommand set to offer a suggestion first. A claw running with `ANTHROPIC_API_KEY` set that runs `claw doctorr` actually sends the string "doctorr" to the configured LLM provider and pays for the tokens.
2953
+
2954
+
**Concrete repro.**
2955
+
```
2956
+
$ cd /tmp/cdCC && git init -q .
2957
+
2958
+
# Correct subcommand — works
2959
+
$ claw --output-format json doctor | jq '.kind'
2960
+
"doctor"
2961
+
2962
+
# Typo subcommand — falls through to prompt dispatch
"unknown option: --fake-flag\nRun `claw --help` for usage."
2987
+
# Flags are rejected structurally. Subcommands are silently promptified.
2988
+
```
2989
+
2990
+
**Trace path.**
2991
+
- `rust/crates/rusty-claude-cli/src/main.rs:696-718` — the end of `parse_args`'s subcommand match. After matching specific strings (`"help"`, `"version"`, `"status"`, `"sandbox"`, `"doctor"`, `"state"`, `"dump-manifests"`, `"bootstrap-plan"`, `"agents"`, `"mcp"`, `"skills"`, `"system-prompt"`, `"acp"`, `"login"`/`"logout"`, `"init"`, `"export"`, `"prompt"`), the final arm is:
2992
+
```rust
2993
+
other if other.starts_with('/') => parse_direct_slash_cli_action(...),
`_other` covers "literally anything that wasn't a known subcommand or a slash command" — no levenshtein, no prefix match, no warning. It just assumes the operator meant to send a prompt.
3001
+
- `rust/crates/rusty-claude-cli/src/main.rs` slash-command dispatch — contains a `bare_slash_command_guidance` / "did you mean" helper that accepts the unknown slash name and suggests close matches. The same function-shape (distance + prefix / substring match) is trivially reusable for subcommand names.
3002
+
- `rust/crates/rusty-claude-cli/src/main.rs:755-765` — `parse_single_word_command_alias` is the place where a known-subcommand-alias list is matched for status/sandbox/doctor/state. This is the same point at which a "did you mean" suggestion could be hooked when the match fails.
3003
+
- `grep 'did you mean\|Did you mean' rust/crates/rusty-claude-cli/src/main.rs | wc -l` — matches exist for slash commands and flags, not for subcommands.
3004
+
- `rust/crates/rusty-claude-cli/src/main.rs:8307` — `--help` line: `claw [...] TEXT Shorthand non-interactive prompt mode`. The shorthand mode is the *documented* behavior — so the typo-becomes-prompt path is technically-correct per the spec. The clawability gap is the missing safety net for known-subcommand typos.
3005
+
3006
+
**Why this is specifically a clawability gap.**
3007
+
1. *Silent LLM spend on typos.* A claw or CI pipeline with `ANTHROPIC_API_KEY` set that typos `claw doctorr` sends "doctorr" to the LLM provider as a live prompt. The cost is not zero: a minimal turn costs 10s–100s of input tokens plus whatever the model responds with. Over a CI matrix of 100 lanes per day with a 1% typo rate, that's ~1 spurious API call per day per lane per typo class.
3008
+
2. *Structural signal lost.* The returned error — "missing Anthropic credentials" or actual LLM output — is indistinguishable from a *real* prompt failure. A claw's error handler cannot tell "my subcommand was a typo" from "my prompt legitimately failed." Structured error signaling is a claw-code design principle (Product Principle "Events over scraped prose"); the subcommand typo surface violates it.
3009
+
3. *Infrastructure already exists.* The slash-command dispatch already does levenshtein-style "Did you mean /skills" suggestions. Flag parsing already rejects unknown `--flags` with a structured error. Only the subcommand path has the silent-fallthrough behavior. The asymmetry is the gap, not a missing feature.
3010
+
4. *Joins the "silent acceptance of malformed input" class.* #97 (empty `--allowedTools`), #98 (`--compact` ignored in 9 paths), #99 (unvalidated `--cwd`/`--date`), #101 (fail-open env-var), #104 (silent `.txt` suffix), **#108** (silent subcommand-to-prompt fallthrough). Six flavors of "operator typo silently produces unintended behavior."
3011
+
5. *Cross-claw orchestration hazard.* A claw that dynamically constructs subcommand names from config or from another claw's output has a latent "subcommand name typo → live LLM call" vector. The fix (did-you-mean before Prompt fallthrough) is a one-function additional dispatcher that preserves the shorthand-prompt behavior for *actual* prose inputs while catching obvious subcommand typos.
3012
+
6. *Bounded intent detection.* "Is this input a typo of a known subcommand?" is decidable with cheap heuristics: exact-prefix match of the known subcommand list (`dotr` → prefix of `doctor`), bounded-edit-distance (levenshtein ≤ 2), single-character-swap. Prose inputs rarely match any of these against the subcommand list; subcommand typos almost always do.
3013
+
3014
+
**Fix shape — insert a did-you-mean guard before the Prompt fallthrough.**
3015
+
1. *Extract a `suggest_similar_subcommand(token) -> Option<Vec<String>>` helper.* Compute against the static list of known subcommands: `["help", "version", "status", "sandbox", "doctor", "state", "dump-manifests", "bootstrap-plan", "agents", "mcp", "skills", "system-prompt", "acp", "init", "export", "prompt"]`. Use levenshtein ≤ 2, or prefix/substring match length ≥ 4. ~40 lines.
3016
+
2. *Gate the fallthrough on a shape heuristic.* Before `_other => CliAction::Prompt`, check: (a) single-token input (no spaces) that (b) matches a known-subcommand typo via the suggester. If both true, return `Err(format!("unknown subcommand: {token}. Did you mean: {suggestions}? Run `claw --help` for the full list. If you meant to send a prompt literally, wrap in quotes or prefix with `claw prompt`."))`. If either false, fall through to Prompt as today. ~20 lines.
3017
+
3. *Preserve the shorthand-prompt mode for real prose.* Multi-word inputs (`claw explain this code`), quoted inputs (`claw "doctor"`), and inputs that don't match any known-subcommand typo continue through the existing fallthrough. The fix only catches the single-token near-match shape. ~0 extra lines — the guard is short-circuit.
3018
+
4. *Regression tests.* One per typo shape (`doctorr`, `skilsl`, `statuss`, `deply`, `mcpp`, `sklils`). One for legitimate short prompt (`claw hello`) that should NOT trigger the guard. One for quoted workaround (`claw prompt "doctorr"`) that should dispatch to Prompt unchanged.
3019
+
3020
+
**Acceptance.** `claw doctorr` exits non-zero with structured JSON error `{"type":"error","error":"unknown subcommand: doctorr. Did you mean: doctor? ..."}`. `claw hello world this is a prompt` still dispatches to Prompt unchanged (multi-token, no near-match). `claw "doctorr"` (quoted single token) dispatches to Prompt unchanged, since operator explicitly opted into shorthand-prompt. Zero billed LLM calls from subcommand typos.
3021
+
3022
+
**Blocker.** None. ~60 lines of dispatcher logic + regression tests. The levenshtein helper is 20 lines of pure arithmetic. Shorthand-prompt mode preserved for all non-near-match inputs.
3023
+
3024
+
**Source.** Jobdori dogfood 2026-04-18 against `/tmp/cdCC` on main HEAD `91c79ba` in response to Clawhip pinpoint nudge at `1494849975530815590`. Joins **silent-flag / documented-but-unenforced** (#96–#101, #104) on the subcommand-dispatch axis — sixth instance of "malformed operator input silently produces unintended behavior." Joins **parallel-entry-point asymmetry** (#91, #101, #104, #105) as another pair-axis: slash commands vs subcommands disagree on typo handling. Sibling of **#96** on the `--help` / flag-validation hygiene axis: #96 is "help advertises commands that don't work," #108 is "help doesn't advertise that subcommand typos silently become LLM prompts." Natural bundle: **#96 + #98 + #108** — three `--help`-and-dispatch-surface hygiene fixes that together remove the operator footguns in the command-parsing pipeline (help leak + flag silent-drop + subcommand typo fallthrough). Also **#91 + #101 + #104 + #105 + #108** — the full 5-way parallel-entry-point asymmetry audit.
0 commit comments