Skip to content

Add did-you-mean hint when --token/--key credentials are swapped#916

Draft
epompeii wants to merge 1 commit into
develfrom
claude/cli-token-key-help-po6i5z
Draft

Add did-you-mean hint when --token/--key credentials are swapped#916
epompeii wants to merge 1 commit into
develfrom
claude/cli-token-key-help-po6i5z

Conversation

@epompeii

@epompeii epompeii commented Jun 24, 2026

Copy link
Copy Markdown
Member

Context

Bencher recently moved from recommending --token (a JWT) to --key (a bencher_user_* / bencher_run_* API key) for CLI authentication. Users and agents still reach for the old --token flag and sometimes pass the wrong credential type into the wrong flag. Previously this produced an opaque validation error:

  • Passing a key to --token failed with Failed to validate JWT (JSON Web Token): ...
  • Passing a JWT to --key failed with Failed to validate Bencher API key: ...

Neither message told the user they simply used the wrong flag.

Change

Add clap value_parser functions for the token and key fields of CliBackend that wrap the existing FromStr parsing and, on failure, detect whether the supplied value is actually the other credential type and emit a redirecting hint:

error: invalid value 'bencher_user_...' for '--token <TOKEN>': You supplied a Bencher API key to `--token`/`BENCHER_API_TOKEN`. Use `--key`/`BENCHER_API_KEY` instead.

error: invalid value 'eyJ...' for '--key <KEY>': You supplied a JWT API token to `--key`/`BENCHER_API_KEY`. Use `--token`/`BENCHER_API_TOKEN` instead.

Detection reuses the existing FromStr impls (no new prefix/format logic): JWTs and API keys are structurally disjoint, so a value that fails one parser and succeeds the other is unambiguously a swap. Genuinely malformed values fall through to the original validation error (no false hint). Because clap runs value parsers on env-var values too, the hint also fires for BENCHER_API_TOKEN / BENCHER_API_KEY.

This mirrors the existing check_env / check_volume value-parser pattern already used in the CLI parser.

Tests

Added 8 unit tests covering both swap directions, both key types (bencher_user_* and bencher_run_*), valid inputs, and the garbage-without-hint fall-through.

Verification

  • cargo test -p bencher_cli (8 new tests pass)
  • cargo clippy -p bencher_cli --all-targets --all-features --no-deps -- -Dwarnings clean
  • cargo fmt
  • Manual smoke test confirmed all three behaviors (key to --token hint, JWT to --key hint, garbage to plain error)

Passing a Bencher API key to `--token` or a JWT to `--key` previously
failed with an opaque validation error. Add clap value parsers for both
fields that detect the swap (the two credential types are structurally
disjoint) and redirect the user to the correct flag.
@epompeii epompeii added the claude label Jun 24, 2026 — with Claude
@github-actions

Copy link
Copy Markdown
Contributor

🤖 Claude Code Review

PR: #916
Base: devel
Head: claude/cli-token-key-help-po6i5z
Commit: 80963c44361dd4d6eb1d1f9906bab8bd9a0fd644


Code Review: --token/--key swap hint

Overall this is a small, well-targeted, well-tested change. It adds clap value_parser functions that detect when a credential was passed to the wrong flag and emit a "did you mean" hint. Quality is high and it follows TDD as required by CLAUDE.md.

Correctness

  • The logic is sound. parse_token falls back to suggesting --key only when the value fails as Jwt and succeeds as BencherKey (and vice versa), so the hint can never misfire: Jwt (base64 dot-segments) and BencherKey (bencher_user_/bencher_run_ prefix) are genuinely disjoint, as the doc comment claims.
  • Nice bonus: clap applies value_parser to env values too, so a swapped BENCHER_API_TOKEN/BENCHER_API_KEY environment variable also gets the hint, not just the CLI flag.
  • Tests cover the matrix well: accept-valid, swapped-suggests-other, and garbage-no-hint for both directions. The garbage tests assert on the underlying ValidError message text ("JWT", "Bencher API key"), which matches the actual error strings in bencher_valid/src/error.rs.

Security

  • Credential echo in the fallback path (pre-existing, worth noting): in the non-swapped failure branch, err.to_string() returns ValidError::Jwt(jwt) / ValidError::BencherKey(key), both of which interpolate the raw rejected value ({0}) into the message printed to the terminal. So a malformed-but-real credential gets echoed back to stdout/stderr and potentially into CI logs. This behavior predates the PR (clap's default FromStr parser produced the same ValidError display), so it's not a regression, but since this PR is now the single owner of these parsers it would be a low-cost improvement to redact the value here rather than forwarding err.to_string(). The swapped-flag branches correctly avoid echoing the value.

Style / standards

  • No emdashes; error strings use proper sentence punctuation. ✅
  • Module ordering (struct → parsers → tests) is consistent with the file. ✅
  • Returning Result<_, String> is the required clap value_parser convention, so this does not conflict with the CLAUDE.md "wrap original error, not String" rule (that applies to library error enums). ✅
  • parse_token and parse_key are near-mirror images. Minor: could be unified with a small helper/macro, but at this size the explicit duplication is more readable than an abstraction, so I'd leave it.

Nits

  • The two doc comments are nearly identical; fine as-is.
  • I was unable to run cargo nextest run -p bencher_cli / cargo test --doc (permission denied in this session). Please confirm the suite passes locally before merge, and as CLAUDE.md notes, run cargo fmt + cargo clippy --all-features -- -Dwarnings one final time.

Verdict: Approve. The only substantive item is the pre-existing credential echo in the fallback error path, which is optional to address but cheap to fix while you're in this code.


Model: claude-opus-4-8

@github-actions

Copy link
Copy Markdown
Contributor

🐰 Bencher Report

Branchclaude/cli-token-key-help-po6i5z
Testbedintel-v1
Click to view all benchmark results
BenchmarkLatencyBenchmark Result
microseconds (µs)
(Result Δ%)
Upper Boundary
microseconds (µs)
(Limit %)
Adapter::Json📈 view plot
🚷 view threshold
4.58 µs
(-0.70%)Baseline: 4.62 µs
4.86 µs
(94.25%)
Adapter::Magic (JSON)📈 view plot
🚷 view threshold
4.46 µs
(-0.69%)Baseline: 4.49 µs
4.70 µs
(94.87%)
Adapter::Magic (Rust)📈 view plot
🚷 view threshold
26.08 µs
(+2.19%)Baseline: 25.52 µs
26.54 µs
(98.25%)
Adapter::Rust📈 view plot
🚷 view threshold
3.54 µs
(+1.58%)Baseline: 3.48 µs
3.59 µs
(98.48%)
Adapter::RustBench📈 view plot
🚷 view threshold
3.52 µs
(+1.04%)Baseline: 3.48 µs
3.58 µs
(98.11%)
🐰 View full continuous benchmarking report in Bencher

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants