Skip to content

feat(cli): improve agent experience — new commands, background login, actionable errors#10

Open
caffeinum wants to merge 30 commits into
mainfrom
fix/organisations-list
Open

feat(cli): improve agent experience — new commands, background login, actionable errors#10
caffeinum wants to merge 30 commits into
mainfrom
fix/organisations-list

Conversation

@caffeinum
Copy link
Copy Markdown

@caffeinum caffeinum commented May 27, 2026

Description

Improve the Sanity CLI's experience for AI agents and non-interactive environments. Based on 2027 eval traces showing agents struggling with auth, flag discovery, and error recovery.

Follows the AX framework: error messages are the primary teaching channel for AI agents — copy-pasteable hints collapse the hypothesis space and eliminate retry loops.

New commands:

  • sanity organizations list — list orgs with --json, --sort, --order flags
  • sanity auth status — check login state (exit 0/1), supports --json
  • sanity auth cancel — stop a running background login process

Non-interactive login:

  • sanity login in non-interactive mode spawns a detached background child (backgroundLoginChild.ts) that handles the OAuth callback, then returns immediately
  • sanity init -y auto-triggers background login when not authenticated, polls for the token up to 120s
  • Pidfile guard (.auth-callback.json) prevents multiple background login children from stacking
  • Pre-populates telemetryDisclosed to prevent a race condition where the next CLI command clobbers the token
  • PID/port hidden from output to prevent agents from killing the callback server
  • When multiple providers are available, errors with list + hint instead of auto-picking

Actionable error messages (formatHint helper):

  • formatHint(...commands) renders consistent [Hint] blocks with copy-pasteable commands
  • All sanity init flag conflicts include hints with working examples
  • --bare --output-path conflict moved from oclif generic error to our code with two-step hint
  • --create-project without value intercepted via catch override with --project-name hint
  • projects list and organizations list detect auth errors (both requireUser plain Error and HTTP 401/403) and show "Not logged in" with login hint
  • Multiple orgs in unattended mode: lists org IDs with names and provides --organization hint
  • Multiple providers in unattended mode: lists providers and provides --provider hint

Flag improvements:

  • --project hidden alias for --project-id on all commands using shared flags
  • --json flag on projects list with sort/order support
  • Project name auto-derived from package.json name or basename(cwd) in unattended mode
  • Organization auto-picked (first with attach grant) when exactly one is available

What to review

  • backgroundLogin.ts / backgroundLoginChild.ts — detached child process for OAuth callback. Pidfile guard, port selection, nonce verification. cancelBackgroundLogin() export for the cancel command.
  • initAction.ts — the auto-login polling loop (120s max, 3s interval). effectiveProjectName derivation. checkFlagsInUnattendedMode with formatHint.
  • login.ts (action) — the !isInteractive() branch with telemetryDisclosed pre-population. PID/port stripped from output.
  • getProvider.ts — error with list + hint when multiple providers available in non-interactive mode.
  • createProjectFromName.ts — unattended org selection: auto-pick when one, error with hint when multiple.
  • formatHint.ts — shared helper for consistent hint rendering.
  • projects/list.ts and organizations/list.ts — auth error detection covers both requireUser Error and HTTP 401/403.

Testing

  • 35+ new/updated tests across login, init, auth status, auth cancel, organizations list, projects list, cors add, and shared flags
  • All tests pass (pnpm test on affected packages)
  • Verified via 2027 eval platform — background login, auto-login from init, and hint text all confirmed working in e2b sandboxes
  • Auth error detection tested for both requireUser path (no token) and HTTP 401/403 path

caffeinum and others added 10 commits May 26, 2026 13:13
implement new `sanity organizations list` command to display all organizations accessible by the current user. includes command handler, tests, and snapshot fixtures.

also adds organization command topic and updates topic aliases for discoverability.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add hidden `--project` alias on all commands that use `--project-id`,
so agents don't get "Nonexistent flag" when they try the more natural
flag name. This affects cors, datasets, tokens, users, and other
commands using the shared getProjectIdFlag helper.

Update the unattended auth error to mention SANITY_AUTH_TOKEN env var
as an alternative to `sanity login`, so agents in headless environments
can skip the browser OAuth dance entirely.

Add SANITY_AUTH_TOKEN example to `sanity login --help` output.

Motivated by 2027 eval trace sanity-30620c19 where an agent spent
~2min on auth flow and tried `--project` on `cors add`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Agents consistently try `sanity projects list --json` and get
"Nonexistent flag: --json". This was observed in both eval traces
(sanity-94b5a32e and the local run). The flag outputs projects as
a JSON array with id, name, members count, manage URL, and created
timestamp.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
spawn a detached child process for oauth callback when running in
non-interactive mode (ci, containers, agents). the child handles
port binding, browser launch, and token persistence — parent cli
returns immediately so automation can continue.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
… pidfile guard

- Pre-populate telemetryDisclosed before spawning background child to
  prevent the next CLI command's telemetry hook from clobbering the token
- Respect --no-open flag in non-interactive mode (was always opening browser)
- Add pidfile guard to prevent multiple background login children from stacking
- Remove dead forceBrowser/--background flag code
- Improve wait time messaging (~30-60 seconds)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…uards

- Add missing changeset for runtime behavior changes
- Stop mutating options.projectName; use effectiveProjectName local
- Store providerUrl in pidfile; kill stale child on provider mismatch
- Replace execSync shell string with execFileSync to prevent injection
- Add --no-open test for non-interactive login path
- Make --project exclusive with --project-id to prevent silent conflict

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ngeset

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Print ~/.config/sanity/config.json path in background login output
  so agents know where to look for the token
- Detect 401/403 in projects list and organizations list, show
  "Not logged in" instead of generic "Failed to list" error
- Auto-trigger background login from `sanity init -y` when not
  authenticated, poll for token up to 120s instead of just failing

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…r messages

- Add `sanity auth status` command with --json flag for checking
  authentication state (exits 0 when logged in, 1 when not)
- Add hint: lines with working examples to all sanity init flag
  conflict errors so agents can self-correct in one step
- Add login hint to projects/organizations list 401/403 errors

Based on AX framework: error messages are the primary teaching
channel for AI agents — copy-pasteable examples collapse the
hypothesis space and eliminate retry loops.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 27, 2026

📦 Bundle Stats — @sanity/cli

Compared against main (c645f9ba)

@sanity/cli

Metric Value vs main (c645f9b)
Internal (raw) 2.1 KB -
Internal (gzip) 799 B -
Bundled (raw) 10.97 MB -
Bundled (gzip) 2.06 MB -
Import time 847ms +4ms, +0.5%

bin:sanity

Metric Value vs main (c645f9b)
Internal (raw) 1023 B -
Internal (gzip) 486 B -
Bundled (raw) 9.84 MB -
Bundled (gzip) 1.77 MB -
Import time 944ms -0ms, -0.0%

🗺️ View treemap · Artifacts

Details
  • Import time regressions over 10% are flagged with ⚠️
  • Sizes shown as raw / gzip 🗜️. Internal bytes = own code only. Total bytes = with all dependencies. Import time = Node.js cold-start median.

📦 Bundle Stats — @sanity/cli-core

Compared against main (c645f9ba)

Metric Value vs main (c645f9b)
Internal (raw) 96.5 KB +235 B, +0.2%
Internal (gzip) 22.7 KB +53 B, +0.2%
Bundled (raw) 21.61 MB +235 B, +0.0%
Bundled (gzip) 3.42 MB +41 B, +0.0%
Import time 797ms -3ms, -0.3%

🗺️ View treemap · Artifacts

Details
  • Import time regressions over 10% are flagged with ⚠️
  • Sizes shown as raw / gzip 🗜️. Internal bytes = own code only. Total bytes = with all dependencies. Import time = Node.js cold-start median.

📦 Bundle Stats — create-sanity

Compared against main (c645f9ba)

Metric Value vs main (c645f9b)
Internal (raw) 908 B -
Internal (gzip) 483 B -
Bundled (raw) 931 B -
Bundled (gzip) 491 B -
Import time ❌ ChildProcess denied: node -
Details
  • Import time regressions over 10% are flagged with ⚠️
  • Sizes shown as raw / gzip 🗜️. Internal bytes = own code only. Total bytes = with all dependencies. Import time = Node.js cold-start median.

oclif's built-in flag exclusion produces a generic error without hints.
Move the validation to initAction where we control the message, and add
a copy-pasteable two-step example showing the correct workflow.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@caffeinum caffeinum changed the title fix(cli): add organizations list command and improve unattended init #8 feat(cli): improve agent experience — new commands, background login, actionable errors May 27, 2026
@caffeinum
Copy link
Copy Markdown
Author

@2027dev trigger eval

Comment thread packages/@sanity/cli/src/actions/auth/login/getProvider.ts Outdated
…w findings

- Replace 110-line inline script template with backgroundLoginChild.ts
  that imports authServer.ts and @sanity/cli-core — gets type-checking,
  linting, and proper stack traces
- Fix empty catch lint error, dynamic import() lint violation
- Fix login.vercel.test.ts missing isInteractive mock
- Fix auth status JSON output conflicting with error throw
- Invert empty if-body anti-pattern in login command
- Fix import ordering in projects/list.ts
- Replace lodash-es/size with .length for strings in organizations/list
- Align child timeout (150s) with parent polling (120s)
- Bump @sanity/cli-core changeset to minor

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Comment thread packages/@sanity/cli/src/actions/auth/backgroundLogin.ts Outdated
Comment thread packages/@sanity/cli/src/actions/init/project/createProjectFromName.ts Outdated
if (options.bare && options.outputPath) {
throw new InitError(
'--bare cannot be used with --output-path. Use --bare to create the project only, then scaffold the studio separately.\n' +
'hint: sanity init --bare --project-name "my-project" --dataset production -y\n' +
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you do integrate hint system into the CLI, they might need to be refactored into a separate module?

Context: hints are inspired by my experience working with CLI AX, see Section 1. "Error messages are your only reliable teaching channel" in my write-up https://noninteractive.org/blog/agent-experience. Main idea: best way to teach Claude to use your CLI is fine-tune the model. Second best way is to over-communicate, and provide clear information to it.

Comment thread packages/@sanity/cli/src/commands/schemas/deploy.ts
caffeinum and others added 3 commits May 27, 2026 13:59
When multiple login providers or organizations are available in
non-interactive mode, list the options and provide a copy-pasteable
hint instead of silently picking one. Auto-pick only when there's
exactly one choice (no ambiguity).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@caffeinum caffeinum added the trigger: preview Publishes a preview build via pkg.pr.new + runs 2027 eval label May 27, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 27, 2026

Preview this PR with pkg.pr.new

Run the Sanity CLI

npx https://pkg.pr.new/team2027/sanity-cli/@sanity/cli@2d656d7 <command>

...Or upgrade project dependencies

📦 @sanity/cli
pnpm install https://pkg.pr.new/@sanity/cli@2d656d7
📦 @sanity/cli-build
pnpm install https://pkg.pr.new/@sanity/cli-build@2d656d7
📦 @sanity/cli-core
pnpm install https://pkg.pr.new/@sanity/cli-core@2d656d7
📦 @sanity/cli-test
pnpm install https://pkg.pr.new/@sanity/cli-test@2d656d7
📦 @sanity/eslint-config-cli
pnpm install https://pkg.pr.new/@sanity/eslint-config-cli@2d656d7

View Commit (2d656d7)

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 27, 2026

2027 // Getting Started: Staging CLI — D 58.3/100

+ ████████████░░░░░░░░   ▲ +6.8 pts vs baseline
Time Cost Errors Interruptions
6m 4s ▼ -2m 3s $2.33 ▼ -$0.94 3 0 ▼ -1

Tested: example.com → daytona.io
Vars: cliInstall: npm i -g https://pkg.pr.new/team2027/sanity-cli/@sanity/cli@2d656d7

Commit 2d656d7 · View report → · Dashboard

@team2027 team2027 deleted a comment from claude Bot May 27, 2026
@team2027 team2027 deleted a comment from claude Bot May 27, 2026
@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

Claude encountered an error —— View job


I'll analyze this and get back to you.

oclif's "Flag --create-project expects a value" fires before our
code runs with no way to customize the message. Override the catch
lifecycle to intercept this specific error and add a hint pointing
to --project-name.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
caffeinum and others added 11 commits May 27, 2026 16:37
Replace inline `hint:` strings with `formatHint()` helper that renders
hints as a visually distinct block:

  [Hint]
    sanity init --project-name "my-project" -y

Accepts variadic args for multi-line hints. Uses cyan styleText for
the [Hint] label. Applied across all error messages in init, login,
projects list, and organizations list.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The auto-login-from-init calls login() with no provider, which now
errors when multiple providers exist (per our explicit-selection
change). Fix by defaulting to google in the init auto-login path.

Direct `sanity login` still errors with the provider list — this
only affects the automatic login triggered by `sanity init -y`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- detect 'must login first' errors from client in addition to HTTP status codes
- add auth error tests with hints for organizations and projects list
- apply sort and order flags to JSON output in projects list
- update test assertions to verify error types with instanceof checks
…ng callback server

The background login output was too passive — agents treated "~30-60 seconds"
as informational and killed the callback server PID before OAuth completed.
Now explicitly warns not to kill the process and provides a clear wait+check
workflow using `sanity auth status`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…er login prompt

- Add `sanity organizations create <name>` command for explicit org creation
  (mirrors `organizations list` pattern, supports --json, has auth error detection)
- Auto-create personal org in unattended `init` when user has none, instead of
  erroring. Eliminates the most common post-signup blocker for AI agents
  (observed in evals: agents bypassed CLI to hit raw management API)
- Route org-not-found error through formatHint for consistency
- Strengthen background login message: explicit "wait for login to complete"
  and "do not run other sanity commands until auth status confirms"
- Rename pidfile `.bg-login.json` → `.auth-callback.json` (less inviting name)
- Add `sanity auth cancel` to stop background login cleanly
- Strip PID/port from login output (move to debug logs)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Change "Please complete login in the browser" → "Wait for the user to complete
the login in their browser". The agent is not the one logging in — the user is.
Framing it this way is a clearer directive to wait, not act.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When a background login child is running but the token hasn't arrived yet,
any auth-checking command (sanity auth status, projects list, organizations
list/create, debug --secrets) now reports "Login pending" instead of
"Not logged in".

We only know the callback server is alive, not whether the user is actively
logging in — so the message stays honest: "callback server is running,
waiting for OAuth redirect."

Observed in eval traces: agent runs login, immediately checks debug --secrets,
sees "Not logged in", concludes login failed — while the user is still
entering their password in the browser. This message gives agents the
information they need to wait instead of giving up.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a `--wait` flag that synchronously polls validateSession() up to 5
minutes after starting the background login. Agents and CI can run a
single deterministic command and get a clear success/failure exit code
instead of having to async-poll `sanity auth status`.

Bumps CHILD_TIMEOUT_MS and child-side TIMEOUT_MS from 150s to 300s to
match the new wait window and accommodate slower OAuth flows (TOTP,
multi-step browser automation).

Motivated by ax-eval rounds where agents under cognitive load reached
for browser-automation shortcuts (CDP, puppeteer, signup API) instead
of waiting for the async background login to complete.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Coverage Delta

File Statements
packages/@sanity/cli-core/src/SanityCommand.ts 84.9% (- 1.1%)
packages/@sanity/cli/scripts/check-topic-aliases.ts 0.0% (±0%)
packages/@sanity/cli/src/actions/auth/backgroundLogin.ts 15.9% (new)
packages/@sanity/cli/src/actions/auth/backgroundLoginChild.ts 0.0% (new)
packages/@sanity/cli/src/actions/auth/login/getProvider.ts 100.0% (±0%)
packages/@sanity/cli/src/actions/auth/login/login.ts 81.2% (- 18.8%)
packages/@sanity/cli/src/actions/auth/login/storeAuthToken.ts 100.0% (new)
packages/@sanity/cli/src/actions/debug/gatherDebugInfo.ts 94.9% (- 1.6%)
packages/@sanity/cli/src/actions/deploy/deployStudioSchemasAndManifests.ts 100.0% (±0%)
packages/@sanity/cli/src/actions/init/initAction.ts 86.1% (- 11.0%)
packages/@sanity/cli/src/actions/init/project/createProjectFromName.ts 100.0% (±0%)
packages/@sanity/cli/src/commands/auth/cancel.ts 100.0% (new)
packages/@sanity/cli/src/commands/auth/status.ts 89.5% (new)
packages/@sanity/cli/src/commands/init.ts 100.0% (±0%)
packages/@sanity/cli/src/commands/login.ts 100.0% (±0%)
packages/@sanity/cli/src/commands/organizations/create.ts 96.3% (new)
packages/@sanity/cli/src/commands/organizations/list.ts 97.0% (new)
packages/@sanity/cli/src/commands/projects/list.ts 97.4% (- 2.6%)
packages/@sanity/cli/src/commands/schemas/deploy.ts 95.0% (±0%)
packages/@sanity/cli/src/topicAliases.ts 100.0% (±0%)
packages/@sanity/cli/src/util/formatHint.ts 100.0% (new)
packages/@sanity/cli/src/util/sharedFlags.ts 100.0% (±0%)

Comparing 22 changed files against main @ c645f9ba5b692a1c3bea691dd9b56aaf2032d662

Overall Coverage

Metric Coverage
Statements 83.4% (- 0.9%)
Branches 73.8% (- 0.5%)
Functions 83.4% (- 0.8%)
Lines 83.9% (- 0.9%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

trigger: preview Publishes a preview build via pkg.pr.new + runs 2027 eval

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant