Skip to content

fix(setup): harden setup-aws/setup-gcp exits, confirmation, and output#115

Draft
tcerqueira wants to merge 3 commits into
mainfrom
tc/setup-hardening
Draft

fix(setup): harden setup-aws/setup-gcp exits, confirmation, and output#115
tcerqueira wants to merge 3 commits into
mainfrom
tc/setup-hardening

Conversation

@tcerqueira

@tcerqueira tcerqueira commented Jun 29, 2026

Copy link
Copy Markdown
Member

What

Hardens the setup-aws / setup-gcp cloud-connection wizards, fixing three coupled rough edges from the setup work in #112.

D — bare Deno.exit(1) bypassed the error envelope. Every direct exit (missing aws/gcloud CLI, CLI command failure, JSON parse failure, no active GCP account/project, cancelled prompt) now routes through util.ts error(). Agents get a structured { error: { code, message, hint, traceId } } envelope on stderr and a stable ExitCode. runAwsCommand/runGcloudCommand now take the GlobalContext so they can emit the envelope.

E — non-interactive runs auto-applied infra with no confirmation. Previously confirmApply() returned true whenever non-interactive, so CI silently created IAM roles / service accounts / workload identity. Now there is an explicit --apply opt-in.

B — wizards printed human text to stdout and emitted no JSON. All progress/status/plan chrome moved to stderr; under --json the decorative output is suppressed and a single JSON result object is written to stdout via writeJsonResult (provider, org, app, contexts, role/service-account name, ARNs, policies/roles, enabled APIs).

Confirmation contract (chosen)

In non-interactive mode (--non-interactive/-y or no TTY), the wizards refuse to create or modify cloud infrastructure unless the caller passes the explicit --apply flag — otherwise they exit with USAGE / CONFIRMATION_REQUIRED explaining how to opt in. In interactive mode the existing confirmation prompt is shown (and --apply skips it). The GCP API-enable step keeps its dedicated --enable-apis opt-in: missing APIs in non-interactive mode without --enable-apis now error (USAGE/APIS_NOT_ENABLED) instead of being silently enabled. The decision lives in a pure, unit-tested applyGate() helper, reused by both the apply gate and the API-enable gate.

Why this approach

  • Safe-by-default in CI: the dangerous path (mutating cloud infra unattended) requires a deliberate, greppable flag. Nothing is applied silently.
  • Consistent with existing conventions: mirrors the agent/CI contract (USAGE exit code + structured envelope + hint) and the existing --policies/--roles/--enable-apis "pre-supply the input" pattern.
  • Composable: a full unattended run is --json --non-interactive --policies ... --apply (AWS) or --json --non-interactive --roles ... --enable-apis --apply (GCP).

Rejected alternatives

  • Reuse -y/--non-interactive as the apply confirmation — conflates "don't prompt me" with "yes, mutate my cloud account"; too easy to trigger destructive changes by habit. Kept them orthogonal.
  • Keep auto-applying non-interactively (status quo) — the surprising, dangerous behavior this PR removes.
  • Interactive-only prompt with no flag — would make the wizards unusable in CI.

Tests

  • New deploy/setup-cloud.test.ts unit-tests applyGate() (apply / refuse / prompt) — pure, runs without aws/gcloud or a backend token.
  • deno fmt, deno lint, deno check all pass.
  • Verified stdout discipline: setup-aws --json --non-interactive ... against an unreachable endpoint emits nothing on stdout and a single structured envelope on stderr (exit 3).

e2e gap / follow-up

The wizards query the backend, then shell out to real aws/gcloud before reaching the apply gate, so the create-IAM-role / missing-CLI / API-enable paths can only be exercised end-to-end with live credentials and the CLIs installed. Those weren't runnable in this environment; the control-flow fixes are code-level and the core guard is unit-tested. Worth a manual e2e pass with real AWS/GCP creds before release.

Review follow-up: GCP API-enablement ordering

Fixed a partial-mutation ordering bug: setup-gcp previously enabled missing APIs (a real cloud mutation) as soon as --enable-apis was passed, then the master apply gate would refuse later when --apply was absent in non-interactive mode — leaving APIs enabled but nothing else created.

--apply is now the master gate for all cloud mutations. The new pure gcpApiEnableDecision() evaluates --apply before the API-specific --enable-apis, so a non-interactive run without --apply exits via error() (USAGE/CONFIRMATION_REQUIRED) before any API is enabled. Net contract: non-interactive API enablement requires both --enable-apis (authorizes the action) and --apply (authorizes mutating the account); without --apply, nothing mutates. The interactive prompt path and the AWS gating are unchanged. Added unit assertions covering the "no mutation before refuse" ordering (pure logic, no gcloud).

Addresses three coupled rough edges in the cloud-connection wizards
(setup rough edges from #112):

D — bare `Deno.exit(1)` bypassed the error envelope. Every direct exit
(missing aws/gcloud CLI, CLI command failure, JSON parse failure, no
active GCP account / project, cancelled prompt) now routes through
util.ts `error()`, so agents get a structured `{error:{code,...}}`
envelope on stderr and a stable ExitCode (USAGE for missing
prerequisites/cancellation, GENERIC for CLI failures).

E — non-interactive runs auto-applied infra with no confirmation. Added
an explicit `--apply` opt-in: in non-interactive mode the wizards now
refuse to create/modify IAM roles, service accounts, or workload
identity unless `--apply` is passed (USAGE/CONFIRMATION_REQUIRED
otherwise); interactive mode still prompts. The GCP API-enable step is
gated the same way behind its existing `--enable-apis` flag instead of
silently enabling APIs in CI. The decision lives in a pure, unit-tested
`applyGate()` helper.

B — the wizards printed human text to stdout and emitted no JSON. All
progress/status/plan chrome now goes to stderr; under `--json` the
decorative output is suppressed and a single JSON result object
(provider, role/service-account, ARNs, policies/roles, enabled APIs) is
written to stdout via `writeJsonResult`.

Also fixes a latent bug where the GCP role-grant and the plan preview
stringified the `{label,value}` option object instead of its value.
Unit-test `applyGate()` — the safety contract that the wizards must
never auto-apply cloud infra in non-interactive mode without an explicit
opt-in flag. Pure, so it runs without aws/gcloud or a backend token.
… mutation

setup-gcp enabled missing APIs (a real cloud mutation) as soon as
`--enable-apis` was passed, then the later master apply gate would refuse
when `--apply` was absent in non-interactive mode — leaving APIs enabled
but nothing else created (a surprise partial mutation despite the gate
"refusing").

Make `--apply` the master gate for all cloud mutations: the new pure
`gcpApiEnableDecision()` evaluates `--apply` *before* the API-specific
`--enable-apis`, so a non-interactive run without `--apply` exits via
`error()` (USAGE/CONFIRMATION_REQUIRED) before any API is enabled. Net
contract: non-interactive API enablement now requires BOTH `--enable-apis`
(authorizes the action) AND `--apply` (authorizes mutating the account);
without `--apply` nothing mutates. The interactive prompt path is
preserved, and the AWS path's already-correct gating is unchanged.

Updates the `--apply`/`--enable-apis` help text and adds unit assertions
covering the "no mutation before refuse" ordering.
@tcerqueira tcerqueira marked this pull request as draft June 29, 2026 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant