Skip to content

fix(aws): skip STS pre-validation for ambient (OIDC) providers; don't bake creds#354

Merged
Cre-eD merged 1 commit into
mainfrom
fix/aws-ambient-explicit-provider-creds
Jun 30, 2026
Merged

fix(aws): skip STS pre-validation for ambient (OIDC) providers; don't bake creds#354
Cre-eD merged 1 commit into
mainfrom
fix/aws-ambient-explicit-provider-creds

Conversation

@Cre-eD

@Cre-eD Cre-eD commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Problem

A real GitHub-OIDC sc deploy to a throwaway env failed at the explicit AWS provider:

pulumi-aws: error validating credentials: Invalid credentials configured.
pulumi:providers:aws '...--ecs-fargate--...--provider--testtmp' has a problem: Invalid credentials configured.

Everything else under ambient creds worked — OIDC federation, the S3 Pulumi state backend, KMS, StackReference reads.

Root cause (verified, not guessed)

  • In pulumi-aws v6.83.4, "Invalid credentials configured" = incomplete creds, not "none" (that is No valid credential sources found). The provider does fall back to env (token else AWS_SESSION_TOKEN), and the runner env had complete creds incl. the session token (confirmed in the run log).
  • Inspecting the stack's Pulumi state: the aws provider resource has accessKey + secretKey baked in (from prior static-key deploys), skipCredentialsValidation: false, no token.
  • So this is a static→ambient transition: the provider's eager STS pre-validation chokes on the incomplete (token-less) credential picture, even though the real API calls would resolve the complete ambient env creds.

Fix

In ambient mode (empty static keys): leave the provider credential-less (default chain resolves env creds at call time) and set SkipCredentialsValidation: true to skip the brittle pre-validation. Static keys keep validation on. Shared applyAWSProviderCreds helper for Provider() + the CloudTrail region provider. Tests cover both paths.

The first ambient deploy self-heals the state (provider becomes creds-less + skipCredentialsValidation: true); no manual state surgery.

What this deliberately does NOT do

The original version of this PR baked the ambient env creds (incl. the rotating AWS_SESSION_TOKEN) onto the provider inputs. A Codex (gpt-5.5) + Gemini review rejected that: it persists ephemeral creds in the encrypted checkpoint and produces a provider diff every run. Dropped.

Verification

  • go build / go vet / go test ./pkg/clouds/pulumi/aws/ green.
  • End-to-end confirmation pending: a testtmp re-deploy on this build (preview or post-release) — the ambient code path is dormant for every current caller (only reached when aws-oidc-role is set), so there is no production exposure from merging.

Follow-up to #349.

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown

Semgrep Scan Results

Repository: api | Commit: eff0c1d

Check Status Details
⚠️ Semgrep Warning 1 warning(s), 1 total

Scanned at 2026-06-30 15:45 UTC

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown

Security Scan Results

Repository: api | Commit: eff0c1d

Check Status Details
✅ Secret Scan Pass No secrets detected
✅ Dependencies (Trivy) Pass 0 total (no critical/high)
✅ Dependencies (Grype) Pass 0 total (no critical/high)
📦 SBOM Generated 523 components (CycloneDX)

Scanned at 2026-06-30 15:45 UTC

@github-actions

Copy link
Copy Markdown

📊 Statement coverage

Measured on the documented included set (see docs/TESTING.md → Coverage scope). Observe-only — no regression gate is enforced yet.

Scope This PR main baseline Δ
Included set (Gold-tier denominator) 90.3% 90.3% +0.0 pp
Full set (whole repo, transparency) 27.9% 27.9% +0.0 pp

Baseline: main @ dea14c1

… bake creds

Re-deploying a stack that was previously deployed with static keys, now under
ambient GitHub-OIDC creds, fails at the explicit pulumi-aws provider with
'Invalid credentials configured'. In pulumi-aws v6.83.4 that message means
INCOMPLETE creds (not 'no creds' -> that is 'No valid credential sources
found'): the eager STS pre-validation chokes on the static->ambient provider
transition, even though the runner's ambient env creds are complete (incl.
AWS_SESSION_TOKEN, which does reach the plugin) and authorize the real calls.

Fix: in ambient mode (empty static keys) keep the provider credential-less so
the AWS default chain resolves env creds at call time, and set
SkipCredentialsValidation=true to skip the brittle pre-validation. Static keys
keep validation on. Shared applyAWSProviderCreds helper (Provider() + cloudtrail
region provider) + tests.

Deliberately does NOT copy the rotating ambient creds (incl. session token) onto
provider inputs: that persists ephemeral creds in the Pulumi checkpoint and
diffs the provider every run. Supersedes the first approach after a Codex + Gemini
review rejected credential-baking.

Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
@Cre-eD Cre-eD force-pushed the fix/aws-ambient-explicit-provider-creds branch from f53cd2d to 3c7848e Compare June 30, 2026 15:44
@Cre-eD Cre-eD changed the title fix(aws): pass ambient OIDC creds (incl. session token) to explicit providers fix(aws): skip STS pre-validation for ambient (OIDC) providers; don't bake creds Jun 30, 2026
@Cre-eD Cre-eD merged commit adb026b into main Jun 30, 2026
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants