feat: Anthropic Workload Identity Federation#2658
feat: Anthropic Workload Identity Federation#2658dgageot wants to merge 3 commits intodocker:mainfrom
Conversation
- validate.go now resolves the effective provider type through the providers map, so a model with provider: <custom-key> and a model-level auth: block validates against the underlying provider type (was incorrectly rejecting valid configs). - urlSource: cap response body at 1 MiB, refuse to follow redirects (Go strips Authorization on cross-origin redirects but not arbitrary user-defined headers, which would leak header secrets), and apply a 30s request timeout as defence in depth on top of the caller's context. - commandSource: apply a 30s timeout for the same reason. - Error-snippet truncation in urlSource is now rune-aware (no mid-codepoint splits). - Added regression tests for all of the above.
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
Reviewed feat: Anthropic Workload Identity Federation (#2658).
The overall implementation is well-structured with good defensive measures (no-redirect HTTP client, 1 MiB body cap, 30 s timeouts, context propagation, UTF-8-safe error truncation). Validation logic, provider inheritance, and the EffectiveAuth/EffectiveProviderType helpers are correct.
Two medium-confidence issues worth addressing before merge, plus one low-severity diagnostic gap:
| # | Severity | File | Issue |
|---|---|---|---|
| 1 | 🟡 MEDIUM | Nil dereference if is nil (bypasses YAML validation) | |
| 2 | 🟡 MEDIUM | Expanded URL (may contain secret values) logged verbatim in errors | |
| 3 | ⚪ LOW | Silent body truncation at 1 MiB LimitReader boundary |
| return nil, fmt.Errorf("anthropic: unsupported auth.type %q", cfg.Auth.Type) | ||
| } | ||
| slog.Debug("Anthropic Workload Identity Federation configured", | ||
| "federation_rule_id", cfg.Auth.Federation.FederationRuleID) |
There was a problem hiding this comment.
[MEDIUM] Potential nil dereference on cfg.Auth.Federation before nil guard
cfg.Auth.Federation.FederationRuleID is accessed at line 150 inside the block that checks cfg.Auth != nil and cfg.Auth.Type == "workload_identity_federation", but there is no intermediate nil check on cfg.Auth.Federation itself.
slog.Debug("Anthropic Workload Identity Federation configured",
"federation_rule_id", cfg.Auth.Federation.FederationRuleID) // panics if Federation == nilConfig loaded from YAML goes through FederationAuthConfig.validate() which correctly rejects a nil Federation block, so the normal YAML path is safe. However, any code that constructs a ModelConfig programmatically (tests, future callers, fuzz inputs) with Auth = &AuthConfig{Type: "workload_identity_federation", Federation: nil} will reach this line and panic.
Suggestion: add a guard or assert cfg.Auth.Federation != nil before the slog.Debug call (the same guard already exists in federation.RequestOptions implicitly via field access, so a simple nil check with a returned error would be consistent with the pattern used elsewhere in this function).
| defer cancel() | ||
| req, err := http.NewRequestWithContext(ctx, http.MethodGet, expandedURL, http.NoBody) | ||
| if err != nil { | ||
| return "", fmt.Errorf("build request for %q: %w", expandedURL, err) |
There was a problem hiding this comment.
[MEDIUM] Fully-expanded URL (may contain secret values) logged verbatim in error messages
After environment.Expand substitutes env-var values into rawURL, the resulting expandedURL string is embedded verbatim in multiple error messages:
return "", fmt.Errorf("build request for %q: %w", expandedURL, err)
return "", fmt.Errorf("fetch %q: %w", expandedURL, err)
return "", fmt.Errorf("fetch %q: status %d: %s", expandedURL, resp.StatusCode, ...)These errors propagate through the SDK's ErrorEvent path and appear in the TUI and logs. The documented GitHub Actions pattern (${ACTIONS_ID_TOKEN_REQUEST_URL}&audience=...) only expands a URL — not a secret — so the common case is harmless. But the implementation places no restriction on embedding secrets in the URL, and if a user writes url: https://endpoint.com/token?secret=${MY_SECRET}, the expanded secret will appear in every error message from this source (network errors, 4xx/5xx responses, etc.).
Suggestion: log/report rawURL (the unexpanded template) in error messages rather than expandedURL, keeping the template in logs and the expanded value only in memory for the actual request.
| } | ||
| defer resp.Body.Close() | ||
|
|
||
| body, err := io.ReadAll(io.LimitReader(resp.Body, maxTokenResponseBytes)) |
There was a problem hiding this comment.
[LOW] Silent truncation when response body hits 1 MiB limit — no diagnostic error
io.ReadAll(io.LimitReader(resp.Body, maxTokenResponseBytes)) silently stops reading at 1 MiB and returns a truncated (and likely broken) byte slice with no error. A subsequent json.Unmarshal or raw-token parse on the truncated data would fail with a confusing message that hides the real cause.
body, err := io.ReadAll(io.LimitReader(resp.Body, maxTokenResponseBytes))
// err is nil even if the body was truncated at exactly 1 MiBSuggestion: after reading, compare len(body) to maxTokenResponseBytes and return an explicit error such as "response from %q exceeded %d bytes; check identity_token.url endpoint" before attempting to parse the token.
|
conflicts |
Closes #2640.
Summary
Adds support for Anthropic Workload Identity Federation (WIF) so that
agents can authenticate Claude API requests with short-lived OIDC-derived
tokens instead of a long-lived
ANTHROPIC_API_KEY.The feature is configured with a typed
auth:block on aProviderConfigor
ModelConfig(model-level wins, otherwise inherited from the provider):identity_tokensupports four mutually exclusive sources, covering the commonruntimes:
fileenvcommandgcloud auth print-identity-token,az account get-access-token, …)urlurlsupports${VAR}expansion in both the URL and any header values, sothe GitHub Actions OIDC endpoint can be wired without baking secrets into
YAML:
A complete walkthrough of all four sources lives in
examples/anthropic_wif.yaml.How it works
The official
anthropic-sdk-go(v1.39+) already exposesoption.WithFederationTokenProvider(IdentityTokenFunc, FederationOptions),so no token-exchange / RFC 7523 wire format had to be implemented locally.
pkg/config/latest/auth.go— typedAuthConfig/FederationAuthConfig/
IdentityTokenSourceConfigwith strict validation (prefix checks,exactly-one-source). Plus
EffectiveAuth,EffectiveProviderType, and(*AuthConfig).EnvVars()helpers, reused by gather/availability/runtime.pkg/model/provider/anthropic/federation/— turns the typed config intooption.RequestOptions. Wraps token-source errors with a clearanthropic workload identity federation: failed to refresh identity token from <kind> source (federation_rule=fdrl_…): <cause>message thatflows through the existing runtime
ErrorEventpath, so refresh failuresappear in the TUI immediately.
pkg/model/provider/anthropic/client.go— singlebuildDirectAuthOptionshelper picks between WIF and the legacy
ANTHROPIC_API_KEYpath; rejectscombining
auth:with--gateway.pkg/config/gather.go,pkg/runtime/model_switcher.go— auto-detection /required-env / availability checks understand WIF: a model with WIF auth
no longer requires
ANTHROPIC_API_KEY, and the env vars referenced bythe chosen identity-token source (env name,
${VAR}references in URL /headers) are surfaced instead.
pkg/model/provider/defaults.go— propagates provider-levelAuthtomodels that don't override it.
agent-schema.json— JSON Schema for the new types, withoneOffor thefour token sources and
^fdrl_/^svac_regex constraints.docs/providers/anthropic/index.md— new "Workload Identity Federation"section.
Defensive behaviour worth calling out
The
urlsource ships with three pieces of belt-and-braces hardening(against hostile or misconfigured token endpoints):
Authorization/Cookieoncross-origin redirects but not arbitrary user-defined headers, so a
redirect from the configured endpoint could leak a header secret (e.g.
X-OIDC-Token: ${SECRET}) to an attacker-controlled host. Redirectssurface as non-2xx errors.
prevents OOM from a hostile endpoint streaming an unbounded body.
commandsource has an analogous 30s timeout. UTF-8-aware truncation isused for non-2xx error snippets so we never produce invalid UTF-8 in error
messages.
Tests
pkg/config/latest/auth_test.go— full validation matrix (incl.custom-provider indirection both positive and negative cases).
pkg/model/provider/anthropic/federation/federation_test.go— file/env/command/url sources,
${VAR}expansion, JSON-field extraction, redirectrefusal, body-size cap, UTF-8-safe error truncation.
pkg/model/provider/anthropic/auth_test.go—NewClientcovers WIFsuccess, broken config, unknown auth type, and the gateway-mutex.
pkg/config/gather_auth_test.go— env-var detection across all sourcekinds and provider inheritance.
pkg/runtime/model_switcher_test.go— availability surfaces anthropicfor both model- and provider-level WIF, and stays unavailable for plain
configs without
ANTHROPIC_API_KEY.pkg/model/provider/model_defaults_test.go—Authinheritance fromProviderConfigwith model-override precedence.mise dev(lint + test + build) green.Out of scope
pkg/model/provider/anthropic/vertex.go) is leftunchanged — it has its own auth flow.
provider: anthropic+base_url:may or may not understandBearer sk-ant-oat01-…tokens;WIF is intended for
api.anthropic.com.Commits
feat: add Anthropic Workload Identity Federation supportrefactor: simplify Anthropic WIF wiringfix: tighten Anthropic WIF validation and harden urlSource