Skip to content

feat: Anthropic Workload Identity Federation#2658

Open
dgageot wants to merge 3 commits intodocker:mainfrom
dgageot:feat/anthropic-wif
Open

feat: Anthropic Workload Identity Federation#2658
dgageot wants to merge 3 commits intodocker:mainfrom
dgageot:feat/anthropic-wif

Conversation

@dgageot
Copy link
Copy Markdown
Member

@dgageot dgageot commented May 6, 2026

Closes #2640.

Summary

Adds support for Anthropic Workload Identity Federation (WIF) so that
agents can authenticate Claude API requests with short-lived OIDC-derived
tokens instead of a long-lived ANTHROPIC_API_KEY.

The feature is configured with a typed auth: block on a ProviderConfig
or ModelConfig (model-level wins, otherwise inherited from the provider):

providers:
  anthropic-wif:
    provider: anthropic
    auth:
      type: workload_identity_federation
      workload_identity_federation:
        federation_rule_id: fdrl_REPLACE_ME
        organization_id: 00000000-0000-0000-0000-000000000000
        # service_account_id: svac_...   # optional, target_type=SERVICE_ACCOUNT only
        identity_token:
          # Pick exactly one of: file, env, command, url
          file: /var/run/secrets/anthropic.com/token

models:
  claude:
    provider: anthropic-wif
    model: claude-sonnet-4-5

identity_token supports four mutually exclusive sources, covering the common
runtimes:

Source When to use
file Kubernetes projected SA tokens, SPIFFE/SPIRE helpers, Vault sidecars — anything that rotates a file on disk
env The token is already exported in an environment variable
command Shell out on every refresh (gcloud auth print-identity-token, az account get-access-token, …)
url Fetch from an HTTP(S) endpoint (cloud metadata servers, GitHub Actions OIDC token endpoint, …)

url supports ${VAR} expansion in both the URL and any header values, so
the GitHub Actions OIDC endpoint can be wired without baking secrets into
YAML:

identity_token:
  url: ${ACTIONS_ID_TOKEN_REQUEST_URL}&audience=https://api.anthropic.com
  headers:
    Authorization: bearer ${ACTIONS_ID_TOKEN_REQUEST_TOKEN}
  response_field: value

A complete walkthrough of all four sources lives in
examples/anthropic_wif.yaml.

How it works

The official anthropic-sdk-go (v1.39+) already exposes
option.WithFederationTokenProvider(IdentityTokenFunc, FederationOptions),
so no token-exchange / RFC 7523 wire format had to be implemented locally.

  • pkg/config/latest/auth.go — typed AuthConfig / FederationAuthConfig
    / IdentityTokenSourceConfig with strict validation (prefix checks,
    exactly-one-source). Plus EffectiveAuth, EffectiveProviderType, and
    (*AuthConfig).EnvVars() helpers, reused by gather/availability/runtime.
  • pkg/model/provider/anthropic/federation/ — turns the typed config into
    option.RequestOptions. Wraps token-source errors with a clear
    anthropic workload identity federation: failed to refresh identity token from <kind> source (federation_rule=fdrl_…): <cause> message that
    flows through the existing runtime ErrorEvent path, so refresh failures
    appear in the TUI immediately.
  • pkg/model/provider/anthropic/client.go — single buildDirectAuthOptions
    helper picks between WIF and the legacy ANTHROPIC_API_KEY path; rejects
    combining auth: with --gateway.
  • pkg/config/gather.go, pkg/runtime/model_switcher.go — auto-detection /
    required-env / availability checks understand WIF: a model with WIF auth
    no longer requires ANTHROPIC_API_KEY, and the env vars referenced by
    the chosen identity-token source (env name, ${VAR} references in URL /
    headers) are surfaced instead.
  • pkg/model/provider/defaults.go — propagates provider-level Auth to
    models that don't override it.
  • agent-schema.json — JSON Schema for the new types, with oneOf for the
    four token sources and ^fdrl_ / ^svac_ regex constraints.
  • docs/providers/anthropic/index.md — new "Workload Identity Federation"
    section.

Defensive behaviour worth calling out

The url source ships with three pieces of belt-and-braces hardening
(against hostile or misconfigured token endpoints):

  • No redirect following. Go strips Authorization/Cookie on
    cross-origin redirects but not arbitrary user-defined headers, so a
    redirect from the configured endpoint could leak a header secret (e.g.
    X-OIDC-Token: ${SECRET}) to an attacker-controlled host. Redirects
    surface as non-2xx errors.
  • Bounded response body (1 MiB). JWTs are well under 16 KiB; the cap
    prevents OOM from a hostile endpoint streaming an unbounded body.
  • 30s request timeout layered on top of the SDK-supplied context.

command source has an analogous 30s timeout. UTF-8-aware truncation is
used for non-2xx error snippets so we never produce invalid UTF-8 in error
messages.

Tests

  • pkg/config/latest/auth_test.go — full validation matrix (incl.
    custom-provider indirection both positive and negative cases).
  • pkg/model/provider/anthropic/federation/federation_test.go — file/env/
    command/url sources, ${VAR} expansion, JSON-field extraction, redirect
    refusal, body-size cap, UTF-8-safe error truncation.
  • pkg/model/provider/anthropic/auth_test.goNewClient covers WIF
    success, broken config, unknown auth type, and the gateway-mutex.
  • pkg/config/gather_auth_test.go — env-var detection across all source
    kinds and provider inheritance.
  • pkg/runtime/model_switcher_test.go — availability surfaces anthropic
    for both model- and provider-level WIF, and stays unavailable for plain
    configs without ANTHROPIC_API_KEY.
  • pkg/model/provider/model_defaults_test.goAuth inheritance from
    ProviderConfig with model-override precedence.

mise dev (lint + test + build) green.

Out of scope

  • Vertex AI client (pkg/model/provider/anthropic/vertex.go) is left
    unchanged — it has its own auth flow.
  • Custom Anthropic-compatible gateways behind provider: anthropic +
    base_url: may or may not understand Bearer sk-ant-oat01-… tokens;
    WIF is intended for api.anthropic.com.

Commits

  1. feat: add Anthropic Workload Identity Federation support
  2. refactor: simplify Anthropic WIF wiring
  3. fix: tighten Anthropic WIF validation and harden urlSource

dgageot added 3 commits May 6, 2026 10:11
- validate.go now resolves the effective provider type through the providers map, so a model with provider: <custom-key> and a model-level auth: block validates against the underlying provider type (was incorrectly rejecting valid configs).

- urlSource: cap response body at 1 MiB, refuse to follow redirects (Go strips Authorization on cross-origin redirects but not arbitrary user-defined headers, which would leak header secrets), and apply a 30s request timeout as defence in depth on top of the caller's context.

- commandSource: apply a 30s timeout for the same reason.

- Error-snippet truncation in urlSource is now rune-aware (no mid-codepoint splits).

- Added regression tests for all of the above.
@dgageot dgageot requested a review from a team as a code owner May 6, 2026 12:00
Copy link
Copy Markdown

@docker-agent docker-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟡 NEEDS ATTENTION

Reviewed feat: Anthropic Workload Identity Federation (#2658).

The overall implementation is well-structured with good defensive measures (no-redirect HTTP client, 1 MiB body cap, 30 s timeouts, context propagation, UTF-8-safe error truncation). Validation logic, provider inheritance, and the EffectiveAuth/EffectiveProviderType helpers are correct.

Two medium-confidence issues worth addressing before merge, plus one low-severity diagnostic gap:

# Severity File Issue
1 🟡 MEDIUM Nil dereference if is nil (bypasses YAML validation)
2 🟡 MEDIUM Expanded URL (may contain secret values) logged verbatim in errors
3 ⚪ LOW Silent body truncation at 1 MiB LimitReader boundary

return nil, fmt.Errorf("anthropic: unsupported auth.type %q", cfg.Auth.Type)
}
slog.Debug("Anthropic Workload Identity Federation configured",
"federation_rule_id", cfg.Auth.Federation.FederationRuleID)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Potential nil dereference on cfg.Auth.Federation before nil guard

cfg.Auth.Federation.FederationRuleID is accessed at line 150 inside the block that checks cfg.Auth != nil and cfg.Auth.Type == "workload_identity_federation", but there is no intermediate nil check on cfg.Auth.Federation itself.

slog.Debug("Anthropic Workload Identity Federation configured",
    "federation_rule_id", cfg.Auth.Federation.FederationRuleID)  // panics if Federation == nil

Config loaded from YAML goes through FederationAuthConfig.validate() which correctly rejects a nil Federation block, so the normal YAML path is safe. However, any code that constructs a ModelConfig programmatically (tests, future callers, fuzz inputs) with Auth = &AuthConfig{Type: "workload_identity_federation", Federation: nil} will reach this line and panic.

Suggestion: add a guard or assert cfg.Auth.Federation != nil before the slog.Debug call (the same guard already exists in federation.RequestOptions implicitly via field access, so a simple nil check with a returned error would be consistent with the pattern used elsewhere in this function).

defer cancel()
req, err := http.NewRequestWithContext(ctx, http.MethodGet, expandedURL, http.NoBody)
if err != nil {
return "", fmt.Errorf("build request for %q: %w", expandedURL, err)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Fully-expanded URL (may contain secret values) logged verbatim in error messages

After environment.Expand substitutes env-var values into rawURL, the resulting expandedURL string is embedded verbatim in multiple error messages:

return "", fmt.Errorf("build request for %q: %w", expandedURL, err)
return "", fmt.Errorf("fetch %q: %w", expandedURL, err)
return "", fmt.Errorf("fetch %q: status %d: %s", expandedURL, resp.StatusCode, ...)

These errors propagate through the SDK's ErrorEvent path and appear in the TUI and logs. The documented GitHub Actions pattern (${ACTIONS_ID_TOKEN_REQUEST_URL}&audience=...) only expands a URL — not a secret — so the common case is harmless. But the implementation places no restriction on embedding secrets in the URL, and if a user writes url: https://endpoint.com/token?secret=${MY_SECRET}, the expanded secret will appear in every error message from this source (network errors, 4xx/5xx responses, etc.).

Suggestion: log/report rawURL (the unexpanded template) in error messages rather than expandedURL, keeping the template in logs and the expanded value only in memory for the actual request.

}
defer resp.Body.Close()

body, err := io.ReadAll(io.LimitReader(resp.Body, maxTokenResponseBytes))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[LOW] Silent truncation when response body hits 1 MiB limit — no diagnostic error

io.ReadAll(io.LimitReader(resp.Body, maxTokenResponseBytes)) silently stops reading at 1 MiB and returns a truncated (and likely broken) byte slice with no error. A subsequent json.Unmarshal or raw-token parse on the truncated data would fail with a confusing message that hides the real cause.

body, err := io.ReadAll(io.LimitReader(resp.Body, maxTokenResponseBytes))
// err is nil even if the body was truncated at exactly 1 MiB

Suggestion: after reading, compare len(body) to maxTokenResponseBytes and return an explicit error such as "response from %q exceeded %d bytes; check identity_token.url endpoint" before attempting to parse the token.

@rumpl
Copy link
Copy Markdown
Member

rumpl commented May 6, 2026

conflicts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[anthropic] Workload Identity Federation support

2 participants