feat: dynamic vault OIDC callback port and in-process token capture#910
Conversation
Read the dynamically-assigned host port from /tmp/vault_callback_port (written by ocm-container's ports feature) and pass it as callbackport to vault login, enabling multiple containers to run simultaneously. When re-authenticating inside a container, first try writing ~/.vault-token normally. If that fails (read-only bind mount or atomic rename issues), fall back to capturing the token in-process via the VAULT_TOKEN environment variable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
WalkthroughRefactors Vault token refresh into smaller helpers: adds readCallbackPort(), builds container OIDC args with optional callback port, separates container/local setup, and implements a container fallback that captures the token from stdout when the token file cannot be written. ChangesVault Token Setup Refactoring
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 14 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (14 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/utils/vault_test.go`:
- Around line 132-145: The tests TestSetupVaultTokenLocal_ArgsShape and
TestSetupVaultTokenContainer_ArgsShape are no-ops because they only reference
setupVaultTokenLocal and setupVaultTokenContainer without executing or asserting
behavior; replace them by invoking the functions (or the helper that builds the
vault login command) with a test-injected command factory or argument-builder
stub and assert the produced command, arguments, and environment variables match
expected values for non-container and container modes respectively.
Specifically, update the tests to call the builders used by
setupVaultTokenLocal/setupVaultTokenContainer (or refactor those functions to
accept an injectable commandFactory/argBuilder), inject a fake commandFactory
that captures the command and args, then assert on captured command string, args
list and any env vars to verify the login command construction.
- Around line 42-62: The test never calls readCallbackPort() and instead
duplicates its file-read/trim logic; refactor so the parsing is testable and
call production code directly: either change readCallbackPort to accept a path
(e.g., readCallbackPort(path string) string/error) or extract the trimming logic
into a new function parseCallbackPort(content string) (or parseCallbackPort(r
io.Reader)), update readCallbackPort to use that helper, and then update
vault_test.go to create a temp file (or a string/reader) and call the production
parseCallbackPort/readCallbackPort with that input and assert results for both
present and missing-file cases; reference readCallbackPort and
vaultCallbackPortFile to locate the code to change.
In `@pkg/utils/vault.go`:
- Around line 103-115: setupVaultTokenLocal (and setupVaultTokenContainer)
currently call exec.Command without a context which can hang indefinitely;
change both functions to accept a context.Context parameter, create a child
context with a reasonable timeout using context.WithTimeout, and use
exec.CommandContext instead of exec.Command so the spawned 'vault login' process
is cancelled when the context times out or is cancelled; propagate the ctx from
callers into setupVaultTokenLocal/setupVaultTokenContainer, check for ctx.Err()
and return a wrapped error (including context timeout/cancel reason) when the
command fails, and ensure any stdout/stderr handling remains correct so the
process is terminated on timeout.
- Around line 155-178: The fallback to in-process token capture is executed
whenever loginCmd.Run() returns any error, but the code logs "Token file write
failed" and switches to containerOIDCArgs(true) even when the error was
unrelated to writing ~/.vault-token; update setupVaultTokenContainer to inspect
the original error from the first vault login (loginCmd.Run()) and only perform
the -no-store fallback (rebuilding loginArgs with containerOIDCArgs(true) and
re-running loginCmd with tokenBuf capture) when the error indicates a token-file
write/rename permission failure (e.g., syscall errors, permission denied, or
specific vault CLI stderr text), otherwise return the original error (including
its context) instead of silently switching to container-based auth; keep
existing logging but change the message to reflect that the fallback is
conditional and include the original error when returning.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 01ffb963-6cc3-4b78-a4e5-d32dce055fe5
📒 Files selected for processing (2)
pkg/utils/vault.gopkg/utils/vault_test.go
Replace file-writing tests with mocked readFileFunc. Extract buildOIDCArgs to accept callbackPort as a parameter so tests exercise arg-building logic without any I/O or API calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add context.WithTimeout (5 minutes) to all vault login exec.Command calls so stalled OIDC flows don't block indefinitely. Gate the -no-store fallback in setupVaultTokenContainer to only trigger on token file write errors (rename, device busy, read-only, permission denied). Auth/network/timeout failures now return immediately instead of silently retrying. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/label tide/merge-method-squash 🤖 Claude claude@anthropic.com commenting on behalf of @clcollins |
|
@clcollins: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| return fmt.Errorf("vault login timed out after %s", vaultLoginTimeout) | ||
| } else if !isTokenFileError(err) { | ||
| return fmt.Errorf("vault login failed: %v\n\n"+ | ||
| "If authentication timed out or the callback failed:\n"+ |
There was a problem hiding this comment.
Can you make this less ocm-container focused? Seems like it would be a general UX issue?
There was a problem hiding this comment.
@dustman9000 Same as the other one: this is an ocm-container only problem & this code isn't reachable by normal osdctl users outside of ocm-container.
| } | ||
| return fmt.Errorf("vault login failed: %v\n\n"+ | ||
| "If authentication timed out or the callback failed:\n"+ | ||
| " 1. Ensure ocm-container has the vault port enabled in the ports feature\n"+ |
There was a problem hiding this comment.
Same UX issue here, please make more generic.
There was a problem hiding this comment.
@dustman9000 this is an ocm-container only problem, and this function is not reachable unless you are in ocm-container. Normal osdctl users outside of ocm-container won't ever see this message.
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: clcollins, dustman9000 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Summary
/tmp/vault_callback_port(written by ocm-container's ports feature) and pass it ascallbackporttovault login -method=oidc~/.vault-tokennormally; if that fails (read-only bind mount or atomic rename issues), fall back to capturing the token in-process viaVAULT_TOKENenv varlaunch-optsapproachCompanion PR
Jira
ROSAENG-39831
Test plan
go test ./pkg/utils/ -v~/.vault-token:osdctl rhobs logsuses it directlyosdctl rhobs logstriggers OIDC re-auth/tmp/vault_callback_portpresent: uses dynamic callback port/tmp/vault_callback_port: falls back to default port 8250VAULT_TOKENcapture🤖 Generated with Claude Code
Summary by CodeRabbit
Refactor
Tests