Add IMDSv2 migration command for ROSA Classic clusters#909
Conversation
Implements automated IMDSv2 migration workflow with: - Pre-flight health checks (ClusterOperators, CPMS, nodes, etcd) - Hive MachinePool patching for infra/worker nodes - Sequential infra machine replacement with health validation - ControlPlaneMachineSet update for master node rollout - Post-migration validation Supports selective migration via --nodes flag (all, master, infra, workers). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 30 minutes and 54 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (4)
WalkthroughThis PR introduces ChangesIMDSv2 Migration Command
Sequence DiagramsequenceDiagram
participant User as User/CLI
participant Cmd as osdctl imdsv2
participant OCM as OCM API
participant K8s as Kubernetes API
participant Hive as Hive<br/>(MachinePools)
User->>Cmd: cluster imdsv2 --cluster-id=... --nodes=...
Cmd->>OCM: Fetch cluster environment and ID
Cmd->>K8s: Pre-flight checks (operators, CPMS, nodes, etcd)
K8s-->>Cmd: Health status
rect rgba(100, 200, 150, 0.5)
Cmd->>Hive: List MachinePools in Hive namespace
Hive-->>Cmd: MachinePool specs
Cmd->>Hive: Patch selected pools for IMDSv2
end
rect rgba(150, 150, 200, 0.5)
Cmd->>K8s: Delete infra Machines (sequential)
K8s-->>Cmd: Machines deleted
Cmd->>K8s: Poll for replacement Machines Running
K8s-->>Cmd: Replacements ready
Cmd->>K8s: Wait for infra nodes Ready
Cmd->>K8s: Verify cluster operator health
end
rect rgba(200, 150, 100, 0.5)
Cmd->>K8s: Get ControlPlaneMachineSet
Cmd->>Cmd: User confirmation (destructive)
Cmd->>K8s: Patch CPMS for IMDSv2
Cmd->>K8s: Monitor CPMS rollout
K8s-->>Cmd: Rollout complete
end
Cmd->>K8s: Final validation (nodes Ready, machines IMDSv2)
K8s-->>Cmd: Migration verified
Cmd->>User: Success/warnings
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 2 warnings)
✅ Passed checks (12 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: givaldolins The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
cmd/cluster/imdsv2.go (1)
348-395: ⚖️ Poor tradeoffConsider using the Kubernetes client instead of shelling out to
oc.Shelling out to
ocintroduces a dependency on the CLI being installed and configured. The kubeconfig context used byocmay differ from the client initialized via backplane ininit(). Consider using theo.clientto fetchClusterOperatorresources directly via theconfig.openshift.io/v1API group.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cmd/cluster/imdsv2.go` around lines 348 - 395, The checkClusterOperators function currently shells out to `oc` to list ClusterOperators; replace that with direct API calls using the existing o.client so the same kubeconfig/backplane client is used. Use the OpenShift config v1 ClusterOperator type (e.g., configv1.ClusterOperatorList) or a dynamic client list against "config.openshift.io/v1" to fetch all ClusterOperator resources with o.client.List(ctx, &list) (or equivalent), then iterate list.Items and inspect each item.Status.Conditions exactly as done now to build unhealthyOps, returning errors on client.List or unmarshalling failures as appropriate; keep the same final behavior/message and error formatting but remove the exec.CommandContext call and JSON unmarshal logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cmd/cluster/imdsv2.go`:
- Around line 728-754: The function monitorCPMSRollout creates a timeout context
from context.Background(), breaking cancellation propagation; change the context
creation to derive from the passed ctx (use context.WithTimeout(ctx,
imdsv2RolloutPollTimeout) and keep cancel deferred) so parent cancellations
(SIGINT or upstream cancellation) interrupt the polling loop; update the context
creation in monitorCPMSRollout where pollCtx and cancel are defined (and
continue to pass pollCtx into wait.PollUntilContextTimeout as before).
---
Nitpick comments:
In `@cmd/cluster/imdsv2.go`:
- Around line 348-395: The checkClusterOperators function currently shells out
to `oc` to list ClusterOperators; replace that with direct API calls using the
existing o.client so the same kubeconfig/backplane client is used. Use the
OpenShift config v1 ClusterOperator type (e.g., configv1.ClusterOperatorList) or
a dynamic client list against "config.openshift.io/v1" to fetch all
ClusterOperator resources with o.client.List(ctx, &list) (or equivalent), then
iterate list.Items and inspect each item.Status.Conditions exactly as done now
to build unhealthyOps, returning errors on client.List or unmarshalling failures
as appropriate; keep the same final behavior/message and error formatting but
remove the exec.CommandContext call and JSON unmarshal logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 993b8d49-1d85-43e7-9f2d-d04b1a1555f0
📒 Files selected for processing (2)
cmd/cluster/cmd.gocmd/cluster/imdsv2.go
- Fix monitorCPMSRollout to derive timeout context from parent ctx instead of context.Background(), allowing proper cancellation propagation (SIGINT/upstream cancellation) - Replace checkClusterOperators oc shell-out with direct API calls using configv1.ClusterOperatorList and existing o.client - Register configv1 scheme in init() for ClusterOperator resources - Remove unused os/exec import Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
/retest |
|
@givaldolins: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Implements automated IMDSv2 migration workflow with:
Supports selective migration via --nodes flag (all, master, infra, workers).
This MR automates SOP
Summary by CodeRabbit
osdctl cluster imdsv2command to migrate ROSA Classic cluster nodes to require IMDSv2 authentication.