feat: cleanup candidate reservations when confirming vm#871
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughThe PR adds a Reservation field index keyed by allocation VM UUIDs, updates the controller to remove newly confirmed VM UUIDs from candidate reservations using that index, registers the index in test environments, and adds tests validating cleanup behavior. ChangesReservation Allocation Cleanup by VM UUID
Sequence Diagram(s)sequenceDiagram
participant Reconciler as reconcileAllocations
participant Status as Status.Allocations snapshot
participant Cleanup as cleanupCandidateReservations
participant Index as Reservation VM-UUID Index
participant Candidate as Candidate Reservation CRs
Reconciler->>Status: snapshot old allocations
Reconciler->>Reconciler: compute newStatusAllocations
Reconciler->>Cleanup: for each newly confirmed VM UUID
Cleanup->>Index: list reservations by VM UUID
Index-->>Cleanup: candidate reservations
Cleanup->>Candidate: patch Spec.CommittedResourceReservation.Allocations (remove VM UUID)
Candidate-->>Cleanup: patch result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/scheduling/reservations/commitments/reservation_controller.go`:
- Around line 398-403: The current logic copies
res.Status.CommittedResourceReservation.Allocations into
existingStatusAllocations and later uses presence in status to skip cleanup,
which allows a transient cleanup failure to be permanently skipped on the next
reconcile; change the reconcile/cleanup flow in reservation_controller.go so
cleanup of stale candidate allocations is not gated solely by membership in
res.Status.CommittedResourceReservation.Allocations: compare
existingStatusAllocations to the new patched allocations and only mark candidate
allocations removed when the cleanup step actually succeeded, and if the
patch/cleanup fails do not mutate status to record the VM as cleaned;
specifically update the code that reads/writes
res.Status.CommittedResourceReservation.Allocations and the cleanup branch (the
blocks around existingStatusAllocations and the cleanup calls referenced in your
comment) to always retry cleanup for VMs whose candidate allocations remain
stale until cleanup returns success (or add an explicit per-VM cleanup-needed
flag/finalizer in status that you clear only after successful cleanup).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8d9d8ac7-752f-4ae0-bbfc-99d65b9ce8b2
📒 Files selected for processing (5)
internal/scheduling/reservations/commitments/committed_resource_controller_test.gointernal/scheduling/reservations/commitments/field_index.gointernal/scheduling/reservations/commitments/integration_test.gointernal/scheduling/reservations/commitments/reservation_controller.gointernal/scheduling/reservations/commitments/reservation_controller_test.go
Test Coverage ReportTest Coverage 📊: 69.5% |
When a VM is confirmed onto a CommittedResource reservation, immediately
remove that VM's UUID from Spec.Allocations on all other candidate
reservations that still carry it. Previously those slots were only freed
after their grace period or the next periodic requeue, leaving transient
phantom blocks on non-selected hosts.