[SAP] Implement graceful shutdown for cinder services#314
Open
hemna wants to merge 1 commit into
Open
Conversation
5a72074 to
1f69e00
Compare
4c183ec to
d331087
Compare
1dd055a to
d2fddd7
Compare
a235a58 to
b77438d
Compare
Scsabiii
previously approved these changes
May 15, 2026
d972062 to
ea28218
Compare
db7847d to
8e7da2c
Compare
Three-phase graceful shutdown that allows in-flight volume and backup
operations to complete before the pod exits during Kubernetes rolling
updates. Covers both cinder-volume and cinder-backup services.
The fix: install SIG_IGN for SIGTERM/SIGINT/SIGHUP at the start of
Service.stop(). cinder-volume runs as N forked child processes (one per
backend) under oslo_service.ProcessLauncher. On SIGTERM, oslo_service's
child handler calls SignalHandler.clear() which resets all handlers to
SIG_DFL. The parent then sends a second SIGTERM via os.kill(child_pid,
SIGTERM). Without SIG_IGN, the child terminates immediately — even
though it's still inside pool.waitall() waiting for in-flight RPC
handlers to finish.
How it works:
Phase 1: Skip consumer cancel. We do NOT call rpcserver.stop() because
it triggers an eventlet socket race ("simultaneous read on fileno")
that disrupts outbound HTTP/RPC connections used by in-flight ops.
Broker-side deregistration happens automatically at process exit.
During the gap, heartbeat stops (scheduler reroutes within
service_down_time) and @reject_if_draining rejects late arrivals.
Phase 2: Block in pool.waitall() until all in-flight RPC handler
greenthreads complete their operations.
Phase 3: Stop coordination, call super().stop(), cleanup threadpool
executor. Process exits cleanly.
Supporting mechanisms:
- Worker entry heartbeat (set_workers decorator): touches worker DB
entries every 10s during operations, preventing new pod init_host
do_cleanup from resetting in-flight volumes to error. Uses short
interruptible sleeps (0.1s polling) for fast shutdown response.
- do_cleanup freshness check: skips worker entries updated within
service_down_time, only cleans up truly stale entries.
- Backup restore heartbeat: touches backup.updated_at every 10s,
preventing new backup pod from triggering BackupRestoreCancel.
- Backup _detach_device no-reraise: if detach fails during shutdown,
log and continue. Data integrity preserved.
- reject_if_draining decorator: unconditionally rejects new RPC calls
on a draining service so scheduler routes to healthy backends.
Test adaptations:
- Base test classes (BaseVolumeTestCase, BaseBackupTest) drain the
GreenPool in addCleanup to prevent greenthread leaks.
- test_volume_cleanup sets service_down_time=0 and runs threadpool
tasks synchronously (mock_object) to avoid race conditions.
- test_admin_actions tearDown explicitly stops fake RPC servers to
deregister stale endpoints from oslo.messaging's fake transport.
- test_service assertions updated to reflect intentional skip of
rpcserver.stop() in graceful shutdown, and signal flag reset.
- test_backup_messages updated for _detach_device no-reraise behavior.
Requires (separate changes):
- dumb-init --single-child on container commands
- terminationGracePeriodSeconds: 900 on pod spec
Change-Id: Icdd28affc73fd34491b656a68410dce8e46264d4
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
[SAP] Implement graceful shutdown for cinder services
Three-phase graceful shutdown that allows in-flight volume and backup operations to complete before the pod exits during Kubernetes rolling updates. Covers both
cinder-volumeandcinder-backupservices.The fix
Install
SIG_IGNfor SIGTERM/SIGINT/SIGHUP at the start ofService.stop().cinder-volumeruns as N forked child processes (one per backend) underoslo_service.ProcessLauncher. On SIGTERM, oslo_service's child handler callsSignalHandler.clear()which resets all handlers toSIG_DFL. The parent then sends a second SIGTERM viaos.kill(child_pid, SIGTERM). WithoutSIG_IGN, the child terminates immediately — even though it's still insidepool.waitall()waiting for in-flight RPC handlers to finish.How it works
Phase 1: Skip consumer cancel. We do NOT call
rpcserver.stop()because it triggers an eventlet socket race ("simultaneous read on fileno") that disrupts outbound HTTP/RPC connections used by in-flight ops. Broker-side deregistration happens automatically when the OS closes the AMQP socket at process exit. During the gap, heartbeat stops (scheduler reroutes withinservice_down_time) and@reject_if_drainingrejects late arrivals.Phase 2: Block in
pool.waitall()until all in-flight RPC handler greenthreads complete their operations.Phase 3: Stop coordination, call
super().stop(), cleanup green thread pool. Process exits cleanly.Supporting mechanisms
set_workersdecorator): touches worker DB entries during operations via a spawned greenthread. Uses short interruptible sleeps (0.1s polling on the stop event) so the heartbeat exits within ~100ms when the operation completes, rather than blocking for up to 10s.do_cleanupfreshness check: skips worker entries updated withinservice_down_time, only cleans up truly stale entries. Prevents new pod from resetting in-flight operations on a draining pod.backup.updated_atevery 10s, preventing new backup pod from triggeringBackupRestoreCancel._cleanup_one_backupfreshness check: skips recent backups in creating/restoring state._detach_deviceno-reraise: if detach fails during shutdown, log and continue. Data integrity preserved; dangling export cleaned on next startup.reject_if_drainingdecorator: unconditionally rejects new RPC calls on a draining service so scheduler routes to healthy backends._add_to_threadpool/signal_shutdown/cleanup_threadpool: ThreadPoolManager uses eventlet GreenPool (cooperative greenthreads, same process) for async task dispatch.signal_shutdown()rejects new tasks;cleanup_threadpool()drains and re-creates the pool for potential service restart.Requires (separate changes)
dumb-init --single-childon cinder-volume AND cinder-backup container commandsterminationGracePeriodSeconds: 900on pod specTest results (qa-de-1, 2026-05-22)
test_nova_cinder_interactions.py— 7/7 PASS:test_graceful_shutdown.py— single-op tests 6/6 PASS on KVM (premium volume type):test_graceful_shutdown.py— concurrency suite 5/6 PASS on VMware:oslo.messaging companion patch — NOT required
Connection.stop_consuming()is never called in cinder's shutdown path. Verified empirically by adding[GS-AMQP]instrumentation — zero hits across all pod-kill logs. Full sweep passes identically with the patch enabled or disabled. The patch (review.opendev.org/986243) is orthogonal to this PR.Files changed
cinder/service.py_drain_pool()withpool.waitall()cinder/volume/manager.pyreject_if_drainingdecorator,signal_shutdown(), direct flow executioncinder/manager.pyThreadPoolManager(GreenPool + shutdown coordination),do_cleanupfreshness checkcinder/objects/cleanable.pyset_workersdecorator (interruptible sleep)cinder/backup/manager.pycinder/volume/flows/manager/create_volume.pyCreateVolumeOnFinishTaskunconditional writecinder/opts.pygraceful_shutdown_timeoutregistrationcinder/tests/unit/test_manager.pycinder/tests/unit/test_service.pycinder/tests/unit/test.py_common_cleanupderegisters fake RPC endpointscinder/tests/unit/volume/__init__.pyBaseVolumeTestCaseGreenPool drain in addCleanupcinder/tests/unit/backup/test_backup.pyBaseBackupTestGreenPool drain in addCleanupcinder/tests/unit/test_volume_cleanup.pyservice_down_time=0for test determinismcinder/tests/unit/api/contrib/test_admin_actions.py