[SAP] Implement graceful shutdown for cinder services by hemna · Pull Request #314 · sapcc/cinder

hemna · 2026-02-20T20:05:32Z

[SAP] Implement graceful shutdown for cinder services

Three-phase graceful shutdown that allows in-flight volume and backup operations to complete before the pod exits during Kubernetes rolling updates. Covers both cinder-volume and cinder-backup services.

The fix

Install SIG_IGN for SIGTERM/SIGINT/SIGHUP at the start of Service.stop().

cinder-volume runs as N forked child processes (one per backend) under oslo_service.ProcessLauncher. On SIGTERM, oslo_service's child handler calls SignalHandler.clear() which resets all handlers to SIG_DFL. The parent then sends a second SIGTERM via os.kill(child_pid, SIGTERM). Without SIG_IGN, the child terminates immediately — even though it's still inside pool.waitall() waiting for in-flight RPC handlers to finish.

How it works

Phase 1: Skip consumer cancel. We do NOT call rpcserver.stop() because it triggers an eventlet socket race ("simultaneous read on fileno") that disrupts outbound HTTP/RPC connections used by in-flight ops. Broker-side deregistration happens automatically when the OS closes the AMQP socket at process exit. During the gap, heartbeat stops (scheduler reroutes within service_down_time) and @reject_if_draining rejects late arrivals.

Phase 2: Block in pool.waitall() until all in-flight RPC handler greenthreads complete their operations.

Phase 3: Stop coordination, call super().stop(), cleanup green thread pool. Process exits cleanly.

Supporting mechanisms

Worker entry heartbeat (set_workers decorator): touches worker DB entries during operations via a spawned greenthread. Uses short interruptible sleeps (0.1s polling on the stop event) so the heartbeat exits within ~100ms when the operation completes, rather than blocking for up to 10s.
do_cleanup freshness check: skips worker entries updated within service_down_time, only cleans up truly stale entries. Prevents new pod from resetting in-flight operations on a draining pod.
Backup restore heartbeat: touches backup.updated_at every 10s, preventing new backup pod from triggering BackupRestoreCancel.
Backup _cleanup_one_backup freshness check: skips recent backups in creating/restoring state.
Backup _detach_device no-reraise: if detach fails during shutdown, log and continue. Data integrity preserved; dangling export cleaned on next startup.
reject_if_draining decorator: unconditionally rejects new RPC calls on a draining service so scheduler routes to healthy backends.
_add_to_threadpool / signal_shutdown / cleanup_threadpool: ThreadPoolManager uses eventlet GreenPool (cooperative greenthreads, same process) for async task dispatch. signal_shutdown() rejects new tasks; cleanup_threadpool() drains and re-creates the pool for potential service restart.

Requires (separate changes)

dumb-init --single-child on cinder-volume AND cinder-backup container commands
terminationGracePeriodSeconds: 900 on pod spec

Test results (qa-de-1, 2026-05-22)

test_nova_cinder_interactions.py — 7/7 PASS:

#	Test	Duration
T1	attach with ConnectorRejected baseline	195s
T2	kill OTHER-side pod during initialize_connection	283s
T3	kill VM-side pod during cross-shard FCD relocate	329s
T4	kill VM-side pod during Nova retry attach	309s
T9	operator migrate of attached volume (baseline)	213s
T10	kill SOURCE pod during operator migrate (canonical SIG_IGN reproducer)	285s
T11	kill DEST pod during operator migrate	238s

test_graceful_shutdown.py — single-op tests 6/6 PASS on KVM (premium volume type):

Test	Duration
volume create (from image)	20s
volume delete	74s
volume extend (16→32 GB)	68s
volume clone	67s
snapshot create	66s
snapshot delete	87s

test_graceful_shutdown.py — concurrency suite 5/6 PASS on VMware:

Test	Duration	Ops on killed pod
sameop_create_x3	215s	3/3
sameop_extend_x3	202s	3/3
sameop_clone_x3	524s	3/3
mixed_create_extend_clone	392s	3/3
mixed_create_extend_snapshot_delete	336s	3/3
mixed_create3_extend1	222s	3/4 (1 vCenter InvalidArgument flake)

oslo.messaging companion patch — NOT required

Connection.stop_consuming() is never called in cinder's shutdown path. Verified empirically by adding [GS-AMQP] instrumentation — zero hits across all pod-kill logs. Full sweep passes identically with the patch enabled or disabled. The patch (review.opendev.org/986243) is orthogonal to this PR.

Files changed

File	Purpose
`cinder/service.py`	Three-phase shutdown, SIG_IGN install, `_drain_pool()` with `pool.waitall()`
`cinder/volume/manager.py`	`reject_if_draining` decorator, `signal_shutdown()`, direct flow execution
`cinder/manager.py`	`ThreadPoolManager` (GreenPool + shutdown coordination), `do_cleanup` freshness check
`cinder/objects/cleanable.py`	Worker heartbeat greenthread in `set_workers` decorator (interruptible sleep)
`cinder/backup/manager.py`	Backup restore heartbeat + freshness check + no-reraise on detach
`cinder/volume/flows/manager/create_volume.py`	`CreateVolumeOnFinishTask` unconditional write
`cinder/opts.py`	`graceful_shutdown_timeout` registration
`cinder/tests/unit/test_manager.py`	ThreadPoolManager unit tests
`cinder/tests/unit/test_service.py`	Service shutdown unit tests + signal flag reset
`cinder/tests/unit/test.py`	`_common_cleanup` deregisters fake RPC endpoints
`cinder/tests/unit/volume/__init__.py`	`BaseVolumeTestCase` GreenPool drain in addCleanup
`cinder/tests/unit/backup/test_backup.py`	`BaseBackupTest` GreenPool drain in addCleanup
`cinder/tests/unit/test_volume_cleanup.py`	Sync threadpool + `service_down_time=0` for test determinism
`cinder/tests/unit/api/contrib/test_admin_actions.py`	Explicit fake RPC server teardown

Three-phase graceful shutdown that allows in-flight volume and backup operations to complete before the pod exits during Kubernetes rolling updates. Covers both cinder-volume and cinder-backup services. The fix: install SIG_IGN for SIGTERM/SIGINT/SIGHUP at the start of Service.stop(). cinder-volume runs as N forked child processes (one per backend) under oslo_service.ProcessLauncher. On SIGTERM, oslo_service's child handler calls SignalHandler.clear() which resets all handlers to SIG_DFL. The parent then sends a second SIGTERM via os.kill(child_pid, SIGTERM). Without SIG_IGN, the child terminates immediately — even though it's still inside pool.waitall() waiting for in-flight RPC handlers to finish. How it works: Phase 1: Skip consumer cancel. We do NOT call rpcserver.stop() because it triggers an eventlet socket race ("simultaneous read on fileno") that disrupts outbound HTTP/RPC connections used by in-flight ops. Broker-side deregistration happens automatically at process exit. During the gap, heartbeat stops (scheduler reroutes within service_down_time) and @reject_if_draining rejects late arrivals. Phase 2: Block in pool.waitall() until all in-flight RPC handler greenthreads complete their operations. Phase 3: Stop coordination, call super().stop(), cleanup threadpool executor. Process exits cleanly. Supporting mechanisms: - Worker entry heartbeat (set_workers decorator): touches worker DB entries every 10s during operations, preventing new pod init_host do_cleanup from resetting in-flight volumes to error. Uses short interruptible sleeps (0.1s polling) for fast shutdown response. - do_cleanup freshness check: skips worker entries updated within service_down_time, only cleans up truly stale entries. - Backup restore heartbeat: touches backup.updated_at every 10s, preventing new backup pod from triggering BackupRestoreCancel. - Backup _detach_device no-reraise: if detach fails during shutdown, log and continue. Data integrity preserved. - reject_if_draining decorator: unconditionally rejects new RPC calls on a draining service so scheduler routes to healthy backends. Test adaptations: - Base test classes (BaseVolumeTestCase, BaseBackupTest) drain the GreenPool in addCleanup to prevent greenthread leaks. - test_volume_cleanup sets service_down_time=0 and runs threadpool tasks synchronously (mock_object) to avoid race conditions. - test_admin_actions tearDown explicitly stops fake RPC servers to deregister stale endpoints from oslo.messaging's fake transport. - test_service assertions updated to reflect intentional skip of rpcserver.stop() in graceful shutdown, and signal flag reset. - test_backup_messages updated for _detach_device no-reraise behavior. Requires (separate changes): - dumb-init --single-child on container commands - terminationGracePeriodSeconds: 900 on pod spec Change-Id: Icdd28affc73fd34491b656a68410dce8e46264d4

hemna force-pushed the graceful-shutdown branch 3 times, most recently from 5a72074 to 1f69e00 Compare February 23, 2026 14:24

hemna force-pushed the graceful-shutdown branch 3 times, most recently from 4c183ec to d331087 Compare April 30, 2026 12:37

hemna force-pushed the graceful-shutdown branch 2 times, most recently from 1dd055a to d2fddd7 Compare May 14, 2026 22:25

hemna changed the title ~~[SAP] Try graceful shutdown~~ [SAP] Implement graceful shutdown for cinder services May 14, 2026

hemna force-pushed the graceful-shutdown branch 4 times, most recently from a235a58 to b77438d Compare May 14, 2026 22:41

Scsabiii previously approved these changes May 15, 2026

View reviewed changes

hemna dismissed Scsabiii’s stale review via 81fe034 May 15, 2026 14:21

hemna force-pushed the graceful-shutdown branch 7 times, most recently from d972062 to ea28218 Compare May 22, 2026 20:29

hemna requested a review from Scsabiii May 22, 2026 20:33

hemna force-pushed the graceful-shutdown branch 7 times, most recently from db7847d to 8e7da2c Compare May 28, 2026 13:53

hemna force-pushed the graceful-shutdown branch from 8e7da2c to db7847d Compare May 28, 2026 14:24

hemna mentioned this pull request May 28, 2026

[SAP] Implement graceful shutdown for cinder services #338

Open

hemna force-pushed the graceful-shutdown branch from db7847d to e5a3ed9 Compare June 4, 2026 12:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SAP] Implement graceful shutdown for cinder services#314

[SAP] Implement graceful shutdown for cinder services#314
hemna wants to merge 1 commit into
stable/2023.1-m3from
graceful-shutdown

hemna commented Feb 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hemna commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!