Skip to content

Add transport URL secret rotation with consumer finalizer#663

Open
lmiccini wants to merge 1 commit into
openstack-k8s-operators:mainfrom
lmiccini:finalize-secret-rotation
Open

Add transport URL secret rotation with consumer finalizer#663
lmiccini wants to merge 1 commit into
openstack-k8s-operators:mainfrom
lmiccini:finalize-secret-rotation

Conversation

@lmiccini

@lmiccini lmiccini commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

When infra-operator rotates a RabbitMQ transport URL (creating a new secret and user), consumer operators must hold a consumer finalizer on the old secret until all their pods have rolled out with the new credentials. Without this, infra-operator cleans up the old RabbitMQ user while pods are still connected with old credentials, causing message bus outages.

Design:

  1. Add consumer finalizer to the current transport URL secret early in reconcile. Set instance.Status.TransportURLSecret for first-time setup only (when empty or unchanged); during rotation the status is updated solely by FinalizeSecretRotation at end of reconcile.

  2. Pass transportURL.Status.SecretName directly to sub-CR creation functions and config generation as a parameter — never read from instance.Status.TransportURLSecret for sub-CR specs.

  3. Use statefulset.IsReady() / deployment.IsReady() from lib-common in all sub-CR controllers for accurate rollout status.

  4. Use object.ManageRotationGracePeriod() to enforce a 60-second grace period before evaluating the rotation guard. This gives sub-CRs time to detect config changes, update their workloads, and roll pods — without relying on informer cache freshness.

  5. Guard: CredentialRotationGuardReady(true, conditions) — evaluates AllSubConditionIsTrue after the grace period expires. Only when all sub-CR conditions are True does FinalizeSecretRotation remove the consumer finalizer from the old secret.

  6. Delete path: reconcileDelete now cross-references TransportURL CRs directly (by name) as a fallback when removing consumer finalizers, handling the edge case where status was not yet updated before a crash. This matches the pattern used by manila, heat, and barbican.

The same pattern applies to notification transport URL secrets and application credential secrets where applicable.

@openshift-ci openshift-ci Bot requested review from abays and stuggi June 23, 2026 10:53
@openshift-ci

openshift-ci Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: lmiccini
Once this PR has been reviewed and has the lgtm label, please assign abays for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@lmiccini lmiccini force-pushed the finalize-secret-rotation branch from 5897bda to 4f0979f Compare June 24, 2026 12:09
@centosinfra-prod-github-app

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/zuul/t/rdoproject.org/buildset/59a2745820f04da09ce53e84225a8c6b

openstack-k8s-operators-content-provider FAILURE in 7m 40s
⚠️ cinder-operator-kuttl SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cinder-operator-tempest SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@lmiccini lmiccini force-pushed the finalize-secret-rotation branch from 4f0979f to 2f9c5d8 Compare June 24, 2026 12:23
When infra-operator rotates a RabbitMQ transport URL (creating a new
secret and user), consumer operators must hold a consumer finalizer on
the old secret until all their pods have rolled out with the new
credentials. Without this, infra-operator cleans up the old RabbitMQ
user while pods are still connected with old credentials, causing
message bus outages.

Design:

1. Add consumer finalizer to the current transport URL secret early
   in reconcile. Set instance.Status.TransportURLSecret for first-time
   setup only (when empty or unchanged); during rotation the status
   is updated solely by FinalizeSecretRotation at end of reconcile.

2. Pass transportURL.Status.SecretName directly to sub-CR creation
   functions and config generation as a parameter — never read from
   instance.Status.TransportURLSecret for sub-CR specs.

3. Use statefulset.IsReady() / deployment.IsReady() from lib-common
   in all sub-CR controllers for accurate rollout status.

4. Use object.ManageRotationGracePeriod() to enforce a 60-second
   grace period before evaluating the rotation guard. This gives
   sub-CRs time to detect config changes, update their workloads,
   and roll pods — without relying on informer cache freshness.

5. Guard: CredentialRotationGuardReady(true, conditions) — evaluates
   AllSubConditionIsTrue after the grace period expires. Only when
   all sub-CR conditions are True does FinalizeSecretRotation remove
   the consumer finalizer from the old secret.

The same pattern applies to notification transport URL secrets and
application credential secrets where applicable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@lmiccini lmiccini force-pushed the finalize-secret-rotation branch from 2f9c5d8 to fd6bdd2 Compare June 24, 2026 13:31
@openshift-ci

openshift-ci Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant