fix(orchestrator): fail DB creation job on actual errors instead of silently succeeding#407
Conversation
…ilently succeeding
Code Review by Qodo
Context used✅ Tickets:
RHDHBUGS-2577 1.
|
Signed-off-by: Fortune-Ndlovu <fndlovu@redhat.com>
Review Summary by QodoFix DB creation job error handling and make backoffLimit configurable
WalkthroughsDescription• Replace blanket error suppression with proper database creation error handling • Distinguish between "database already exists" (acceptable) and real failures (exit non-zero) • Make job backoffLimit configurable via dbCreationJobBackoffLimit in values.yaml • Enable Kubernetes to properly retry on actual database creation failures Diagramflowchart LR
A["DB Creation Job"] -->|"Old: || echo WARNING"| B["Always Succeeds"]
A -->|"New: Check if exists"| C{"Database exists?"}
C -->|"Yes"| D["Skip & Succeed"]
C -->|"No"| E["Exit 1"]
E -->|"Retry via backoffLimit"| A
File Changes1. charts/backstage/templates/sonataflows.yaml
|
…job-should-be-able-to-retry-properly-and-fail-if-needed-it-shoud-not-silently-succeed-if-there-are-errors
….3, adding dbCreationJobBackoffLimit parameter.
|
/cherry-pick release-1.10 |
|
@rm3l: once the present PR merges, I will cherry-pick it on top of DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/review |
PR Reviewer Guide 🔍Warning
Here are some key observations to aid the review process:
|
…json and values.schema.tmpl.json with minimum and maximum constraints. Update sonataflows.yaml to simplify database creation command. Signed-off-by: Fortune-Ndlovu <fndlovu@redhat.com>
Signed-off-by: Fortune-Ndlovu <fndlovu@redhat.com>
|
/agentic_review |
|
Persistent review updated to latest commit 635fd18 |
…exists in PostgreSQL and matches the behavior of the internal branch Signed-off-by: Fortune-Ndlovu <fndlovu@redhat.com>
|
/agentic_review |
|
Persistent review updated to latest commit 8f6b80f |
Signed-off-by: Fortune-Ndlovu <fndlovu@redhat.com>
|
cc/ @rm3l for review |
Signed-off-by: Fortune-Ndlovu <fndlovu@redhat.com>
…pgrade compatibility The CI "Test Latest Release" check fails because helm upgrade tries to patch the existing Job's spec.template, which Kubernetes rejects as immutable. The old chart created the Job without ttlSecondsAfterFinished, so it persists indefinitely and blocks the upgrade. Adding helm.sh/hook and helm.sh/hook-delete-policy annotations makes Helm delete the old Job before creating the new one on upgrade.
…job-should-be-able-to-retry-properly-and-fail-if-needed-it-shoud-not-silently-succeed-if-there-are-errors
|
/agentic_review |
|
Code review by qodo was updated up to the latest commit 13b5e0f |
Add hook-succeeded to the Helm hook delete policy so that successful. Jobs are cleaned up immediately while failed Jobs are kept for log inspection. TTL still handles cleanup for ArgoCD users after 5 minutes.
|
/agentic_review |
|
Code review by qodo was updated up to the latest commit 41fbb3e |
|
/agentic_review |
|
Code review by qodo was updated up to the latest commit 333fd0c |
Signed-off-by: Fortune-Ndlovu <fndlovu@redhat.com>
…ce names and replace dots with dashes, expose ttlSecondsAfterFinished and activeDeadlineSeconds in values.yaml so users can configure them. Signed-off-by: Fortune-Ndlovu <fndlovu@redhat.com>
|
|
/agentic_review |
|
Code review by qodo was updated up to the latest commit 56b08b4 |
0654311
into
redhat-developer:main
|
@rm3l: #407 failed to apply on top of branch "release-1.10": DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |



Description of the change
The
create-sonataflow-databaseJob was using|| echo WARNINGwhich swallowed all psql errors, causing the Job to always report success. This prevented Kubernetes from retrying viabackoffLimitwhen there were real failures (e.g. wrong credentials).This PR:
<release>-create-sf-db-<chart-version>) instead of Helm hooks, ensuring ArgoCD compatibility and avoiding immutable field errors on upgradesbackoffLimit,ttlSecondsAfterFinished, andactiveDeadlineSecondsconfigurable viavalues.yamlBreaking change (version 6.0.0)
The Job name changed from
<release>-create-sonataflow-databaseto<release>-create-sf-db-<version>. On upgrade, the old Job remains as an orphan (harmless, cleaned up by TTL or manually) and the new versioned Job is created fresh.Which issue(s) does this PR fix or relate to
https://redhat.atlassian.net/browse/RHDHBUGS-2577
How to test changes / Special notes to the reviewer
1. Fresh database creation
Expected: Job completes 1/1, logs show
CREATE DATABASE.2. Database already exists
Expected: Job completes 1/1, logs show
Database 'sonataflow' already exists, skipping creation.3. Failure and retry (wrong credentials)
Expected: Job fails, retries twice (3 pods total), logs show
ERROR: Failed to create database 'sonataflow'., job status showsFailed.4. Schema validation of configurable values
5. Upgrade from 5.11.1 to 6.0.0
Expected: Upgrade succeeds without errors. Old
rhdh-create-sonataflow-databaseJob remains as harmless orphan. Newrhdh-create-sf-db-6-0-0Job is created and completes successfully.Checklist
Chart.yamlaccording to Semantic Versioning.values.yamland added to the corresponding README.md. The pre-commit utility can be used to generate the necessary content. Runpre-commit run --all-filesto run the hooks and then push any resulting changes. The pre-commit Workflow will enforce this and warn you if needed.pre-commithook.ct lintcommand.