Skip to content

fix: OCPBUGS-81675: hcco: merge additionalTrustBundle into proxy trusted CA bundle#8520

Open
dpateriya wants to merge 1 commit into
openshift:mainfrom
dpateriya:fix/hcco-additionaltrustbundle-trusted-ca-bundle
Open

fix: OCPBUGS-81675: hcco: merge additionalTrustBundle into proxy trusted CA bundle#8520
dpateriya wants to merge 1 commit into
openshift:mainfrom
dpateriya:fix/hcco-additionaltrustbundle-trusted-ca-bundle

Conversation

@dpateriya

@dpateriya dpateriya commented May 14, 2026

Copy link
Copy Markdown
Contributor

Problem

When HostedCluster.spec.additionalTrustBundle is set, HCCO syncs it into openshift-config/user-ca-bundle on the guest cluster. However, that ConfigMap is not referenced by the guest cluster's proxy.config.openshift.io/cluster object, so CNO never merges it into openshift-config-managed/trusted-ca-bundle.

Any component that reads trusted-ca-bundle to build its TLS trust pool (such as CVO when contacting a custom spec.updateService) cannot verify certificates signed by the internal CA that was provided only via additionalTrustBundle. The failure surface is:

Unable to retrieve available updates: tls: failed to verify certificate: x509: certificate signed by unknown authority

This affects users running an internal Cincinnati-compatible update service (e.g. OSUS) behind a certificate signed by an internal CA that is provided via additionalTrustBundle.

Fix

Two changes in HCCO (reconciler in resources.go):

Change 1 — reconcileProxyTrustedCAConfigMap

When both proxy.trustedCA and additionalTrustBundle are set, merge the additionalTrustBundle CA data into the same destination ConfigMap (in openshift-config) alongside the proxy CA data. CNO then merges both into openshift-config-managed/trusted-ca-bundle.

Change 2 — Proxy object reconciliation in reconcileConfig

When proxy.trustedCA is not set but additionalTrustBundle is set, point the guest cluster's proxy.config.openshift.io/cluster object's spec.trustedCA.Name at "user-ca-bundle" (which reconcileUserCertCABundle already writes). CNO then picks it up and merges it into trusted-ca-bundle.

Why here (HCCO) and not in upstream CVO

A previous PR (openshift/cluster-version-operator#1370) attempted to fix this in CVO gated on --hypershift. Reviewer @enxebre correctly pointed out that CPO ships with the OCP payload at the same cadence as CVO, so there is no release-speed advantage, and the CVO fix would be unnecessary tech debt. This HCCO fix benefits all consumers of trusted-ca-bundle (CVO, MCO, etc.) rather than just CVO.

Testing

go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/ -run "TestReconcileProxyTrustedCAConfigMap|TestReconcileUserCertCABundle" -v

New TestReconcileProxyTrustedCAConfigMap table-driven test covers 4 cases:

  • neither proxy.trustedCA nor additionalTrustBundle set → no dest ConfigMap created
  • only proxy.trustedCA set → dest ConfigMap contains proxy CA data only
  • both set → dest ConfigMap contains merged CA data from both sources
  • only additionalTrustBundle set → reconcileProxyTrustedCAConfigMap returns early (handled by Change 2)

Fixes: https://issues.redhat.com/browse/OCPBUGS-81675

Summary by CodeRabbit

  • Bug Fixes

    • Proxy trusted CA handling now sets the proxy to use the user CA bundle when an additional trust bundle is present, avoids removing that bundle during cleanup, and correctly creates/merges CA data so concatenated bundles preserve newline formatting.
  • Tests

    • Added unit tests validating trusted CA creation, error cases, precedence behavior, and correct merging between proxy trusted CA and additional trust bundles.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 14, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@dpateriya: This pull request references Jira Issue OCPBUGS-81675, which is invalid:

  • expected the bug to target either version "5.0." or "openshift-5.0.", but it targets "4.22.0" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Problem

When HostedCluster.spec.additionalTrustBundle is set, HCCO syncs it into openshift-config/user-ca-bundle on the guest cluster. However, that ConfigMap is not referenced by the guest cluster's proxy.config.openshift.io/cluster object, so CNO never merges it into openshift-config-managed/trusted-ca-bundle.

Any component that reads trusted-ca-bundle to build its TLS trust pool (such as CVO when contacting a custom spec.updateService) cannot verify certificates signed by the internal CA that was provided only via additionalTrustBundle. The failure surface is:

Unable to retrieve available updates: tls: failed to verify certificate: x509: certificate signed by unknown authority

This affects users running an internal Cincinnati-compatible update service (e.g. OSUS) behind a certificate signed by an internal CA that is provided via additionalTrustBundle.

Fix

Two changes in HCCO (reconciler in resources.go):

Change 1 — reconcileProxyTrustedCAConfigMap

When both proxy.trustedCA and additionalTrustBundle are set, merge the additionalTrustBundle CA data into the same destination ConfigMap (in openshift-config) alongside the proxy CA data. CNO then merges both into openshift-config-managed/trusted-ca-bundle.

Change 2 — Proxy object reconciliation in reconcileConfig

When proxy.trustedCA is not set but additionalTrustBundle is set, point the guest cluster's proxy.config.openshift.io/cluster object's spec.trustedCA.Name at "user-ca-bundle" (which reconcileUserCertCABundle already writes). CNO then picks it up and merges it into trusted-ca-bundle.

Why here (HCCO) and not in upstream CVO

A previous PR (openshift/cluster-version-operator#1370) attempted to fix this in CVO gated on --hypershift. Reviewer @enxebre correctly pointed out that CPO ships with the OCP payload at the same cadence as CVO, so there is no release-speed advantage, and the CVO fix would be unnecessary tech debt. This HCCO fix benefits all consumers of trusted-ca-bundle (CVO, MCO, etc.) rather than just CVO.

Testing

go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/ -run "TestReconcileProxyTrustedCAConfigMap|TestReconcileUserCertCABundle" -v

New TestReconcileProxyTrustedCAConfigMap table-driven test covers 4 cases:

  • neither proxy.trustedCA nor additionalTrustBundle set → no dest ConfigMap created
  • only proxy.trustedCA set → dest ConfigMap contains proxy CA data only
  • both set → dest ConfigMap contains merged CA data from both sources
  • only additionalTrustBundle set → reconcileProxyTrustedCAConfigMap returns early (handled by Change 2)

Fixes: https://issues.redhat.com/browse/OCPBUGS-81675

Made with Cursor

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented May 14, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

When reconciling proxy config, if Proxy.Spec.TrustedCA.Name is empty and HostedCluster.Spec.AdditionalTrustBundle is set, the code sets Proxy.Spec.TrustedCA.Name to the user-ca-bundle ConfigMap name. reconcileProxyTrustedCAConfigMap now initializes destCM.Data by copying all key/value pairs from the referenced source ConfigMap, and if AdditionalTrustBundle is present, fetches its control-plane ConfigMap and appends its ca-bundle.crt (certs.UserCABundleMapKey) onto destCM.Data["ca-bundle.crt"] with newline normalization. Tests covering these scenarios were added.

Sequence Diagram(s)

sequenceDiagram
    participant HCO as HostedClusterConfigOperator
    participant API as Kubernetes API (Control Plane)
    participant SourceCM as Proxy Source ConfigMap
    participant ATB_CM as AdditionalTrustBundle ConfigMap
    participant DestCM as Destination TrustedCA ConfigMap

    HCO->>API: Read Proxy object
    alt Proxy.trustedCA empty and AdditionalTrustBundle set
        HCO->>HCO: Set Proxy.Spec.TrustedCA.Name = "user-ca-bundle"
    end
    HCO->>API: Read SourceCM referenced by Proxy.trustedCA
    API-->>HCO: return SourceCM data
    HCO->>DestCM: initialize destCM.Data = {}
    HCO->>HCO: copy all key/value pairs from SourceCM -> destCM.Data
    alt AdditionalTrustBundle set
        HCO->>API: Read ATB_CM from control plane
        API-->>HCO: return ATB_CM data
        alt ATB_CM["ca-bundle.crt"] non-empty
            HCO->>DestCM: append ATB_CM["ca-bundle.crt"] to destCM.Data["ca-bundle.crt"]
        end
    end
    HCO->>API: Create/Update DestCM with destCM.Data
    API-->>HCO: acknowledge create/update
Loading
🚥 Pre-merge checks | ✅ 10 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ❓ Inconclusive The check asks to review "Ginkgo test code" with Describe/It/BeforeEach patterns, but the PR adds standard Go unit tests (func TestX(t *testing.T)) with Gomega matchers, not Ginkgo BDD tests. Clarify: Are the check requirements for all tests or only Ginkgo-based tests? If all tests: the PR tests lack assertion messages per requirement 4. If only Ginkgo: check is not applicable (PASS).
✅ Passed checks (10 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: merging additionalTrustBundle into proxy trusted CA bundle, which is the primary purpose of this PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed Added tests use standard Go t.Run() table-driven approach, not Ginkgo (It/Describe/Context). Check applies only to Ginkgo tests; not applicable here.
Microshift Test Compatibility ✅ Passed The PR adds only standard Go unit tests (TestReconcileProxyTrustedCAConfigMap, TestReconcileConfigProxyTrustedCA) using testing.T, not Ginkgo e2e tests. The custom check is inapplicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests added; only standard Go unit tests (TestReconcileProxyTrustedCAConfigMap, TestReconcileConfigProxyTrustedCA). Check applies only to Ginkgo e2e tests.
Topology-Aware Scheduling Compatibility ✅ Passed PR does not introduce scheduling constraints. Changes only modify ConfigMap and Proxy object configurations for trust bundle management, with no pod/deployment/workload scheduling logic.
Ote Binary Stdout Contract ✅ Passed No process-level stdout writes found. PR adds standard Go unit tests and production logic with no main/init/TestMain/suite setup code that could corrupt JSON stdout.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed New tests are standard Go unit tests (testing.T), not Ginkgo e2e tests. Check for Ginkgo e2e tests with IPv4/external connectivity is not applicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from jparrill and sjenning May 14, 2026 12:58
@openshift-ci

openshift-ci Bot commented May 14, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dpateriya
Once this PR has been reviewed and has the lgtm label, please assign jparrill for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release and removed do-not-merge/needs-area labels May 14, 2026
@codecov

codecov Bot commented May 14, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 77.77778% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 39.74%. Comparing base (222a19f) to head (b0cc962).
⚠️ Report is 522 commits behind head on main.

Files with missing lines Patch % Lines
...rconfigoperator/controllers/resources/resources.go 77.77% 5 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8520      +/-   ##
==========================================
+ Coverage   36.30%   39.74%   +3.43%     
==========================================
  Files         764      773       +9     
  Lines       93015    94913    +1898     
==========================================
+ Hits        33772    37721    +3949     
+ Misses      56530    54494    -2036     
+ Partials     2713     2698      -15     
Files with missing lines Coverage Δ
...rconfigoperator/controllers/resources/resources.go 56.35% <77.77%> (+5.83%) ⬆️

... and 164 files with indirect coverage changes

Flag Coverage Δ
cmd-support 32.68% <ø> (+2.65%) ⬆️
cpo-hostedcontrolplane 41.76% <ø> (+4.71%) ⬆️
cpo-other 40.58% <77.77%> (+4.88%) ⬆️
hypershift-operator 50.72% <ø> (+2.83%) ⬆️
other 31.58% <ø> (+3.89%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dpateriya

Copy link
Copy Markdown
Contributor Author

/retest

@dpateriya dpateriya force-pushed the fix/hcco-additionaltrustbundle-trusted-ca-bundle branch from 45d3f02 to 4319c77 Compare May 14, 2026 13:11
@dpateriya dpateriya changed the title OCPBUGS-81675: hcco: merge additionalTrustBundle into proxy trusted CA bundle fix: OCPBUGS-81675: hcco: merge additionalTrustBundle into proxy trusted CA bundle May 14, 2026
@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 14, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@dpateriya: This pull request references Jira Issue OCPBUGS-81675, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Problem

When HostedCluster.spec.additionalTrustBundle is set, HCCO syncs it into openshift-config/user-ca-bundle on the guest cluster. However, that ConfigMap is not referenced by the guest cluster's proxy.config.openshift.io/cluster object, so CNO never merges it into openshift-config-managed/trusted-ca-bundle.

Any component that reads trusted-ca-bundle to build its TLS trust pool (such as CVO when contacting a custom spec.updateService) cannot verify certificates signed by the internal CA that was provided only via additionalTrustBundle. The failure surface is:

Unable to retrieve available updates: tls: failed to verify certificate: x509: certificate signed by unknown authority

This affects users running an internal Cincinnati-compatible update service (e.g. OSUS) behind a certificate signed by an internal CA that is provided via additionalTrustBundle.

Fix

Two changes in HCCO (reconciler in resources.go):

Change 1 — reconcileProxyTrustedCAConfigMap

When both proxy.trustedCA and additionalTrustBundle are set, merge the additionalTrustBundle CA data into the same destination ConfigMap (in openshift-config) alongside the proxy CA data. CNO then merges both into openshift-config-managed/trusted-ca-bundle.

Change 2 — Proxy object reconciliation in reconcileConfig

When proxy.trustedCA is not set but additionalTrustBundle is set, point the guest cluster's proxy.config.openshift.io/cluster object's spec.trustedCA.Name at "user-ca-bundle" (which reconcileUserCertCABundle already writes). CNO then picks it up and merges it into trusted-ca-bundle.

Why here (HCCO) and not in upstream CVO

A previous PR (openshift/cluster-version-operator#1370) attempted to fix this in CVO gated on --hypershift. Reviewer @enxebre correctly pointed out that CPO ships with the OCP payload at the same cadence as CVO, so there is no release-speed advantage, and the CVO fix would be unnecessary tech debt. This HCCO fix benefits all consumers of trusted-ca-bundle (CVO, MCO, etc.) rather than just CVO.

Testing

go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/ -run "TestReconcileProxyTrustedCAConfigMap|TestReconcileUserCertCABundle" -v

New TestReconcileProxyTrustedCAConfigMap table-driven test covers 4 cases:

  • neither proxy.trustedCA nor additionalTrustBundle set → no dest ConfigMap created
  • only proxy.trustedCA set → dest ConfigMap contains proxy CA data only
  • both set → dest ConfigMap contains merged CA data from both sources
  • only additionalTrustBundle set → reconcileProxyTrustedCAConfigMap returns early (handled by Change 2)

Fixes: https://issues.redhat.com/browse/OCPBUGS-81675

Summary by CodeRabbit

  • Bug Fixes

  • Proxy trusted CA handling now correctly creates and merges trusted CA data with any additional trust bundle, ensuring proper certificate validation.

  • Tests

  • Added unit tests covering trusted CA creation and merging behavior between proxy trusted CA and additional trust bundles.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@dpateriya dpateriya force-pushed the fix/hcco-additionaltrustbundle-trusted-ca-bundle branch from 4319c77 to 7cd780c Compare May 14, 2026 13:50
@openshift-ci-robot

Copy link
Copy Markdown

@dpateriya: This pull request references Jira Issue OCPBUGS-81675, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

Problem

When HostedCluster.spec.additionalTrustBundle is set, HCCO syncs it into openshift-config/user-ca-bundle on the guest cluster. However, that ConfigMap is not referenced by the guest cluster's proxy.config.openshift.io/cluster object, so CNO never merges it into openshift-config-managed/trusted-ca-bundle.

Any component that reads trusted-ca-bundle to build its TLS trust pool (such as CVO when contacting a custom spec.updateService) cannot verify certificates signed by the internal CA that was provided only via additionalTrustBundle. The failure surface is:

Unable to retrieve available updates: tls: failed to verify certificate: x509: certificate signed by unknown authority

This affects users running an internal Cincinnati-compatible update service (e.g. OSUS) behind a certificate signed by an internal CA that is provided via additionalTrustBundle.

Fix

Two changes in HCCO (reconciler in resources.go):

Change 1 — reconcileProxyTrustedCAConfigMap

When both proxy.trustedCA and additionalTrustBundle are set, merge the additionalTrustBundle CA data into the same destination ConfigMap (in openshift-config) alongside the proxy CA data. CNO then merges both into openshift-config-managed/trusted-ca-bundle.

Change 2 — Proxy object reconciliation in reconcileConfig

When proxy.trustedCA is not set but additionalTrustBundle is set, point the guest cluster's proxy.config.openshift.io/cluster object's spec.trustedCA.Name at "user-ca-bundle" (which reconcileUserCertCABundle already writes). CNO then picks it up and merges it into trusted-ca-bundle.

Why here (HCCO) and not in upstream CVO

A previous PR (openshift/cluster-version-operator#1370) attempted to fix this in CVO gated on --hypershift. Reviewer @enxebre correctly pointed out that CPO ships with the OCP payload at the same cadence as CVO, so there is no release-speed advantage, and the CVO fix would be unnecessary tech debt. This HCCO fix benefits all consumers of trusted-ca-bundle (CVO, MCO, etc.) rather than just CVO.

Testing

go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/ -run "TestReconcileProxyTrustedCAConfigMap|TestReconcileUserCertCABundle" -v

New TestReconcileProxyTrustedCAConfigMap table-driven test covers 4 cases:

  • neither proxy.trustedCA nor additionalTrustBundle set → no dest ConfigMap created
  • only proxy.trustedCA set → dest ConfigMap contains proxy CA data only
  • both set → dest ConfigMap contains merged CA data from both sources
  • only additionalTrustBundle set → reconcileProxyTrustedCAConfigMap returns early (handled by Change 2)

Fixes: https://issues.redhat.com/browse/OCPBUGS-81675

Summary by CodeRabbit

  • Bug Fixes

  • Proxy trusted CA handling now automatically points to the merged user CA bundle when an additional trust bundle is present, and correctly creates and merges trusted CA data to ensure proper certificate validation.

  • Tests

  • Added unit tests covering trusted CA creation and merging behavior between proxy trusted CA and additional trust bundles.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@dpateriya

dpateriya commented May 19, 2026

Copy link
Copy Markdown
Contributor Author

Verified

Tested manually on a HyperShift hosted cluster with spec.additionalTrustBundle set.

Before the fix:

  • proxy.config.openshift.io/cluster on the guest had spec.trustedCA.name: ""
  • openshift-config-managed/trusted-ca-bundle did not contain the internal CA
  • CVO failed with x509: certificate signed by unknown authority when contacting spec.updateService

After patching HCCO with this fix:

  • proxy.config.openshift.io/cluster is now updated with spec.trustedCA.name: "user-ca-bundle"
  • CNO picks up user-ca-bundle and merges it into openshift-config-managed/trusted-ca-bundle
  • CVO reads trusted-ca-bundle for TLS verification and can now trust the internal CA

Validation:

$ oc get proxy cluster -o jsonpath='{.spec.trustedCA.name}' --kubeconfig hosted-kubeconfig
user-ca-bundle

The full data flow is now:

hc.spec.additionalTrustBundle
        │
        ▼
HCCO writes "openshift-config/user-ca-bundle" (already existed)
        │
        ▼  ← YOUR FIX: point proxy.spec.trustedCA.Name at "user-ca-bundle"
        │
CNO sees proxy.spec.trustedCA.Name is set
        │
        ▼
CNO merges it into "openshift-config-managed/trusted-ca-bundle"
        │
        ▼
CVO reads trusted-ca-bundle → TLS now trusts the internal CA → updateService works

/verified by me

@openshift-ci-robot

Copy link
Copy Markdown

@dpateriya: The /verified command must be used with one of the following actions: by, later, remove, or bypass. See https://docs.ci.openshift.org/docs/architecture/jira/#premerge-verification for more information.

Details

In response to this:

Verified

Tested manually on a HyperShift hosted cluster with spec.additionalTrustBundle set.

Before the fix:

  • proxy.config.openshift.io/cluster on the guest had spec.trustedCA.name: ""
  • openshift-config-managed/trusted-ca-bundle did not contain the internal CA
  • CVO failed with x509: certificate signed by unknown authority when contacting spec.updateService

After patching HCCO with this fix:

  • proxy.config.openshift.io/cluster is now updated with spec.trustedCA.name: "user-ca-bundle"
  • CNO picks up user-ca-bundle and merges it into openshift-config-managed/trusted-ca-bundle
  • CVO reads trusted-ca-bundle for TLS verification and can now trust the internal CA

Validation:

$ oc get proxy cluster -o jsonpath='{.spec.trustedCA.name}' --kubeconfig hosted-kubeconfig
user-ca-bundle

The full data flow is now:

  1. HostedCluster.spec.additionalTrustBundle → HCCO syncs to openshift-config/user-ca-bundle (existing behavior)
  2. HCCO sets proxy.config.openshift.io/cluster spec.trustedCA.name = "user-ca-bundle" (this fix)
  3. CNO merges the referenced ConfigMap into openshift-config-managed/trusted-ca-bundle
  4. CVO reads trusted-ca-bundle → TLS verification succeeds for internal update service

/verified

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@dpateriya

Copy link
Copy Markdown
Contributor Author

/verified by me

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label May 19, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@dpateriya: This PR has been marked as verified by me.

Details

In response to this:

/verified by me

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jparrill jparrill left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped some comments. Thanks!

@jparrill

jparrill commented May 25, 2026

Copy link
Copy Markdown
Contributor

/verified by me

@dpateriya the /verified label should come from someone other than the PR author — ideally someone from QE or another engineer who has independently tested the fix in a real cluster. Self-verification doesn't give us the confidence that the change actually works end-to-end in a live environment, especially for a trust bundle fix like this where the failure mode (silent TLS rejection) is easy to miss in unit tests alone.

Could someone from QE pick this up and verify it on a hosted cluster with additionalTrustBundle + a custom update service behind an internal CA?

@dpateriya

dpateriya commented May 25, 2026

Copy link
Copy Markdown
Contributor Author

The initial /verified label was added based on my own testing in a development environment, and I acknowledge that independent QE verification is needed before this can be considered fully validated.

What was tested (dev environment):

  • Patched HCCO image deployed to a HyperShift hosted cluster
  • Confirmed proxy.config.openshift.io/cluster gets spec.trustedCA.name: user-ca-bundle when only hc.spec.additionalTrustBundle is set
  • Confirmed CNO subsequently creates openshift-config-managed/trusted-ca-bundle with the CA data

Request: Could someone from QE independently verify this fix covers the expected scenarios (ATB-only, both ATB+proxy.trustedCA, removal/cleanup)?

@dpateriya dpateriya force-pushed the fix/hcco-additionaltrustbundle-trusted-ca-bundle branch from 7cd780c to 928ff98 Compare May 25, 2026 14:39
@openshift-ci-robot openshift-ci-robot removed the verified Signifies that the PR passed pre-merge verification criteria label May 25, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@dpateriya: This pull request references Jira Issue OCPBUGS-81675, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

Problem

When HostedCluster.spec.additionalTrustBundle is set, HCCO syncs it into openshift-config/user-ca-bundle on the guest cluster. However, that ConfigMap is not referenced by the guest cluster's proxy.config.openshift.io/cluster object, so CNO never merges it into openshift-config-managed/trusted-ca-bundle.

Any component that reads trusted-ca-bundle to build its TLS trust pool (such as CVO when contacting a custom spec.updateService) cannot verify certificates signed by the internal CA that was provided only via additionalTrustBundle. The failure surface is:

Unable to retrieve available updates: tls: failed to verify certificate: x509: certificate signed by unknown authority

This affects users running an internal Cincinnati-compatible update service (e.g. OSUS) behind a certificate signed by an internal CA that is provided via additionalTrustBundle.

Fix

Two changes in HCCO (reconciler in resources.go):

Change 1 — reconcileProxyTrustedCAConfigMap

When both proxy.trustedCA and additionalTrustBundle are set, merge the additionalTrustBundle CA data into the same destination ConfigMap (in openshift-config) alongside the proxy CA data. CNO then merges both into openshift-config-managed/trusted-ca-bundle.

Change 2 — Proxy object reconciliation in reconcileConfig

When proxy.trustedCA is not set but additionalTrustBundle is set, point the guest cluster's proxy.config.openshift.io/cluster object's spec.trustedCA.Name at "user-ca-bundle" (which reconcileUserCertCABundle already writes). CNO then picks it up and merges it into trusted-ca-bundle.

Why here (HCCO) and not in upstream CVO

A previous PR (openshift/cluster-version-operator#1370) attempted to fix this in CVO gated on --hypershift. Reviewer @enxebre correctly pointed out that CPO ships with the OCP payload at the same cadence as CVO, so there is no release-speed advantage, and the CVO fix would be unnecessary tech debt. This HCCO fix benefits all consumers of trusted-ca-bundle (CVO, MCO, etc.) rather than just CVO.

Testing

go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/ -run "TestReconcileProxyTrustedCAConfigMap|TestReconcileUserCertCABundle" -v

New TestReconcileProxyTrustedCAConfigMap table-driven test covers 4 cases:

  • neither proxy.trustedCA nor additionalTrustBundle set → no dest ConfigMap created
  • only proxy.trustedCA set → dest ConfigMap contains proxy CA data only
  • both set → dest ConfigMap contains merged CA data from both sources
  • only additionalTrustBundle set → reconcileProxyTrustedCAConfigMap returns early (handled by Change 2)

Fixes: https://issues.redhat.com/browse/OCPBUGS-81675

Summary by CodeRabbit

  • Bug Fixes

  • Proxy trusted CA handling now targets the user CA bundle when an additional trust bundle is present, avoids removing that bundle during cleanup, and correctly creates/merges CA data so concatenated bundles preserve newline formatting.

  • Tests

  • Added unit tests validating trusted CA creation, error cases, and correct merging behavior between proxy trusted CA and additional trust bundles.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go`:
- Around line 777-791: The test currently returns immediately when tc.wantErr is
true which skips assertions about the destination ConfigMap; instead, remove the
early return and always run the dest CM checks based on tc.wantDestCMExists so
error-paths also validate whether manifests.ProxyTrustedCAConfigMap(proxyCAName)
was created or not using r.client.Get and apierrors.IsNotFound; keep the
existing expectations for destCM.Data["ca-bundle.crt"] == tc.wantDestCMData when
wantDestCMExists is true and assert apierrors.IsNotFound(err) when it is false.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 3656122c-c14d-412a-b36b-f5db86560311

📥 Commits

Reviewing files that changed from the base of the PR and between 7cd780c and 928ff98.

📒 Files selected for processing (2)
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go

…ted CA bundle

When HostedCluster.spec.additionalTrustBundle is set, HCCO already syncs
its contents into openshift-config/user-ca-bundle on the guest cluster.
However, that ConfigMap is not referenced by the guest cluster's
proxy.config.openshift.io/cluster object, so CNO never merges it into
openshift-config-managed/trusted-ca-bundle. Any component reading
trusted-ca-bundle (including CVO's update service TLS client) therefore
cannot verify TLS certificates signed by the internal CA provided via
additionalTrustBundle.

Fix this in HCCO with two changes:

1. reconcileProxyTrustedCAConfigMap: when both proxy.trustedCA and
   additionalTrustBundle are set, merge the additionalTrustBundle CA
   data into the destination ConfigMap alongside proxy.trustedCA data.
   CNO then merges both into trusted-ca-bundle.

2. reconcileConfig (Proxy object reconciliation): when no proxy.trustedCA
   is configured but additionalTrustBundle is set, point the Proxy
   object's spec.trustedCA.Name at user-ca-bundle (already written by
   reconcileUserCertCABundle), so CNO picks up the additionalTrustBundle
   CA via the standard trusted-ca-bundle path.

This benefits all consumers of trusted-ca-bundle (CVO, MCO, etc.) and
avoids adding HyperShift-specific special-casing in upstream components.

Co-authored-by: Cursor <cursoragent@cursor.com>
@dpateriya dpateriya force-pushed the fix/hcco-additionaltrustbundle-trusted-ca-bundle branch from 928ff98 to b0cc962 Compare May 25, 2026 15:29

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go (1)

1145-1157: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not delete the management-cluster source ConfigMap.

Line 1150/1151 deletes the ConfigMap from hcp.Namespace, but this ref comes from hcp.Spec.Configuration.Proxy.TrustedCA. That is the source bundle, so switching away from proxy.trustedCA or moving to ATB-only will destroy the admin-managed ConfigMap in the control-plane namespace.

Suggested fix
-			// log and ignore deletion errors, should not disrupt normal workflow
-			cm.Namespace = hcp.Namespace
-			if err := r.cpClient.Delete(ctx, cm); err != nil {
-				log.Error(err, "failed to delete configmap", "name", cm.Name, "namespace", cm.Namespace)
-			}
-
 			cm.Namespace = manifests.ProxyTrustedCAConfigMap("").Namespace
 			if err := r.client.Delete(ctx, cm); err != nil {
 				log.Error(err, "failed to delete configmap in hosted cluster", "name", cm.Name, "namespace", cm.Namespace)
 			}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go`
around lines 1145 - 1157, The code currently unconditionally deletes the
ConfigMap from hcp.Namespace using r.cpClient.Delete which can remove the
management-cluster source referenced by hcp.Spec.Configuration.Proxy.TrustedCA;
change the cleanup logic in the resource reconciliation (around
currentConfigMapRef handling in resources.go) to skip deleting the
management-source ConfigMap: only call r.cpClient.Delete when
currentConfigMapRef is not equal to hcp.Spec.Configuration.Proxy.TrustedCA (or
alternatively verify a controller-owned label/ownerReference before deletion),
and continue to delete the hosted-cluster copy with r.client.Delete (using
manifests.ProxyTrustedCAConfigMap("").Namespace) as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go`:
- Around line 1180-1184: The code is fetching the AdditionalTrustBundle
ConfigMap by hcp.Spec.AdditionalTrustBundle.Name but reconcileUserCertCABundle
uses cpomanifests.UserCAConfigMap(hcp.Namespace), so change the lookup to use
the canonical control-plane ATB name: call r.cpClient.Get with
client.ObjectKey{Name: cpomanifests.UserCAConfigMap(hcp.Namespace).Name,
Namespace: hcp.Namespace} (or otherwise use
cpomanifests.UserCAConfigMap(hcp.Namespace) as the source of truth) when
populating atbCM so the merge path and reconcileUserCertCABundle use the same
ConfigMap name.

---

Duplicate comments:
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go`:
- Around line 1145-1157: The code currently unconditionally deletes the
ConfigMap from hcp.Namespace using r.cpClient.Delete which can remove the
management-cluster source referenced by hcp.Spec.Configuration.Proxy.TrustedCA;
change the cleanup logic in the resource reconciliation (around
currentConfigMapRef handling in resources.go) to skip deleting the
management-source ConfigMap: only call r.cpClient.Delete when
currentConfigMapRef is not equal to hcp.Spec.Configuration.Proxy.TrustedCA (or
alternatively verify a controller-owned label/ownerReference before deletion),
and continue to delete the hosted-cluster copy with r.client.Delete (using
manifests.ProxyTrustedCAConfigMap("").Namespace) as before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 2abf2bdb-eef3-44b5-ad74-0976e2ace52b

📥 Commits

Reviewing files that changed from the base of the PR and between 928ff98 and b0cc962.

📒 Files selected for processing (2)
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go

Comment on lines +1180 to +1184
if hcp.Spec.AdditionalTrustBundle != nil {
atbCM := &corev1.ConfigMap{}
if err := r.cpClient.Get(ctx, client.ObjectKey{Namespace: hcp.Namespace, Name: hcp.Spec.AdditionalTrustBundle.Name}, atbCM); err != nil {
return fmt.Errorf("failed to get additionalTrustBundle configmap %s/%s: %w", hcp.Namespace, hcp.Spec.AdditionalTrustBundle.Name, err)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the canonical control-plane ATB ConfigMap here.

This path reads hcp.Spec.AdditionalTrustBundle.Name, but reconcileUserCertCABundle below reads cpomanifests.UserCAConfigMap(hcp.Namespace). If those names ever diverge, the guest user-ca-bundle sync still works while this merge path fails with a not-found.

Suggested fix
-			atbCM := &corev1.ConfigMap{}
-			if err := r.cpClient.Get(ctx, client.ObjectKey{Namespace: hcp.Namespace, Name: hcp.Spec.AdditionalTrustBundle.Name}, atbCM); err != nil {
-				return fmt.Errorf("failed to get additionalTrustBundle configmap %s/%s: %w", hcp.Namespace, hcp.Spec.AdditionalTrustBundle.Name, err)
+			atbCM := cpomanifests.UserCAConfigMap(hcp.Namespace)
+			if err := r.cpClient.Get(ctx, client.ObjectKeyFromObject(atbCM), atbCM); err != nil {
+				return fmt.Errorf("failed to get additionalTrustBundle configmap %s/%s: %w", atbCM.Namespace, atbCM.Name, err)
 			}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go`
around lines 1180 - 1184, The code is fetching the AdditionalTrustBundle
ConfigMap by hcp.Spec.AdditionalTrustBundle.Name but reconcileUserCertCABundle
uses cpomanifests.UserCAConfigMap(hcp.Namespace), so change the lookup to use
the canonical control-plane ATB name: call r.cpClient.Get with
client.ObjectKey{Name: cpomanifests.UserCAConfigMap(hcp.Namespace).Name,
Namespace: hcp.Namespace} (or otherwise use
cpomanifests.UserCAConfigMap(hcp.Namespace) as the source of truth) when
populating atbCM so the merge path and reconcileUserCertCABundle use the same
ConfigMap name.

@openshift-ci

openshift-ci Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

@dpateriya: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@dpateriya

Copy link
Copy Markdown
Contributor Author

@jparrill could you please review this PR once you LGTM, I will review this in a real env.

@hypershift-jira-solve-ci

Copy link
Copy Markdown

Test Failure Analysis Complete

Job Information

  • Prow Job: Red Hat Konflux / hypershift-operator-main-enterprise-contract / hypershift-operator-main
  • Build ID: hypershift-operator-main-enterprise-contract-4zmlz (and hypershift-operator-enterprise-contract-wttd2)
  • PR: fix: OCPBUGS-81675: hcco: merge additionalTrustBundle into proxy trusted CA bundle #8520 — fix: OCPBUGS-81675: hcco: merge additionalTrustBundle into proxy trusted CA bundle
  • Component: hypershift-operator-main
  • Snapshot: hypershift-operator-20260525-152929-000
  • Check Run Date: 2026-05-25 (17 days stale — never re-run)

Test Failure Analysis

Error

Integration test for component hypershift-operator-main snapshot
hypershift-operator-20260525-152929-000 and scenario
hypershift-operator-main-enterprise-contract has failed

verify step: 250 success(es), 22 warning(s), 6 failure(s)

Summary

Both failing checks are stale Enterprise Contract (EC) verification results from May 25, 2026 that have never been re-run. The EC verify step evaluated 278 policy rules against the hypershift-operator-main container image and found 6 violations. These same 278 rules now yield 0 failures on PRs built today (June 11), confirming the violations were resolved by upstream EC policy updates between May 25 and June 11. The PR's build pipeline (hypershift-operator-main-on-pull-request) succeeded — only the post-build EC policy verification failed due to transient policy state at the time.

Root Cause

The root cause is stale Enterprise Contract verification results caused by the PR having no new commits since May 25, 2026. The Konflux EC checks run once per commit push and are not automatically re-triggered when upstream EC policies change.

What happened:

  1. PR fix: OCPBUGS-81675: hcco: merge additionalTrustBundle into proxy trusted CA bundle #8520's single commit was pushed on May 25, 2026.
  2. Konflux built the hypershift-operator-main image successfully (check hypershift-operator-main-on-pull-request → SUCCESS).
  3. The EC verification pipeline (hypershift-operator-main-enterprise-contract) ran against the built image's SLSA provenance attestation and evaluated 278 policy rules.
  4. 6 of those rules failed — likely related to trusted task bundle references or policy rule changes that were in flux at the time.
  5. No new commits have been pushed, so the EC check has never been re-run.

Evidence of staleness (same 278 total checks, failure count decreasing over time as EC policies were updated):

The control-plane-operator component, which uses the identical build pipeline (hypershift-common-operator-build), passed its EC check (484 successes, 0 failures) on the same commit, indicating the failures are specific to the hypershift-operator-main EC scenario's policy configuration at that point in time — not a build or code issue.

This is not related to the PR's code changes (additionalTrustBundle reconciliation in HCCO). The failure is purely an infrastructure/policy timing issue.

Recommendations
  1. Rebase or push an empty commit to the PR branch to trigger fresh Konflux builds and EC verification. Based on current EC policy state (as verified on PR CNTRLPLANE-3276: Add Azure endpoint access transition e2e test #8718 today), the checks will pass with 0 failures.

    git fetch upstream main
    git rebase upstream/main
    git push --force-with-lease

    Or if no conflicts exist:

    git commit --allow-empty -m "chore: retrigger CI checks"
    git push
  2. No code changes needed — the PR's build succeeds and the EC policy violations have been resolved upstream.

  3. For future reference: Konflux EC checks are point-in-time evaluations. Long-lived PRs may accumulate stale EC failures that self-resolve when the checks are re-triggered against current policy state.

Evidence
Evidence Detail
PR build status hypershift-operator-main-on-pull-requestSUCCESS (build is fine)
EC check 1 hypershift-operator-main-enterprise-contractFAILURE (250 pass, 22 warn, 6 fail)
EC check 2 hypershift-operator-enterprise-contractFAILURE (250 pass, 22 warn, 6 fail)
Check run date 2026-05-25T15:29:32Z (17 days ago, never re-run)
Last commit date 2026-05-25T15:29:15Z (single commit on PR)
Same checks on PR #8718 (today) 256 pass, 22 warn, 0 fail → neutral (PASS)
Same checks on PR #8713 (today) 256 pass, 22 warn, 0 fail → success (PASS)
Control-plane-operator EC (same commit) 484 pass, 0 warn, 0 fail → success (PASS)
Main branch EC (Jun 10) 512 pass, 44 warn, 0 fail → neutral (PASS)
Pipeline used hypershift-common-operator-build (same for all components)
Image tag on-pr-b0cc962974293382409cf93be7c6c68207142bdf (expired May 30, 5d TTL)
Pipeline run 1 hypershift-operator-main-enterprise-contract-4zmlz
Pipeline run 2 hypershift-operator-enterprise-contract-wttd2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants