Skip to content

CNTRLPLANE-3250,CNTRLPLANE-430: API-driven Azure topology and private connectivity (Phase 1)#8537

Merged
openshift-merge-bot[bot] merged 8 commits into
openshift:mainfrom
muraee:api-driven-azure-topology
Jun 10, 2026
Merged

CNTRLPLANE-3250,CNTRLPLANE-430: API-driven Azure topology and private connectivity (Phase 1)#8537
openshift-merge-bot[bot] merged 8 commits into
openshift:mainfrom
muraee:api-driven-azure-topology

Conversation

@muraee

@muraee muraee commented May 18, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Extend AzurePrivateType enum with Swift variant and add AzureSwiftSpec struct with immutable podNetworkInstance field
  • Add per-cluster API-driven helpers (UseSwiftNetworkingHCP/HC, UseSharedIngressHCP/HC, IsAroHCPByHCP/HC) that derive behavior from Topology and Private API fields instead of isAroHCP() env-var checks
  • Replace all per-cluster isAroHCP() / UseSharedIngress() calls in visibility, router, shared ingress, infra, nodepool, and CPO v2 component controllers with API-driven helpers
  • Add PublicEndpointExposed condition type for tracking public endpoint exposure state
  • Phase 1 backward compatibility: legacy annotation fallback in UseSwiftNetworkingHCP() for unmigrated clusters

This is Phase 1 of the API-driven Azure topology and private connectivity enhancement. Phase 2 (ARO-RP data migration) and Phase 3 (legacy cleanup) are out of scope.

Key Changes

API (api/hypershift/v1beta1/azure.go)

  • AzurePrivateType: Enum=PrivateLink;Swift
  • AzureSwiftSpec: podNetworkInstance (required, immutable, max 255 chars)
  • AzurePrivateSpec: struct-level type immutability (!has(oldSelf.type) || self.type == oldSelf.type) enabling Phase 2 migration, positive/negative union member rules

Helper Functions (support/netutil/visibility.go, support/azureutil/azureutil.go)

  • UseSwiftNetworkingHCP/HC() — API field first, env var + annotation fallback
  • UseSharedIngressHCP/HC()UseSwiftNetworking && IsPublic
  • IsAroHCPByHCP/HC() — per-cluster ARO wrapper for component predicates
  • IsPrivateHCP/HC() — Phase 1 fallback for Topology="" + Swift annotation
  • LabelHCPRoutes()UseSharedIngress || UseSwiftNetworking

Controller Changes

  • Config-generator: cluster-level UseSharedIngressHC() filter (security-critical — prevents Private cluster endpoints from leaking into public HAProxy config)
  • CPO v2 components (CNO, CCM, storage, registry, ingress, KAS): IsAroHCPByHCP(cpContext.HCP) replaces env-var predicates
  • Infra: UseSwiftNetworkingHCP() for ClusterIP vs LB router service decision
  • NodePool HAProxy, network policies, KAS healthcheck: per-cluster UseSharedIngressHCP/HC()

Not Changed (management-cluster-level, kept as env-var)

  • hypershift-operator/main.go — controller startup
  • control-plane-operator/main.go — controller startup
  • cmd/cluster/core/dump.go — dump command

Test plan

  • Unit tests: visibility helpers, serialization compatibility
  • Envtest CEL validation: 10 test cases (5 onCreate, 5 onUpdate) for Swift union rules
  • E2E: existing ARO HCP tests pass (CI)
  • Manual: verify UseSharedIngressHC() correctly filters Private+Swift clusters from HAProxy config

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Azure Swift networking added as an alternative to PrivateLink.
    • New PublicEndpointExposed condition to track public API reachability via shared ingress.
  • Improvements

    • Per-cluster Swift/shared-ingress and ARO detection for more accurate behavior in mixed environments.
    • Routing, ingress, router and storage flows now respect Swift/shared-ingress decisions.
  • Behavior/Validation

    • Azure private spec supports a Swift member with exclusive-selection and immutability rules.
  • Tests

    • Added serialization compatibility and public-endpoint condition tests.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci

openshift-ci Bot commented May 18, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 18, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 18, 2026
@openshift-ci-robot

openshift-ci-robot commented May 18, 2026

Copy link
Copy Markdown

@muraee: This pull request references CNTRLPLANE-3250 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

  • Extend AzurePrivateType enum with Swift variant and add AzureSwiftSpec struct with immutable podNetworkInstance field
  • Add per-cluster API-driven helpers (UseSwiftNetworkingHCP/HC, UseSharedIngressHCP/HC, IsAroHCPByHCP/HC) that derive behavior from Topology and Private API fields instead of isAroHCP() env-var checks
  • Replace all per-cluster isAroHCP() / UseSharedIngress() calls in visibility, router, shared ingress, infra, nodepool, and CPO v2 component controllers with API-driven helpers
  • Add PublicEndpointExposed condition type for tracking public endpoint exposure state
  • Phase 1 backward compatibility: legacy annotation fallback in UseSwiftNetworkingHCP() for unmigrated clusters

This is Phase 1 of the API-driven Azure topology and private connectivity enhancement. Phase 2 (ARO-RP data migration) and Phase 3 (legacy cleanup) are out of scope.

Key Changes

API (api/hypershift/v1beta1/azure.go)

  • AzurePrivateType: Enum=PrivateLink;Swift
  • AzureSwiftSpec: podNetworkInstance (required, immutable, max 255 chars)
  • AzurePrivateSpec: struct-level type immutability (!has(oldSelf.type) || self.type == oldSelf.type) enabling Phase 2 migration, positive/negative union member rules

Helper Functions (support/netutil/visibility.go, support/azureutil/azureutil.go)

  • UseSwiftNetworkingHCP/HC() — API field first, env var + annotation fallback
  • UseSharedIngressHCP/HC()UseSwiftNetworking && IsPublic
  • IsAroHCPByHCP/HC() — per-cluster ARO wrapper for component predicates
  • IsPrivateHCP/HC() — Phase 1 fallback for Topology="" + Swift annotation
  • LabelHCPRoutes()UseSharedIngress || UseSwiftNetworking

Controller Changes

  • Config-generator: cluster-level UseSharedIngressHC() filter (security-critical — prevents Private cluster endpoints from leaking into public HAProxy config)
  • CPO v2 components (CNO, CCM, storage, registry, ingress, KAS): IsAroHCPByHCP(cpContext.HCP) replaces env-var predicates
  • Infra: UseSwiftNetworkingHCP() for ClusterIP vs LB router service decision
  • NodePool HAProxy, network policies, KAS healthcheck: per-cluster UseSharedIngressHCP/HC()

Not Changed (management-cluster-level, kept as env-var)

  • hypershift-operator/main.go — controller startup
  • control-plane-operator/main.go — controller startup
  • cmd/cluster/core/dump.go — dump command

Test plan

  • Unit tests: visibility helpers, serialization compatibility
  • Envtest CEL validation: 10 test cases (5 onCreate, 5 onUpdate) for Swift union rules
  • E2E: existing ARO HCP tests pass (CI)
  • Manual: verify UseSharedIngressHC() correctly filters Private+Swift clusters from HAProxy config

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented May 18, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds Azure Swift support (AzureSwiftSpec) and XValidation for AzurePrivateSpec, JSON N-1 compatibility tests, netutil helpers for per-HCP/HC Swift and shared-ingress detection, per-cluster azureutil ARO helpers, migrates many ARO/shared-ingress checks to HCP/HC-scoped predicates across control-plane and hypershift controllers, and introduces a PublicEndpointExposed condition with probing logic and unit tests.

Sequence Diagram(s)

sequenceDiagram
  participant Reconciler as HostedClusterReconciler
  participant Router as SharedIngressService
  participant Probe as ProbeSharedIngressEndpoint
  participant HC as HostedCluster
  Reconciler->>Router: fetch router Service (ClusterIP / LB status)
  Router-->>Reconciler: service IP / LB hostname
  Reconciler->>Probe: probe(serviceIP, port, kasHostname)
  Probe-->>Reconciler: probe result (success/fail)
  Reconciler->>HC: update PublicEndpointExposed condition (Status/Reason)
Loading
🚥 Pre-merge checks | ✅ 10 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.41% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning public_endpoint_condition_test.go has 3 assertions without meaningful failure messages (lines 181, 185, 202). These bare assertions hamper debugging when failures occur. Add assertion messages to public_endpoint_condition_test.go lines 181, 185, 202. Other tests adequately follow all quality guidelines.
✅ Passed checks (10 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names across 3 test files use stable, descriptive static strings with no dynamic content like pod names, UUIDs, IPs, timestamps, or generated identifiers.
Microshift Test Compatibility ✅ Passed No Ginkgo e2e tests were added in this PR. All test files are standard Go unit tests using the testing.T framework.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests are added in this PR. All new test files use standard Go unit testing patterns (*testing.T), not Ginkgo patterns (Describe/It/When). SNO compatibility check is not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed PR does not introduce any scheduling constraints. Changes are API definitions, helper functions, and controller logic refactoring that maintain topology compatibility.
Ote Binary Stdout Contract ✅ Passed No OTE Binary Stdout Contract violations detected. PR adds no process-level code that writes to stdout. Test files properly structured. All logging isolated within function code.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No Ginkgo e2e tests were added. Three new test files are standard Go unit tests using testing.T, not Ginkgo Describe/It patterns. Check not applicable.
Title check ✅ Passed The title clearly summarizes the main change: adding API-driven Azure topology and private connectivity support with Phase 1 implementation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added area/api Indicates the PR includes changes for the API area/cli Indicates the PR includes changes for CLI area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/documentation Indicates the PR includes changes for documentation area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/platform/azure PR/issue for Azure (AzurePlatform) platform and removed do-not-merge/needs-area labels May 18, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
hypershift-operator/controllers/hostedcluster/network_policies.go (1)

80-87: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Delete SharedIngressNetworkPolicy when shared ingress is no longer used.

This branch only creates the policy. With per-HC gating, a true→false transition leaves a stale policy that still permits ingress from the shared-ingress namespace.

Suggested fix
 	if netutil.UseSharedIngressHC(hcluster) {
 		policy := networkpolicy.SharedIngressNetworkPolicy(controlPlaneNamespaceName)
 		if _, err := createOrUpdate(ctx, r.Client, policy, func() error {
 			return reconcileSharedIngressNetworkPolicy(policy, hcluster)
 		}); err != nil {
 			return fmt.Errorf("failed to reconcile sharedingress network policy: %w", err)
 		}
+	} else {
+		policy := networkpolicy.SharedIngressNetworkPolicy(controlPlaneNamespaceName)
+		if _, err := k8sutil.DeleteIfNeeded(ctx, r.Client, policy); err != nil {
+			return fmt.Errorf("failed to delete sharedingress network policy: %w", err)
+		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hypershift-operator/controllers/hostedcluster/network_policies.go` around
lines 80 - 87, The code only creates/updates SharedIngressNetworkPolicy when
netutil.UseSharedIngressHC(hcluster) is true, so when that function returns
false existing policies remain; update the logic in the enclosing function to
handle the false branch by locating the SharedIngressNetworkPolicy (use
networkpolicy.SharedIngressNetworkPolicy(controlPlaneNamespaceName) or the same
name/labels used by reconcileSharedIngressNetworkPolicy) and delete it via the
controller client (e.g., r.Client.Delete) if it exists, ignoring NotFound errors
and returning other errors; keep the existing createOrUpdate/reconcile path
unchanged for the true branch and ensure error messages reference the same
resources (SharedIngressNetworkPolicy) so failures to delete are surfaced.
hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go (1)

970-988: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clear stale Azure PLS conditions when Swift networking is active.

At Line 970, Swift-enabled clusters skip this condition sync path, so previously set AzurePrivateLinkService* conditions can linger and misreport topology state after migration or fallback changes.

Suggested fix
-	if hcluster.Spec.Platform.Type == hyperv1.AzurePlatform && !netutil.UseSwiftNetworkingHC(hcluster) {
+	if hcluster.Spec.Platform.Type == hyperv1.AzurePlatform && !netutil.UseSwiftNetworkingHC(hcluster) {
 		hcpNamespace := manifests.HostedControlPlaneNamespace(hcluster.Namespace, hcluster.Name)
 		var azPLSList hyperv1.AzurePrivateLinkServiceList
 		if err := r.List(ctx, &azPLSList, &client.ListOptions{Namespace: hcpNamespace}); err != nil {
 			condition := metav1.Condition{
 				Type:    string(hyperv1.AzurePrivateLinkServiceAvailable),
 				Status:  metav1.ConditionUnknown,
 				Reason:  hyperv1.NotFoundReason,
 				Message: fmt.Sprintf("error listing AzurePrivateLinkService in namespace %s: %v", hcpNamespace, err),
 			}
 			meta.SetStatusCondition(&hcluster.Status.Conditions, condition)
 		} else if len(azPLSList.Items) > 0 {
 			meta.SetStatusCondition(&hcluster.Status.Conditions, computeAzurePLSCondition(azPLSList, hyperv1.AzurePrivateLinkServiceAvailable))
 			meta.SetStatusCondition(&hcluster.Status.Conditions, computeAzurePLSCondition(azPLSList, hyperv1.AzurePLSCreated))
 			meta.SetStatusCondition(&hcluster.Status.Conditions, computeAzurePLSCondition(azPLSList, hyperv1.AzureInternalLoadBalancerAvailable))
 			meta.SetStatusCondition(&hcluster.Status.Conditions, computeAzurePLSCondition(azPLSList, hyperv1.AzurePrivateEndpointAvailable))
 			meta.SetStatusCondition(&hcluster.Status.Conditions, computeAzurePLSCondition(azPLSList, hyperv1.AzurePrivateDNSAvailable))
 		}
+	} else if hcluster.Spec.Platform.Type == hyperv1.AzurePlatform {
+		for _, t := range []hyperv1.ConditionType{
+			hyperv1.AzurePrivateLinkServiceAvailable,
+			hyperv1.AzurePLSCreated,
+			hyperv1.AzureInternalLoadBalancerAvailable,
+			hyperv1.AzurePrivateEndpointAvailable,
+			hyperv1.AzurePrivateDNSAvailable,
+		} {
+			meta.RemoveStatusCondition(&hcluster.Status.Conditions, string(t))
+		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go`
around lines 970 - 988, When Swift networking is active the controller currently
skips the Azure PLS sync and can leave stale AzurePrivateLinkService*
conditions; update the logic around netutil.UseSwiftNetworkingHC(hcluster) so
that when it returns true you explicitly clear/reset the Azure PLS conditions
(AzurePrivateLinkServiceAvailable, AzurePLSCreated,
AzureInternalLoadBalancerAvailable, AzurePrivateEndpointAvailable,
AzurePrivateDNSAvailable) from hcluster.Status.Conditions instead of leaving
them unchanged—use meta.RemoveStatusCondition(...) for each condition type (or
set them to a neutral/Unknown condition via meta.SetStatusCondition if you
prefer a reset message) in the hostedcluster_controller.go code path that checks
UseSwiftNetworkingHC to ensure stale state is removed.
🧹 Nitpick comments (1)
support/netutil/visibility_test.go (1)

106-118: ⚡ Quick win

Remove duplicated Azure topology table cases.

These two entries duplicate scenarios already covered earlier in baseVisibilityCases() and add noise without new behavior coverage.

Proposed cleanup
-		{
-			name:          "When Azure topology is PublicAndPrivate it should be public and private",
-			platformType:  hyperv1.AzurePlatform,
-			azureTopology: hyperv1.AzureTopologyPublicAndPrivate,
-			wantPrivate:   true,
-			wantPublic:    true,
-		},
-		{
-			name:          "When Azure topology is Private it should be private and not public",
-			platformType:  hyperv1.AzurePlatform,
-			azureTopology: hyperv1.AzureTopologyPrivate,
-			wantPrivate:   true,
-			wantPublic:    false,
-		},
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@support/netutil/visibility_test.go` around lines 106 - 118, Remove the
duplicated Azure topology table cases from the test table: delete the two
entries with names "When Azure topology is PublicAndPrivate it should be public
and private" and "When Azure topology is Private it should be private and not
public" in visibility_test.go because they duplicate scenarios already covered
by baseVisibilityCases(); rely on baseVisibilityCases() for those cases to avoid
redundant test rows.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go`:
- Around line 970-988: When Swift networking is active the controller currently
skips the Azure PLS sync and can leave stale AzurePrivateLinkService*
conditions; update the logic around netutil.UseSwiftNetworkingHC(hcluster) so
that when it returns true you explicitly clear/reset the Azure PLS conditions
(AzurePrivateLinkServiceAvailable, AzurePLSCreated,
AzureInternalLoadBalancerAvailable, AzurePrivateEndpointAvailable,
AzurePrivateDNSAvailable) from hcluster.Status.Conditions instead of leaving
them unchanged—use meta.RemoveStatusCondition(...) for each condition type (or
set them to a neutral/Unknown condition via meta.SetStatusCondition if you
prefer a reset message) in the hostedcluster_controller.go code path that checks
UseSwiftNetworkingHC to ensure stale state is removed.

In `@hypershift-operator/controllers/hostedcluster/network_policies.go`:
- Around line 80-87: The code only creates/updates SharedIngressNetworkPolicy
when netutil.UseSharedIngressHC(hcluster) is true, so when that function returns
false existing policies remain; update the logic in the enclosing function to
handle the false branch by locating the SharedIngressNetworkPolicy (use
networkpolicy.SharedIngressNetworkPolicy(controlPlaneNamespaceName) or the same
name/labels used by reconcileSharedIngressNetworkPolicy) and delete it via the
controller client (e.g., r.Client.Delete) if it exists, ignoring NotFound errors
and returning other errors; keep the existing createOrUpdate/reconcile path
unchanged for the true branch and ensure error messages reference the same
resources (SharedIngressNetworkPolicy) so failures to delete are surfaced.

---

Nitpick comments:
In `@support/netutil/visibility_test.go`:
- Around line 106-118: Remove the duplicated Azure topology table cases from the
test table: delete the two entries with names "When Azure topology is
PublicAndPrivate it should be public and private" and "When Azure topology is
Private it should be private and not public" in visibility_test.go because they
duplicate scenarios already covered by baseVisibilityCases(); rely on
baseVisibilityCases() for those cases to avoid redundant test rows.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: d1a66cca-7f97-4534-947f-4982768bfce8

📥 Commits

Reviewing files that changed from the base of the PR and between cf2b91f and 340ccd7.

⛔ Files ignored due to path filters (42)
  • api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !**/zz_generated*.go, !**/zz_generated*
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDC.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HCPEtcdBackup.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ImageStreamImportMode.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/KMSEncryptionProvider.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/TLSAdherence.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDC.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HCPEtcdBackup.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ImageStreamImportMode.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/KMSEncryptionProvider.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/TLSAdherence.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • client/applyconfiguration/hypershift/v1beta1/azureprivatespec.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/azureswiftspec.go is excluded by !client/**
  • client/applyconfiguration/utils.go is excluded by !client/**
  • cmd/install/assets/crds/hypershift-operator/tests/hostedclusters.hypershift.openshift.io/stable.hostedclusters.azure.testsuite.yaml is excluded by !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • docs/content/reference/aggregated-docs.md is excluded by !docs/content/reference/aggregated-docs.md
  • docs/content/reference/api.md is excluded by !docs/content/reference/api.md
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/azure.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_conditions.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !vendor/**, !**/vendor/**, !**/zz_generated*.go, !**/zz_generated*
📒 Files selected for processing (35)
  • api/hypershift/v1beta1/azure.go
  • api/hypershift/v1beta1/azure_private_spec_test.go
  • api/hypershift/v1beta1/hostedcluster_conditions.go
  • control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go
  • control-plane-operator/controllers/hostedcontrolplane/infra/infra.go
  • control-plane-operator/controllers/hostedcontrolplane/ingress/router.go
  • control-plane-operator/controllers/hostedcontrolplane/kas/healthcheck.go
  • control-plane-operator/controllers/hostedcontrolplane/kas/service.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/cloud_controller_manager/azure/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/cloud_controller_manager/azure/config.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/cloud_controller_manager/azure/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/cno/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/cno/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/controlplaneoperator/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/controlplaneoperator/role.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/ingressoperator/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/ingressoperator/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/kas/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/kas/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/registryoperator/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/registryoperator/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/router/config.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/router/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/router/util/util.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/storage/azure.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/storage/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/storage/deployment.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
  • hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go
  • hypershift-operator/controllers/hostedcluster/network_policies.go
  • hypershift-operator/controllers/nodepool/apiserver-haproxy/haproxy.go
  • sharedingress-config-generator/config.go
  • support/azureutil/azureutil.go
  • support/netutil/visibility.go
  • support/netutil/visibility_test.go

@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8537 May 18, 2026 15:37 Inactive
@codecov

codecov Bot commented May 18, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 67.40088% with 74 lines in your changes missing coverage. Please review.
✅ Project coverage is 41.53%. Comparing base (f13c62d) to head (e2aa9ff).
⚠️ Report is 22 commits behind head on main.

Files with missing lines Patch % Lines
...trollers/hostedcluster/hostedcluster_controller.go 66.66% 27 Missing and 6 partials ⚠️
...controllers/hostedcontrolplane/v2/storage/azure.go 0.00% 6 Missing ⚠️
...lplane/v2/cloud_controller_manager/azure/config.go 0.00% 3 Missing and 2 partials ⚠️
support/netutil/visibility.go 94.02% 3 Missing and 1 partial ⚠️
...ne/v2/cloud_controller_manager/azure/deployment.go 0.00% 3 Missing ⚠️
...rconfigoperator/controllers/resources/resources.go 25.00% 2 Missing and 1 partial ⚠️
...ane/v2/cloud_controller_manager/azure/component.go 0.00% 2 Missing ⚠️
...controlplane/v2/controlplaneoperator/deployment.go 0.00% 2 Missing ⚠️
...rollers/hostedcontrolplane/v2/router/deployment.go 0.00% 2 Missing ⚠️
...ostedcontrolplane/hostedcontrolplane_controller.go 0.00% 1 Missing ⚠️
... and 13 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8537      +/-   ##
==========================================
+ Coverage   41.43%   41.53%   +0.10%     
==========================================
  Files         756      756              
  Lines       93647    93797     +150     
==========================================
+ Hits        38802    38959     +157     
+ Misses      52124    52094      -30     
- Partials     2721     2744      +23     
Files with missing lines Coverage Δ
...ator/controllers/hostedcontrolplane/infra/infra.go 50.54% <100.00%> (ø)
.../controllers/hostedcontrolplane/kas/healthcheck.go 100.00% <100.00%> (ø)
...ator/controllers/hostedcontrolplane/kas/service.go 40.36% <100.00%> (ø)
...trollers/hostedcontrolplane/v2/router/util/util.go 100.00% <100.00%> (ø)
...ator/controllers/hostedcluster/network_policies.go 77.19% <100.00%> (ø)
.../controllers/nodepool/apiserver-haproxy/haproxy.go 62.72% <100.00%> (ø)
...erator/controllers/nodepool/nodepool_controller.go 43.06% <100.00%> (-0.07%) ⬇️
sharedingress-config-generator/config.go 75.38% <100.00%> (-6.65%) ⬇️
support/azureutil/azureutil.go 44.93% <100.00%> (+0.70%) ⬆️
...ostedcontrolplane/hostedcontrolplane_controller.go 45.70% <0.00%> (ø)
... and 22 more

... and 2 files with indirect coverage changes

Flag Coverage Δ
cmd-support 34.98% <94.36%> (+0.10%) ⬆️
cpo-hostedcontrolplane 43.50% <24.44%> (ø)
cpo-other 43.17% <25.00%> (+0.42%) ⬆️
hypershift-operator 51.62% <68.57%> (+0.05%) ⬆️
other 31.56% <100.00%> (-0.08%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go`:
- Around line 5360-5362: The lbReady calculation only checks
svc.Status.LoadBalancer.Ingress[0] and can miss readiness on other ingress
entries; change the lbReady logic (where lbReady is computed) to iterate over
svc.Status.LoadBalancer.Ingress and set lbReady true if any ingress element has
a non-empty Hostname or IP, preserving the existing short-circuit with
netutil.UseSharedIngressHC(hc) usage and any surrounding logic that depends on
lbReady.
- Around line 2092-2096: The current code swallows errors from
reconcilePublicEndpointExposedCondition by only logging them; instead propagate
the error so the controller-runtime retries properly. In the reconcile loop
where reconcilePublicEndpointExposedCondition(ctx, hcluster) is called, change
the error branch to return the error (e.g., return ctrl.Result{}, err) rather
than just logging, and keep the existing logic that updates requeueAfter when
pubEndpointRequeue is non-nil; reference
reconcilePublicEndpointExposedCondition, pubEndpointRequeue, and requeueAfter to
locate and update the error-handling path.
- Around line 5342-5358: The early returns when kasStrategy is nil or missing
route info and when the shared ingress Service is missing or has no ClusterIP
leave the PublicEndpointExposed condition stale; change these code paths (the
checks around kasStrategy/kasHostname and the svc/svcKey lookup and clusterIP
check) to explicitly set the HostedCluster status condition
PublicEndpointExposed to False (with a clear Reason like "NoSharedIngress" or
"RouteNotReady"), persist the status update before returning, and return a
non-error requeue (so the controller will reconcile again) instead of silently
returning nil; ensure you use the existing status update helpers in this
controller to update the condition so retries and removal transitions observe
the explicit state.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 5724ac97-9b6c-412c-88a0-22faceb6d3b5

📥 Commits

Reviewing files that changed from the base of the PR and between 340ccd7 and 6448520.

⛔ Files ignored due to path filters (1)
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_conditions.go is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (2)
  • api/hypershift/v1beta1/hostedcluster_conditions.go
  • hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • api/hypershift/v1beta1/hostedcluster_conditions.go

@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8537 May 18, 2026 16:41 Inactive
@muraee muraee force-pushed the api-driven-azure-topology branch from 6448520 to a4756a0 Compare May 18, 2026 16:47
@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8537 May 18, 2026 16:50 Inactive
@muraee muraee force-pushed the api-driven-azure-topology branch from a4756a0 to 2b04196 Compare May 18, 2026 16:52
@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8537 May 18, 2026 16:55 Inactive
@muraee muraee force-pushed the api-driven-azure-topology branch from 2b04196 to 589ce0d Compare May 18, 2026 17:06
@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8537 May 18, 2026 17:09 Inactive

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
control-plane-operator/controllers/hostedcontrolplane/infra/infra.go (1)

458-464: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve the standard router service reconciliation in the Swift path.

This branch bypasses ingress.ReconcileRouterService and only mutates a few Service fields, so the private router service no longer gets the usual ownership/metadata setup and can be orphaned on HostedControlPlane deletion.

One way to keep the existing contract
 	if netutil.UseSwiftNetworkingHCP(hcp) {
 		if _, err := k8sutil.DeleteIfNeeded(ctx, r.Client, pubSvc); err != nil {
 			return fmt.Errorf("failed to delete public router service: %w", err)
 		}
 		if _, err := createOrUpdate(ctx, r.Client, privSvc, func() error {
-			return reconcileAROPrivateRouterService(privSvc, hcp)
+			if err := ingress.ReconcileRouterService(privSvc, true, true, hcp); err != nil {
+				return err
+			}
+			return reconcileAROPrivateRouterService(privSvc, hcp)
 		}); err != nil {
 			return fmt.Errorf("failed to reconcile private router service: %w", err)
 		}

As per coding guidelines, "Follow owner reference patterns for proper garbage collection of resources".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@control-plane-operator/controllers/hostedcontrolplane/infra/infra.go` around
lines 458 - 464, The Swift networking branch skips the standard
ingress.ReconcileRouterService flow and instead manually mutates the Service
(privSvc), losing ownership/metadata so the Service can be orphaned on
HostedControlPlane deletion; restore the normal contract by invoking
ingress.ReconcileRouterService (or reusing its logic) for the router Services in
the netutil.UseSwiftNetworkingHCP(hcp) path (rather than only calling
reconcileAROPrivateRouterService), and ensure you use
createOrUpdate/DeleteIfNeeded around pubSvc/privSvc while preserving owner
references and standard metadata so the HostedControlPlane becomes the
controller for garbage collection.
♻️ Duplicate comments (2)
hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go (2)

5365-5366: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Evaluate all LoadBalancer ingress entries for readiness.

At Line 5365-5366, readiness checks only Ingress[0]. If another ingress entry has IP/hostname, this produces false negatives.

💡 Suggested fix
- lbReady := len(svc.Status.LoadBalancer.Ingress) > 0 &&
-   (svc.Status.LoadBalancer.Ingress[0].Hostname != "" || svc.Status.LoadBalancer.Ingress[0].IP != "")
+ lbReady := false
+ for _, ing := range svc.Status.LoadBalancer.Ingress {
+   if ing.Hostname != "" || ing.IP != "" {
+     lbReady = true
+     break
+   }
+ }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go`
around lines 5365 - 5366, The lbReady check currently only inspects
svc.Status.LoadBalancer.Ingress[0], causing false negatives; change the logic
that sets lbReady to scan all entries in svc.Status.LoadBalancer.Ingress and set
lbReady to true if any entry has a non-empty Hostname or IP. Update the code
that assigns lbReady (reference variable lbReady and the slice
svc.Status.LoadBalancer.Ingress) to iterate over the slice (handle nil/empty
slice as false) and break early when an ingress with Hostname != "" or IP != ""
is found. Ensure the new logic preserves existing behavior when the ingress list
is empty.

5347-5363: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Set PublicEndpointExposed explicitly on missing prerequisites.

At Line 5348 and Line 5356-5362, the function returns without updating condition state. That can leave PublicEndpointExposed stale during convergence/removal transitions. Set it to False with a transition reason and requeue when shared ingress is expected.

💡 Suggested fix direction
- if kasStrategy == nil || kasStrategy.Route == nil || kasStrategy.Route.Hostname == "" {
-   return nil, nil
- }
+ if kasStrategy == nil || kasStrategy.Route == nil || kasStrategy.Route.Hostname == "" {
+   cond := metav1.Condition{
+     Type:               string(hyperv1.PublicEndpointExposed),
+     Status:             metav1.ConditionFalse,
+     Reason:             hyperv1.PublicEndpointConvergenceInProgressReason,
+     Message:            "API server route hostname is not yet available",
+     ObservedGeneration: hc.Generation,
+   }
+   if meta.SetStatusCondition(&hc.Status.Conditions, cond) {
+     if err := r.Client.Status().Update(ctx, hc); err != nil {
+       return nil, fmt.Errorf("failed to update PublicEndpointExposed condition: %w", err)
+     }
+   }
+   d := 10 * time.Second
+   return &d, nil
+ }

As per coding guidelines, **/{hypershift-operator,control-plane-operator}/controllers/**/*.go: Follow standard controller-runtime patterns for operator controllers with proper error handling and requeuing.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go`
around lines 5347 - 5363, The controller early-returns when shared ingress is
missing (checks using ServicePublishingStrategyByTypeByHC and the Service
fetched via r.Client.Get) without updating the HostedCluster condition
PublicEndpointExposed; modify the control flow so that when kasStrategy is nil
or missing Route/Hostname, or when the shared ingress Service is not found or
has no ClusterIP, you explicitly set the HostedCluster's PublicEndpointExposed
condition to False with an appropriate transition reason/message (e.g.,
"SharedIngressUnavailable"/"MissingServiceClusterIP"), persist the status
update, and requeue the reconciliation (rather than returning nil,nil) so state
converges during creation/removal; ensure updates occur where kasStrategy, svc
retrieval (r.Client.Get) and svc.Spec.ClusterIP checks currently return early.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/hypershift/v1beta1/azure.go`:
- Around line 665-680: AzurePrivateSpec currently allows AzurePrivateTypeSwift
even when the cluster is self-managed; add an XValidation rule to gate Swift to
managed Azure auth by ensuring when AzurePrivateType == "Swift" the
azureAuthenticationConfigType is the managed value (e.g. not
"WorkloadIdentities" / must equal "Managed"). Concretely, add a
kubebuilder:validation:XValidation rule to the AzurePrivateSpec similar to the
existing rules that enforces self.type != 'Swift' || has(self.swift) AND
self.type != 'Swift' || (self.azureAuthenticationConfigType == 'Managed') (or
equivalent != 'WorkloadIdentities') so Swift is only valid for managed-auth
clusters; reference AzurePrivateTypeSwift, AzurePrivateSpec, and
azureAuthenticationConfigType when making the change.

---

Outside diff comments:
In `@control-plane-operator/controllers/hostedcontrolplane/infra/infra.go`:
- Around line 458-464: The Swift networking branch skips the standard
ingress.ReconcileRouterService flow and instead manually mutates the Service
(privSvc), losing ownership/metadata so the Service can be orphaned on
HostedControlPlane deletion; restore the normal contract by invoking
ingress.ReconcileRouterService (or reusing its logic) for the router Services in
the netutil.UseSwiftNetworkingHCP(hcp) path (rather than only calling
reconcileAROPrivateRouterService), and ensure you use
createOrUpdate/DeleteIfNeeded around pubSvc/privSvc while preserving owner
references and standard metadata so the HostedControlPlane becomes the
controller for garbage collection.

---

Duplicate comments:
In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go`:
- Around line 5365-5366: The lbReady check currently only inspects
svc.Status.LoadBalancer.Ingress[0], causing false negatives; change the logic
that sets lbReady to scan all entries in svc.Status.LoadBalancer.Ingress and set
lbReady to true if any entry has a non-empty Hostname or IP. Update the code
that assigns lbReady (reference variable lbReady and the slice
svc.Status.LoadBalancer.Ingress) to iterate over the slice (handle nil/empty
slice as false) and break early when an ingress with Hostname != "" or IP != ""
is found. Ensure the new logic preserves existing behavior when the ingress list
is empty.
- Around line 5347-5363: The controller early-returns when shared ingress is
missing (checks using ServicePublishingStrategyByTypeByHC and the Service
fetched via r.Client.Get) without updating the HostedCluster condition
PublicEndpointExposed; modify the control flow so that when kasStrategy is nil
or missing Route/Hostname, or when the shared ingress Service is not found or
has no ClusterIP, you explicitly set the HostedCluster's PublicEndpointExposed
condition to False with an appropriate transition reason/message (e.g.,
"SharedIngressUnavailable"/"MissingServiceClusterIP"), persist the status
update, and requeue the reconciliation (rather than returning nil,nil) so state
converges during creation/removal; ensure updates occur where kasStrategy, svc
retrieval (r.Client.Get) and svc.Spec.ClusterIP checks currently return early.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 16b9a6b0-a8fd-4187-9d8f-bea523d0ff8a

📥 Commits

Reviewing files that changed from the base of the PR and between 2b04196 and 589ce0d.

⛔ Files ignored due to path filters (42)
  • api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !**/zz_generated*.go, !**/zz_generated*
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDC.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HCPEtcdBackup.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ImageStreamImportMode.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/KMSEncryptionProvider.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/TLSAdherence.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDC.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HCPEtcdBackup.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ImageStreamImportMode.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/KMSEncryptionProvider.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/TLSAdherence.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • client/applyconfiguration/hypershift/v1beta1/azureprivatespec.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/azureswiftspec.go is excluded by !client/**
  • client/applyconfiguration/utils.go is excluded by !client/**
  • cmd/install/assets/crds/hypershift-operator/tests/hostedclusters.hypershift.openshift.io/stable.hostedclusters.azure.testsuite.yaml is excluded by !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • docs/content/reference/aggregated-docs.md is excluded by !docs/content/reference/aggregated-docs.md
  • docs/content/reference/api.md is excluded by !docs/content/reference/api.md
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/azure.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_conditions.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !vendor/**, !**/vendor/**, !**/zz_generated*.go, !**/zz_generated*
📒 Files selected for processing (36)
  • api/hypershift/v1beta1/azure.go
  • api/hypershift/v1beta1/azure_private_spec_test.go
  • api/hypershift/v1beta1/hostedcluster_conditions.go
  • control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go
  • control-plane-operator/controllers/hostedcontrolplane/infra/infra.go
  • control-plane-operator/controllers/hostedcontrolplane/ingress/router.go
  • control-plane-operator/controllers/hostedcontrolplane/kas/healthcheck.go
  • control-plane-operator/controllers/hostedcontrolplane/kas/service.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/cloud_controller_manager/azure/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/cloud_controller_manager/azure/config.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/cloud_controller_manager/azure/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/cno/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/cno/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/controlplaneoperator/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/controlplaneoperator/role.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/ingressoperator/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/ingressoperator/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/kas/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/kas/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/registryoperator/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/registryoperator/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/router/config.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/router/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/router/util/util.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/storage/azure.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/storage/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/storage/deployment.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
  • hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go
  • hypershift-operator/controllers/hostedcluster/network_policies.go
  • hypershift-operator/controllers/hostedcluster/public_endpoint_condition_test.go
  • hypershift-operator/controllers/nodepool/apiserver-haproxy/haproxy.go
  • sharedingress-config-generator/config.go
  • support/azureutil/azureutil.go
  • support/netutil/visibility.go
  • support/netutil/visibility_test.go
🚧 Files skipped from review as they are similar to previous changes (26)
  • control-plane-operator/controllers/hostedcontrolplane/v2/registryoperator/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/ingressoperator/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/controlplaneoperator/role.go
  • hypershift-operator/controllers/hostedcluster/network_policies.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/kas/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/router/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/cno/component.go
  • control-plane-operator/controllers/hostedcontrolplane/kas/service.go
  • control-plane-operator/controllers/hostedcontrolplane/kas/healthcheck.go
  • sharedingress-config-generator/config.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/router/config.go
  • control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go
  • hypershift-operator/controllers/nodepool/apiserver-haproxy/haproxy.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/ingressoperator/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/controlplaneoperator/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/ingress/router.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/cloud_controller_manager/azure/config.go
  • support/azureutil/azureutil.go
  • hypershift-operator/controllers/hostedcluster/public_endpoint_condition_test.go
  • api/hypershift/v1beta1/hostedcluster_conditions.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/registryoperator/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/cloud_controller_manager/azure/deployment.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/storage/azure.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
  • api/hypershift/v1beta1/azure_private_spec_test.go
  • support/netutil/visibility.go

Comment thread api/hypershift/v1beta1/azure.go Outdated
@muraee muraee force-pushed the api-driven-azure-topology branch from 589ce0d to 9a3435a Compare May 19, 2026 10:10
@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8537 May 19, 2026 10:18 Inactive
@muraee muraee force-pushed the api-driven-azure-topology branch 2 times, most recently from 5e9c12c to dd73d8f Compare May 19, 2026 15:18
@enxebre

enxebre commented Jun 8, 2026

Copy link
Copy Markdown
Member

Other than the regression lgtm overall
TestReconcileInternalRouterServiceStatus needs adjustment to catch the regression
TestReconcileInfrastructure needs to account for only private scenarios

Add UseSwiftNetworkingHCP to reconcileAPIServerServiceStatus so Swift
clusters resolve the KAS hostname from the Route strategy (via
KasRouteHostname helper) instead of falling through to LB service
lookup. Update infra tests to cover Swift Private and PublicAndPrivate
topologies separately, and fix test helper to skip simulating LB
provisioning for router services that are ClusterIP under Swift/shared
ingress.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@enxebre

enxebre commented Jun 8, 2026

Copy link
Copy Markdown
Member

/approve

@openshift-ci

openshift-ci Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enxebre, muraee

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@zgalor

zgalor commented Jun 8, 2026

Copy link
Copy Markdown

Re-test After Fix — e2aa9ffd8

Retested with the latest commit (e2aa9ffd8 fix(cpo): handle Swift private topology in infra reconciliation) on fresh clusters using both the HyperShift operator and CPO built from the PR branch.

Results

Cluster Topology InfraReady Etcd KAS KAS Public Reachability
zg-public-4 PublicAndPrivate True True True HTTP 200 ok
zg-private-4 Private True True True Connection refused (expected)

Verified

  1. Private-router issue fixedPrivate topology cluster now reaches InfrastructureReady: True without manual intervention.
  2. Konnectivity routes auto-populated — Both clusters get konnectivity-server.apps.<cluster>.hypershift.local set by the CPO automatically.
  3. KAS visibility correct — Public cluster KAS reachable via shared ingress (HTTP 200). Private cluster KAS not reachable from public network (connection refused).
  4. Route structure correct for private cluster — No kube-apiserver public route created; only kube-apiserver-private and kube-apiserver-internal (hypershift.local) routes exist.

LGTM from the CS testing side.

@openshift-ci-robot

Copy link
Copy Markdown

@zgalor: The /verified command must be used with one of the following actions: by, later, remove, or bypass. See https://docs.ci.openshift.org/docs/architecture/jira/#premerge-verification for more information.

Details

In response to this:

/verified

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@zgalor

zgalor commented Jun 8, 2026

Copy link
Copy Markdown

/verified by @zgalor

@openshift-ci-robot

Copy link
Copy Markdown

@zgalor: This PR has been marked as verified by @zgalor.

Details

In response to this:

/verified by @zgalor

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@zgalor

zgalor commented Jun 8, 2026

Copy link
Copy Markdown

/lgtm

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aks-4-22
/test e2e-aws-4-22
/test e2e-aks
/test e2e-aws
/test e2e-aws-upgrade-hypershift-operator
/test e2e-azure-self-managed
/test e2e-azure-v2-self-managed
/test e2e-kubevirt-aws-ovn-reduced
/test e2e-v2-aws
/test e2e-v2-gke

@muraee

muraee commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

/test "ci/prow/security"
/test "ci/prow/verify-deps"

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD ad3da77 and 2 for PR HEAD e2aa9ff in total

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 8ea786c and 1 for PR HEAD e2aa9ff in total

@muraee

muraee commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

/retest-required

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD ac1a1c2 and 0 for PR HEAD e2aa9ff in total

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

/hold

Revision e2aa9ff was retested 3 times: holding

@muraee

muraee commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

/retest-required

@hypershift-jira-solve-ci

hypershift-jira-solve-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown

Now let me produce the final report:

Test Failure Analysis Complete

Job Information

Test Failure Analysis

Error

1) TestNodePool/HostedCluster0/Main/TestNodePoolPrevReleaseN3 (2700s):
   util.go:583: Failed to wait for 1 nodes to become ready for NodePool
   e2e-clusters-2svvn/node-pool-fvh4j-qhk7j in 45m0s: context deadline exceeded
   - observed **v1.Node collection invalid: expected 1 nodes, got 0

2) TestKarpenterUpgradeControlPlane/Main (1910s):
   karpenter_test.go:1551: Failed to wait for NodeClaims to be ready in 5m0s: context deadline exceeded
   - observed **v1.NodeClaim collection invalid: expected 1 NodeClaims, got 2
   - NodeClaim on-demand-7tdlm not ready: Launched=true, Registered=false, Initialized=false
   - Registered=Unknown: NodeNotFound(Node not registered with cluster)

Summary

This job has 2 independent infrastructure-level flakes, neither caused by PR #8537. (1) TestNodePoolPrevReleaseN3: A NodePool using the 4.19 release image (4.19.0-0.ci-2026-06-03-210413) was created on a 4.22 HostedCluster, but its EC2 instance never joined the cluster within 45 minutes (0 nodes observed). The sibling tests using 4.21, 4.20, and 4.18 releases all passed. (2) TestKarpenterUpgradeControlPlane/Main: After a successful control plane upgrade and drift detection, Karpenter launched a replacement NodeClaim (on-demand-7tdlm) but the new EC2 instance failed to bootstrap and register with the cluster within 5 minutes — the instance was Launched but Registered=Unknown: NodeNotFound. The PR only modifies Azure-specific code paths (UseSharedIngressHC, UseSwiftNetworkingHCP, AzurePrivateSpec) that are gated behind IsAroHCPByHC/HCP checks which return false for AWS clusters, making these failures unrelated to the PR changes.

Root Cause

Failure 1 — TestNodePoolPrevReleaseN3 (4.19 node never joined):
The test creates a NodePool using the N-3 minor release image (4.19.0-0.ci-2026-06-03-210413) on a 4.22 HostedCluster. The EC2 instance was provisioned but the node never completed bootstrap and joined the cluster within the 45-minute timeout. This is a transient infrastructure/image issue — the 4.19 CI payload from June 3rd may have had a stale or broken machine-os image that prevented the node from bootstrapping on the 4.22 control plane. Other version-skew tests (N-1/4.21, N-2/4.20, N-4/4.18) all succeeded, isolating the issue to the specific 4.19 CI payload.

The parent test TestNodePool/HostedCluster0 then fails its post-test condition validation: the framework expects ClusterVersionAvailable=False and ClusterVersionProgressing=True (because the HostedCluster was created with 0 replicas), but finds ClusterVersionAvailable=True — this is because other subtests already created worker nodes that allowed the CVO to complete before those nodes were removed. This condition mismatch is a secondary symptom, not a separate bug.

Failure 2 — TestKarpenterUpgradeControlPlane/Main (NodeClaim never registered):
After successfully upgrading the control plane (21m15s rollout), Karpenter correctly detected the existing NodeClaim on-demand-xrqvv as drifted and launched a replacement on-demand-7tdlm. The replacement was Launched=True (EC2 instance created) but never progressed to Registered — the node never called back to the cluster API server. After 5 minutes the test timed out. The test also observed 2 NodeClaims instead of the expected 1, indicating the old NodeClaim was not yet fully terminated. This is a transient issue with the replacement EC2 instance failing to bootstrap (possibly due to ignition/network/timing issues during the cluster upgrade window).

PR #8537 is not the cause. The PR adds Azure-specific Swift networking support and changes UseSharedIngress() (global env var check) to UseSharedIngressHC/HCP() (per-cluster check via IsAroHCPByHC/HCP). The IsAroHCPByHCP function explicitly returns false when hcp.Spec.Platform.Type != hyperv1.AzurePlatform, so all AWS code paths are unchanged. The resolveHAProxyImage change in nodepool_controller.go — the only code path that could theoretically affect node bootstrap — produces identical results for AWS clusters (both old and new paths skip the shared ingress image).

Recommendations
  1. Retry the job — Both failures are transient infrastructure issues (node bootstrap failures), not caused by the PR. A rerun should pass.
  2. Monitor the 4.19 CI payload — If TestNodePoolPrevReleaseN3 consistently fails across multiple job runs, the 4.19.0-0.ci payload image may need investigation for staleness or bootstrap incompatibilities with 4.22 control planes.
  3. No code changes needed in PR CNTRLPLANE-3250,CNTRLPLANE-430: API-driven Azure topology and private connectivity (Phase 1) #8537 — The Azure-focused changes are correctly gated behind platform type checks and do not affect AWS test paths.
Evidence
Evidence Detail
Failed tests TestNodePoolPrevReleaseN3 (2700s timeout), TestKarpenterUpgradeControlPlane/Main (1910s)
N3 release image 4.19.0-0.ci-2026-06-03-210413 — node never joined (0/1 nodes after 45m)
N1/N2/N4 status All PASSED — 4.21 (492s), 4.20 (507s), 4.18 (15s)
Karpenter NodeClaim on-demand-7tdlm: Launched=True, Registered=Unknown (NodeNotFound), Initialized=Unknown
Old NodeClaim on-demand-xrqvv: correctly drifted after upgrade, but 2 NodeClaims observed (old not yet terminated)
PR scope Azure-only: UseSharedIngressHC, UseSwiftNetworkingHCP, AzurePrivateSpec.Swift, PublicEndpointExposed condition
Platform gate IsAroHCPByHCP() returns false for hcp.Spec.Platform.Type != AzurePlatform — AWS paths unaffected
HAProxy resolution Old: sharedingress.UseSharedIngress() → New: netutil.UseSharedIngressHC(hcluster) — both return false for AWS
HostedCluster0 conditions Post-test validation expected ClusterVersionAvailable=False (0 replicas), got True (CVO already completed via sibling subtests) — secondary symptom
Total test results 597 tests, 33 skipped, 6 failures (2 root causes + 4 parent propagations)

@muraee

muraee commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

/override e2e-aws-4-22

@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@muraee: /override requires failed status contexts, check run or a prowjob name to operate on.
The following unknown contexts/checkruns were given:

  • 2e-aws-4-22

Only the following failed contexts/checkruns were expected:

  • CodeRabbit
  • ci/prow/e2e-aks
  • ci/prow/e2e-aks-4-22
  • ci/prow/e2e-aws
  • ci/prow/e2e-aws-4-22
  • ci/prow/e2e-aws-upgrade-hypershift-operator
  • ci/prow/e2e-azure-self-managed
  • ci/prow/e2e-azure-v2-self-managed
  • ci/prow/e2e-kubevirt-aws-ovn-reduced
  • ci/prow/e2e-v2-aws
  • ci/prow/e2e-v2-gke
  • ci/prow/images
  • ci/prow/okd-scos-images
  • ci/prow/security
  • ci/prow/verify-deps
  • pull-ci-openshift-hypershift-main-e2e-aks
  • pull-ci-openshift-hypershift-main-e2e-aks-4-22
  • pull-ci-openshift-hypershift-main-e2e-aws
  • pull-ci-openshift-hypershift-main-e2e-aws-4-22
  • pull-ci-openshift-hypershift-main-e2e-aws-upgrade-hypershift-operator
  • pull-ci-openshift-hypershift-main-e2e-azure-self-managed
  • pull-ci-openshift-hypershift-main-e2e-azure-v2-self-managed
  • pull-ci-openshift-hypershift-main-e2e-kubevirt-aws-ovn-reduced
  • pull-ci-openshift-hypershift-main-e2e-v2-aws
  • pull-ci-openshift-hypershift-main-e2e-v2-gke
  • pull-ci-openshift-hypershift-main-images
  • pull-ci-openshift-hypershift-main-okd-scos-images
  • pull-ci-openshift-hypershift-main-security
  • pull-ci-openshift-hypershift-main-verify-deps
  • tide

If you are trying to override a checkrun that has a space in it, you must put a double quote on the context.

Details

In response to this:

/override 2e-aws-4-22
/unhold

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@muraee: /override requires failed status contexts, check run or a prowjob name to operate on.
The following unknown contexts/checkruns were given:

  • e2e-aws-4-22

Only the following failed contexts/checkruns were expected:

  • CodeRabbit
  • ci/prow/e2e-aks
  • ci/prow/e2e-aks-4-22
  • ci/prow/e2e-aws
  • ci/prow/e2e-aws-4-22
  • ci/prow/e2e-aws-upgrade-hypershift-operator
  • ci/prow/e2e-azure-self-managed
  • ci/prow/e2e-azure-v2-self-managed
  • ci/prow/e2e-kubevirt-aws-ovn-reduced
  • ci/prow/e2e-v2-aws
  • ci/prow/e2e-v2-gke
  • ci/prow/images
  • ci/prow/okd-scos-images
  • ci/prow/security
  • ci/prow/verify-deps
  • pull-ci-openshift-hypershift-main-e2e-aks
  • pull-ci-openshift-hypershift-main-e2e-aks-4-22
  • pull-ci-openshift-hypershift-main-e2e-aws
  • pull-ci-openshift-hypershift-main-e2e-aws-4-22
  • pull-ci-openshift-hypershift-main-e2e-aws-upgrade-hypershift-operator
  • pull-ci-openshift-hypershift-main-e2e-azure-self-managed
  • pull-ci-openshift-hypershift-main-e2e-azure-v2-self-managed
  • pull-ci-openshift-hypershift-main-e2e-kubevirt-aws-ovn-reduced
  • pull-ci-openshift-hypershift-main-e2e-v2-aws
  • pull-ci-openshift-hypershift-main-e2e-v2-gke
  • pull-ci-openshift-hypershift-main-images
  • pull-ci-openshift-hypershift-main-okd-scos-images
  • pull-ci-openshift-hypershift-main-security
  • pull-ci-openshift-hypershift-main-verify-deps
  • tide

If you are trying to override a checkrun that has a space in it, you must put a double quote on the context.

Details

In response to this:

/override e2e-aws-4-22

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@muraee

muraee commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

/override "ci/prow/e2e-aws-4-22"

@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@muraee: Overrode contexts on behalf of muraee: ci/prow/e2e-aws-4-22

Details

In response to this:

/override "ci/prow/e2e-aws-4-22"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@muraee: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/api Indicates the PR includes changes for the API area/cli Indicates the PR includes changes for CLI area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/documentation Indicates the PR includes changes for documentation area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/platform/azure PR/issue for Azure (AzurePlatform) platform jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants