Skip to content

CNTRLPLANE-3022: Add osImageStream to NodePool spec and status#8675

Merged
openshift-merge-bot[bot] merged 2 commits into
openshift:mainfrom
sdminonne:CNTRLPLANE-3022/Dual-Stream-NodePool-API-add-osImageStream-to-spec-status
Jun 9, 2026
Merged

CNTRLPLANE-3022: Add osImageStream to NodePool spec and status#8675
openshift-merge-bot[bot] merged 2 commits into
openshift:mainfrom
sdminonne:CNTRLPLANE-3022/Dual-Stream-NodePool-API-add-osImageStream-to-spec-status

Conversation

@sdminonne

@sdminonne sdminonne commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

What this PR does / why we need it:

Phase 0 of the Dual-Stream RHEL 9/10 NodePool enhancement (OCPSTRAT-3014 / CNTRLPLANE-3022). Adds osImageStream fields to NodePool spec and status, gated behind the OSStreams feature gate (TechPreview).

  • New type OSImageStreamReference with name field restricted to rhel-9 or rhel-10 via Enum validation
  • NodePoolSpec.osImageStream — allows users to override the default OS images for a NodePool by referencing a known OSImageStream. A field-level CEL rule prevents downgrades from rhel-10 to rhel-9
  • NodePoolStatus.osImageStream — reports the observed OS stream on nodes in the pool
  • OSStreams feature gate registered in all 4 feature gate files (disabled in Default, enabled in TechPreview)
  • Envtest CEL validation tests covering enum validation, downgrade prevention, and upgrade/add scenarios

Which issue(s) this PR fixes:

Fixes CNTRLPLANE-3022

Special notes for your reviewer:

The CEL downgrade-prevention rule is placed at the field level (on osImageStream) rather than the struct level (on NodePoolSpec). This ensures the rule is co-gated with the field itself and avoids CRD installation failures in the Default variant where the field doesn't exist in the schema.

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Test plan

  • make api — CRDs regenerated correctly (field present only in TechPreview/CustomNoUpgrade)
  • make update — Full dependency sync passed
  • make verify — Linting and generation passed
  • make test-envtest-ocp — All envtests passed across K8s 1.30–1.35
  • make test — All unit tests passed
  • make run-gitlint — Commit message validated

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Node pools can opt into an OS image stream to select supported RHEL streams with validation to prevent unsupported downgrades.
  • Chores
    • Feature gate "OSStreams" added to defaults and enabled for TechPreview variants; relevant default/TechPreview configurations updated.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 4, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 4, 2026
@openshift-ci

openshift-ci Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot

openshift-ci-robot commented Jun 4, 2026

Copy link
Copy Markdown

@sdminonne: This pull request references CNTRLPLANE-3022 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

Phase 0 of the Dual-Stream RHEL 9/10 NodePool enhancement (OCPSTRAT-3014 / CNTRLPLANE-3022). Adds osImageStream fields to NodePool spec and status, gated behind the OSStreams feature gate (TechPreview).

  • New type OSImageStreamReference with name field restricted to rhel-9 or rhel-10 via Enum validation
  • NodePoolSpec.osImageStream — allows users to override the default OS images for a NodePool by referencing a known OSImageStream. A field-level CEL rule prevents downgrades from rhel-10 to rhel-9
  • NodePoolStatus.osImageStream — reports the observed OS stream on nodes in the pool
  • OSStreams feature gate registered in all 4 feature gate files (disabled in Default, enabled in TechPreview)
  • Envtest CEL validation tests covering enum validation, downgrade prevention, and upgrade/add scenarios

Which issue(s) this PR fixes:

Fixes CNTRLPLANE-3022

Special notes for your reviewer:

The CEL downgrade-prevention rule is placed at the field level (on osImageStream) rather than the struct level (on NodePoolSpec). This ensures the rule is co-gated with the field itself and avoids CRD installation failures in the Default variant where the field doesn't exist in the schema.

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Test plan

  • make api — CRDs regenerated correctly (field present only in TechPreview/CustomNoUpgrade)
  • make update — Full dependency sync passed
  • make verify — Linting and generation passed
  • make test-envtest-ocp — All envtests passed across K8s 1.30–1.35
  • make test — All unit tests passed
  • make run-gitlint — Commit message validated

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 20d1afd5-0ab3-48bb-a868-03cac446e690

📥 Commits

Reviewing files that changed from the base of the PR and between ee12e05 and e6eb131.

⛔ Files ignored due to path filters (6)
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/nodepools.hypershift.openshift.io/OSStreams.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/nodepools-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/nodepools-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • docs/content/reference/aggregated-docs.md is excluded by !docs/content/reference/aggregated-docs.md
  • docs/content/reference/api.md is excluded by !docs/content/reference/api.md
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/nodepool_types.go is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (1)
  • api/hypershift/v1beta1/nodepool_types.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • api/hypershift/v1beta1/nodepool_types.go

📝 Walkthrough

Walkthrough

This pull request adds support for OS image stream selection in HyperShift node pools. The change introduces a new OSStreams feature gate and associated API fields to NodePoolSpec and NodePoolStatus. A new OSImageStreamReference type allows specifying either rhel-9 or rhel-10 as the target OS stream. The implementation includes CEL validation to prevent downgrading from rhel-10 to rhel-9. The feature gate is enabled across four configuration profiles: Hypershift Default, Hypershift TechPreviewNoUpgrade, SelfManagedHA Default, and SelfManagedHA TechPreviewNoUpgrade.

Sequence Diagram(s)

sequenceDiagram
  participant ClusterAdmin
  participant FeatureGateConfig
  participant NodePoolAPI
  ClusterAdmin->>FeatureGateConfig: add/enable "OSStreams" entry (profiles)
  ClusterAdmin->>NodePoolAPI: create/update NodePool with osImageStream
  NodePoolAPI->>FeatureGateConfig: read "OSStreams" feature gate
  Note right of NodePoolAPI: apply CEL validation (prevent rhel-10 -> rhel-9)
  NodePoolAPI-->>ClusterAdmin: NodePoolStatus.osImageStream observed or defaulted
Loading
🚥 Pre-merge checks | ✅ 11
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding osImageStream field to NodePool spec and status, which aligns with the core modifications across nodepool_types.go and feature gate configurations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names added are static and deterministic. Ginkgo tests use descriptive names without dynamic values, generated suffixes, timestamps, UUIDs, or pod/node/namespace names.
Test Structure And Quality ✅ Passed Ginkgo tests meet all quality requirements: 6 single-responsibility tests, proper setup/cleanup, timeouts on cluster ops, meaningful assertion messages, consistent DescribeTable pattern.
Topology-Aware Scheduling Compatibility ✅ Passed PR adds only API type definitions (NodePool fields) and feature gate registrations. No deployment manifests, pod specs, scheduling constraints, affinity rules, or topology-dependent logic introduced.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No new Ginkgo e2e tests are added in this PR. The only new test file (nodepool_types_test.go) is a standard Go unit test using the testing package, not Ginkgo. The check is not applicable.
No-Weak-Crypto ✅ Passed PR adds feature gate and NodePool struct fields for OS image streaming; no weak cryptography, custom crypto, or insecure token comparisons detected.
Container-Privileges ✅ Passed PR contains only API type definitions and feature gate configurations; no Kubernetes container manifests with privileged settings were introduced.
No-Sensitive-Data-In-Logs ✅ Passed No logging of sensitive data found. New fields store only non-sensitive enum values (rhel-9/rhel-10); no logging references these fields or other sensitive information.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added area/api Indicates the PR includes changes for the API area/cli Indicates the PR includes changes for CLI area/documentation Indicates the PR includes changes for documentation and removed do-not-merge/needs-area labels Jun 4, 2026
@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8675 June 4, 2026 21:23 Inactive

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/hypershift/v1beta1/nodepool_types.go`:
- Around line 242-261: The CEL XValidation currently allows removing
osImageStream (self.name unset) to bypass the rhel-10 -> rhel-9 downgrade block;
update the validation expression on the OSImageStream field (OSImageStream in
nodepool_types.go) to require that when oldSelf.name == 'rhel-10' the new
self.name must exist and not equal 'rhel-9' (e.g. replace the rule with:
"!has(oldSelf.name) || oldSelf.name != 'rhel-10' || (has(self.name) && self.name
!= 'rhel-9')") so an unset self.name no longer permits a downgrade escape hatch.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 38e6fd98-eec7-43ca-b03a-e9f00daa8428

📥 Commits

Reviewing files that changed from the base of the PR and between 38f7b17 and 0b89acd.

⛔ Files ignored due to path filters (19)
  • api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !**/zz_generated*.go, !**/zz_generated*
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/zz_generated*
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/nodepools.hypershift.openshift.io/OSStreams.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • client/applyconfiguration/hypershift/v1beta1/nodepoolspec.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/nodepoolstatus.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/osimagestreamreference.go is excluded by !client/**
  • client/applyconfiguration/utils.go is excluded by !client/**
  • cmd/install/assets/crds/hypershift-operator/payload-manifests/featuregates/featureGate-Hypershift-Default.yaml is excluded by !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/payload-manifests/featuregates/featureGate-Hypershift-TechPreviewNoUpgrade.yaml is excluded by !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/payload-manifests/featuregates/featureGate-SelfManagedHA-Default.yaml is excluded by !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/payload-manifests/featuregates/featureGate-SelfManagedHA-TechPreviewNoUpgrade.yaml is excluded by !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/tests/nodepools.hypershift.openshift.io/featuregated.nodepools.osimagestream.testsuite.yaml is excluded by !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/nodepools-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/nodepools-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • docs/content/reference/aggregated-docs.md is excluded by !docs/content/reference/aggregated-docs.md
  • docs/content/reference/api.md is excluded by !docs/content/reference/api.md
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/nodepool_types.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !vendor/**, !**/vendor/**, !**/zz_generated*.go, !**/zz_generated*
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests.yaml is excluded by !vendor/**, !**/vendor/**, !**/zz_generated*
📒 Files selected for processing (5)
  • api/hypershift/v1beta1/featuregates/featureGate-Hypershift-Default.yaml
  • api/hypershift/v1beta1/featuregates/featureGate-Hypershift-TechPreviewNoUpgrade.yaml
  • api/hypershift/v1beta1/featuregates/featureGate-SelfManagedHA-Default.yaml
  • api/hypershift/v1beta1/featuregates/featureGate-SelfManagedHA-TechPreviewNoUpgrade.yaml
  • api/hypershift/v1beta1/nodepool_types.go

Comment thread api/hypershift/v1beta1/nodepool_types.go Outdated
@codecov

codecov Bot commented Jun 4, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 41.49%. Comparing base (f13c62d) to head (e6eb131).
⚠️ Report is 19 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8675      +/-   ##
==========================================
+ Coverage   41.43%   41.49%   +0.05%     
==========================================
  Files         756      756              
  Lines       93647    93648       +1     
==========================================
+ Hits        38802    38855      +53     
+ Misses      52124    52057      -67     
- Partials     2721     2736      +15     

see 2 files with indirect coverage changes

Flag Coverage Δ
cmd-support 34.87% <ø> (ø)
cpo-hostedcontrolplane 43.50% <ø> (ø)
cpo-other 43.17% <ø> (+0.42%) ⬆️
hypershift-operator 51.57% <ø> (ø)
other 31.64% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hypershift-jira-solve-ci

hypershift-jira-solve-ci Bot commented Jun 4, 2026

Copy link
Copy Markdown

Test Failure Analysis Complete

Job Information

  • Prow Job: codecov/project (Codecov coverage gate, not a Prow CI job)
  • Build ID: Check Run 79618384841
  • PR: #8675CNTRLPLANE-3022: Add osImageStream to NodePool spec and status
  • Result: failure — project coverage 41.43% (−0.01%) compared to base e25a87a

Test Failure Analysis

Error

codecov/project: 41.43% (-0.01%) compared to e25a87a
Project coverage decreased by 0.01% (41.44% → 41.43%).
Report is 12 commits behind head on main.

Summary

The codecov/project check failed because overall project coverage dropped by 0.01% (from 41.44% to 41.43%). This is not caused by the PR's own changescodecov/patch passed with "Coverage not affected" and "All modified and coverable lines are covered by tests." The PR only adds Go struct/type definitions and comments to nodepool_types.go, which contain no executable code. The coverage drop comes from 4 files with indirect coverage changes — unrelated files whose coverage shifted between the stale base commit (e25a87a, 12 commits behind main HEAD) and the PR head. This is a false positive caused by a stale base comparison, and codecov/project is not a required merge check (PR #8484 was previously merged despite the same failure).

Root Cause

The failure is a stale-base false positive in Codecov's project-level coverage gate. The specific chain:

  1. PR changes are coverage-neutral: The only non-generated Go file modified is api/hypershift/v1beta1/nodepool_types.go, with +37 lines of pure struct definitions (OSImageStreamReference, NodePoolSpec.OSImageStream field), type declarations, CEL validation annotations, and comments. These contain zero executable code paths — Codecov confirms this via the passing codecov/patch check.

  2. Base commit is stale: The Codecov report states it is "12 commits behind head on main." The comparison base e25a87a had 41.44% coverage, but main has since settled to ~41.43% (visible in PR CNTRLPLANE-3271: add External OIDC e2e tests for v2 framework #8674 which shows +0.00% compared to 38f7b17 at 41.43%, and PR CNTRLPLANE-3020: Adopt coreos/stream-metadata-go upstream library #8673 which passed despite -0.02% drop because the "adjusted base" was 41.43%).

  3. Indirect coverage changes cause the drop: The diff shows −5 hits and +10 misses across 93,647 lines in 4 unrelated files. These are coverage fluctuations from commits merged to main between e25a87a and the PR's HEAD — not from this PR's changes.

  4. No coverage: threshold configured in codecov.yml: The repo's codecov.yml has no coverage: section, so Codecov uses its default behavior: the project check fails on any decrease. With no configured threshold tolerance (e.g., threshold: 0.05%), even a 0.01% drop from a stale base triggers failure.

  5. Not a merge blocker: The codecov/project check is not a required status check for merging. PR OCPBUGS-86008: Gate Route watch on management cluster capability #8484 was previously merged with a codecov/project failure, confirming this is informational only.

Recommendations
  1. This failure can be safely ignored for merging — it is not a required check, and the PR's own code introduces zero coverage change (confirmed by the passing codecov/patch).

  2. Rebase the PR onto current main to pick up the latest base commit for Codecov comparison. This will likely resolve the stale-base comparison issue, as main is currently at ~41.43% and the PR is at 41.43%.

  3. Consider adding a coverage threshold to codecov.yml to prevent future false positives from stale-base comparisons:

    coverage:
      status:
        project:
          default:
            threshold: 0.05%

    This would allow up to 0.05% coverage fluctuation before failing, which prevents noise from indirect coverage changes.

Evidence
Evidence Detail
codecov/project result failure — 41.43% (−0.01%) compared to base e25a87a
codecov/patch result success — "Coverage not affected when comparing e25a87a...0b89acd"
All modified lines covered "All modified and coverable lines are covered by tests"
Stale base "Report is 12 commits behind head on main"
Coverage diff −5 hits, +10 misses, +3 lines (93,644 → 93,647) across 4 indirect files
PR file changes Only nodepool_types.go modified (struct/type defs, no executable code)
No threshold configured codecov.yml has no coverage: section — uses default (fail on any decrease)
Not a merge blocker PR #8484 merged with same codecov/project failure
PR #8673 comparison Passed at −0.02% because "adjusted base" was 41.43% — same as this PR's coverage

Add OSImageStreamReference type and osImageStream fields to NodePoolSpec
and NodePoolStatus, gated behind the OSStreams feature gate
(TechPreview). This is Phase 0 of the Dual-Stream RHEL 9/10 NodePool
enhancement, enabling per-pool OS stream selection in HyperShift.

The spec field allows users to override the default OS images for a
NodePool by referencing a known OSImageStream (rhel-9 or rhel-10).
A field-level CEL validation rule prevents downgrades from rhel-10
to rhel-9. The status field reports the observed OS stream on nodes.

Signed-off-by: Salvatore Dario Minonne <sminonne@redhat.com>
Commit-Message-Assisted-by: Claude (via Claude Code)
@sdminonne sdminonne force-pushed the CNTRLPLANE-3022/Dual-Stream-NodePool-API-add-osImageStream-to-spec-status branch from 0b89acd to ee12e05 Compare June 5, 2026 10:15
@sdminonne sdminonne marked this pull request as ready for review June 5, 2026 10:16
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 5, 2026
@openshift-ci openshift-ci Bot requested review from devguyio and muraee June 5, 2026 10:17
@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8675 June 5, 2026 10:19 Inactive
// When omitted, the pool is using the release version's default OS images.
// +openshift:enable:FeatureGate=OSStreams
// +optional
OSImageStream OSImageStreamReference `json:"osImageStream,omitempty,omitzero"`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use only omitzero, omitempty does nothing on structs

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks going to do.

@jparrill jparrill left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped some comments. Thanks!!

nit: I'd split the feature gate registration (OSStreams in the 4 featureGate YAMLs) into its own commit, separate from the new API type changes in nodepool_types.go. Easier to review and revert independently.

// in-place OS downgrades are not supported.
//
// +openshift:enable:FeatureGate=OSStreams
// +kubebuilder:validation:XValidation:rule="!has(oldSelf.name) || oldSelf.name != 'rhel-10' || self.name != 'rhel-9'",message="OS stream downgrade from rhel-10 to rhel-9 is not allowed; create a new NodePool instead"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CEL downgrade rule can be bypassed via field removal

The current CEL rule prevents the direct rhel-10 → rhel-9 transition, which is great. But since the field is +optional, a user can bypass it in two steps:

  1. Remove osImageStream entirely (passes — field is optional)
  2. Re-add it with name: rhel-9 (passes — !has(oldSelf.name) is true on the zero-value struct)

This is the "immutable + optional two-step bypass" pattern documented in api/AGENTS.md. The arch field handles this with a parent-level guard:

!has(oldSelf.arch) || has(self.arch)

Could you clarify the design intent? If removal should be blocked once set, we'd need something like this on NodePoolSpec:

+kubebuilder:validation:XValidation:rule="!has(oldSelf.osImageStream) || !has(oldSelf.osImageStream.name) || has(self.osImageStream)",message="osImageStream cannot be removed once set"

If removal is intentionally allowed (revert to release default), that's also a valid design — but it should be documented explicitly and tested.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I think removal should be blocked once set.

// +optional
Arch string `json:"arch,omitempty"`

// osImageStream specifies an OS stream to be used for nodes in this pool.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the "Required Tests" section in api/AGENTS.md, API type changes should include serialization compatibility tests that verify N-1/N+1 round-trip behavior. Could you add a test in nodepool_types_test.go following the existing patterns (e.g., TestNodePoolAutoScalingSerializationCompatibility)? It should verify that a NodePoolSpec with OSImageStream set serializes correctly and deserializes into an N-1 struct without the field (and vice versa).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no need to add this for a new optional field. Those tests are valuable when e.g. changing a filed type pointer <-> value

id: "subnet-any"
type: AWS
osImageStream:
name: "rhel-10"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test coverage looks solid for the core scenarios. One gap I'd suggest adding: a test for removing osImageStream from an existing NodePool that had it set. Something like:

- name: When removing osImageStream from a NodePool that had rhel-10 it should [succeed/fail]

The expected outcome depends on the design intent (see my other comment about the CEL bypass), but either way the behavior should be explicitly tested and documented.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a test case for this. Removal should be prevented once is set

- Remove omitempty from OSImageStream struct fields (spec and status),
  keeping only omitzero. omitempty has no effect on Go structs.
- Tighten the field-level CEL downgrade rule to also require
  has(self.name) when oldSelf.name is rhel-10, closing an escape hatch
  where unsetting self.name could bypass the downgrade check.

A parent-level CEL guard on NodePoolSpec to prevent osImageStream
removal once set was considered but cannot be implemented via CEL.
Because osImageStream is feature-gated (+openshift:enable:FeatureGate=
OSStreams), it only exists in the TechPreview/CustomNoUpgrade CRD
schema. A parent-level rule referencing osImageStream would be
included in the Default CRD variant where the field does not exist,
causing a CEL compilation error (undefined field) at CRD installation
time. This is a fundamental constraint of the feature-gate CRD
generation architecture: non-feature-gated CEL rules on a struct
cannot reference feature-gated fields. The "once set, cannot be
removed" constraint will need to be enforced at the controller level
in a follow-up.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sdminonne

sdminonne commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

@jparrill @enxebre

Regarding the "once set, cannot be removed" guard for osImageStream:

I investigated adding a parent-level CEL rule on NodePoolSpec following the same pattern as arch:

// +kubebuilder:validation:XValidation:rule="!has(oldSelf.osImageStream) || has(self.osImageStream)", message="osImageStream cannot be removed once set"

This doesn't work because osImageStream is feature-gated (+openshift:enable:FeatureGate=OSStreams). The codegen includes all NodePoolSpec-level CEL rules in every CRD variant, including Default. In the Default variant, osImageStream is stripped from the schema, so the Kubernetes CEL compiler rejects has(oldSelf.osImageStream) as an undefined field at CRD installation time — it's a compile-time error, not a runtime false.

This is a fundamental constraint: non-feature-gated CEL rules on a parent struct cannot reference feature-gated child fields. The arch pattern works only because arch is always present in the schema.

I've tightened the field-level CEL rule to close the rhel-10 → unset escape hatch (has(self.name) && self.name != 'rhel-9'), but the two-step bypass (remove field, re-add with rhel-9) still exists at the API level. If you concur with this, OK to open a JIRA card to track the work to enforce the "cannot be removed once set" constraint at the controller level?

@enxebre enxebre added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jun 9, 2026
@enxebre

enxebre commented Jun 9, 2026

Copy link
Copy Markdown
Member

/approve

@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enxebre, sdminonne

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 9, 2026
@sdminonne

Copy link
Copy Markdown
Contributor Author

/cc @cblecker @JoelSpeed for approval

@openshift-ci openshift-ci Bot requested review from JoelSpeed and cblecker June 9, 2026 09:47
@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

@sdminonne: GitHub didn't allow me to request PR reviews from the following users: for, approval.

Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

/cc @cblecker @JoelSpeed for approval

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jparrill

jparrill commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 9, 2026
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aks
/test e2e-aws
/test e2e-aws-upgrade-hypershift-operator
/test e2e-azure-self-managed
/test e2e-azure-v2-self-managed
/test e2e-kubevirt-aws-ovn-reduced
/test e2e-v2-aws
/test e2e-v2-gke

@sdminonne

Copy link
Copy Markdown
Contributor Author

/verified by @sdminonne

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Jun 9, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@sdminonne: This PR has been marked as verified by @sdminonne.

Details

In response to this:

/verified by @sdminonne

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sdminonne

Copy link
Copy Markdown
Contributor Author

new field is working

KUBECONFIG=sdminonne-dev-kubeconfig oc patch nodepool test-osimagestream -n clusters --type=merge -p '{"spec":{"osImageStream":{"name":"rhel-10"}}}'
nodepool.hypershift.openshift.io/test-osimagestream patched
KUBECONFIG=sdminonne-dev-kubeconfig oc get nodepool test-osimagestream -n clusters -o jsonpath='{.spec.osImageStream}' 2>&1)                                                                                              
  ⎿  {                                                                                                                                                                                                                           
       "name": "rhel-10"                                                                                                                                                                                                         
     }

KUBECONFIG=sdminonne-dev-kubeconfig oc patch nodepool test-osimagestream -n clusters --type=merge -p '{"spec":{"osImageStream":{"name":"rhel-9"}}}' 2>&1
The NodePool "test-osimagestream" is invalid: spec.osImageStream: Invalid value: "object": OS stream downgrade from rhel-10 to rhel-9 is not allowed; create a new NodePool instead

@cwbotbot

cwbotbot commented Jun 9, 2026

Copy link
Copy Markdown

Test Results

e2e-aws

e2e-aks

@cblecker

cblecker commented Jun 9, 2026

Copy link
Copy Markdown
Member

/uncc

@openshift-ci openshift-ci Bot removed the request for review from cblecker June 9, 2026 15:31
@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

@sdminonne: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/api Indicates the PR includes changes for the API area/cli Indicates the PR includes changes for CLI area/documentation Indicates the PR includes changes for documentation jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants