Skip to content

OCPBUGS-87991: validate additionalNetworks name format in KubeVirt NodePools#8710

Open
chdeshpa-hue wants to merge 1 commit into
openshift:mainfrom
chdeshpa-hue:OCPBUGS-87991-kubevirt-validate-additional-networks
Open

OCPBUGS-87991: validate additionalNetworks name format in KubeVirt NodePools#8710
chdeshpa-hue wants to merge 1 commit into
openshift:mainfrom
chdeshpa-hue:OCPBUGS-87991-kubevirt-validate-additional-networks

Conversation

@chdeshpa-hue

@chdeshpa-hue chdeshpa-hue commented Jun 10, 2026

Copy link
Copy Markdown

Summary

KubeVirt NodePools accept invalid additionalNetworks[].name values without any error at admission or reconcile time. The field requires Multus NAD references in <namespace>/<name> format, but no validation enforces this. Users who omit the namespace prefix get a HostedCluster that appears healthy (ValidPlatformConfig: True "All is well") while VMs silently fail to start with errors buried in the hosted control plane namespace.

This PR adds ValidateAdditionalNetworks() called from both:

  • Admission webhook → instant rejection at oc apply time
  • PlatformValidation() → reconcile-time safety net

Checks added:

  1. Format: exactly one / with non-empty namespace and name segments
  2. Uniqueness: reject duplicate network names
  3. Length: generated KubeVirt interface name must not exceed 63 characters

Problem

Without this fix, the debugging path requires navigating 6 layers before finding the root cause (VM conditions in the HC namespace set by virt-controller). NodePool status actively misleads with "All is well". Time to root cause ranges from 30 minutes (expert) to impossible (app team without HC namespace RBAC).

User Experience After Fix

$ oc apply -f nodepool.yaml
Error from server: admission webhook "nodepool.hypershift.openshift.io" denied the request:
  additionalNetworks[0].name "storage-net" must be in the format <namespace>/<name>

Testing

  • 7 new unit test cases covering invalid format, duplicates, and overlength
  • All 10 pre-existing test cases pass unchanged (no regression)
  • Live cluster validation on OCP 4.22.0 + CNV 4.21.8
  • gofmt and go vet clean

Backward Compatibility

  • No additionalNetworks configured → validation skipped entirely
  • Valid ns/name entries → pass all checks, zero behavior change
  • Existing clusters with valid configs → unaffected

Fixes: https://redhat.atlassian.net/browse/OCPBUGS-87991

Made with Cursor

Summary by CodeRabbit

  • New Features
    • Added validation for KubeVirt additional networks in NodePool configuration, ensuring network names follow proper formatting standards (namespace/name format), preventing duplicate networks, and validating that generated interface names do not exceed length limits.

…dePools

The additionalNetworks[].name field in KubeVirt NodePools requires Multus NAD
references in <namespace>/<name> format, but no validation enforces this. Invalid
names are silently accepted — NodePool reports ValidPlatformConfig=True "All is
well" — while VMs fail to start with errors buried 4 layers deep in the hosted
control plane namespace (VM conditions set by virt-controller).

Add ValidateAdditionalNetworks() called from both the admission webhook (instant
user feedback at oc apply time) and PlatformValidation() (reconcile safety net):
- Format: exactly one slash with non-empty namespace and name segments
- Uniqueness: reject duplicate network names
- Length: generated interface name must not exceed 63 characters

Fixes: https://redhat.atlassian.net/browse/OCPBUGS-87991
Co-authored-by: Cursor <cursoragent@cursor.com>
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jun 10, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@chdeshpa-hue: This pull request references Jira Issue OCPBUGS-87991, which is invalid:

  • expected the bug to target the "5.0.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

KubeVirt NodePools accept invalid additionalNetworks[].name values without any error at admission or reconcile time. The field requires Multus NAD references in <namespace>/<name> format, but no validation enforces this. Users who omit the namespace prefix get a HostedCluster that appears healthy (ValidPlatformConfig: True "All is well") while VMs silently fail to start with errors buried in the hosted control plane namespace.

This PR adds ValidateAdditionalNetworks() called from both:

  • Admission webhook → instant rejection at oc apply time
  • PlatformValidation() → reconcile-time safety net

Checks added:

  1. Format: exactly one / with non-empty namespace and name segments
  2. Uniqueness: reject duplicate network names
  3. Length: generated KubeVirt interface name must not exceed 63 characters

Problem

Without this fix, the debugging path requires navigating 6 layers before finding the root cause (VM conditions in the HC namespace set by virt-controller). NodePool status actively misleads with "All is well". Time to root cause ranges from 30 minutes (expert) to impossible (app team without HC namespace RBAC).

User Experience After Fix

$ oc apply -f nodepool.yaml
Error from server: admission webhook "nodepool.hypershift.openshift.io" denied the request:
 additionalNetworks[0].name "storage-net" must be in the format <namespace>/<name>

Testing

  • 7 new unit test cases covering invalid format, duplicates, and overlength
  • All 10 pre-existing test cases pass unchanged (no regression)
  • Live cluster validation on OCP 4.22.0 + CNV 4.21.8
  • gofmt and go vet clean

Backward Compatibility

  • No additionalNetworks configured → validation skipped entirely
  • Valid ns/name entries → pass all checks, zero behavior change
  • Existing clusters with valid configs → unaffected

Fixes: https://redhat.atlassian.net/browse/OCPBUGS-87991

Made with Cursor

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

This PR adds validation for KubeVirt NodePool additional networks. A new ValidateAdditionalNetworks function enforces that each network name follows the <namespace>/<name> format with non-empty parts, detects duplicates, and ensures the generated interface name does not exceed 63 characters. The validation integrates into the existing PlatformValidation function and is called by the NodePool webhook during create and update operations. Comprehensive test cases verify all validation constraints and error messages.

🚥 Pre-merge checks | ✅ 11
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically identifies the main change: adding validation for additionalNetworks name format in KubeVirt NodePools.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed Tests in this PR use standard Go table-driven testing with t.Run(), not Ginkgo. All test names are static descriptive strings with no dynamic values, timestamps, UUIDs, or generated identifiers.
Test Structure And Quality ✅ Passed Tests use table-driven pattern with Gomega; each case has single responsibility; no cluster operations; meaningful assertion messages; consistent with codebase patterns.
Topology-Aware Scheduling Compatibility ✅ Passed This PR adds validation for KubeVirt NodePool additional networks format—no scheduling constraints, deployment manifests, or topology assumptions introduced.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No Ginkgo e2e tests were added in this PR. The test additions are traditional Go unit tests using the testing package and table-driven patterns, not Ginkgo-based e2e tests.
No-Weak-Crypto ✅ Passed No weak cryptography, custom crypto implementations, or non-constant-time secret comparisons found. PR contains only string validation logic for KubeVirt network names.
Container-Privileges ✅ Passed PR contains only Go source code (webhook validation and KubeVirt network validation), no container/K8s manifests; no privileged settings found.
No-Sensitive-Data-In-Logs ✅ Passed PR contains no logging of sensitive data. Error messages only expose public network configuration metadata (namespace/name pairs) which are not secrets, passwords, tokens, or PII.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/platform/kubevirt PR/issue for KubeVirt (KubevirtPlatform) platform and removed do-not-merge/needs-area labels Jun 10, 2026
@openshift-ci openshift-ci Bot requested review from bryan-cox and enxebre June 10, 2026 07:07
@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: chdeshpa-hue
Once this PR has been reviewed and has the lgtm label, please assign muraee for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 80.00000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 41.51%. Comparing base (832b848) to head (53adcee).

Files with missing lines Patch % Lines
...controllers/hostedcluster/hostedcluster_webhook.go 25.00% 4 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8710      +/-   ##
==========================================
+ Coverage   41.50%   41.51%   +0.01%     
==========================================
  Files         758      758              
  Lines       93689    93719      +30     
==========================================
+ Hits        38882    38906      +24     
- Misses      52070    52074       +4     
- Partials     2737     2739       +2     
Files with missing lines Coverage Δ
...operator/controllers/nodepool/kubevirt/kubevirt.go 70.87% <100.00%> (+1.64%) ⬆️
...controllers/hostedcluster/hostedcluster_webhook.go 50.23% <25.00%> (-0.97%) ⬇️
Flag Coverage Δ
cmd-support 34.86% <ø> (ø)
cpo-hostedcontrolplane 43.59% <ø> (ø)
cpo-other 43.17% <ø> (ø)
hypershift-operator 51.60% <80.00%> (+0.03%) ⬆️
other 31.64% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
hypershift-operator/controllers/nodepool/kubevirt/kubevirt.go (1)

142-162: ⚡ Quick win

Consider validating namespace and name parts against Kubernetes DNS subdomain rules.

The current validation checks that parts are non-empty after trimming, but doesn't enforce Kubernetes DNS subdomain naming conventions (RFC 1123). This means names like "my ns/nad" (with space) or "my_ns/nad" (with underscore) pass validation but will cause VM startup failures when KubeVirt attempts to reference a non-existent NetworkAttachmentDefinition.

Kubernetes resource names must be lowercase alphanumeric with - and ., starting and ending with alphanumeric. Adding a regex check like ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$ for each part would catch these at admission time with clearer error messages.

♻️ Example validation with DNS subdomain rules
+import (
+	"regexp"
+)
+
+var dnsSubdomainRegex = regexp.MustCompile(`^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$`)
+
 func ValidateAdditionalNetworks(networks []hyperv1.KubevirtNetwork) error {
 	if len(networks) == 0 {
 		return nil
 	}
 	seen := make(map[string]bool, len(networks))
 	for idx, network := range networks {
 		parts := strings.Split(network.Name, "/")
 		if len(parts) != 2 || strings.TrimSpace(parts[0]) == "" || strings.TrimSpace(parts[1]) == "" {
 			return fmt.Errorf("additionalNetworks[%d].name %q must be in the format <namespace>/<name>", idx, network.Name)
 		}
+		namespace := strings.TrimSpace(parts[0])
+		name := strings.TrimSpace(parts[1])
+		if !dnsSubdomainRegex.MatchString(namespace) {
+			return fmt.Errorf("additionalNetworks[%d].name %q has invalid namespace part %q (must be a valid DNS subdomain)", idx, network.Name, namespace)
+		}
+		if !dnsSubdomainRegex.MatchString(name) {
+			return fmt.Errorf("additionalNetworks[%d].name %q has invalid name part %q (must be a valid DNS subdomain)", idx, network.Name, name)
+		}
 		if seen[network.Name] {
 			return fmt.Errorf("additionalNetworks[%d].name %q is duplicated", idx, network.Name)
 		}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hypershift-operator/controllers/nodepool/kubevirt/kubevirt.go` around lines
142 - 162, The ValidateAdditionalNetworks function currently only checks for
non-empty namespace/name parts; update it to enforce Kubernetes DNS
subdomain/name rules by validating each part (the namespace and the name from
strings.Split(network.Name, "/")) against the RFC1123 regex (e.g.
^[a-z0-9]([-a-z0-9]*[a-z0-9])?$), and return clear formatted errors like
"additionalNetworks[%d].name %q: namespace %q is invalid" or "…: name %q is
invalid" when a part fails; keep the existing duplicate check and
virtualMachineInterfaceName length check intact, and reference the same symbols
(ValidateAdditionalNetworks, virtualMachineInterfaceName) so the change is
localized to this function.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@hypershift-operator/controllers/nodepool/kubevirt/kubevirt.go`:
- Around line 142-162: The ValidateAdditionalNetworks function currently only
checks for non-empty namespace/name parts; update it to enforce Kubernetes DNS
subdomain/name rules by validating each part (the namespace and the name from
strings.Split(network.Name, "/")) against the RFC1123 regex (e.g.
^[a-z0-9]([-a-z0-9]*[a-z0-9])?$), and return clear formatted errors like
"additionalNetworks[%d].name %q: namespace %q is invalid" or "…: name %q is
invalid" when a part fails; keep the existing duplicate check and
virtualMachineInterfaceName length check intact, and reference the same symbols
(ValidateAdditionalNetworks, virtualMachineInterfaceName) so the change is
localized to this function.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 098416f3-4a27-47b2-8089-b95cf2b80aec

📥 Commits

Reviewing files that changed from the base of the PR and between 832b848 and 53adcee.

📒 Files selected for processing (3)
  • hypershift-operator/controllers/hostedcluster/hostedcluster_webhook.go
  • hypershift-operator/controllers/nodepool/kubevirt/kubevirt.go
  • hypershift-operator/controllers/nodepool/kubevirt/kubevirt_test.go

@hypershift-jira-solve-ci

Copy link
Copy Markdown

I now have all the information needed to produce the final report. The root cause is clear.

Test Failure Analysis Complete

Job Information

Test Failure Analysis

Error

1: CT1 Title does not start with one of fix, feat, chore, docs, style, refactor, perf, test, revert, ci, build: "OCPBUGS-87991: validate additionalNetworks name format in KubeVirt NodePools"
make: *** [Makefile:614: run-gitlint] Error 1

Summary

The gitlint check failed because the commit message title "OCPBUGS-87991: validate additionalNetworks name format in KubeVirt NodePools" does not follow the Conventional Commits format required by the repository. The .gitlint config enforces contrib-title-conventional-commits (rule CT1), which requires the title to begin with one of: fix, feat, chore, docs, style, refactor, perf, test, revert, ci, build. The title starts with a Jira key (OCPBUGS-87991:) instead.

Root Cause

The commit message title uses a JIRA-KEY: description format (OCPBUGS-87991: validate additionalNetworks name format in KubeVirt NodePools) instead of the required Conventional Commits format (type(scope): description).

The openshift/hypershift repository enforces Conventional Commits via gitlint's built-in contrib-title-conventional-commits rule, configured in .gitlint:

[general]
contrib=contrib-title-conventional-commits

[contrib-title-conventional-commits]
types = fix,feat,chore,docs,style,refactor,perf,test,revert,ci,build

The CT1 rule requires the title to match the pattern <type>(<optional-scope>): <description>, where <type> must be one of the 11 allowed prefixes. The Jira key OCPBUGS-87991 is not a valid conventional commit type, so gitlint rejects it.

This is purely a commit message formatting issue — it is not a code or test problem.

Recommendations

Amend the commit message to use the Conventional Commits format. Since this PR adds validation logic (a new feature), the title should use the feat type. The Jira key can be included in the commit body or as a footer:

feat(kubevirt): validate additionalNetworks name format in NodePools

The additionalNetworks[].name field in KubeVirt NodePools requires Multus NAD
references in <namespace>/<name> format, but no validation enforces this.

Add ValidateAdditionalNetworks() called from both the admission webhook
and PlatformValidation():
- Format: exactly one slash with non-empty namespace and name segments
- Uniqueness: reject duplicate network names
- Length: generated interface name must not exceed 63 characters

Fixes: https://redhat.atlassian.net/browse/OCPBUGS-87991

Alternatively, if this is considered a bug fix (enforcing a contract that should have existed), use fix instead of feat:

fix(kubevirt): validate additionalNetworks name format in NodePools

To amend: git commit --amend and update the title, then force-push.

Evidence
Evidence Detail
Failing step Run make run-gitlint (step 3)
Rule violated CT1contrib-title-conventional-commits
Actual title OCPBUGS-87991: validate additionalNetworks name format in KubeVirt NodePools
Required format <type>(<scope>): <description> where type ∈ {fix, feat, chore, docs, style, refactor, perf, test, revert, ci, build}
Config file .gitlint at repo root
Gitlint version 0.19.1
Title max length 120 characters (would pass)
Body max line length 140 characters (would pass)

@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@chdeshpa-hue: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/platform/kubevirt PR/issue for KubeVirt (KubevirtPlatform) platform jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants