Add Windows FIO VM workload support by ebattat · Pull Request #1229 · redhat-performance/benchmark-runner

ebattat · 2026-05-19T13:37:35Z

Summary

Add winfio_vm workload: Windows FIO storage benchmark running on Windows Server 2025 VMs via KubeVirt
Pod: uses PowerShell scripts SCP'd to the VM, FIO pre-installed at C:/tools/fio
Runs 4 block sizes (4k, 8k, 32k, 1M) x 4 IO patterns (read, write, randread, randwrite) for perf_ci
Results parsed from FIO JSON output and uploaded to Elasticsearch per block_size x io_operation
Add delay parameter to _wait_for_virtctl_ssh in virtctl.py to handle Windows UEFI boot timing
Add winfio_vm to Jenkins PerfCI pipeline with WINDOWS_SERVER_2025_URL
Golden files added for all run types (perf_ci, test_ci, func_ci, chaos_ci, release)

Test plan

Single VM (winfio_vm) with ODF - verify JSON results and SCP
Scale VMs - verify parallel execution
Golden file tests pass
Elasticsearch upload verification

🤖 Assisted-by: Claude Code

Summary by CodeRabbit

New Features
- Windows FIO workload with single and scaled multi-VM runs
- Automated FIO orchestration and local results aggregation into a consolidated summary
- Optional upload of benchmark results to Elasticsearch
- Grafana dashboard updated with FIO OS-version filter
Tests
- Added golden-file fixtures for Windows VM benchmark templates
Chores
- Added VM/DataVolume templates, packaging and default workload list updates, README/docs and CI adjustments

coderabbitai · 2026-05-19T13:40:47Z

Warning

Review limit reached

@ebattat, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 51 minutes and 34 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: f57a3ca8-abd0-4e33-b422-e44302eef962

📥 Commits

Reviewing files that changed from the base of the PR and between 8b85330 and 80ff2b9.

📒 Files selected for processing (46)

README.md
benchmark_runner/common/template_operations/templates/winfio/__init__.py
benchmark_runner/common/template_operations/templates/winfio/internal_data/__init__.py
benchmark_runner/common/template_operations/templates/winfio/internal_data/parse_fio_results_template.ps1
benchmark_runner/common/template_operations/templates/winfio/internal_data/run_fio_benchmark_template.ps1
benchmark_runner/common/template_operations/templates/winfio/internal_data/windows_dv_template.yaml
benchmark_runner/common/template_operations/templates/winfio/internal_data/winfio_vm_template.yaml
benchmark_runner/common/template_operations/templates/winfio/winfio_data_template.yaml
benchmark_runner/common/virtctl/virtctl.py
benchmark_runner/grafana/perf/jsonnet/main.libsonnet
benchmark_runner/main/environment_variables.py
benchmark_runner/workloads/winfio_vm.py
docs/source/podman.md
jenkins/PerfCI/03_PerfCI_Workloads_Deployment/Jenkinsfile
pyproject.toml
tests/unittest/benchmark_runner/common/template_operations/golden_files.py
tests/unittest/benchmark_runner/common/template_operations/golden_files/chaos_ci_winfio_vm_ODF_PVC_False/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/chaos_ci_winfio_vm_ODF_PVC_False/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/chaos_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/chaos_ci_winfio_vm_ODF_PVC_True/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/chaos_ci_winfio_vm_ODF_PVC_True/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/chaos_ci_winfio_vm_ODF_PVC_True/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_False/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_False/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_True/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_True/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_True/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/perf_ci_winfio_vm_ODF_PVC_False/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/perf_ci_winfio_vm_ODF_PVC_False/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/perf_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/perf_ci_winfio_vm_ODF_PVC_True/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/perf_ci_winfio_vm_ODF_PVC_True/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/perf_ci_winfio_vm_ODF_PVC_True/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/release_winfio_vm_ODF_PVC_False/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/release_winfio_vm_ODF_PVC_False/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/release_winfio_vm_ODF_PVC_False/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/release_winfio_vm_ODF_PVC_True/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/release_winfio_vm_ODF_PVC_True/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/release_winfio_vm_ODF_PVC_True/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/test_ci_winfio_vm_ODF_PVC_False/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/test_ci_winfio_vm_ODF_PVC_False/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/test_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/test_ci_winfio_vm_ODF_PVC_True/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/test_ci_winfio_vm_ODF_PVC_True/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/test_ci_winfio_vm_ODF_PVC_True/winfio_vm.yaml

📝 Walkthrough

Walkthrough

Adds a Windows FIO workload: PowerShell runner and parser, KubeVirt VM and CDI templates, a WinfioVM Python orchestrator that runs benchmarks and uploads results, Jenkins/workload registration and packaging updates, plus golden-file tests for multiple CI scenarios.

Changes

Windows FIO Workload Implementation

Layer / File(s)	Summary
FIO benchmark execution and result aggregation scripts `benchmark_runner/common/template_operations/templates/winfio/internal_data/run_fio_benchmark_template.ps1`, `benchmark_runner/common/template_operations/templates/winfio/internal_data/parse_fio_results_template.ps1`	PowerShell templates that run FIO on Windows producing per-run JSON and stderr, then aggregate per-run JSON files into `fio_summary.json` grouped by block size and `rw` with IOPS, BW, avg and p99 latencies.
Kubernetes VM and DataVolume templates `benchmark_runner/common/template_operations/templates/winfio/internal_data/windows_dv_template.yaml`, `benchmark_runner/common/template_operations/templates/winfio/internal_data/winfio_vm_template.yaml`, `benchmark_runner/common/template_operations/templates/winfio/winfio_data_template.yaml`	CDI DataVolume and KubeVirt VirtualMachine Jinja templates for Windows VMs (cloud-init for Windows SSH keys, Ceph RBD block PVC config) and top-level template metadata with run profiles.
WinfioVM workload orchestration class `benchmark_runner/workloads/winfio_vm.py`	Python class extending BootstormVM that creates/prepares/starts VMs, SCPs scripts (with verify/retries), runs FIO remotely via PowerShell, polls and retrieves `fio_summary.json`, normalizes records, and uploads results to Elasticsearch.
Workload registration and configuration `benchmark_runner/main/environment_variables.py`, `benchmark_runner/common/virtctl/virtctl.py`, `jenkins/PerfCI/03_PerfCI_Workloads_Deployment/Jenkinsfile`, `pyproject.toml`, `README.md`, `docs/source/podman.md`, `benchmark_runner/grafana/perf/jsonnet/main.libsonnet`, `tests/unittest/benchmark_runner/common/template_operations/golden_files.py`	Registers `winfio_vm` and maps its namespace; adjusts Virtctl SSH polling to use `OC.DELAY`; sets `WINDOWS_URL` from `WINDOWS_SERVER_MSSQL_2025_URL` for `winfio_vm`/`winfio_vm_scale` in Jenkins; includes winfio template assets in package-data; updates workload docs, Grafana templating, and the golden-files helper workload list.
Distribution packaging `pyproject.toml`	Includes `winfio` templates and PowerShell scripts in package-data.

Test Golden Files and Validation

Layer / File(s)	Summary
Chaos/Func/Perf/Release/Test CI golden files & helper `tests/unittest/benchmark_runner/common/template_operations/golden_files/{chaos,func,perf,release,test}_ci_winfio_vm_ODF_PVC_{False,True}/{namespace,windows_dv,winfio_vm}.yaml`, `tests/unittest/benchmark_runner/common/template_operations/golden_files.py`	Golden-file fixtures validating generated Namespace, DataVolume, and VirtualMachine manifests across multiple CI run types and ODF PVC variants; minor formatting change to golden-files helper workload list.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

redhat-performance/benchmark-runner#1224: Both PRs modify Jenkins Windows URL/scale handling used by Windows workloads.
redhat-performance/benchmark-runner#1214: Overlaps on Virtctl SSH/wait logic changes in benchmark_runner/common/virtctl/virtctl.py.

Suggested reviewers

RobertKrawitz
arpsharm

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title 'Add Windows FIO VM workload support' directly and accurately reflects the primary change: introducing a new winfio_vm workload for Windows FIO storage benchmarking on KubeVirt VMs.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch winfio-workload

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_True/windows_dv.yaml (1)
1-20: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Differentiate ODF_PVC=True from ODF_PVC=False in this fixture set.

Lines 1-20 are effectively the same as the false variant, so this func_ci golden does not exercise an ODF-specific edge path for DataVolume rendering.

As per coding guidelines, "Ensure test coverage for edge cases. Verify mocks and fixtures are properly scoped."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_True/windows_dv.yaml`
around lines 1 - 20, The fixture duplicates the non-ODF variant and doesn't
exercise ODF-specific rendering; update the DataVolume fixture (DataVolume kind
with metadata.name "windows-clone-dv" and spec.pvc fields) to reflect
ODF_PVC=True by changing the pvc.storageClassName to an ODF storage class used
in our tests, adjust pvc accessModes/volumeMode to the ODF-expected values if
different (e.g., ensure ReadWriteOnce/Filesystem or the cluster's ODF mode), and
make the metadata.name unique (e.g., append "-odf") or add an ODF-specific
annotation so the test vector triggers the ODF code path during rendering.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@benchmark_runner/common/template_operations/templates/winfio/internal_data/winfio_vm_template.yaml`:
- Around line 69-70: The tpm field is misplaced inside an interface object (near
networkInterfaceMultiqueue) and must be moved to the domain devices section;
remove the tpm: { } entry from the interfaces[] block and add it as
spec.template.spec.domain.devices.tpm (i.e., ensure tpm is a sibling of
interfaces under devices), keeping any empty/object value intact so the VM
manifest validates against KubeVirt's VirtualMachine schema.

In `@benchmark_runner/common/virtctl/virtctl.py`:
- Around line 128-136: The sleep before checking SSH is currently hardcoded to
OC.DELAY inside _wait_for_virtctl_ssh which forces the same boot delay for all
callers; change the method signature of _wait_for_virtctl_ssh to accept a
configurable delay parameter (e.g., initial_delay or delay, defaulting to
OC.DELAY) and replace the unconditional time.sleep(OC.DELAY) and loop increments
to use that parameter; update callers that need the Windows-specific extra wait
to pass the larger delay while leaving others to use the default, and keep
references to vm_name, namespace, key_path, and username unchanged so callers
can locate and update call sites easily.

In `@benchmark_runner/workloads/winfio_vm.py`:
- Around line 57-63: The helper _create_vm currently calls save_error_logs()
inside its try/except which runs in child worker processes; remove the
save_error_logs() call from _create_vm’s except block and any other phase helper
invoked by _run_parallel_phases (the other occurrences near the same helpers),
and let the top-level run() exception handler perform centralized error-log
collection instead; keep the except re-raising the original exception (raise
err) so the main process can catch it and call save_error_logs() there.
- Around line 65-83: The _prepare_vm and _run_fio methods currently swallow all
exceptions (catch Exception and only logger.warning), which allows the pipeline
to continue with missing data; modify these methods so failures are propagated:
in _prepare_vm (method name _prepare_vm and calls to _get_vm_name,
_virtctl.start_vm_sync, _virtctl._wait_for_virtctl_ssh, and
_scp_to_vm_with_verify) and in _run_fio replace the broad except Exception
handlers with either (a) catch specific exceptions you expect, log an error with
context, and re-raise the exception, or (b) log the error and raise a new
RuntimeError including the original exception as the cause, or (c) return a
clear failure value (e.g., False) and ensure the caller checks it and fails the
workload; ensure any logging uses logger.error and includes the vm identifier
and exception details so the pipeline fails fast instead of proceeding silently.

In `@jenkins/PerfCI/03_PerfCI_Workloads_Deployment/Jenkinsfile`:
- Around line 159-161: The mapping for the "winfio_vm" workload sets WINDOWS_URL
to WINDOWS_SERVER_MSSQL_2025_URL, causing the wrong Windows image to be used;
update the conditional in the Jenkinsfile so that when workload == "winfio_vm"
it assigns WINDOWS_URL = WINDOWS_SERVER_2025_URL instead of
WINDOWS_SERVER_MSSQL_2025_URL (keep the if (workload == "winfio_vm") block and
only change the assigned constant name).

In
`@tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml`:
- Around line 47-53: Move the misplaced tpm entry out of the network interface
object: remove the tpm: { } key under the interfaces list (where name: nic-0,
model: virtio, masquerade etc. are defined) and add it under the VM spec
devices.tpm block (create or update devices.tpm) so TPM is defined at top-level
devices.tpm rather than inside interfaces[]. Ensure the final YAML uses devices:
{ tpm: ... } and no tpm key remains inside interfaces.

In
`@tests/unittest/benchmark_runner/common/template_operations/golden_files/perf_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml`:
- Around line 47-53: The TPM key is incorrectly nested under the network
interface entry (devices.interfaces[] — e.g., the interface named "nic-0"); move
the tpm mapping out of the interface object into the device-level location
spec.template.spec.domain.devices.tpm so the TPM is defined under devices.tpm
(not devices.interfaces[].tpm), preserving its contents and removing the tpm key
from the interface.

In
`@tests/unittest/benchmark_runner/common/template_operations/golden_files/release_winfio_vm_ODF_PVC_False/winfio_vm.yaml`:
- Around line 47-52: The YAML places device-level fields under an interface
item; move networkInterfaceMultiqueue and tpm out of the interface entry and
into the domain devices section (spec.template.spec.domain.devices) instead of
under the interfaces list entry (the nic-0 interface). Ensure
networkInterfaceMultiqueue is defined at the same level as other device configs
under domain.devices and add a tpm entry under domain.devices (not
per-interface) so KubeVirt validation recognizes them.

In
`@tests/unittest/benchmark_runner/common/template_operations/golden_files/test_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml`:
- Around line 47-53: The fields networkInterfaceMultiqueue and tpm are
incorrectly nested under interfaces[]; move them up to
spec.template.spec.domain.devices so they live alongside other device settings.
Edit the resource template where interfaces: - name: nic-0 ... currently
contains networkInterfaceMultiqueue: true and tpm: { } — remove those two keys
from the interfaces entry and add them under spec.template.spec.domain.devices
(e.g., devices: { networkInterfaceMultiqueue: true, tpm: {}}) so the KubeVirt
schema is satisfied.

---

Outside diff comments:
In
`@tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_True/windows_dv.yaml`:
- Around line 1-20: The fixture duplicates the non-ODF variant and doesn't
exercise ODF-specific rendering; update the DataVolume fixture (DataVolume kind
with metadata.name "windows-clone-dv" and spec.pvc fields) to reflect
ODF_PVC=True by changing the pvc.storageClassName to an ODF storage class used
in our tests, adjust pvc accessModes/volumeMode to the ODF-expected values if
different (e.g., ensure ReadWriteOnce/Filesystem or the cluster's ODF mode), and
make the metadata.name unique (e.g., append "-odf") or add an ODF-specific
annotation so the test vector triggers the ODF code path during rendering.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: e5558b97-5b22-460a-9dab-b6a6797b2173

📥 Commits

Reviewing files that changed from the base of the PR and between 7fdb402 and d325b37.

📒 Files selected for processing (43)

benchmark_runner/common/template_operations/templates/winfio/__init__.py
benchmark_runner/common/template_operations/templates/winfio/internal_data/__init__.py
benchmark_runner/common/template_operations/templates/winfio/internal_data/parse_fio_results_template.ps1
benchmark_runner/common/template_operations/templates/winfio/internal_data/run_fio_benchmark_template.ps1
benchmark_runner/common/template_operations/templates/winfio/internal_data/windows_dv_template.yaml
benchmark_runner/common/template_operations/templates/winfio/internal_data/winfio_vm_template.yaml
benchmark_runner/common/template_operations/templates/winfio/winfio_data_template.yaml
benchmark_runner/common/virtctl/virtctl.py
benchmark_runner/main/environment_variables.py
benchmark_runner/workloads/winfio_vm.py
jenkins/PerfCI/03_PerfCI_Workloads_Deployment/Jenkinsfile
pyproject.toml
tests/unittest/benchmark_runner/common/template_operations/golden_files.py
tests/unittest/benchmark_runner/common/template_operations/golden_files/chaos_ci_winfio_vm_ODF_PVC_False/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/chaos_ci_winfio_vm_ODF_PVC_False/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/chaos_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/chaos_ci_winfio_vm_ODF_PVC_True/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/chaos_ci_winfio_vm_ODF_PVC_True/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/chaos_ci_winfio_vm_ODF_PVC_True/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_False/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_False/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_True/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_True/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_True/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/perf_ci_winfio_vm_ODF_PVC_False/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/perf_ci_winfio_vm_ODF_PVC_False/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/perf_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/perf_ci_winfio_vm_ODF_PVC_True/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/perf_ci_winfio_vm_ODF_PVC_True/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/perf_ci_winfio_vm_ODF_PVC_True/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/release_winfio_vm_ODF_PVC_False/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/release_winfio_vm_ODF_PVC_False/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/release_winfio_vm_ODF_PVC_False/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/release_winfio_vm_ODF_PVC_True/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/release_winfio_vm_ODF_PVC_True/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/release_winfio_vm_ODF_PVC_True/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/test_ci_winfio_vm_ODF_PVC_False/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/test_ci_winfio_vm_ODF_PVC_False/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/test_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/test_ci_winfio_vm_ODF_PVC_True/namespace.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/test_ci_winfio_vm_ODF_PVC_True/windows_dv.yaml
tests/unittest/benchmark_runner/common/template_operations/golden_files/test_ci_winfio_vm_ODF_PVC_True/winfio_vm.yaml

coderabbitai · 2026-05-19T13:46:47Z

+            networkInterfaceMultiqueue: true
+            tpm: { }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In KubeVirt VirtualMachine schema, where should TPM be defined: under spec.template.spec.domain.devices.tpm or inside spec.template.spec.domain.devices.interfaces[]?

💡 Result:

TPM in the KubeVirt VirtualMachine should be defined at spec.template.spec.domain.devices.tpm (not inside spec.template.spec.domain.devices.interfaces[]). For example, KubeVirt’s vTPM device configuration is shown under domain.devices.tpm in the VM template spec, where spec.template.spec.domain.devices.tpm specifies the vTPM device to add to the VM; optional fields like spec.template.spec.domain.devices.tpm.persistent control TPM state persistence after shutdown.[1]

Citations:

1: https://docs.okd.io/4.14/virt/virtual_machines/virt-using-vtpm-devices.html

🏁 Script executed:

find . -path "*benchmark_runner/common/template_operations/templates/winfio/internal_data/winfio_vm_template.yaml"

Repository: redhat-performance/benchmark-runner

Length of output: 180

🏁 Script executed:

cat -n ./benchmark_runner/common/template_operations/templates/winfio/internal_data/winfio_vm_template.yaml

Repository: redhat-performance/benchmark-runner

Length of output: 5799

tpm is incorrectly nested under interfaces[]; it must be a direct child of devices.

According to the KubeVirt VirtualMachine schema, tpm should be defined at spec.template.spec.domain.devices.tpm, not inside an interface object. This placement will cause VM manifest validation to fail.

Suggested fix

devices: blockMultiQueue: false + tpm: {} disks: @@ interfaces: - masquerade: {} model: virtio name: nic-0 networkInterfaceMultiqueue: true - tpm: { }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@benchmark_runner/common/template_operations/templates/winfio/internal_data/winfio_vm_template.yaml` around lines 69 - 70, The tpm field is misplaced inside an interface object (near networkInterfaceMultiqueue) and must be moved to the domain devices section; remove the tpm: { } entry from the interfaces[] block and add it as spec.template.spec.domain.devices.tpm (i.e., ensure tpm is a sibling of interfaces under devices), keeping any empty/object value intact so the VM manifest validates against KubeVirt's VirtualMachine schema.

coderabbitai · 2026-05-19T13:46:47Z

+        # wait before checking for SSH to avoid boot issues
+        time.sleep(OC.DELAY)
+        for i in range(0, timeout, OC.DELAY):
            if self._ssh_ready(vm_name=vm_name, namespace=namespace, key_path=key_path, username=username):
                logger.debug(f'SSH ready on {vm_name} after {i}s')
                return True
            if i > 0 and i % OC.DELAY == 0:
                logger.debug(f'Waiting for SSH on {vm_name}... ({i}s)')
-            time.sleep(self.SLEEP_TIME)
+            time.sleep(OC.DELAY)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

The new boot delay is hardcoded for all workloads instead of being caller-configurable.

This adds unconditional latency to every _wait_for_virtctl_ssh call and doesn’t match the goal of adding a Windows-specific delay knob.

💡 Suggested fix

- def _wait_for_virtctl_ssh(self, vm_name: str, namespace: str = '', key_path: str = '', username: str = '', timeout: int = OC.SHORT_TIMEOUT) -> bool: + def _wait_for_virtctl_ssh(self, vm_name: str, namespace: str = '', key_path: str = '', username: str = '', + timeout: int = OC.SHORT_TIMEOUT, delay: int = 0) -> bool: """Wait for SSH to be accessible on a VM via virtctl.""" - # wait before checking for SSH to avoid boot issues - time.sleep(OC.DELAY) + if delay > 0: + time.sleep(delay) for i in range(0, timeout, OC.DELAY):

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@benchmark_runner/common/virtctl/virtctl.py` around lines 128 - 136, The sleep before checking SSH is currently hardcoded to OC.DELAY inside _wait_for_virtctl_ssh which forces the same boot delay for all callers; change the method signature of _wait_for_virtctl_ssh to accept a configurable delay parameter (e.g., initial_delay or delay, defaulting to OC.DELAY) and replace the unconditional time.sleep(OC.DELAY) and loop increments to use that parameter; update callers that need the Windows-specific extra wait to pass the larger delay while leaving others to use the default, and keep references to vm_name, namespace, key_path, and username unchanged so callers can locate and update call sites easily.

[Claude Code] The initial time.sleep(OC.DELAY) before the loop is required — without it, Windows VMs get stuck because UEFI/OVMF boot takes time before SSH is reachable. Removing it causes the SSH poll to start too early and hang. The loop interval using OC.DELAY may be relaxed to self.SLEEP_TIME in a follow-up after testing.

@ebattat, understood — the initial time.sleep(OC.DELAY) is intentional to give Windows UEFI/OVMF time to complete its boot sequence before SSH probing begins, and removing it causes early polls that stall. That makes sense for the winfio_vm use case.

The remaining concern is that this delay now applies to all callers of _wait_for_virtctl_ssh, including non-Windows workloads that don't need the extra wait. A delay: int = 0 default parameter would let Windows callers opt-in while keeping existing callers unaffected — but I understand if you'd prefer to keep it simple for now and revisit the signature later.

Regarding the loop interval, noted that relaxing back to self.SLEEP_TIME is on your radar for a follow-up after testing. Would you like me to open a GitHub issue to track that?

(◕‿◕)っ🐇

✏️ Learnings added

Learnt from: ebattat Repo: redhat-performance/benchmark-runner PR: 1229 File: benchmark_runner/common/virtctl/virtctl.py:128-136 Timestamp: 2026-05-19T15:01:02.688Z Learning: In `benchmark_runner/common/virtctl/virtctl.py`, the initial `time.sleep(OC.DELAY)` added before the SSH polling loop in `_wait_for_virtctl_ssh` is intentional and required for Windows VM workloads (`winfio_vm`). Windows VMs using UEFI/OVMF take additional time to boot before SSH becomes reachable; removing this sleep causes the SSH poll to start too early and hang. The loop interval change from `self.SLEEP_TIME` to `OC.DELAY` may be relaxed back to `self.SLEEP_TIME` in a future follow-up after testing.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

You right, I will fix it

@arpsharm,
I have tested it several times with time.sleep(self.SLEEP_TIME), but it got stuck when running scale Win FIO VMs. Therefore, I prefer to keep the longer sleep interval of 30 seconds instead of 3 seconds.
Currently, we are not using this method for bootstorm workload, so the sleep time is not an issue.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

coderabbitai · 2026-05-19T13:46:47Z

+    def _create_vm(self, vm_num: str):
+        try:
+            self._oc.create_async(yaml=self._get_vm_yaml(vm_num))
+            self._oc.wait_for_vm_status(vm_name=self._get_vm_name(vm_num), status=VMStatus.Stopped)
+        except Exception as err:
+            self.save_error_logs()
+            raise err


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Avoid save_error_logs() inside phase helpers that run in parallel worker processes.

These helper methods are executed by _run_parallel_phases; calling save_error_logs() there risks the known multiprocessing SSL crash path. Let the top-level run() exception handler do centralized error-log handling.

💡 Suggested fix

def _create_vm(self, vm_num: str): try: self._oc.create_async(yaml=self._get_vm_yaml(vm_num)) self._oc.wait_for_vm_status(vm_name=self._get_vm_name(vm_num), status=VMStatus.Stopped) - except Exception as err: - self.save_error_logs() - raise err + except Exception: + raise @@ - except Exception as err: - self.save_error_logs() - raise err + except Exception: + raise @@ - except Exception as err: - self.save_error_logs() - raise err + except Exception: + raise

Based on learnings: "avoid calling _save_error_logs() from within multiprocessing child processes ... move/limit _save_error_logs() so it runs only in the main process."

Also applies to: 160-162, 168-170

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@benchmark_runner/workloads/winfio_vm.py` around lines 57 - 63, The helper _create_vm currently calls save_error_logs() inside its try/except which runs in child worker processes; remove the save_error_logs() call from _create_vm’s except block and any other phase helper invoked by _run_parallel_phases (the other occurrences near the same helpers), and let the top-level run() exception handler perform centralized error-log collection instead; keep the except re-raising the original exception (raise err) so the main process can catch it and call save_error_logs() there.

coderabbitai · 2026-05-19T13:46:47Z

+    def _prepare_vm(self, vm_num: str):
+        try:
+            vm_name = self._get_vm_name(vm_num)
+            self._virtctl.start_vm_sync(vm_name=vm_name)
+            if not self._virtctl._wait_for_virtctl_ssh(vm_name=vm_name, namespace=self.__namespace,
+                                                       key_path=self.__ssh_key_path, username=self.__username,
+                                                       timeout=OC.SHORT_TIMEOUT):
+                logger.warning(f'SSH never became ready on VM {vm_name}, skipping')
+                return
+
+            scripts = ['run_fio_benchmark.ps1', 'parse_fio_results.ps1']
+            for script in scripts:
+                self._scp_to_vm_with_verify(vm_name=vm_name,
+                                            local_path=os.path.join(self._run_artifacts_path, script),
+                                            remote_path=f'{self.__remote_dir}/{script}')
+        except Exception as err:
+            logger.warning(f'Failed to prepare VM {self._get_vm_name(vm_num)}: {err}')
+
+


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

_prepare_vm / _run_fio swallow hard failures, so runs can silently succeed with missing data.

Both methods catch exceptions and only log warnings; the pipeline then proceeds to collect/upload, which can end in partial/no results without failing the workload.

💡 Suggested fix

def _prepare_vm(self, vm_num: str): try: @@ - except Exception as err: - logger.warning(f'Failed to prepare VM {self._get_vm_name(vm_num)}: {err}') + except Exception as err: + logger.warning(f'Failed to prepare VM {self._get_vm_name(vm_num)}: {err}') + raise @@ def _run_fio(self, vm_num: str): try: @@ - except Exception as err: - logger.warning(f'FIO failed on VM {self._get_vm_name(vm_num)}: {err}') + except Exception as err: + logger.warning(f'FIO failed on VM {self._get_vm_name(vm_num)}: {err}') + raise

Also applies to: 84-94

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@benchmark_runner/workloads/winfio_vm.py` around lines 65 - 83, The _prepare_vm and _run_fio methods currently swallow all exceptions (catch Exception and only logger.warning), which allows the pipeline to continue with missing data; modify these methods so failures are propagated: in _prepare_vm (method name _prepare_vm and calls to _get_vm_name, _virtctl.start_vm_sync, _virtctl._wait_for_virtctl_ssh, and _scp_to_vm_with_verify) and in _run_fio replace the broad except Exception handlers with either (a) catch specific exceptions you expect, log an error with context, and re-raise the exception, or (b) log the error and raise a new RuntimeError including the original exception as the cause, or (c) return a clear failure value (e.g., False) and ensure the caller checks it and fails the workload; ensure any logging uses logger.error and includes the vm identifier and exception details so the pipeline fails fast instead of proceeding silently.

coderabbitai · 2026-05-19T13:46:47Z

+                                    if (workload == "winfio_vm") {
+                                        WINDOWS_URL = WINDOWS_SERVER_MSSQL_2025_URL
+                                    }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the non-MSSQL Windows image URL for winfio_vm

On Line 160, winfio_vm is mapped to WINDOWS_SERVER_MSSQL_2025_URL. For this workload, it should use WINDOWS_SERVER_2025_URL; otherwise scale/perf runs may boot the wrong base image.

Proposed fix

if (workload == "winfio_vm") { - WINDOWS_URL = WINDOWS_SERVER_MSSQL_2025_URL + WINDOWS_URL = WINDOWS_SERVER_2025_URL }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (workload == "winfio_vm") {

WINDOWS_URL = WINDOWS_SERVER_MSSQL_2025_URL

}

if (workload == "winfio_vm") {

WINDOWS_URL = WINDOWS_SERVER_2025_URL

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@jenkins/PerfCI/03_PerfCI_Workloads_Deployment/Jenkinsfile` around lines 159 - 161, The mapping for the "winfio_vm" workload sets WINDOWS_URL to WINDOWS_SERVER_MSSQL_2025_URL, causing the wrong Windows image to be used; update the conditional in the Jenkinsfile so that when workload == "winfio_vm" it assigns WINDOWS_URL = WINDOWS_SERVER_2025_URL instead of WINDOWS_SERVER_MSSQL_2025_URL (keep the if (workload == "winfio_vm") block and only change the assigned constant name).

coderabbitai · 2026-05-19T13:46:47Z

+          interfaces:
+          - masquerade: {}
+            model: virtio
+            name: nic-0
+            networkInterfaceMultiqueue: true
+            tpm: { }
+        features:


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In KubeVirt VirtualMachine API (kubevirt.io/v1), is spec.template.spec.domain.devices.interfaces[].tpmvalid, or must TPM be defined atspec.template.spec.domain.devices.tpm?

💡 Result:

In the KubeVirt VirtualMachine API (kubevirt.io/v1), TPM must be defined at spec.template.spec.domain.devices.tpm (not at spec.template.spec.domain.devices.interfaces[].tpm). The devices.interfaces field is for network interface configuration, while TPM is a top-level device under devices.tpm. KubeVirt user docs show a vTPM example using spec.template.spec.domain.devices.tpm (and nested properties like persistent) under domain.devices, not under domain.devices.interfaces[]. For example, OKD’s documentation explicitly states that spec.template.spec.domain.devices.tpm specifies the vTPM device to add to the VM [1] and provides an example manifest with domain.devices.tpm [2]. Separately, KubeVirt’s interface/network docs describe network interfaces as configured in spec.domain.devices.interfaces and list network-interface-related fields, with no mention of TPM belonging there [3]. Additionally, KubeVirt’s own discussion of the TPM API refers to spec.domain.devices.tpm as the VirtualMachineInstanceSpec attribute that can be implicitly disabled via nil (and later improved with an Enabled attribute), again indicating TPM is modeled under devices.tpm rather than inside interfaces[] [4][5]. So spec.template.spec.domain.devices.interfaces[].tpm is not a valid KubeVirt v1 API path; the correct field is spec.template.spec.domain.devices.tpm. Answer: must use spec.template.spec.domain.devices.tpm.

Citations:

1: No way of overriding the preference value of PreferredTPM from within a VirtualMachine kubevirt/kubevirt#14315

2: https://kubevirt.io/api-reference/v0.5.0/definitions.html

3: https://kubevirt.io/user-guide/compute/persistent_tpm_and_uefi_state/

4: https://kubevirt.io/user-guide/network/interfaces_and_networks/

5: https://kubevirt.io/user-guide/user_workloads/templates/

🏁 Script executed:

fd "winfio_vm.yaml" --type f

Repository: redhat-performance/benchmark-runner

Length of output: 1266

🏁 Script executed:

cat -n tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml | sed -n '40,60p'

Repository: redhat-performance/benchmark-runner

Length of output: 774

🏁 Script executed:

cat -n tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml | sed -n '30,65p'

Repository: redhat-performance/benchmark-runner

Length of output: 1229

Move TPM out of the network interface object.

Line 52 places tpm under interfaces[], which is not a valid interface field in KubeVirt VM spec and will cause rendered manifests to fail validation. TPM must be defined at the top-level devices.tpm, not within the interfaces array.

Proposed fix

devices: blockMultiQueue: false + tpm: {} disks: @@ - masquerade: {} model: virtio name: nic-0 networkInterfaceMultiqueue: true - tpm: { }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unittest/benchmark_runner/common/template_operations/golden_files/func_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml` around lines 47 - 53, Move the misplaced tpm entry out of the network interface object: remove the tpm: { } key under the interfaces list (where name: nic-0, model: virtio, masquerade etc. are defined) and add it under the VM spec devices.tpm block (create or update devices.tpm) so TPM is defined at top-level devices.tpm rather than inside interfaces[]. Ensure the final YAML uses devices: { tpm: ... } and no tpm key remains inside interfaces.

coderabbitai · 2026-05-19T13:46:47Z

+          interfaces:
+          - masquerade: {}
+            model: virtio
+            name: nic-0
+            networkInterfaceMultiqueue: true
+            tpm: { }
+        features:


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In KubeVirt kubevirt.io/v1 VirtualMachine schema, is spec.template.spec.domain.devices.interfaces[].tpmvalid, or must TPM be set atspec.template.spec.domain.devices.tpm?

💡 Result:

In KubeVirt’s kubevirt.io/v1 VirtualMachine API, TPM is configured under spec.template.spec.domain.devices.tpm (as a single tpm device), not under spec.template.spec.domain.devices.interfaces[].tpm. Authoritative docs show the vTPM device placed at spec.template.spec.domain.devices.tpm (including fields like persistent) in example VM manifests [1][2]. Separately, KubeVirt’s TPM API discussion refers to spec.domain.devices.tpm specifically (and how it can’t previously be disabled by user-provided nil) [3]. So, spec.template.spec.domain.devices.interfaces[].tpm is not the valid schema location for TPM; TPM must be set at spec.template.spec.domain.devices.tpm.

Citations:

1: https://docs.okd.io/4.14/virt/virtual_machines/virt-using-vtpm-devices.html

2: https://docs.okd.io/4.21/virt/managing_vms/virt-using-vtpm-devices.html

3: No way of overriding the preference value of PreferredTPM from within a VirtualMachine kubevirt/kubevirt#14315

Move tpm to the device level, not inside the NIC interface.

spec.template.spec.domain.devices.tpm is a device-level field in the KubeVirt kubevirt.io/v1 schema. At line 52 it is nested under an interface entry, which violates the valid schema placement. TPM must be configured at spec.template.spec.domain.devices.tpm, not under devices.interfaces[].tpm.

Proposed fix

devices: blockMultiQueue: false disks: @@ interfaces: - masquerade: {} model: virtio name: nic-0 networkInterfaceMultiqueue: true - tpm: { } + tpm: {}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unittest/benchmark_runner/common/template_operations/golden_files/perf_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml` around lines 47 - 53, The TPM key is incorrectly nested under the network interface entry (devices.interfaces[] — e.g., the interface named "nic-0"); move the tpm mapping out of the interface object into the device-level location spec.template.spec.domain.devices.tpm so the TPM is defined under devices.tpm (not devices.interfaces[].tpm), preserving its contents and removing the tpm key from the interface.

coderabbitai · 2026-05-19T13:46:47Z

+          interfaces:
+          - masquerade: {}
+            model: virtio
+            name: nic-0
+            networkInterfaceMultiqueue: true
+            tpm: { }


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In KubeVirt kubevirt.io/v1 VirtualMachine schema, are networkInterfaceMultiqueueandtpmdefined underspec.template.spec.domain.devices(not per-interface underinterfaces[])?

💡 Result:

Yes, both networkInterfaceMultiqueue and tpm are defined under spec.template.spec.domain.devices in the KubeVirt VirtualMachine schema, rather than being configured per-interface within the interfaces[] list [1][2]. 1. networkInterfaceMultiqueue: This is a boolean field located directly under spec.template.spec.domain.devices. When set to true, it enables multi-queue functionality for all compatible virtio network interfaces [1][3]. It is not an interface-specific setting [1]. 2. tpm: This is a device object defined under spec.template.spec.domain.devices.tpm [2]. It is a global device configuration for the virtual machine, not a property of individual network interfaces [2].

Citations:

1: https://kubevirt.io/user-guide/network/interfaces_and_networks/

2: https://docs.okd.io/4.21/virt/managing_vms/virt-using-vtpm-devices.html

3: https://kubevirt.io/user-guide/user_workloads/templates/

Move device-level fields out of the interface entry.

Lines 51–52 (networkInterfaceMultiqueue and tpm) are placed under an interface item, but in KubeVirt both are device-level fields that belong under spec.template.spec.domain.devices, not per-interface. networkInterfaceMultiqueue applies globally to all compatible interfaces, and tpm is a global device configuration. This schema placement will cause validation failures or silently ignored fields.

Proposed fix

devices: blockMultiQueue: false + networkInterfaceMultiqueue: true + tpm: {} disks: - disk: bus: virtio name: winfio-vm-root-disk-deadbeef dedicatedIOThread: false - disk: bus: virtio name: cloudinitdisk interfaces: - masquerade: {} model: virtio name: nic-0 - networkInterfaceMultiqueue: true - tpm: { }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unittest/benchmark_runner/common/template_operations/golden_files/release_winfio_vm_ODF_PVC_False/winfio_vm.yaml` around lines 47 - 52, The YAML places device-level fields under an interface item; move networkInterfaceMultiqueue and tpm out of the interface entry and into the domain devices section (spec.template.spec.domain.devices) instead of under the interfaces list entry (the nic-0 interface). Ensure networkInterfaceMultiqueue is defined at the same level as other device configs under domain.devices and add a tpm entry under domain.devices (not per-interface) so KubeVirt validation recognizes them.

coderabbitai · 2026-05-19T13:46:47Z

+          interfaces:
+          - masquerade: {}
+            model: virtio
+            name: nic-0
+            networkInterfaceMultiqueue: true
+            tpm: { }
+        features:


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In KubeVirt v1 VirtualMachine schema, are networkInterfaceMultiqueueandtpmvalid underspec.template.spec.domain.devices.interfaces[], or must they be under spec.template.spec.domain.devices?

💡 Result:

Yes, both are valid under spec.template.spec.domain.devices (not under domain.devices.interfaces[]): 1) spec.template.spec.domain.devices.networkInterfaceMultiqueue The KubeVirt user guide shows networkInterfaceMultiqueue configured under spec.template.spec.domain.devices (alongside other “devices” knobs), e.g. spec: template: spec: domain: devices: networkInterfaceMultiqueue: true. [1] Also, the same documentation explains that multi-queue affects “virtio” interfaces configured by name in spec.domain.devices.interfaces, which are separate from the devices.networkInterfaceMultiqueue toggle itself. [2] So networkInterfaceMultiqueue is not meant to be an element field inside spec.template.spec.domain.devices.interfaces[]. [1][2] 2) spec.template.spec.domain.devices.tpm KubeVirt’s vTPM docs show tpm under spec.template.spec.domain.devices: domain: devices: tpm: persistent: true. [3] So tpm is not an entry in spec.template.spec.domain.devices.interfaces[]; it belongs directly under spec.template.spec.domain.devices. [3] Conclusion: networkInterfaceMultiqueue and tpm must be under spec.template.spec.domain.devices; they are not under spec.template.spec.domain.devices.interfaces[].

Citations:

1: https://fabiand.github.io/user-guide/docs/latest/creating-virtual-machines/interfaces-and-networks.html

2: https://kubevirt.io/user-guide/network/interfaces_and_networks/

3: https://docs.okd.io/4.21/virt/managing_vms/virt-using-vtpm-devices.html

🏁 Script executed:

# Read the specific file to verify the issue at lines 47-53 cat -n "tests/unittest/benchmark_runner/common/template_operations/golden_files/test_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml" | sed -n '40,60p'

Repository: redhat-performance/benchmark-runner

Length of output: 774

Move networkInterfaceMultiqueue and tpm to spec.template.spec.domain.devices level.

Lines 51–52 place these device-level fields inside interfaces[]; KubeVirt schema requires them at the devices level. This can cause schema validation failure or silently drop intended configuration.

Proposed fix

devices: blockMultiQueue: false + networkInterfaceMultiqueue: true + tpm: {} disks: - disk: bus: virtio name: winfio-vm-root-disk-deadbeef dedicatedIOThread: false - disk: bus: virtio name: cloudinitdisk interfaces: - masquerade: {} model: virtio name: nic-0 - networkInterfaceMultiqueue: true - tpm: { }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unittest/benchmark_runner/common/template_operations/golden_files/test_ci_winfio_vm_ODF_PVC_False/winfio_vm.yaml` around lines 47 - 53, The fields networkInterfaceMultiqueue and tpm are incorrectly nested under interfaces[]; move them up to spec.template.spec.domain.devices so they live alongside other device settings. Edit the resource template where interfaces: - name: nic-0 ... currently contains networkInterfaceMultiqueue: true and tpm: { } — remove those two keys from the interfaces entry and add them under spec.template.spec.domain.devices (e.g., devices: { networkInterfaceMultiqueue: true, tpm: {}}) so the KubeVirt schema is satisfied.

arpsharm

SSH wait loop uses OC.DELAY (30s) step instead of SLEEP_TIME (3s). If SSH is ready at 5s it won't be checked until 30s. _wait_for_virtctl_ssh_password uses SLEEP_TIME — should be consistent.

arpsharm · 2026-05-20T15:13:03Z

/approve

arpsharm · 2026-05-25T19:31:03Z

/test ?

arpsharm · 2026-05-25T19:32:24Z

/test test-step-sysbench-pod

arpsharm · 2026-05-26T05:12:32Z

/test test-step-uperf-vm

coderabbitai

♻️ Duplicate comments (1)

jenkins/PerfCI/03_PerfCI_Workloads_Deployment/Jenkinsfile (1)

134-137: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Map winfio_vm/winfio_vm_scale to the non-MSSQL Windows 2025 image.

Both branches currently set WINDOWS_URL to WINDOWS_SERVER_MSSQL_2025_URL, which selects the wrong base image for winfio workloads.

Proposed fix

                                     if (workload == "winfio_vm_scale") {
                                         SCALE = WINFIO_SCALE
-                                        WINDOWS_URL = WINDOWS_SERVER_MSSQL_2025_URL
+                                        WINDOWS_URL = WINDOWS_SERVER_2025_URL
                                     }
@@
                                     if (workload == "winfio_vm") {
-                                        WINDOWS_URL = WINDOWS_SERVER_MSSQL_2025_URL
+                                        WINDOWS_URL = WINDOWS_SERVER_2025_URL
                                     }

Also applies to: 168-170

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@jenkins/PerfCI/03_PerfCI_Workloads_Deployment/Jenkinsfile` around lines 134 -
137, The winfio workload branches incorrectly assign WINDOWS_URL to
WINDOWS_SERVER_MSSQL_2025_URL; update both branches where workload ==
"winfio_vm_scale" and workload == "winfio_vm" (also the identical block around
lines 168-170) to set WINDOWS_URL to the non-MSSQL image constant
WINDOWS_SERVER_2025_URL while keeping SCALE = WINFIO_SCALE (or appropriate SCALE
assignment) unchanged so winfio uses the correct Windows 2025 image.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@jenkins/PerfCI/03_PerfCI_Workloads_Deployment/Jenkinsfile`:
- Around line 134-137: The winfio workload branches incorrectly assign
WINDOWS_URL to WINDOWS_SERVER_MSSQL_2025_URL; update both branches where
workload == "winfio_vm_scale" and workload == "winfio_vm" (also the identical
block around lines 168-170) to set WINDOWS_URL to the non-MSSQL image constant
WINDOWS_SERVER_2025_URL while keeping SCALE = WINFIO_SCALE (or appropriate SCALE
assignment) unchanged so winfio uses the correct Windows 2025 image.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: a87adeca-4776-4e3c-9cab-4dc5a140281e

📥 Commits

Reviewing files that changed from the base of the PR and between 709324a and 8b85330.

📒 Files selected for processing (5)

benchmark_runner/grafana/perf/jsonnet/main.libsonnet
benchmark_runner/main/environment_variables.py
jenkins/PerfCI/03_PerfCI_Workloads_Deployment/Jenkinsfile
pyproject.toml
tests/unittest/benchmark_runner/common/template_operations/golden_files.py

✅ Files skipped from review due to trivial changes (1)

pyproject.toml

arpsharm · 2026-05-26T08:22:27Z

/test ?

arpsharm · 2026-05-26T08:22:55Z

/test test-step-fio-pod

Assisted-by: Claude Code

openshift-ci · 2026-05-26T13:40:18Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: arpsharm, ebattat

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [ebattat]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci Bot requested a review from RobertKrawitz May 19, 2026 13:37

openshift-ci Bot added the approved label May 19, 2026

ebattat self-assigned this May 19, 2026

ebattat added the enhancement New feature or request label May 19, 2026

ebattat added this to benchmark-runner May 19, 2026

github-project-automation Bot moved this to In progress in benchmark-runner May 19, 2026

coderabbitai Bot reviewed May 19, 2026

View reviewed changes

arpsharm reviewed May 20, 2026

View reviewed changes

arpsharm approved these changes May 20, 2026

View reviewed changes

openshift-ci Bot assigned arpsharm May 20, 2026

openshift-ci Bot added lgtm needs-rebase and removed lgtm needs-rebase labels May 20, 2026

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

arpsharm approved these changes May 26, 2026

View reviewed changes

openshift-ci Bot added the lgtm label May 26, 2026

ebattat force-pushed the winfio-workload branch from 8b85330 to 6ae4a64 Compare May 26, 2026 12:40

openshift-ci Bot removed the lgtm label May 26, 2026

Add Windows FIO VM workload support

80ff2b9

Assisted-by: Claude Code

ebattat force-pushed the winfio-workload branch from 6ae4a64 to 80ff2b9 Compare May 26, 2026 12:42

arpsharm approved these changes May 26, 2026

View reviewed changes

openshift-ci Bot added the lgtm label May 26, 2026

ebattat merged commit c86835f into main May 27, 2026
10 of 16 checks passed

ebattat deleted the winfio-workload branch May 27, 2026 05:35

github-project-automation Bot moved this from In progress to Done in benchmark-runner May 27, 2026

Conversation

ebattat commented May 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ebattat May 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

ebattat May 20, 2026

Choose a reason for hiding this comment

Uh oh!

ebattat May 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

arpsharm left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arpsharm commented May 20, 2026

Uh oh!

arpsharm commented May 25, 2026

Uh oh!

arpsharm commented May 25, 2026

Uh oh!

arpsharm commented May 26, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

arpsharm commented May 26, 2026

Uh oh!

ebattat commented May 19, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 19, 2026 •

edited

Loading

coderabbitai Bot May 19, 2026 •

edited

Loading

arpsharm left a comment •

edited

Loading