pci_resource_assignment: add PCI resource assignment crate by jstarks · Pull Request #3570 · microsoft/openvmm

jstarks · 2026-05-27T00:23:04Z

When booting Linux directly (without UEFI), there is no firmware to enumerate the PCI bus, assign bus numbers, probe BAR sizes, or program BAR addresses and bridge windows. Devices behind PCIe root ports are invisible to the guest because their config space and MMIO regions are unconfigured. This crate fills that gap for Linux direct boot.

Longer term, this will also replace UEFI's PCI enumeration for all boot modes. Performing resource assignment in the VMM rather than in guest firmware lets us validate the PCI topology and MMIO layout before the guest ever runs, catching configuration errors (undersized apertures, impossible BAR placements, bus exhaustion) as clear VMM-side errors instead of mysterious guest boot failures.

The algorithm has two phases. Phase 1 walks the bus topology depth-first, assigning secondary and subordinate bus numbers to bridges and probing each device's BAR sizes. SR-IOV VF bus requirements are accounted for when setting subordinate bus numbers.

Phase 2 uses hierarchical bottom-up/top-down allocation. Each bridge computes the total aligned resource requirement of its subtree, then the host bridge carves its MMIO aperture among top-level devices, with each bridge subdividing its allocated range among children. This guarantees non-overlapping bridge windows and correct alignment at every level. BARs are split into two pools: non-prefetchable BARs go to low MMIO (32-bit bridge window), while 64-bit prefetchable BARs go to high MMIO (prefetchable bridge window, the only window capable of 64-bit addresses).

The crate is wired into openvmm_core for Linux direct boot: after loading the kernel, state units are temporarily started with VPs held so that config space accesses route through the chipset's MMIO dispatch, the assignment runs, then state units stop again before the guest resumes.

When booting Linux directly (without UEFI), there is no firmware to enumerate the PCI bus, assign bus numbers, probe BAR sizes, or program BAR addresses and bridge windows. Devices behind PCIe root ports are invisible to the guest because their config space and MMIO regions are unconfigured. This crate fills that gap for Linux direct boot. Longer term, this will also replace UEFI's PCI enumeration for all boot modes. Performing resource assignment in the VMM rather than in guest firmware lets us validate the PCI topology and MMIO layout before the guest ever runs, catching configuration errors (undersized apertures, impossible BAR placements, bus exhaustion) as clear VMM-side errors instead of mysterious guest boot failures. The algorithm has two phases. Phase 1 walks the bus topology depth-first, assigning secondary and subordinate bus numbers to bridges and probing each device's BAR sizes. SR-IOV VF bus requirements are accounted for when setting subordinate bus numbers. Phase 2 uses hierarchical bottom-up/top-down allocation. Each bridge computes the total aligned resource requirement of its subtree, then the host bridge carves its MMIO aperture among top-level devices, with each bridge subdividing its allocated range among children. This guarantees non-overlapping bridge windows and correct alignment at every level. BARs are split into two pools: non-prefetchable BARs go to low MMIO (32-bit bridge window), while 64-bit prefetchable BARs go to high MMIO (prefetchable bridge window, the only window capable of 64-bit addresses). The crate is wired into openvmm_core for Linux direct boot: after loading the kernel, state units are temporarily started with VPs held so that config space accesses route through the chipset's MMIO dispatch, the assignment runs, then state units stop again before the guest resumes.

Copilot

Pull request overview

Adds a new pci_resource_assignment crate that performs VMM-side PCI bus enumeration and BAR/bridge-window allocation for Linux direct boot (where no firmware is present to do it). The crate works purely through a PciConfigAccess trait. An ECAM-based implementation is wired into openvmm_core, which routes config-space cycles through Chipset MMIO dispatch and runs the assignment after kernel load (and after reset) for LoadMode::Linux VMs with non-empty PCIe host bridges.

Changes:

New crate with two-phase algorithm: DFS bus enumeration + BAR-size probing, then bottom-up sizing / top-down address assignment, with SR-IOV VF bus reservation and 32-bit vs 64-bit-prefetchable pool splitting.
New ecam_config_access module providing an ECAM PciConfigAccess impl via the chipset.
LoadedVm::assign_pci_resources runs the assignment with state units started and VPs held; integrated into initial boot and reset paths.

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
Cargo.toml	Adds `pci_resource_assignment` workspace member + dependency entry.
Cargo.lock	Lockfile entries for the new crate and `openvmm_core`.
vm/devices/pci/pci_resource_assignment/Cargo.toml	New crate manifest.
vm/devices/pci/pci_resource_assignment/src/lib.rs	Public API: `PciConfigAccess`, `AssignmentParams`, `assign_pci_resources`, errors, and crate-private result types.
vm/devices/pci/pci_resource_assignment/src/enumerate.rs	Phase 1: DFS bus enumeration, BAR size probing, and SR-IOV VF bus reservation.
vm/devices/pci/pci_resource_assignment/src/assign.rs	Phase 2: bottom-up sizing, top-down address allocation, and bridge/BAR programming.
vm/devices/pci/pci_resource_assignment/src/tests.rs	Unit tests with mock config space covering endpoints, bridges, switches, SR-IOV, errors, and alignment edge cases.
openvmm/openvmm_core/Cargo.toml	Adds dependency on the new crate.
openvmm/openvmm_core/src/worker/dispatch.rs	Stores `Arc<Chipset>` in `LoadedVmInner`; adds `assign_pci_resources` invoked after `load_firmware(false)` on boot and reset.
openvmm/openvmm_core/src/worker/dispatch/ecam_config_access.rs	New module implementing `PciConfigAccess` via chipset MMIO at ECAM addresses, and a helper that iterates all host bridges.

Copilot

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 4 comments.

github-actions · 2026-05-27T02:29:16Z

2 Petri tests failed (2 unstable)

Copilot

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 5 comments.

github-actions · 2026-05-27T03:40:31Z

2 Petri tests failed (2 unstable)

smalis-msft · 2026-05-27T15:45:16Z

+        let stop_guard = self.inner.partition_unit.temporarily_stop_vps().await;
+
+        // Start state units so device config space is accessible.
+        self.state_units.start().await;


All this state unit churn is going to create a lot of tracing, anything we can do to reduce it?

Hmm. I don't have any ideas. Do you?

I wonder if I can change the state unit code to only trace when the time exceeds some threshold.

jackschefer-msft · 2026-05-27T15:54:27Z

+///    resource requirement of its subtree.
+/// 2. **Top-down assignment**: The host bridge carves its aperture among
+///    top-level devices/bridges. Each bridge subdivides its allocated
+///    range among children, largest-first.


Will this support leaf devices providing static GPAs for reservation in the future?

I think that will require an alternate algorithm. I was thinking we'd probably say that either all devices in a root complex must have static BAR assignments, or none--then there's a pretty simple bottom-up bridge window assignment algorithm. But a mix is difficult, since there's no obvious greedy best-fit algorithm after you've poked a bunch of holes in the aperture.

Copilot AI review requested due to automatic review settings May 27, 2026 00:23

Copilot started reviewing on behalf of jstarks May 27, 2026 00:23 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

Comment thread vm/devices/pci/pci_resource_assignment/src/enumerate.rs Outdated

Comment thread vm/devices/pci/pci_resource_assignment/src/tests.rs Outdated

Comment thread openvmm/openvmm_core/src/worker/dispatch/ecam_config_access.rs Outdated

jstarks added 2 commits May 26, 2026 18:00

feedback

25e492a

mor feedback

88ac276

Copilot AI review requested due to automatic review settings May 27, 2026 01:02

Copilot started reviewing on behalf of jstarks May 27, 2026 01:02 View session

jstarks marked this pull request as ready for review May 27, 2026 01:03

jstarks requested a review from a team as a code owner May 27, 2026 01:03

Copilot AI reviewed May 27, 2026

View reviewed changes

jstarks added 2 commits May 26, 2026 19:20

feedback

f6a64bf

mor feedback

268d478

jstarks requested a review from Copilot May 27, 2026 02:24

Copilot started reviewing on behalf of jstarks May 27, 2026 02:25 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

jstarks added 2 commits May 26, 2026 22:02

red

b7cbf53

green

24919a8

smalis-msft reviewed May 27, 2026

View reviewed changes

jackschefer-msft reviewed May 27, 2026

View reviewed changes

jackschefer-msft approved these changes May 28, 2026

View reviewed changes

jstarks merged commit 24d0e18 into microsoft:main May 28, 2026
67 checks passed

jstarks deleted the pre-alloc branch May 28, 2026 21:17

Conversation

jstarks commented May 27, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

smalis-msft May 27, 2026

Choose a reason for hiding this comment

Uh oh!

jstarks May 28, 2026

Choose a reason for hiding this comment

Uh oh!

jstarks May 28, 2026

Choose a reason for hiding this comment

Uh oh!

jstarks May 28, 2026

Choose a reason for hiding this comment

Uh oh!

jackschefer-msft May 27, 2026

Choose a reason for hiding this comment

Uh oh!

jstarks May 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants