[vfio-users] VFIO-PCI with AARCH64 QEMU

Laszlo Ersek lersek at redhat.com
Tue Oct 25 21:44:17 UTC 2016

On 10/25/16 22:38, Haynal, Steve wrote:
> Hi All,
> Thanks for the help. I've started using explicit pflash drives
> instead of -bios. The firmware I was using was 15.12 from
> https://releases.linaro.org/components/kernel/uefi-linaro/15.12/release/qemu64/QEMU_EFI.fd.
> This was not producing any interesting debug output, so I built my
> own from git following these instructions
> https://wiki.linaro.org/LEG/UEFIforQEMU . This produces the output
> shown below. Once booted, the lspci output still looks the same as
> before. If I add acpi=force during boot or compile with -D
> PURE_ACPI_BOOT_ENABLE, the boot always hangs at the line " EFI stub:
> Exiting boot services and installing virtual address map..."

Regarding this hang, I first thought that you were running into the
recent guest kernel regression with ACPI boot, caused by commit
7ba5f605f3a0 ("arm64/numa: remove the limitation that cpu0 must bind to
node0"), now fixed by commit baa5567c18d1 ("arm64: kernel: numa: fix
ACPI boot cpu numa node mapping").

However, your guest kernel log states "4.4.0-45-generic", which could be
... ancient?... in arm64 space, I believe (unless you have a number of
backports on top).

> Boot completes without these options. Any ideas on why the memory
> regions show up as disabled in lspci, and why the large 512MB region
> is ignored?

The firmware log that you pasted contains:

> ProcessPciHost: Config[0x3F000000+0x1000000) Bus[0x0..0xF] Io[0x0+0x10000)@0x3EFF0000 Mem32[0x10000000+0x2EFF0000)@0x0 Mem64[0x8000000000+0x8000000000)@0x0
> RootBridge: PciRoot(0x0)
>   Support/Attr: 70001 / 70001
>     DmaAbove4G: Yes
> NoExtConfSpace: No
>      AllocAttr: 3 (CombineMemPMem Mem64Decode)
>            Bus: 0 - F
>             Io: 0 - FFFF
>            Mem: 10000000 - 3EFEFFFF
>     MemAbove4G: 8000000000 - FFFFFFFFFF
>           PMem: FFFFFFFFFFFFFFFF - 0

So these are the apertures (from QEMU) that the PCI Bus driver can
allocate BARs from. Then, let's see the enumeration:

> PCI Bus First Scanning
> PciBus: Discovered PCI @ [00|00|00]
> PciBus: Discovered PCI @ [00|09|00]
>    BAR[0]: Type =  Mem32; Alignment = 0xFFF;	Length = 0x1000;	Offset = 0x10
>    BAR[1]: Type =  Mem32; Alignment = 0x1FFFFFFF;	Length = 0x20000000;	Offset = 0x14

As Alex pointed out earlier, the 512MB BAR is a 32-bit one, so the
availability of the 512GB 64-bit MMIO address space is useless to it.

> PciHostBridge: SubmitResources for PciRoot(0x0)
>  Mem: Granularity/SpecificFlag = 32 / 00
>       Length/Alignment = 0x20100000 / 0x1FFFFFFF
> PciBus: HostBridge->SubmitResources() - Success
> PciHostBridge: NotifyPhase (AllocateResources)
>  RootBridge: PciRoot(0x0)
>   Mem: Base/Length/Alignment = FFFFFFFFFFFFFFFF/20100000/1FFFFFFF - Out Of Resource!

Here's the error message; the resources cannot be allocated correctly.

> [   62.977982] PCI host bridge /pcie at 10000000 ranges:
> [   62.979210]    IO 0x3eff0000..0x3effffff -> 0x00000000
> [   62.979562]   MEM 0x10000000..0x3efeffff -> 0x10000000
> [   62.979673]   MEM 0x8000000000..0xffffffffff -> 0x8000000000
> [   62.982088] pci-host-generic 3f000000.pcie: PCI host bridge to bus 0000:00
> [   62.982457] pci_bus 0000:00: root bus resource [bus 00-0f]
> [   62.982933] pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
> [   62.983047] pci_bus 0000:00: root bus resource [mem 0x10000000-0x3efeffff]
> [   62.983134] pci_bus 0000:00: root bus resource [mem 0x8000000000-0xffffffffff]
> [   62.992533] pci 0000:00:09.0: BAR 1: no space for [mem size 0x20000000]
> [   62.992669] pci 0000:00:09.0: BAR 1: failed to assign [mem size 0x20000000]

The same logged by the guest kernel.

Any reason the BAR can't be 64-bit?

> The 512MB memory region is quite a bit to reserve. We have Google's
> BigE hardware IP (see https://www.webmproject.org/hardware/vp9/bige/)
> running on an FPGA. This IP shares memory with the host and currently
> Google's driver allocates memory from this 512MB region when it must
> be shared between the application and IP on the FPGA. We want to test
> this IP on a virtual aarch64 platform and hence the device pass
> through and interest in vfio. Eventually, we'd like these passed
> through memory regions to appear as platform devices. Is it
> possible/recommended to hack the vfio infrastructure such that a PCI
> device on the host side appears as a platform device in an aarch64
> Qemu machine? We've done something similar with virtual device
> drivers. Should we stick with virtual device drivers?

I'm sure others can advise you on this better than I can, but platform
devices are almost universally disliked. PCI is enumerable and
standardized. QEMU seems to have a similar device called ivshmem
("Inter-VM Shared Memory PCI device"), and it's PCI.


More information about the vfio-users mailing list