[vfio-users] VFIO-PCI with AARCH64 QEMU

Laszlo Ersek lersek at redhat.com
Thu Oct 27 07:28:03 UTC 2016

On 10/27/16 02:24, Haynal, Steve wrote:
> Hi All,
> I was able to enable both memory regions but my test program did not
> work on aarch64 as it does on x86. The driver is an UIO driver and it
> fails when it can't find resource0 in
> /sys/bus/pci/devices/0000:00:09.0. In the x86 guest, I see resource0
> and resource1 in that directory. In the aarch64 guest, there is no
> resourceN. Is this related?
> http://stackoverflow.com/questions/38921463/linux-kernel-4-7-arch-arm64-does-not-create-resource0-file-in-sys-bus-pci-d

It seems related, yes. This is the (partial) call stack that creates the
resource%d files:

  pci_create_resource_files() [drivers/pci/pci-sysfs.c]
    pci_create_attr()         [drivers/pci/pci-sysfs.c]

However, if the platform doesn't define HAVE_PCI_MMAP, then
pci_create_resource_files() does nothing.

See also in "Documentation/filesystems/sysfs-pci.txt" (rewrapped here):

> Supporting PCI access on new platforms
> --------------------------------------
> In order to support PCI resource mapping as described above, Linux
> platform code must define HAVE_PCI_MMAP and provide a
> pci_mmap_page_range function. Platforms are free to only support
> subsets of the mmap functionality, but useful return codes should be
> provided.

While "arch/arm/include/asm/pci.h" defines HAVE_PCI_MMAP, and declares
the pci_mmap_page_range() function, "arch/arm64/include/asm/pci.h" does

The patch linked in the stackoverflow question aimed to add this
functionality (it seems), but apparently it hasn't been accepted.

You can find the discussion here:

[PATCH v2] arm64: pci: add support for pci_mmap_page_range

The problem seems to be that defining HAVE_PCI_MMAP exposes two sets of
pseudo-files, one set under sysfs, and another set under /proc/bus/pci/.
The latter is considered legacy / deprecated / ugly, and should be
avoided (apparently), but for that, the generic PCI code will have to be


- From: Arnd Bergmann <arnd at xxxxxxxx>
- Date: Mon, 18 Apr 2016 17:00:49 +0200

> The problem is that once we allow mmap() on proc/bus/pci/*/*,
> it becomes much harder to prove that we are able to remove it
> again without breaking stuff that worked.
> We have to decouple the sysfs interface from the procfs interface
> before we allow the former.

On 10/27/16 02:24, Haynal, Steve wrote:
> Any ideas on how I can have resource0 and resource1 populated? 

The kernel feature looks incomplete ATM in arm64.

> The memory regions still show up as disabled and I must enable them with setpci.

I think that's because neither the firmware nor the kernel has a driver
for this device. Minimally in UEFI, it is the given UEFI_DRIVER's
responsibility to toggle mem/io decoding in the command register of the
device when the driver binds the device. No driver -- no decoding
enabled. I assume it's the same for the probe functions of kernel device

> The pci-related portion of the kernel log is below as well as lspci
> output. I updated to kernel 4.8.4-040804. I see the same behavior
> with 4.4.0 or 4.8.4.
> [    5.473692] pci_hotplug: PCI Hot Plug PCI Core version: 0.5
> [    5.473848] pciehp: PCI Express Hot Plug Controller Driver version: 0.4
> [    5.475878] OF: PCI: host bridge /pcie at 10000000 ranges:
> [    5.476320] OF: PCI:    IO 0x3eff0000..0x3effffff -> 0x00000000
> [    5.476616] OF: PCI:   MEM 0x10000000..0x3efeffff -> 0x10000000
> [    5.476678] OF: PCI:   MEM 0x8000000000..0xffffffffff -> 0x8000000000
> [    5.477429] pci-host-generic 3f000000.pcie: ECAM at [mem 0x3f000000-0x3fffffff] for [bus 00-0f]
> [    5.479081] pci-host-generic 3f000000.pcie: PCI host bridge to bus 0000:00
> [    5.479354] pci_bus 0000:00: root bus resource [bus 00-0f]
> [    5.479460] pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
> [    5.479496] pci_bus 0000:00: root bus resource [mem 0x10000000-0x3efeffff]
> [    5.479524] pci_bus 0000:00: root bus resource [mem 0x8000000000-0xffffffffff]
> [    5.480416] pci 0000:00:00.0: [1b36:0008] type 00 class 0x060000
> [    5.483773] pci 0000:00:09.0: [10ee:7022] type 00 class 0x058000
> [    5.484100] pci 0000:00:09.0: reg 0x10: [mem 0x10800000-0x10800fff]
> [    5.484163] pci 0000:00:09.0: reg 0x14: [mem 0x10000000-0x107fffff]
> [    5.488027] pci 0000:00:09.0: BAR 1: assigned [mem 0x10000000-0x107fffff]
> [    5.488274] pci 0000:00:09.0: BAR 0: assigned [mem 0x10800000-0x10800fff]

Looks good to me.

> lspci -vvv
> 00:00.0 Host bridge: Red Hat, Inc. Device 0008
> 	Subsystem: Red Hat, Inc Device 1100
> 	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> 	Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> 00:09.0 Memory controller: Xilinx Corporation Device 7022
> 	Subsystem: Xilinx Corporation Device 0007
> 	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> 	Interrupt: pin A routed to IRQ 47
> 	Region 0: Memory at 10800000 (32-bit, non-prefetchable) [disabled] [size=4K]
> 	Region 1: Memory at 10000000 (32-bit, non-prefetchable) [disabled] [size=8M]
> 	Capabilities: [80] Power Management version 3
> 		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
> 		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> 	Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit+
> 		Address: 0000000000000000  Data: 0000
> 	Capabilities: [c0] Express (v2) Root Complex Integrated Endpoint, MSI 00
> 		DevCap:	MaxPayload 512 bytes, PhantFunc 0
> 			ExtTag- RBE+
> 		DevCtl:	Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
> 			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
> 			MaxPayload 256 bytes, MaxReadReq 512 bytes
> 		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
> 		DevCap2: Completion Timeout: Range B, TimeoutDis+, LTR-, OBFF Not Supported
> 		DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF Disabled
> 	Capabilities: [100 v2] Advanced Error Reporting
> 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UESvrt:	DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> 		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> 		CEMsk:	RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr+
> 		AERCap:	First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-

If there was an active matching kernel driver / module, I think it would
be listed here.


> -----Original Message-----
> From: Haynal, Steve 
> Sent: Tuesday, October 25, 2016 5:16 PM
> To: 'Laszlo Ersek'; Ard Biesheuvel
> Cc: Alex Williamson; vfio-users at redhat.com; Eric Auger
> Subject: RE: [vfio-users] VFIO-PCI with AARCH64 QEMU
> Hi All,
> I can enable the memory region with the "setpci -s 00:09.0 COMMAND=2:2" command. For proof of concept tests, I can get by with a shared memory size of 8MB, which should fit. I can also switch to 64-bit BARs. Both of these changes require resynthesizing the FPGA design overnight and may cause other problems, so I will report back if it works tomorrow.
> I am using the stock default kernel in the current Xenial aarch64 cloud image from Ubuntu (4.4.0-45). I will build a newer kernel.
> I also prefer the enumeration and standardization of a PCI device over a platform device, but some of our customers want the virtual environment to more closely match their final hardware target environment. I will take a look at ivshmem. 
> Thanks again to all for the help.
> Best Regards,
> Steve Haynal

More information about the vfio-users mailing list