[vfio-users] vfio not working with vanilla kernel 5.4.22

Thu Feb 27 11:35:21 UTC 2020

On Mon, 24 Feb 2020, at 6:58 PM, Bronek Kozicki wrote:
> On Mon, 24 Feb 2020, at 5:23 PM, Alex Williamson wrote:
> > On Mon, 24 Feb 2020 10:40:39 +0000
> > "Bronek Kozicki" <brok at spamcop.net> wrote:
> > 
> > > Heads up to anyone running the latest vanilla kernels - after upgrade
> > > from 5.4.21 to 5.4.22 one of my VMs lost access to a vfio1
> > > passed-through GPU. This was restored when I downgraded to 5.4.21 so
> > > the problem seems related to some patch in version 5.4.22
> > > 
> > > Also, when starting the VM, I noticed the hypervisor log flooded with
> > > messages "BAR 3: can't reserve" like:
> > > 
> > > Feb 24 09:49:38 gdansk.lan.incorrekt.net kernel: vfio-pci
> > > 0000:03:00.0: vfio_ecap_init: hiding ecap 0x1e at 0x258 Feb 24 09:49:38
> > > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0:
> > > vfio_ecap_init: hiding ecap 0x19 at 0x900 Feb 24 09:49:38
> > > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: BAR 3: can't
> > > reserve [mem 0xc0000000-0xc1ffffff 64bit pref] Feb 24 09:49:38
> > > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: No more image
> > > in the PCI ROM Feb 24 09:51:43 gdansk.lan.incorrekt.net kernel:
> > > vfio-pci 0000:03:00.0: BAR 3: can't reserve [mem
> > > 0xc0000000-0xc1ffffff 64bit pref] Feb 24 09:51:43
> > > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: BAR 3: can't
> > > reserve [mem 0xc0000000-0xc1ffffff 64bit pref] Feb 24 09:51:43
> > > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: BAR 3: can't
> > > reserve [mem 0xc0000000-0xc1ffffff 64bit pref] Feb 24 09:51:43
> > > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: BAR 3: can't
> > > reserve [mem 0xc0000000-0xc1ffffff 64bit pref] Feb 24 09:51:43
> > > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: BAR 3: can't
> > > reserve [mem 0xc0000000-0xc1ffffff 64bit pref]
> > > 
> > > journalctl -b-2 | grep "vfio-pci 0000:03:00.0: BAR 3: can't reserve"
> > > | wc -l 2609
> > > 
> > > Finally, when shutting down the VM I observed kernel panic on the
> > > hypervisor:
> > > 
> > > [  873.831301] Kernel panic - not syncing: Timeout: Not all CPUs
> > > entered broadcast exception handler [  874.874008] Shutting down cpus
> > > with NMI [  874.888189] Kernel Offset: 0x0 from 0xffffffff81000000
> > > (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [
> > > 875.074319] Rebooting in 30 seconds..
> > 
> > Tried v5.4.22, not getting anything similar.  Potentially there's a
> > driver activated in this kernel that wasn't previously on your system
> > and it's attached itself to part of your device.  Look in /proc/iomem
> > to see what it might be and disable it.  Thanks,
> > 
> > Alex 
> 
> 
> Thank you Alex. One more thing which might be relevant: my system has 
> two identical GPUs (Quadro  M5000), each in its own IOMMU group, and 
> two VMs each using one of these GPUs. One of the VMs is Windows 10 and 
> I think it is configured for MSI-X, the other is Ubuntu Biopic with 
> stable nvidia drivers.
> 
> I will try to find more debugging information when I get home, but 
> perhaps above will allow you to reproduce. 

Some more information

My system has 2 Xeon CPUs  E5-2667 v2, each with 8 cores and 16 threads (total 32 threads over 2 sockets). The motherboard is Supermicro X9DA7. Despite 2 GPUs attached, the machine is headless with ttyS0 for control - both GPUs are dedicated for virtual machines.

There is 128GB of ECC RAM, shared between small number of VMs and ZFS filesystems. 80GB is reserved in hugepages for the VMs, 20GB is reserved for ZFS cache. The kernel options are my own and unlikely to be very good (happy to take feedback); I use the same kernel package for both hypervisor and for one of the virtual machines, so some kernel options enabled only make sense in a VM. I need CONFIG_PREEMPT_VOLUNTARY and CONFIG_TREE_RCU for ZFS, I do not care about Xen, legacy hardware or some kernel debugging options (although perhaps I should).

I did get more of that on my main computer (including dmesg logs, pcie topology etc), but because of a kernel panic (same as seen earlier, while trying to reproduce the bug) its root filesystem is currently not in a good state and I am unfortunately too busy at the moment to fix it and access this data. Will send more over the weekend, assuming that fixing my computer wont take very long.

B.

-- 
  Bronek Kozicki
  brok at spamcop.net