[vfio-users] vfio not working with vanilla kernel 5.4.22

Wed Mar 18 13:34:06 UTC 2020

On Thu, 27 Feb 2020, at 11:35 AM, Bronek Kozicki wrote:
> On Mon, 24 Feb 2020, at 6:58 PM, Bronek Kozicki wrote:
> > On Mon, 24 Feb 2020, at 5:23 PM, Alex Williamson wrote:
> > > On Mon, 24 Feb 2020 10:40:39 +0000
> > > "Bronek Kozicki" <brok at spamcop.net> wrote:
> > > 
> > > > Heads up to anyone running the latest vanilla kernels - after upgrade
> > > > from 5.4.21 to 5.4.22 one of my VMs lost access to a vfio1
> > > > passed-through GPU. This was restored when I downgraded to 5.4.21 so
> > > > the problem seems related to some patch in version 5.4.22
> > > > 
> > > > Also, when starting the VM, I noticed the hypervisor log flooded with
> > > > messages "BAR 3: can't reserve" like:
> > > > 
> > > > Feb 24 09:49:38 gdansk.lan.incorrekt.net kernel: vfio-pci
> > > > 0000:03:00.0: vfio_ecap_init: hiding ecap 0x1e at 0x258 Feb 24 09:49:38
> > > > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0:
> > > > vfio_ecap_init: hiding ecap 0x19 at 0x900 Feb 24 09:49:38
> > > > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: BAR 3: can't
> > > > reserve [mem 0xc0000000-0xc1ffffff 64bit pref] Feb 24 09:49:38
> > > > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: No more image
> > > > in the PCI ROM Feb 24 09:51:43 gdansk.lan.incorrekt.net kernel:
> > > > vfio-pci 0000:03:00.0: BAR 3: can't reserve [mem
> > > > 0xc0000000-0xc1ffffff 64bit pref] Feb 24 09:51:43
> > > > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: BAR 3: can't
> > > > reserve [mem 0xc0000000-0xc1ffffff 64bit pref] Feb 24 09:51:43
> > > > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: BAR 3: can't
> > > > reserve [mem 0xc0000000-0xc1ffffff 64bit pref] Feb 24 09:51:43
> > > > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: BAR 3: can't
> > > > reserve [mem 0xc0000000-0xc1ffffff 64bit pref] Feb 24 09:51:43
> > > > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: BAR 3: can't
> > > > reserve [mem 0xc0000000-0xc1ffffff 64bit pref]
> > > > 
> > > > journalctl -b-2 | grep "vfio-pci 0000:03:00.0: BAR 3: can't reserve"
> > > > | wc -l 2609
> > > > 
> > > > Finally, when shutting down the VM I observed kernel panic on the
> > > > hypervisor:
> > > > 
> > > > [  873.831301] Kernel panic - not syncing: Timeout: Not all CPUs
> > > > entered broadcast exception handler [  874.874008] Shutting down cpus
> > > > with NMI [  874.888189] Kernel Offset: 0x0 from 0xffffffff81000000
> > > > (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [
> > > > 875.074319] Rebooting in 30 seconds..
> > > 
> > > Tried v5.4.22, not getting anything similar.  Potentially there's a
> > > driver activated in this kernel that wasn't previously on your system
> > > and it's attached itself to part of your device.  Look in /proc/iomem
> > > to see what it might be and disable it.  Thanks,
> > > 
> > > Alex 
> > 
> > 
> > Thank you Alex. One more thing which might be relevant: my system has 
> > two identical GPUs (Quadro  M5000), each in its own IOMMU group, and 
> > two VMs each using one of these GPUs. One of the VMs is Windows 10 and 
> > I think it is configured for MSI-X, the other is Ubuntu Biopic with 
> > stable nvidia drivers.
> > 
> > I will try to find more debugging information when I get home, but 
> > perhaps above will allow you to reproduce. 
> 
> Some more information
> 
> My system has 2 Xeon CPUs  E5-2667 v2, each with 8 cores and 16 threads 
> (total 32 threads over 2 sockets). The motherboard is Supermicro X9DA7. 
> Despite 2 GPUs attached, the machine is headless with ttyS0 for control 
> - both GPUs are dedicated for virtual machines.
> 
> There is 128GB of ECC RAM, shared between small number of VMs and ZFS 
> filesystems. 80GB is reserved in hugepages for the VMs, 20GB is 
> reserved for ZFS cache. The kernel options are my own and unlikely to 
> be very good (happy to take feedback); I use the same kernel package 
> for both hypervisor and for one of the virtual machines, so some kernel 
> options enabled only make sense in a VM. I need 
> CONFIG_PREEMPT_VOLUNTARY and CONFIG_TREE_RCU for ZFS, I do not care 
> about Xen, legacy hardware or some kernel debugging options (although 
> perhaps I should).
> 
> I did get more of that on my main computer (including dmesg logs, pcie 
> topology etc), but because of a kernel panic (same as seen earlier, 
> while trying to reproduce the bug) its root filesystem is currently not 
> in a good state and I am unfortunately too busy at the moment to fix it 
> and access this data. Will send more over the weekend, assuming that 
> fixing my computer wont take very long.

A followup after long pause (long story short, my computer would not boot anymore, not even to BIOS, so I had to rebuild it with a new motherboard). I found the following reported by nvidia-smi on one of the cards:

WARNING: infoROM is corrupted at gpu 0000:06:00.0

According to NVIDIA, that's either bad drivers (I am using LTS version 440) or, more likely, bad card. I guess it it the latter in my case.

B.

-- 
  Bronek Kozicki
  brok at incorrekt.com