[vfio-users] vfio not working with vanilla kernel 5.4.22

Bronek Kozicki brok at spamcop.net
Mon Feb 24 18:58:01 UTC 2020


On Mon, 24 Feb 2020, at 5:23 PM, Alex Williamson wrote:
> On Mon, 24 Feb 2020 10:40:39 +0000
> "Bronek Kozicki" <brok at spamcop.net> wrote:
> 
> > Heads up to anyone running the latest vanilla kernels - after upgrade
> > from 5.4.21 to 5.4.22 one of my VMs lost access to a vfio1
> > passed-through GPU. This was restored when I downgraded to 5.4.21 so
> > the problem seems related to some patch in version 5.4.22
> > 
> > Also, when starting the VM, I noticed the hypervisor log flooded with
> > messages "BAR 3: can't reserve" like:
> > 
> > Feb 24 09:49:38 gdansk.lan.incorrekt.net kernel: vfio-pci
> > 0000:03:00.0: vfio_ecap_init: hiding ecap 0x1e at 0x258 Feb 24 09:49:38
> > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0:
> > vfio_ecap_init: hiding ecap 0x19 at 0x900 Feb 24 09:49:38
> > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: BAR 3: can't
> > reserve [mem 0xc0000000-0xc1ffffff 64bit pref] Feb 24 09:49:38
> > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: No more image
> > in the PCI ROM Feb 24 09:51:43 gdansk.lan.incorrekt.net kernel:
> > vfio-pci 0000:03:00.0: BAR 3: can't reserve [mem
> > 0xc0000000-0xc1ffffff 64bit pref] Feb 24 09:51:43
> > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: BAR 3: can't
> > reserve [mem 0xc0000000-0xc1ffffff 64bit pref] Feb 24 09:51:43
> > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: BAR 3: can't
> > reserve [mem 0xc0000000-0xc1ffffff 64bit pref] Feb 24 09:51:43
> > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: BAR 3: can't
> > reserve [mem 0xc0000000-0xc1ffffff 64bit pref] Feb 24 09:51:43
> > gdansk.lan.incorrekt.net kernel: vfio-pci 0000:03:00.0: BAR 3: can't
> > reserve [mem 0xc0000000-0xc1ffffff 64bit pref]
> > 
> > journalctl -b-2 | grep "vfio-pci 0000:03:00.0: BAR 3: can't reserve"
> > | wc -l 2609
> > 
> > Finally, when shutting down the VM I observed kernel panic on the
> > hypervisor:
> > 
> > [  873.831301] Kernel panic - not syncing: Timeout: Not all CPUs
> > entered broadcast exception handler [  874.874008] Shutting down cpus
> > with NMI [  874.888189] Kernel Offset: 0x0 from 0xffffffff81000000
> > (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [
> > 875.074319] Rebooting in 30 seconds..
> 
> Tried v5.4.22, not getting anything similar.  Potentially there's a
> driver activated in this kernel that wasn't previously on your system
> and it's attached itself to part of your device.  Look in /proc/iomem
> to see what it might be and disable it.  Thanks,
> 
> Alex 


Thank you Alex. One more thing which might be relevant: my system has two identical GPUs (Quadro  M5000), each in its own IOMMU group, and two VMs each using one of these GPUs. One of the VMs is Windows 10 and I think it is configured for MSI-X, the other is Ubuntu Biopic with stable nvidia drivers.

I will try to find more debugging information when I get home, but perhaps above will allow you to reproduce. 


B.

-- 
  Bronek Kozicki
  brok at spamcop.net





More information about the vfio-users mailing list