[vfio-users] UEFI GOP regression in kernel 4.1.13 onwards when sequestering GPU

Joseph East eastyjr at gmail.com
Fri Jan 8 04:50:27 UTC 2016


Hi all,

I've recently installed a kernel upgrade package to 4.1.13 which seems to have broken GOP output switching (and thus Xorg) when I sequester my PCI-E GPU during initrd. This happens using pci-stub or vfio-pci and seems to have carried on all the way through to at least kernel 4.4-rc7 (haven't tried later versions). I've managed to narrow down the kernel patch which caused the problem, but I'm wondering if anyone else has this issue before reporting upstream as the failmode likely only impacts vfio users.

This is the patch in question

https://github.com/torvalds/linux/commit/bd69119

My normal boot sequence with kernel 4.1.12 or patch-less 4.1.13. Separate monitors are connected to integrated GPU and PCI-E GPU pre-boot.
* POST appears on both monitors, PCI-E GPU first then iGPU (despite UEFI setting...)
* GRUB appears on PCI-E GPU, linux kernel selected and loads initrd (iGPU blank at this point)
* GRUB prompt freezes on PCI-E GPU, boot splash screen appears on iGPU output
* KDE login appears on iGPU output, system acts as normal from this point
* xorg log shows successful binding to i915, no mention of radeon or attempts to bind to radeon
* dmesg indicates fbcon inteldrmfb is set as the primary device

Broken boot sequence, as above but with 4.1.13+ as-is
* POST appears on both monitors, PCI-E GPU first then iGPU (despite UEFI setting...)
* GRUB appears on PCI-E GPU, loads initrd
* PCI-E GPU goes blank, can see iGPU monitor light up but no output
* After waiting, must SSH into box to do anything
* xorg logs indicate trying to bind to radeon (sequestered), which is why it died
* dmesg shows no indication of switching fbcon, fb0 bound to an arbitrary EFI VGA frame buffer device which can be anything?

There are three work-arounds I've identified which restore functionality
1) Disconnect the monitor connected to the PCI-E GPU during boot, plug it in after the host has booted
2) Re-compile the kernel without the above patch (works on at least 4.1.13, haven't tried on others but there have been no commits to eboot.c since)
3) Custom xorg.conf to pin the primary monitor to the integrated GPU

My platform:

* Core i5 2500 w/ Intel HD2000
* Gigabyte Z77X-UD5H w/ firmware F14, iGPU set as primary adapter, VT-d enabled etc.
* Radeon 7750 used for PCI-E pass-through using vfio-pci
* OpenSUSE 42.1 booting from UEFI
* OpenSUSE KVM pattern (via YaST):
-libvirt 1.2.18.1
-virt-manager 1.2.1
-QEMU 2.3.1
* Kernel command line: "resume=/dev/system/swap splash=silent quiet showopts intel_iommu=on rd.driver.pre=vfio-pci"
* pci-ids for HD7750 are static entries in conf file within modprobe.d

There's also the possibility that my hardware is suspect, my motherboard hasn't had the best track record with regards to firmware quality and there are existing ACPI issues (auto reboot on shutdown for example), my GPU also uses a patched firmware in order to enable GOP. But with that said everything works as it should when the kernel patch is removed.

Regards,
Joseph




More information about the vfio-users mailing list