[vfio-users] GPU passthrough errors with linux 5.1 and newer

Zoltán Kővágó dirty.ice.hu at gmail.com
Sun Jul 21 18:59:00 UTC 2019


Hi,

Recently my previously perfectly working GPU passthrough setup (with a 
win8.1 x64 guest with OVMF) started to malfunction in various ways: 
screen randomly turned off for a few seconds, BSOD with 
VIDEO_TDR_FAILURE, 3d apps randomly crashing, not drawing the windows' 
content, and graphical glitches (for example in furmark the OSD text 
flickers).

After fiddling around with various qemu versions, nvidia driver versions 
on the guest, I figured out that with a linux 5.0 kernel it works fine, 
but with 5.1 it randomly fails. I bisected it and it looks like the 
culprit is the commit 4e103134b862 "KVM: x86/mmu: Zap only the relevant 
pages when removing a memslot"[1]. I tried to revert in on top of 5.2.1 
but too many things changed in the meantime. Anyway, if I replace the 
body of kvm_mmu_invalidate_zap_pages_in_memslot with 
kvm_mmu_zap_all(kvm); it works again (probably with horrible performance 
degradation).

Did anyone experience anything like this? I'm using Alex's ACS override 
patch, maybe it violates some assumption that the new code has?

Thanks,
Zoltan

# lspci
00:00.0 Host bridge: Intel Corporation 4th Gen Core Processor DRAM 
Controller (rev 06)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core 
Processor PCI Express x16 Controller (rev 06)
00:01.1 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core 
Processor PCI Express x8 Controller (rev 06)
00:02.0 Display controller: Intel Corporation Xeon E3-1200 v3/4th Gen 
Core Processor Integrated Graphics Controller (rev 06)
00:03.0 Audio device: Intel Corporation Xeon E3-1200 v3/4th Gen Core 
Processor HD Audio Controller (rev 06)
00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset 
Family USB xHCI (rev 05)
00:16.0 Communication controller: Intel Corporation 8 Series/C220 Series 
Chipset Family MEI Controller #1 (rev 04)
00:19.0 Ethernet controller: Intel Corporation Ethernet Connection 
I217-V (rev 05)
00:1a.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset 
Family USB EHCI #2 (rev 05)
00:1b.0 Audio device: Intel Corporation 8 Series/C220 Series Chipset 
High Definition Audio Controller (rev 05)
00:1c.0 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset 
Family PCI Express Root Port #1 (rev d5)
00:1c.3 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset 
Family PCI Express Root Port #4 (rev d5)
00:1d.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset 
Family USB EHCI #1 (rev 05)
00:1f.0 ISA bridge: Intel Corporation Z87 Express LPC Controller (rev 05)
00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset 
Family 6-port SATA Controller 1 [AHCI mode] (rev 05)
00:1f.3 SMBus: Intel Corporation 8 Series/C220 Series Chipset Family 
SMBus Controller (rev 05)
01:00.0 VGA compatible controller: NVIDIA Corporation GK107 [GeForce GT 
640] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GK107 HDMI Audio Controller 
(rev a1)
02:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce GTX 
980] (rev a1)
02:00.1 Audio device: NVIDIA Corporation GM204 High Definition Audio 
Controller (rev a1)

My qemu commandline:
qemu-system-x86_64 -M q35 -machine kernel_irqchip=on -enable-kvm -m 4096 
-cpu 
host,kvm=off,hv_time,hv_vendor_id=fuck_nvidia,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff 
-smp cores=3,threads=2,sockets=1 -nodefaults -rtc base=localtime 
--device virtio-balloon -realtime  -device qemu-xhci -device 
piix4-usb-uhci,id=uhci -qmp unix:/tmp/qemu-passthrough,server,nowait 
-drive 
file=/dev/nullptr-vg/win81_tmp,id=disk,format=raw,discard=unmap,cache=unsafe,if=none 
-device virtio-scsi-pci,id=scsi -device scsi-hd,drive=disk,id=scsi-disk 
-drive if=pflash,format=raw,readonly,file=/home/dirty_ice/OVMF_CODE.fd 
-drive if=pflash,format=raw,file=/home/dirty_ice/OVMF_VARS81.fd -netdev 
bridge,id=mynet -device 
virtio-net,netdev=mynet,id=mynic,mac=52:d0:91:a8:08:0e -display none 
-monitor stdio -device 
ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1 
-device 
vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on,id=vga 
-device vfio-pci,host=02:00.1,bus=root.1,addr=00.1,id=vga-hdmi -object 
input-linux,id=kbd,grab_all=on,evdev=/dev/input/by-id/ckb-Corsair_Gaming_K95_RGB_PLATINUM_Keyboard_vKB_-event 
-object 
input-linux,id=mouse,evdev=/dev/input/by-id/usb-Logitech_Gaming_Mouse_G502_0C6738683935-event-mouse 
-audiodev 
alsa,id=foo,out.mixeng=off,out.try-poll=off,threshold=15000,timer-period=5000,out.dev=swap 
-device usb-audio,audiodev=foo,multi=on,buffer=10752,id=audio -global 
isa-pcspk.audiodev=foo

[1]: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4e103134b862314dc2f2f18f2fb0ab972adc3f5f




More information about the vfio-users mailing list