[vfio-users] GPU passthrough errors with linux 5.1 and newer
Zoltán Kővágó
dirty.ice.hu at gmail.com
Sun Jul 21 18:59:00 UTC 2019
Hi,
Recently my previously perfectly working GPU passthrough setup (with a
win8.1 x64 guest with OVMF) started to malfunction in various ways:
screen randomly turned off for a few seconds, BSOD with
VIDEO_TDR_FAILURE, 3d apps randomly crashing, not drawing the windows'
content, and graphical glitches (for example in furmark the OSD text
flickers).
After fiddling around with various qemu versions, nvidia driver versions
on the guest, I figured out that with a linux 5.0 kernel it works fine,
but with 5.1 it randomly fails. I bisected it and it looks like the
culprit is the commit 4e103134b862 "KVM: x86/mmu: Zap only the relevant
pages when removing a memslot"[1]. I tried to revert in on top of 5.2.1
but too many things changed in the meantime. Anyway, if I replace the
body of kvm_mmu_invalidate_zap_pages_in_memslot with
kvm_mmu_zap_all(kvm); it works again (probably with horrible performance
degradation).
Did anyone experience anything like this? I'm using Alex's ACS override
patch, maybe it violates some assumption that the new code has?
Thanks,
Zoltan
# lspci
00:00.0 Host bridge: Intel Corporation 4th Gen Core Processor DRAM
Controller (rev 06)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core
Processor PCI Express x16 Controller (rev 06)
00:01.1 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core
Processor PCI Express x8 Controller (rev 06)
00:02.0 Display controller: Intel Corporation Xeon E3-1200 v3/4th Gen
Core Processor Integrated Graphics Controller (rev 06)
00:03.0 Audio device: Intel Corporation Xeon E3-1200 v3/4th Gen Core
Processor HD Audio Controller (rev 06)
00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset
Family USB xHCI (rev 05)
00:16.0 Communication controller: Intel Corporation 8 Series/C220 Series
Chipset Family MEI Controller #1 (rev 04)
00:19.0 Ethernet controller: Intel Corporation Ethernet Connection
I217-V (rev 05)
00:1a.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset
Family USB EHCI #2 (rev 05)
00:1b.0 Audio device: Intel Corporation 8 Series/C220 Series Chipset
High Definition Audio Controller (rev 05)
00:1c.0 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset
Family PCI Express Root Port #1 (rev d5)
00:1c.3 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset
Family PCI Express Root Port #4 (rev d5)
00:1d.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset
Family USB EHCI #1 (rev 05)
00:1f.0 ISA bridge: Intel Corporation Z87 Express LPC Controller (rev 05)
00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset
Family 6-port SATA Controller 1 [AHCI mode] (rev 05)
00:1f.3 SMBus: Intel Corporation 8 Series/C220 Series Chipset Family
SMBus Controller (rev 05)
01:00.0 VGA compatible controller: NVIDIA Corporation GK107 [GeForce GT
640] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GK107 HDMI Audio Controller
(rev a1)
02:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce GTX
980] (rev a1)
02:00.1 Audio device: NVIDIA Corporation GM204 High Definition Audio
Controller (rev a1)
My qemu commandline:
qemu-system-x86_64 -M q35 -machine kernel_irqchip=on -enable-kvm -m 4096
-cpu
host,kvm=off,hv_time,hv_vendor_id=fuck_nvidia,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff
-smp cores=3,threads=2,sockets=1 -nodefaults -rtc base=localtime
--device virtio-balloon -realtime -device qemu-xhci -device
piix4-usb-uhci,id=uhci -qmp unix:/tmp/qemu-passthrough,server,nowait
-drive
file=/dev/nullptr-vg/win81_tmp,id=disk,format=raw,discard=unmap,cache=unsafe,if=none
-device virtio-scsi-pci,id=scsi -device scsi-hd,drive=disk,id=scsi-disk
-drive if=pflash,format=raw,readonly,file=/home/dirty_ice/OVMF_CODE.fd
-drive if=pflash,format=raw,file=/home/dirty_ice/OVMF_VARS81.fd -netdev
bridge,id=mynet -device
virtio-net,netdev=mynet,id=mynic,mac=52:d0:91:a8:08:0e -display none
-monitor stdio -device
ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1
-device
vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on,id=vga
-device vfio-pci,host=02:00.1,bus=root.1,addr=00.1,id=vga-hdmi -object
input-linux,id=kbd,grab_all=on,evdev=/dev/input/by-id/ckb-Corsair_Gaming_K95_RGB_PLATINUM_Keyboard_vKB_-event
-object
input-linux,id=mouse,evdev=/dev/input/by-id/usb-Logitech_Gaming_Mouse_G502_0C6738683935-event-mouse
-audiodev
alsa,id=foo,out.mixeng=off,out.try-poll=off,threshold=15000,timer-period=5000,out.dev=swap
-device usb-audio,audiodev=foo,multi=on,buffer=10752,id=audio -global
isa-pcspk.audiodev=foo
[1]:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4e103134b862314dc2f2f18f2fb0ab972adc3f5f
More information about the vfio-users
mailing list