[vfio-users] Host hard lockups

vfio vfio at taintedbit.com
Fri Aug 5 21:11:19 UTC 2016


Hello everyone,

I've been running VGA passthrough with a Debian unstable host and a
Windows 10 guest for months now. Everything works perfectly, except that
the entire machine randomly freezes when the guest is running.

When a freeze happens, the guest immediately locks up. Sometimes, if
audio was playing, it goes into a short loop. Strangely, the host does
not usually freeze immediately; it takes a few seconds after the guest
has frozen. For example, my CPU monitor on the host will usually perform
a few more measurements before completely freezing, and the mouse cursor
on the host machine will continue working for a few seconds as well.

When the host freezes, not even the physical reset button on the machine
works. It requires a hard reset by holding the power button. There have
been a few times where the reset button worked. However, in one of these
instances, the host refused to boot after a reset, claiming to be unable
to initialize one of the USB buses. Sometimes the issue does not happen
for several days with multiple-hour sessions. Sometimes it happens
multiple times per day, possibly a few minutes after booting the guest.

No freeze leaves any traces in syslog.

These issues are very similar to those reported by Colin Godsey on this
list in May. While the conclusion of that thread seemed to be BIOS
firmware problems or "the Skylake freeze", I am using a Core i7-4960X
(Ivy Bridge-E) and an ASUS Rampage IV Extreme with the final BIOS
revision (4901 from 2014-06-18).

I am using libvirt with virt-manager. My "normal" configuration is to
plug my keyboard and mouse into a USB switch, and connect this USB
switch to two different USB buses. I then use PCI passthrough to pass
one of the buses to the guest. I have had the freeze occur while playing
games, while doing nothing at all, and a few times even without ever
connecting input devices to the guest. My host typically doesn't have
any load on it while running the guest.

I have tried a large variety of configurations while trying to track
down the problem. Since the only clue I ever received was the USB bus
failure after a reset, I have focused a lot on USB and I/O settings. I
have experienced the crash in all of these situations:
- Reserving hugepages or not
- Using hugepages or not
- Using "USB Host Device" to add keyboard / mice to the guest
- Using a completely different physical keyboard and mouse for the guest
and host (including "basic" non-gaming Microsoft devices)
- Passing through a built-in USB2 bus for the guest
- Passing through a built-in USB3 bus for the guest
- Passing through a dedicated USB3 PCIe card for the guest
- Using a physical SSD for the guest with virtio
- Using a physical SSD for the guest with "SATA" mode
- Using a raw disk image for the guest HDD with virtio
- Using a raw disk image for the guest HDD with "SATA" mode
- Virtio drivers installed or not installed in the guest
- Using "host-passthrough" or "IvyBridge" for the guest CPU
- Pinning the guest cores or not
- 2, 6, 8, 10 logical guest threads (host has 12 logical)
- Using Hyper-V extensions or not
- Using UEFI (OVMF) or BIOS (SeaBIOS) for the guest

My current host cmdline:
BOOT_IMAGE=/vmlinuz-4.6.0-1-amd64 root=/dev/mapper/vg--ssd-lv--ssd--1 ro
quiet intel_iommu=on usbhid.quirks=0x1B1C:0x1B11:0x20000408
transparent_hugepage=never

My current configuration:
Linux debian 4.6.0-1-amd64 #1 SMP Debian 4.6.4-1 (2016-07-18) x86_64
ASUS Rampage IV Extreme motherboard
Intel(R) Core(TM) i7-4960X CPU @ 3.60GHz (stepping 4, microcode 0x416)
Slot 01:00: GeForce GTX 580 (host)
Slot 02:00: GeForce 210 (host)
Slot 03:00: GeForce GTX TITAN X (guest passthrough, pci-stubbed)
Slot 04:00: Fresco Logic FL1100 USB 3.0 (guest passthrough)

At this point, I don't know what else to try. I would greatly appreciate
any assistance or suggestions!
Thanks!




More information about the vfio-users mailing list