[vfio-users] (good) working GPU for passthrough?

Kyle Marek psppsn96 at gmail.com
Mon Feb 11 01:25:00 UTC 2019


On 1/30/19 3:25 PM, Kash Pande wrote:
> On 2019-01-30 8:04 a.m., Tobias Geiger wrote:
>> So it seems - i still suffer from the reset bug, despite the VM being
>> 100% UEFI and Q35 now. 
>
> I probably spoke too early when I said before that the reset issue is
> fixed by proper PCIe layout.
>
>
> The issue seems to be hardware-related, that there is no PCI quirk in
> QEMU/vfio_pci to fully reset the GPU using PSP mode1 reset.
>
>
> There is allegedly code in AMDGPU.ko to reset the GPU, but I believe
> this is related to GPU hang reset, not full GPU / VM reboot reset.
> Additionally, this code didn't work as advertised for me, last night,
> when testing in a Linux guest.
>
>
> There is an option to vfio_pci called disable_idle_d3 and when set to Y
> it will prevent the vfio-bound PCI devices from entering the D3 idle
> state that the card then gets stuck in. Of course, your card at idle
> will be consuming about 50-60W more than needed.
>
>
> If I simply start a VFIO guest, the GPU idles at 3 watts. I can reboot
> as much as I want.
>
>
> Kash

Hmmm... still can't seem to get my RX Vega 64 to reboot.

kmarek at kyle.internal.gigabyteproductions.net ~
$ uname -a
Linux kyle.internal.gigabyteproductions.net 4.20.6-200.fc29.x86_64 #1 SMP Thu Jan 31 15:50:43 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

kmarek at kyle.internal.gigabyteproductions.net ~
$ rpm -q qemu-system-x86 edk2-ovmf
qemu-system-x86-3.0.0-3.fc29.x86_64
edk2-ovmf-20180815gitcb5f4f45ce-4.fc29.noarch

kmarek at kyle.internal.gigabyteproductions.net ~
$ cat /sys/module/vfio_pci/parameters/disable_idle_d3
Y

My test case:

qemu-system-x86_64 \
  -nodefaults \
  -nodefconfig \
  -no-user-config \
  -display none \
  -accel kvm \
  -machine q35 \
  -cpu host \
  -smp cores=$(nproc) \
  -m 2G \
  -drive file=/usr/share/edk2/ovmf/OVMF_CODE.fd,format=raw,if=pflash,readonly=on \
  -device ioh3420,bus=pcie.0,multifunction=on,port=1,chassis=1,id=root1 \
  -device vfio-pci,host=07:00.0,bus=root1,multifunction=on \
  -device vfio-pci,host=07:00.1,bus=root1 \
  -monitor stdio \
  ;

Upon first boot of card/VM:

[  204.728816] vfio-pci 0000:07:00.0: enabling device (0000 -> 0003)
[  204.735267] vfio_ecap_init: 0000:07:00.0 hiding ecap 0x19 at 0x270
[  204.741241] vfio_ecap_init: 0000:07:00.0 hiding ecap 0x1b at 0x2d0
[  204.749996] vfio-pci 0000:07:00.1: enabling device (0000 -> 0002)

When I quit in the QEMU monitor, the image stays on the screen, and no
further host dmesg output is produced.

When I run the test qemu command again, the following host dmesg output
is produced, and the image disappears:

[  284.792957] vfio_ecap_init: 0000:07:00.0 hiding ecap 0x19 at 0x270
[  284.798886] vfio_ecap_init: 0000:07:00.0 hiding ecap 0x1b at 0x2d0

The card doesn't display anything again until host reboot.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20190210/24f5fcf8/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20190210/24f5fcf8/attachment.sig>


More information about the vfio-users mailing list