[vfio-users] Need help with GPU Passthrough on Ryzen C6H + GTX 980 Ti + GTX 1060 6G

Thiago Ramon thiagoramon at gmail.com
Thu Jul 6 04:23:02 UTC 2017


On Thu, Jul 6, 2017 at 12:54 AM, Alex Williamson <
alex.l.williamson at gmail.com> wrote:

> On Wed, Jul 5, 2017 at 9:10 PM, Thiago Ramon <thiagoramon at gmail.com>
> wrote:
>
>> I'm having a quite unique problem, and have exhausted all possibilities I
>> have found so far, after a couple weeks of attempts, so I've decided to
>> bother you guys with this.
>>
>> My setup: Ryzen 7 1800X, Asus Crosshair VI Hero, NVidia GTX 980 Ti and
>> NVidia GTX 1060 6G
>> OS: Arch Linux, latest updates, mainline kernel (4.11.7-1-ARCH),
>> QEMU 2.9.0, libvirt 3.4.0
>>
>> The problem, for all I can tell, is that the GPU is getting corrupted
>> somehow at or before reaching the BIOS/UEFI, getting reset and stuck on
>> mode D3, no matter which GPU I passthrough, boot options, SeaBIOS/OVMF,
>> chipset or connection to the PCI/PCIe bus.
>>
>> Both GPUs are healthy and working perfectly under Linux, using the
>> proprietary NVidia drivers.
>>
>> Things tried: Disabled D3 mode in vfio_pci, used pci_stub instead,
>> disabled NVidia driver and ran the VM from the console, multiple boot
>> options involving the IOMMU and KVM (but hey, any new ideas help)
>>
>> I know the motherboard (in general, not this one specifically) can work
>> with GPU passthrough, as I already had contact with someone passing a GTX
>> 1070 with it (though his other GPU is AMD).
>>
>> Unless there's something I've overlooked, I probably need to gather more
>> in-depth information on what's going on with the GPU in the first moments
>> of the boot process, so if anyone knows of a good set of debug options for
>> QEMU, or if kernel tracing is better, please let me know.
>>
>> Thanks for any help, and let me know if there's any extra info that could
>> help solve this puzzle.
>>
>>
>> Relevant logs and more details: https://www.reddit.co
>> m/r/VFIO/comments/6khu5i/need_help_with_gpu_passthrough_on_ryzen_c6h_gtx/
>>
>
> Wow, formatting in reddit is nearly impossible to decipher... pastebin?  I
> can spot one issue:
>
>  pci 0000:29:00.0: vgaarb: setting as boot VGA device
>
> Generally you want to assign the non-boot device.  And probably related:
>
> Failed to mmap 0000:29:00.0 BAR 3. Performance may be slow
>
> This is really suggesting something much more wrong than performance may
> be slow.  Check /proc/iomem, find what driver is claiming resources on the
> device, disable it.  This probably means that some other driver besides
> vesafb or efifb is blocking the device.  The kernel will try pretty hard to
> attach a driver to the primary graphics, which is one of the complications
> of trying to assign primary graphics.  Thanks,
>
> Alex
>

Here, dropped the raw message in pastebin: https://pastebin.com/hfJ6ryJg

That particular run was trying to pass the 980 Ti, which is the boot
device, and which probably had something else prodding at it (I'll give it
a try again and check what else was attaching to it). I've mostly focused
on passing the 1060 though, which doesn't get touched by anything but
vfio-pci, and also doesn't show any mmap issues, here's the last QEMU run
with SeaBIOS:

https://pastebin.com/DEPpewCH

And the last one from OVMF:

https://pastebin.com/L7gkrm36

On the kernel log, I only get the vfio_bar_restore messages. One
interesting and consistent pattern is that SeaBIOS always generate 2 pairs
of warnings (one for GPU, one audio), while OVMF generates quite a bit
(dozen+, don't have a log handy). Probably not relevant, as apparently the
failure happens before the first message anyway.

Another detail that may be relevant: Whenever I try a passthrough (and
fail), the kernel fails to soft restart. It gets to the last stage where it
would do a soft reset but the console just sits there. Could this just be
vfio_pci trying to do something with the unresponsive card, or something
else that may be a clue to what's going on?

Thanks for the help
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20170706/95533551/attachment.htm>


More information about the vfio-users mailing list