<div dir="ltr"><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jul 6, 2017 at 1:46 PM, Thiago Ramon <span dir="ltr"><<a href="mailto:thiagoramon@gmail.com" target="_blank">thiagoramon@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div class="gmail-h5"><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jul 6, 2017 at 2:20 AM, Alex Williamson <span dir="ltr"><<a href="mailto:alex.l.williamson@gmail.com" target="_blank">alex.l.williamson@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class="gmail-m_3851253673060972461gmail-">On Wed, Jul 5, 2017 at 10:23 PM, Thiago Ramon <span dir="ltr"><<a href="mailto:thiagoramon@gmail.com" target="_blank">thiagoramon@gmail.com</a>></span> wrote:<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div class="gmail-m_3851253673060972461gmail-m_8458038860250725867gmail-h5"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> </blockquote></div><br></div></div></div><div class="gmail_extra">Here, dropped the raw message in pastebin: <a href="https://pastebin.com/hfJ6ryJg" target="_blank">https://pastebin.com<wbr>/hfJ6ryJg</a></div><div class="gmail_extra"><br></div><div class="gmail_extra">That particular run was trying to pass the 980 Ti, which is the boot device, and which probably had something else prodding at it (I'll give it a try again and check what else was attaching to it). I've mostly focused on passing the 1060 though, which doesn't get touched by anything but vfio-pci, and also doesn't show any mmap issues, here's the last QEMU run with SeaBIOS:</div><div class="gmail_extra"><br></div><div class="gmail_extra"><a href="https://pastebin.com/DEPpewCH" target="_blank">https://pastebin.com/DEPpewCH</a><br></div><div class="gmail_extra"><br></div><div class="gmail_extra">And the last one from OVMF:</div><div class="gmail_extra"><br></div><div class="gmail_extra"><a href="https://pastebin.com/L7gkrm36" target="_blank">https://pastebin.com/L7gkrm36</a><br></div><div class="gmail_extra"><br></div><div class="gmail_extra">On the kernel log, I only get the vfio_bar_restore messages. One interesting and consistent pattern is that SeaBIOS always generate 2 pairs of warnings (one for GPU, one audio), while OVMF generates quite a bit (dozen+, don't have a log handy). Probably not relevant, as apparently the failure happens before the first message anyway.</div><div class="gmail_extra"><br></div><div class="gmail_extra">Another detail that may be relevant: Whenever I try a passthrough (and fail), the kernel fails to soft restart. It gets to the last stage where it would do a soft reset but the console just sits there. Could this just be vfio_pci trying to do something with the unresponsive card, or something else that may be a clue to what's going on?</div></div></blockquote><div><br></div></span><div>Yep, here's what I suspected about the D3 warning:</div><div><br></div><div><div>>PCI state after passthrough attempt:</div><div>> 29:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM200 [GeForce GTX 980 Ti] [10de:17c8] (rev ff) (prog-if ff)</div><div>> !!! Unknown header type 7f</div><div>> Kernel driver in use: vfio-pci</div><div>> Kernel modules: nouveau, nvidia_drm, nvidia</div><div>></div><div>> 29:00.1 Audio device [0403]: NVIDIA Corporation GM200 High Definition Audio [10de:0fb0] (rev ff) (prog-if ff)</div><div>> !!! Unknown header type 7f</div><div>> Kernel driver in use: vfio-pci</div><div>> Kernel modules: snd_hda_intel</div></div><div><br></div><div>The card isn't actually stuck in D3, it's basically disappeared from the bus and all reads from config space are returning -1, which is indistinguishable from from D3 power state for the bits that tell us the power state. This is probably the result of doing a bus reset, but that's also our only way of putting the device back to a known state before starting it in the VM. You might try to see if you can reproduce this result manually with setpci. We do a bus reset by finding the bridge upstream of the device, lspci -t is handy for this with a tree view of the PCI topology. As an example:</div><div><br></div><div><a href="https://pastebin.com/c3URT6vx" target="_blank">https://pastebin.com/c3URT6vx</a><br></div><div><br></div><div>Bus numbers are shown in brackets, so if I want the parent bridge of device 01:00.0, look to the left of [01]--00.0 to find 01.0. This is attached to the root bus at [0000:00], so the full address of the parent bridge is 0000:00:01.0.</div><div><br></div><div>We can access the bridge control register using</div><div><br></div><div># setpci -s 0000:00:01.0 BRIDGE_CONTROL</div><div><br></div><div>The secondary bus reset bit is 0x40. We want to set this bit:</div><div><br></div><div># setpci -s 0000:00:01.0 BRIDGE_CONTROL=40:40<br></div><div><br></div><div>Then clear it:</div><div><br></div><div><div># setpci -s 0000:00:01.0 BRIDGE_CONTROL=00:40</div></div><div><br></div><div>Then run lspci on the bus to see if the device is still present. In your case it would be bus 29, so you'd run</div><div><br></div><div># lspci -vvv -s 0000:29:</div><div><br></div><div>Do you get output like above with the 'Unknown header type 7f' or a complete listing of the device? Be sure to reboot the system after running this test, regardless of the result the device will be re-initialized, and clearly nothing should be using the device while doing this. If the graphics card doesn't recover from a bus reset, then something about this system setup is not compatible with this use case. Thanks,</div><div><br></div><div>Alex</div></div></div></div> </blockquote></div><br></div></div></div><div class="gmail_extra">Ok, did some more testing. First thing I did was from having my 2 cards bound to the NVidia driver, shut down X, rmmod nvidia, bound my secondary card to vfio-pci and tried to reset the bus. It indeed failed to reset properly and got stuck.</div><div class="gmail_extra">Then I tried switching out to my primary passthrough setup, to see what was grabbing the card memory, which turned out to be vesafb, even though I've disabled it.</div><div class="gmail_extra">After adding a bunch more options to the boot command line, I've managed to properly block it from anything else, and proceeded to test the bus reset, which worked this time.</div><div class="gmail_extra">Then I tried running the VM (without external BIOS) which failed, but complained about not accessing the BIOS.</div><div class="gmail_extra">Rebooted again and tried with a pre-dumped BIOS, and it still failed in the same way as before.</div><div class="gmail_extra"><br></div><div class="gmail_extra">Returning to my secondary card, I've tried to reset the bus again, this time from a fresh boot, which seems to have worked fine. Here are the logs:</div><div class="gmail_extra"><br></div><div class="gmail_extra"><a href="https://pastebin.com/94F5wURY" target="_blank">https://pastebin.com/94F5wURY</a><br></div><div class="gmail_extra"><br></div><div class="gmail_extra">I've proceeded to reset the bus a few times, to see if it was a problem, but at least half a dozen resets don't seem to have caused any problems. Any other ideas?</div></div> </blockquote></div><br></div><div class="gmail_extra">Progress!(?)</div><div class="gmail_extra"><br></div><div class="gmail_extra">Decided to try PCI passthrough with VirtualBox to see if I could get anything out of it, as it does it quite differently from QEMU. And to my surprise, it seems it actually managed to passthrough my GTX 1060, though due to he nature of NVidia's drivers I got stuck with a code 43. I'm not sure if the virtualization showing through is the only issue though, and I couldn't get the card to actually start on a Linux guest either.</div><div class="gmail_extra"><br></div><div class="gmail_extra">Anyway, the interesting news is that at least to a cursory lspci in the guest the card seems good, and the card doesn't get corrupted (at least until I try to reset it manually afterwards, but it does that if I used it on the host before).</div><div class="gmail_extra"><br></div><div class="gmail_extra">Here's a copy of the dmesg output during a whole run, might help clear what's going on here: <a href="https://pastebin.com/U6Qvu0Wh">https://pastebin.com/U6Qvu0Wh</a></div><div class="gmail_extra"><br></div><div class="gmail_extra">I've failed so far to locate anyone running PCI passthrough on a modern NVidia GPU with VirtualBox, so I don't think this is going to be viable (but I'll try a bit more), but maybe comparing the different approaches we can figure out which part of the process is going bad with the vfio/QEMU option.</div><div class="gmail_extra"><br></div><div class="gmail_extra">Thanks for the help so far.</div></div>