[vfio-users] VM doesn't boot if I use GPU passthrough

Ryan Flagler ryan.flagler at gmail.com
Thu Jan 28 19:24:02 UTC 2016


I did some more testing and removing my static vcpu placement and all
vcpupin entries has greatly increased my stability. So far "knock on wood"
the server hasn't crashed or hung up due to gpu driver crashes.

On Thu, Jan 28, 2016 at 1:18 PM Nicolas Roy-Renaud <
nicolas.roy-renaud.1 at ens.etsmtl.ca> wrote:

> Have you checked what "dmesg -w" shows while you start up a VM with your
> GPU passed to it? I was having similar issues before, and it turned out
> that I had to get libvirt to append the GPU to the VM after it was done
> booting because of some invalid rom issues. Could be worth a shot.
>
>
> On 2016-01-27 11:13, Ryan Flagler wrote:
>
> I pulled some different hardware from an unused machine and did some
> testing. So far, on my initial test, I am no longer seeing the driver
> crashes. It must have something to do with my motherboard. That's
> frustrating, because it was a Xeon E5 platform which I think Alex
> recommended in general.
>
> Thanks for all the advice everyone.
>
> On Tue, Jan 26, 2016 at 11:56 AM Ryan Flagler <ryan.flagler at gmail.com>
> wrote:
>
>> Thanks for the encouragement guys. I think I'm going to try some scrounge
>> for some other hardware just to make sure my GPU isn't the problem. The
>> only other cards I have are AMD which besides rebooting actually work
>> solidly.
>>
>> On Tue, Jan 26, 2016 at 11:36 AM Ruben Felgenhauer <
>> 4felgenh at informatik.uni-hamburg.de> wrote:
>>
>>> Hi, Ryan!
>>>
>>> Installing an older Kernel is probably easier than you might think.
>>> On Ubuntu you should be able to find out which kernels are in the repos
>>> with apt-cache,
>>> but I sadly don't know the params, so maybe take a look at the manpage.
>>> And afterwards you should be able to install a specific version with
>>> 'apt-get install packagename=version'
>>>
>>> On Debian there is simply http://snapshot.debian.org/package/linux/
>>> which is how I downgraded from 4.3 to 4.1 on my Debian testing.
>>> You can just download the deb files there and install them with dpkg.
>>> Maybe if you search for a testing system that is similar to Ubuntu, you
>>> could give that a try.
>>>
>>> But keep in mind that this doesn't uninstall the old kernel, so you will
>>> have a fallback.
>>> You might need to select the right kernel at GRUB though.
>>>
>>> Best regards,
>>> Ruben
>>>
>>>
>>> Am 26.01.2016 um 18:11 schrieb Will Marler:
>>>
>>> Well, you run Linux and you're experimenting with VGA passthrough ...
>>> you're resourceful! What about picking up a 16GB SSD for $15
>>> <http://www.amazon.com/Samsung-16GB-Solid-State-Drive/dp/B003YMJPE8/ref=sr_1_3?ie=UTF8&qid=1453827934&sr=8-3&keywords=16GB+SSD> and
>>> installing Arch (or Fedora, or Gentoo... whatever suits) side by side with
>>> Ubuntu? Presumably your VM can be launched either way without any
>>> configuration changes ... when you get tired/frustrated of the
>>> Arch/Fedora/Gentoo way you reboot back. If it works, you've found the
>>> answer, if it doesn't, you've improved your Linux-fu for not much
>>> (monetary) cost.
>>>
>>>
>>> On Tue, Jan 26, 2016 at 10:03 AM, Ryan Flagler <ryan.flagler at gmail.com>
>>> wrote:
>>>
>>>> Yea, that's just a major jump. Wish I had a dedicated test system to
>>>> try more things. ;)
>>>>
>>>> On Tue, Jan 26, 2016 at 10:34 AM Will Marler <will at wmarler.com> wrote:
>>>>
>>>>> Next up would be Kernel, it sounds like...
>>>>>
>>>>> On Tue, Jan 26, 2016 at 8:27 AM, Ryan Flagler <ryan.flagler at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thanks for this info Will. Tried matching your qemu/libvirt versions
>>>>>> and I still get the driver crashes. I'm not sure what else to try.
>>>>>>
>>>>>> On Mon, Jan 25, 2016 at 9:20 PM Will Marler <will at wmarler.com> wrote:
>>>>>>
>>>>>>> Hey Ryan,
>>>>>>>
>>>>>>> Here are the answers to your questions:
>>>>>>>
>>>>>>> 20:06:27 will ~% uname -a
>>>>>>> Linux haze 4.3.3-2-ARCH #1 SMP PREEMPT Wed Dec 23 20:09:18 CET 2015
>>>>>>> x86_64 GNU/Linux
>>>>>>> 20:07:01 will ~% pacman -Q | egrep '^linux|^libvirt|^qemu'
>>>>>>> libvirt 1.3.1-1
>>>>>>> libvirt-glib 0.2.2-1
>>>>>>> libvirt-python 1.3.1-1
>>>>>>> linux 4.3.3-2
>>>>>>> linux-api-headers 4.1.4-1
>>>>>>> linux-firmware 20151207.bbe4917-1
>>>>>>> qemu 2.4.1-2
>>>>>>>
>>>>>>> And here is the pastebin to my XML file:
>>>>>>> http://pastebin.com/nB3DPkEr
>>>>>>>
>>>>>>> As far as the guest drivers are concerned, they're the "GeForce Game
>>>>>>> Ready Driver" version 361.43.
>>>>>>>
>>>>>>> HTH!
>>>>>>>
>>>>>>> On Mon, Jan 25, 2016 at 10:12 AM, Ryan Flagler <
>>>>>>> ryan.flagler at gmail.com> wrote:
>>>>>>>
>>>>>>>> Thanks Will. Here is my info with the guest that crashes.
>>>>>>>>
>>>>>>>> Host OS Info
>>>>>>>>  ubuntu - 14.04.03
>>>>>>>>  kernel - 3.19.0-47
>>>>>>>>
>>>>>>>> virsh version
>>>>>>>>  Compiled against library: libvirt 1.2.18
>>>>>>>>  Using library: libvirt 1.2.18
>>>>>>>>  Using API: QEMU 1.2.18
>>>>>>>>  Running hypervisor: QEMU 2.5.0
>>>>>>>>
>>>>>>>> patches
>>>>>>>>  I did not manually apply any patches to Qemu. Built directly from
>>>>>>>> source.
>>>>>>>>
>>>>>>>> Guest Info
>>>>>>>>  Windows 10
>>>>>>>>  nVidia Graphics Driver 361.43
>>>>>>>>
>>>>>>>> Guest Event Viewer Entry On Driver Crash
>>>>>>>>  Source - nvlddmkm
>>>>>>>>  Event ID - 14
>>>>>>>>  Info - \Device\Video3  CMDre 00000004 0000011c bad0011f 00000000
>>>>>>>> 00d0011f
>>>>>>>>
>>>>>>>> Guest XML - Attached
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jan 25, 2016 at 10:18 AM Will Marler <will at wmarler.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On Mon, Jan 25, 2016 at 9:07 AM, Ryan Flagler <
>>>>>>>>> ryan.flagler at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Will, could you tell us the following?
>>>>>>>>>>
>>>>>>>>>> What Linux distribution on host?
>>>>>>>>>>
>>>>>>>>> Arch
>>>>>>>>>
>>>>>>>>>> What kernel are you using on host?
>>>>>>>>>> What libvirt version on host?
>>>>>>>>>> What qemu version on host?
>>>>>>>>>>
>>>>>>>>> Will have to check when I'm home from work & the kids are asnooze,
>>>>>>>>> but it's whatever's latest (and I'm not using the linux-vfio-lts kernel)
>>>>>>>>>
>>>>>>>>>> What OS on guest?
>>>>>>>>>>
>>>>>>>>> Windows 10.
>>>>>>>>>
>>>>>>>>>> What nvidia graphics driver version on guest?
>>>>>>>>>>
>>>>>>>>> Again, I'll have to check. But the latest or nearly latest.
>>>>>>>>>
>>>>>>>>>> My machines gpu driver crashes constantly and I'm trying to
>>>>>>>>>> narrow down why. Thanks!
>>>>>>>>>>
>>>>>>>>> How frustrating : (. I'll also get a pastebin of my XML for you,
>>>>>>>>> in case that will help. I've been running "stable" since mid 2015. I use
>>>>>>>>> the quotes because some things tripped me up (guest machine can't "sleep,"
>>>>>>>>> can only power on & power off; when host machine goes to sleep with guest
>>>>>>>>> running, on host wake-up the guest is non-responsive and 100% CPU).
>>>>>>>>>
>>>>>>>>> Will
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Jan 25, 2016, 10:02 AM Will Marler <will at wmarler.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> This is discussed in
>>>>>>>>>>> http://vfio.blogspot.com/2015/05/vfio-gpu-how-to-series-part-4-our-first.html.
>>>>>>>>>>> You have to do more than <kvm><hidden state='on'/></kvm>:
>>>>>>>>>>>
>>>>>>>>>>> "The GeForce card is nearly as easy, but we first need to work
>>>>>>>>>>> around some of the roadblocks Nvidia has put in place to prevent you from
>>>>>>>>>>> using the hardware you've purchased in the way that you desire (and by my
>>>>>>>>>>> reading conforms to the EULA for their software, but IANAL).  For this step
>>>>>>>>>>> we again need to run virsh edit on the VM.  Within the <features> section,
>>>>>>>>>>> remove everything between the <hyperv> tags, including the tags
>>>>>>>>>>> themselves.  In their place add the following tags:
>>>>>>>>>>>
>>>>>>>>>>>     <kvm>
>>>>>>>>>>>       <hidden state='on'/>
>>>>>>>>>>>     </kvm>
>>>>>>>>>>>
>>>>>>>>>>> Additionally, within the <clock> tag, find the timer named
>>>>>>>>>>> hypervclock, remove the line containing this tag completely.  Save and exit
>>>>>>>>>>> the edit session."
>>>>>>>>>>>
>>>>>>>>>>> I can confirm it works, I've been getting a lot of mileage from
>>>>>>>>>>> my passed-through 750Ti lately since getting a Steam Link :-D.
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Jan 24, 2016 at 7:32 AM, Ruben Felgenhauer <
>>>>>>>>>>> 4felgenh at informatik.uni-hamburg.de> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> finally I had time to this again. I tried out virt-manager and
>>>>>>>>>>>> after a bit of playing around with it, it /somewhat/ worked:
>>>>>>>>>>>>
>>>>>>>>>>>> The machine is at least booting. I still have a standard vga
>>>>>>>>>>>> card enabled in the virt-manager config window.
>>>>>>>>>>>> After the machine has booted, I can see that the device gets
>>>>>>>>>>>> recognized as 750ti.
>>>>>>>>>>>> However, the gpu doesn't get used, because of 'Code 43'.
>>>>>>>>>>>> Code 43 is a generic error, so any idea what it could mean in
>>>>>>>>>>>> this case?
>>>>>>>>>>>>
>>>>>>>>>>>> Of course I added the <kvm><hidden state='on'/></kvm> lines at
>>>>>>>>>>>> the associated position.
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Ruben
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Am 18.01.2016 um 22:27 schrieb Will Marler:
>>>>>>>>>>>>
>>>>>>>>>>>> I'm not sure what correct command-line syntax is. Have you
>>>>>>>>>>>> tried using libvirt and VirtManager to handle your VM rather than command
>>>>>>>>>>>> line, and modifying the XML rather than the command line? I think that's
>>>>>>>>>>>> generally the preferred method these days (it's certainly easier from my
>>>>>>>>>>>> point of view, and the way I got my 750 Ti to pass through).
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jan 18, 2016 at 11:04 AM, Ruben Felgenhauer <
>>>>>>>>>>>> 4felgenh at informatik.uni-hamburg.de> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi, Alex!
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for your reply!
>>>>>>>>>>>>> My GPU indeed has a seperate audio device located at 01:00.1.
>>>>>>>>>>>>>
>>>>>>>>>>>>> However, just adding -device vfio-pci,host=01:00.1 doesn't
>>>>>>>>>>>>> seem to do the trick.
>>>>>>>>>>>>> Of course the corresponding device is already blacklisted and
>>>>>>>>>>>>> bound to vfio.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The Debian Wiki entry about VGA passthrough (
>>>>>>>>>>>>> https://wiki.debian.org/VGAPassthrough) mentions QEMU
>>>>>>>>>>>>> arguments like "-device
>>>>>>>>>>>>> vfio-pci,host=01:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on,romfile=...
>>>>>>>>>>>>> -device vfio-pci,host=01:00.1,bus=pcie.0" which seems to address GPUs with
>>>>>>>>>>>>> audio devices, but if I try to do something similar, the buses 'root' and
>>>>>>>>>>>>> 'pcie' couldn't be found. Maybe I missed something very important?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On the same article, it says that the "HDMI soundcard [...]
>>>>>>>>>>>>> needs to be unbound from its driver":
>>>>>>>>>>>>> # echo '0000:01:00.1' | sudo tee
>>>>>>>>>>>>> /sys/bus/pci/devices/0000:01:00.1/driver/unbind
>>>>>>>>>>>>> I figured the vfio-bind script from the Arch Linux Forum
>>>>>>>>>>>>> thread (https://bbs.archlinux.org/viewtopic.php?id=162768)
>>>>>>>>>>>>> would do exactly this thing, so I didn't explicitly do so for the audio
>>>>>>>>>>>>> device. Is that okay?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>> Ruben
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Am 18.01.2016 um 08:31 schrieb Alexander Petrenz:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Ruben,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I guess your 750ti also has some audio device. You should pass
>>>>>>>>>>>>> through this too. It should be something like 01:00.1. There are many
>>>>>>>>>>>>> command line examples you can find about that.
>>>>>>>>>>>>> Also I´m not quite sure, if you should remove the x-vga=on.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards
>>>>>>>>>>>>> Alex
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Jan 17, 2016 at 11:12 PM, Ruben Felgenhauer <
>>>>>>>>>>>>> 4felgenh at informatik.uni-hamburg.de> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am trying to pass my nVidia GTX 750ti to my QEMU guest.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Problem is: After the QEMU monitor pops up, nothing happens.
>>>>>>>>>>>>>> The GPU's output is dead, and the vm won't be accessible via SSH anymore,
>>>>>>>>>>>>>> so it's very likely that the VM isn't booting up at all. Also, there are no
>>>>>>>>>>>>>> error messages from QEMU on the console whatsoever which makes debugging it
>>>>>>>>>>>>>> especially hard.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is how I start the vm with normal vga emulation:
>>>>>>>>>>>>>> qemu-system-x86_64 -hda vm.ovl -boot c -enable-kvm -m 1024
>>>>>>>>>>>>>> -cpu host,kvm=off -smp cores=4,threads=2 -redir tcp:5022::22
>>>>>>>>>>>>>> Everything runs fine in this case. To do the passthrough, I
>>>>>>>>>>>>>> add this:
>>>>>>>>>>>>>> -device vfio-pci,host=01:00.0,multifunction=on,x-vga=on -vga
>>>>>>>>>>>>>> none
>>>>>>>>>>>>>> This brings said problems with it. I also tried out multiple
>>>>>>>>>>>>>> different combinations of -device's arguments or even adding a romfile for
>>>>>>>>>>>>>> the GPU, but none of these steps changed anything at all.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Obviously, I am using a BIOS installation and I'm well-aware
>>>>>>>>>>>>>> with this bug:
>>>>>>>>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=107561, but
>>>>>>>>>>>>>> neither using less RAM (as you can see I am using 1GB now) nor switching to
>>>>>>>>>>>>>> an older Kernel changed anything about the problem. I have tried Kernel
>>>>>>>>>>>>>> 4.1.0 and 4.3.0.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Host is Debian testing with QEMU 2.5.0.
>>>>>>>>>>>>>> I tried both Debian and Windows 7 as a guest, but both are
>>>>>>>>>>>>>> showing exactly the same behaviour.
>>>>>>>>>>>>>> Mainboard is an ASUS Z87-PLUS. The 750ti is produced by ASUS
>>>>>>>>>>>>>> aswell.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Any idea how I could get passthrough running?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> vfio-users mailing list
>>>>>>>>>>>>>> vfio-users at redhat.com
>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/vfio-users
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> vfio-users mailing list
>>>>>>>>>>>>> vfio-users at redhat.com
>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/vfio-users
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> vfio-users mailing list
>>>>>>>>>>> vfio-users at redhat.com
>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/vfio-users
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>
>>>>>
>>>
>>>
>
> _______________________________________________
> vfio-users mailing listvfio-users at redhat.comhttps://www.redhat.com/mailman/listinfo/vfio-users
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20160128/ee375f9d/attachment.htm>


More information about the vfio-users mailing list