[vfio-users] VM doesn't boot if I use GPU passthrough

Nicolas Roy-Renaud nicolas.roy-renaud.1 at ens.etsmtl.ca
Thu Jan 28 19:18:43 UTC 2016


Have you checked what "dmesg -w" shows while you start up a VM with your 
GPU passed to it? I was having similar issues before, and it turned out 
that I had to get libvirt to append the GPU to the VM after it was done 
booting because of some invalid rom issues. Could be worth a shot.

On 2016-01-27 11:13, Ryan Flagler wrote:
> I pulled some different hardware from an unused machine and did some 
> testing. So far, on my initial test, I am no longer seeing the driver 
> crashes. It must have something to do with my motherboard. That's 
> frustrating, because it was a Xeon E5 platform which I think Alex 
> recommended in general.
>
> Thanks for all the advice everyone.
>
> On Tue, Jan 26, 2016 at 11:56 AM Ryan Flagler <ryan.flagler at gmail.com 
> <mailto:ryan.flagler at gmail.com>> wrote:
>
>     Thanks for the encouragement guys. I think I'm going to try some
>     scrounge for some other hardware just to make sure my GPU isn't
>     the problem. The only other cards I have are AMD which besides
>     rebooting actually work solidly.
>
>     On Tue, Jan 26, 2016 at 11:36 AM Ruben Felgenhauer
>     <4felgenh at informatik.uni-hamburg.de
>     <mailto:4felgenh at informatik.uni-hamburg.de>> wrote:
>
>         Hi, Ryan!
>
>         Installing an older Kernel is probably easier than you might
>         think.
>         On Ubuntu you should be able to find out which kernels are in
>         the repos with apt-cache,
>         but I sadly don't know the params, so maybe take a look at the
>         manpage.
>         And afterwards you should be able to install a specific
>         version with 'apt-get install packagename=version'
>
>         On Debian there is simply
>         http://snapshot.debian.org/package/linux/ which is how I
>         downgraded from 4.3 to 4.1 on my Debian testing.
>         You can just download the deb files there and install them
>         with dpkg.
>         Maybe if you search for a testing system that is similar to
>         Ubuntu, you could give that a try.
>
>         But keep in mind that this doesn't uninstall the old kernel,
>         so you will have a fallback.
>         You might need to select the right kernel at GRUB though.
>
>         Best regards,
>         Ruben
>
>
>         Am 26.01.2016 um 18:11 schrieb Will Marler:
>>         Well, you run Linux and you're experimenting with VGA
>>         passthrough ... you're resourceful! What about picking up a
>>         16GB SSD for $15
>>         <http://www.amazon.com/Samsung-16GB-Solid-State-Drive/dp/B003YMJPE8/ref=sr_1_3?ie=UTF8&qid=1453827934&sr=8-3&keywords=16GB+SSD> and
>>         installing Arch (or Fedora, or Gentoo... whatever suits) side
>>         by side with Ubuntu? Presumably your VM can be launched
>>         either way without any configuration changes ... when you get
>>         tired/frustrated of the Arch/Fedora/Gentoo way you reboot
>>         back. If it works, you've found the answer, if it doesn't,
>>         you've improved your Linux-fu for not much (monetary) cost.
>>
>>
>>         On Tue, Jan 26, 2016 at 10:03 AM, Ryan Flagler
>>         <ryan.flagler at gmail.com> wrote:
>>
>>             Yea, that's just a major jump. Wish I had a dedicated
>>             test system to try more things. ;)
>>
>>             On Tue, Jan 26, 2016 at 10:34 AM Will Marler
>>             <will at wmarler.com> wrote:
>>
>>                 Next up would be Kernel, it sounds like...
>>
>>                 On Tue, Jan 26, 2016 at 8:27 AM, Ryan Flagler
>>                 <ryan.flagler at gmail.com> wrote:
>>
>>                     Thanks for this info Will. Tried matching your
>>                     qemu/libvirt versions and I still get the driver
>>                     crashes. I'm not sure what else to try.
>>
>>                     On Mon, Jan 25, 2016 at 9:20 PM Will Marler
>>                     <will at wmarler.com> wrote:
>>
>>                         Hey Ryan,
>>
>>                         Here are the answers to your questions:
>>
>>                         20:06:27 will~% uname -a
>>                         Linux haze 4.3.3-2-ARCH #1 SMP PREEMPT Wed
>>                         Dec 23 20:09:18 CET 2015 x86_64 GNU/Linux
>>                         20:07:01 will~% pacman -Q | egrep
>>                         '^linux|^libvirt|^qemu'
>>                         libvirt 1.3.1-1
>>                         libvirt-glib 0.2.2-1
>>                         libvirt-python 1.3.1-1
>>                         linux 4.3.3-2
>>                         linux-api-headers 4.1.4-1
>>                         linux-firmware 20151207.bbe4917-1
>>                         qemu 2.4.1-2
>>
>>                         And here is the pastebin to my XML file:
>>                         http://pastebin.com/nB3DPkEr
>>
>>                         As far as the guest drivers are concerned,
>>                         they're the "GeForce Game Ready Driver"
>>                         version 361.43.
>>
>>                         HTH!
>>
>>                         On Mon, Jan 25, 2016 at 10:12 AM, Ryan
>>                         Flagler <ryan.flagler at gmail.com> wrote:
>>
>>                             Thanks Will. Here is my info with the
>>                             guest that crashes.
>>
>>                             Host OS Info
>>                              ubuntu - 14.04.03
>>                              kernel - 3.19.0-47
>>
>>                             virsh version
>>                              Compiled against library: libvirt 1.2.18
>>                              Using library: libvirt 1.2.18
>>                              Using API: QEMU 1.2.18
>>                              Running hypervisor: QEMU 2.5.0
>>
>>                             patches
>>                              I did not manually apply any patches to
>>                             Qemu. Built directly from source.
>>
>>                             Guest Info
>>                              Windows 10
>>                              nVidia Graphics Driver 361.43
>>
>>                             Guest Event Viewer Entry On Driver Crash
>>                              Source - nvlddmkm
>>                              Event ID - 14
>>                              Info - \Device\Video3  CMDre 00000004
>>                             0000011c bad0011f 00000000 00d0011f
>>
>>                             Guest XML - Attached
>>
>>
>>                             On Mon, Jan 25, 2016 at 10:18 AM Will
>>                             Marler <will at wmarler.com> wrote:
>>
>>                                 On Mon, Jan 25, 2016 at 9:07 AM, Ryan
>>                                 Flagler <ryan.flagler at gmail.com> wrote:
>>
>>                                     Will, could you tell us the
>>                                     following?
>>
>>                                     What Linux distribution on host?
>>
>>                                 Arch
>>
>>                                     What kernel are you using on host?
>>                                     What libvirt version on host?
>>                                     What qemu version on host?
>>
>>                                 Will have to check when I'm home from
>>                                 work & the kids are asnooze, but it's
>>                                 whatever's latest (and I'm not using
>>                                 the linux-vfio-lts kernel)
>>
>>                                     What OS on guest?
>>
>>                                 Windows 10.
>>
>>                                     What nvidia graphics driver
>>                                     version on guest?
>>
>>                                 Again, I'll have to check. But the
>>                                 latest or nearly latest.
>>
>>                                     My machines gpu driver crashes
>>                                     constantly and I'm trying to
>>                                     narrow down why. Thanks!
>>
>>                                 How frustrating : (. I'll also get a
>>                                 pastebin of my XML for you, in case
>>                                 that will help. I've been running
>>                                 "stable" since mid 2015. I use the
>>                                 quotes because some things tripped me
>>                                 up (guest machine can't "sleep," can
>>                                 only power on & power off; when host
>>                                 machine goes to sleep with guest
>>                                 running, on host wake-up the guest is
>>                                 non-responsive and 100% CPU).
>>
>>                                 Will
>>
>>
>>                                     On Mon, Jan 25, 2016, 10:02
>>                                     AM Will Marler <will at wmarler.com>
>>                                     wrote:
>>
>>                                         This is discussed in
>>                                         http://vfio.blogspot.com/2015/05/vfio-gpu-how-to-series-part-4-our-first.html.
>>                                         You have to do more than
>>                                         <kvm><hidden state='on'/></kvm>:
>>
>>                                         "The GeForce card is nearly
>>                                         as easy, but we first need to
>>                                         work around some of the
>>                                         roadblocks Nvidia has put in
>>                                         place to prevent you from
>>                                         using the hardware you've
>>                                         purchased in the way that you
>>                                         desire (and by my reading
>>                                         conforms to the EULA for
>>                                         their software, but IANAL). 
>>                                         For this step we again need
>>                                         to run virsh edit on the VM.
>>                                         Within the <features>
>>                                         section, remove everything
>>                                         between the <hyperv> tags,
>>                                         including the tags
>>                                         themselves. In their place
>>                                         add the following tags:
>>
>>                                         <kvm>
>>                                         <hidden state='on'/>
>>                                         </kvm>
>>
>>                                         Additionally, within the
>>                                         <clock> tag, find the timer
>>                                         named hypervclock, remove the
>>                                         line containing this tag
>>                                         completely. Save and exit the
>>                                         edit session."
>>
>>                                         I can confirm it works, I've
>>                                         been getting a lot of mileage
>>                                         from my passed-through 750Ti
>>                                         lately since getting a Steam
>>                                         Link :-D.
>>
>>                                         On Sun, Jan 24, 2016 at 7:32
>>                                         AM, Ruben Felgenhauer
>>                                         <4felgenh at informatik.uni-hamburg.de>
>>                                         wrote:
>>
>>                                             Hi,
>>
>>                                             finally I had time to
>>                                             this again. I tried out
>>                                             virt-manager and after a
>>                                             bit of playing around
>>                                             with it, it /somewhat/
>>                                             worked:
>>
>>                                             The machine is at least
>>                                             booting. I still have a
>>                                             standard vga card enabled
>>                                             in the virt-manager
>>                                             config window.
>>                                             After the machine has
>>                                             booted, I can see that
>>                                             the device gets
>>                                             recognized as 750ti.
>>                                             However, the gpu doesn't
>>                                             get used, because of
>>                                             'Code 43'.
>>                                             Code 43 is a generic
>>                                             error, so any idea what
>>                                             it could mean in this case?
>>
>>                                             Of course I added the
>>                                             <kvm><hidden
>>                                             state='on'/></kvm> lines
>>                                             at the associated position.
>>
>>                                             Best regards,
>>                                             Ruben
>>
>>
>>                                             Am 18.01.2016 um 22:27
>>                                             schrieb Will Marler:
>>>                                             I'm not sure what
>>>                                             correct command-line
>>>                                             syntax is. Have you
>>>                                             tried using libvirt and
>>>                                             VirtManager to handle
>>>                                             your VM rather than
>>>                                             command line, and
>>>                                             modifying the XML rather
>>>                                             than the command line? I
>>>                                             think that's generally
>>>                                             the preferred method
>>>                                             these days (it's
>>>                                             certainly easier from my
>>>                                             point of view, and the
>>>                                             way I got my 750 Ti to
>>>                                             pass through).
>>>
>>>                                             On Mon, Jan 18, 2016 at
>>>                                             11:04 AM, Ruben
>>>                                             Felgenhauer
>>>                                             <4felgenh at informatik.uni-hamburg.de>
>>>                                             wrote:
>>>
>>>                                                 Hi, Alex!
>>>
>>>                                                 Thanks for your reply!
>>>                                                 My GPU indeed has a
>>>                                                 seperate audio
>>>                                                 device located at
>>>                                                 01:00.1.
>>>
>>>                                                 However, just adding
>>>                                                 -device
>>>                                                 vfio-pci,host=01:00.1 doesn't
>>>                                                 seem to do the trick.
>>>                                                 Of course the
>>>                                                 corresponding device
>>>                                                 is already
>>>                                                 blacklisted and
>>>                                                 bound to vfio.
>>>
>>>                                                 The Debian Wiki
>>>                                                 entry about VGA
>>>                                                 passthrough
>>>                                                 (https://wiki.debian.org/VGAPassthrough)
>>>                                                 mentions QEMU
>>>                                                 arguments like
>>>                                                 "-device
>>>                                                 vfio-pci,host=01:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on,romfile=...
>>>                                                 -device
>>>                                                 vfio-pci,host=01:00.1,bus=pcie.0"
>>>                                                 which seems to
>>>                                                 address GPUs with
>>>                                                 audio devices, but
>>>                                                 if I try to do
>>>                                                 something similar,
>>>                                                 the buses 'root' and
>>>                                                 'pcie' couldn't be
>>>                                                 found. Maybe I
>>>                                                 missed something
>>>                                                 very important?
>>>
>>>                                                 On the same article,
>>>                                                 it says that the
>>>                                                 "HDMI soundcard
>>>                                                 [...] needs to be
>>>                                                 unbound from its
>>>                                                 driver":
>>>                                                 # echo
>>>                                                 '0000:01:00.1' |
>>>                                                 sudo tee
>>>                                                 /sys/bus/pci/devices/0000:01:00.1/driver/unbind
>>>                                                 I figured the
>>>                                                 vfio-bind script
>>>                                                 from the Arch Linux
>>>                                                 Forum thread
>>>                                                 (https://bbs.archlinux.org/viewtopic.php?id=162768)
>>>                                                 would do exactly
>>>                                                 this thing, so I
>>>                                                 didn't explicitly do
>>>                                                 so for the audio
>>>                                                 device. Is that okay?
>>>
>>>                                                 Best regards,
>>>                                                 Ruben
>>>
>>>
>>>                                                 Am 18.01.2016 um
>>>                                                 08:31 schrieb
>>>                                                 Alexander Petrenz:
>>>>                                                 Hi Ruben,
>>>>
>>>>                                                 I guess your 750ti
>>>>                                                 also has some audio
>>>>                                                 device. You should
>>>>                                                 pass through this
>>>>                                                 too. It should be
>>>>                                                 something like
>>>>                                                 01:00.1. There are
>>>>                                                 many command line
>>>>                                                 examples you can
>>>>                                                 find about that.
>>>>                                                 Also I´m not quite
>>>>                                                 sure, if you should
>>>>                                                 remove the x-vga=on.
>>>>
>>>>                                                 Regards
>>>>                                                 Alex
>>>>
>>>>                                                 On Sun, Jan 17,
>>>>                                                 2016 at 11:12 PM,
>>>>                                                 Ruben Felgenhauer
>>>>                                                 <4felgenh at informatik.uni-hamburg.de>
>>>>                                                 wrote:
>>>>
>>>>                                                     Hi,
>>>>
>>>>                                                     I am trying to
>>>>                                                     pass my nVidia
>>>>                                                     GTX 750ti to my
>>>>                                                     QEMU guest.
>>>>
>>>>                                                     Problem is:
>>>>                                                     After the QEMU
>>>>                                                     monitor pops
>>>>                                                     up, nothing
>>>>                                                     happens. The
>>>>                                                     GPU's output is
>>>>                                                     dead, and the
>>>>                                                     vm won't be
>>>>                                                     accessible via
>>>>                                                     SSH anymore, so
>>>>                                                     it's very
>>>>                                                     likely that the
>>>>                                                     VM isn't
>>>>                                                     booting up at
>>>>                                                     all. Also,
>>>>                                                     there are no
>>>>                                                     error messages
>>>>                                                     from QEMU on
>>>>                                                     the console
>>>>                                                     whatsoever
>>>>                                                     which makes
>>>>                                                     debugging it
>>>>                                                     especially hard.
>>>>
>>>>                                                     This is how I
>>>>                                                     start the vm
>>>>                                                     with normal vga
>>>>                                                     emulation:
>>>>                                                     qemu-system-x86_64
>>>>                                                     -hda vm.ovl
>>>>                                                     -boot c
>>>>                                                     -enable-kvm -m
>>>>                                                     1024 -cpu
>>>>                                                     host,kvm=off
>>>>                                                     -smp
>>>>                                                     cores=4,threads=2
>>>>                                                     -redir tcp:5022::22
>>>>                                                     Everything runs
>>>>                                                     fine in this
>>>>                                                     case. To do the
>>>>                                                     passthrough, I
>>>>                                                     add this:
>>>>                                                     -device
>>>>                                                     vfio-pci,host=01:00.0,multifunction=on,x-vga=on
>>>>                                                     -vga none
>>>>                                                     This brings
>>>>                                                     said problems
>>>>                                                     with it. I also
>>>>                                                     tried out
>>>>                                                     multiple
>>>>                                                     different
>>>>                                                     combinations of
>>>>                                                     -device's
>>>>                                                     arguments or
>>>>                                                     even adding a
>>>>                                                     romfile for the
>>>>                                                     GPU, but none
>>>>                                                     of these steps
>>>>                                                     changed
>>>>                                                     anything at all.
>>>>
>>>>                                                     Obviously, I am
>>>>                                                     using a BIOS
>>>>                                                     installation
>>>>                                                     and I'm
>>>>                                                     well-aware with
>>>>                                                     this bug:
>>>>                                                     https://bugzilla.kernel.org/show_bug.cgi?id=107561,
>>>>                                                     but neither
>>>>                                                     using less RAM
>>>>                                                     (as you can see
>>>>                                                     I am using 1GB
>>>>                                                     now) nor
>>>>                                                     switching to an
>>>>                                                     older Kernel
>>>>                                                     changed
>>>>                                                     anything about
>>>>                                                     the problem. I
>>>>                                                     have tried
>>>>                                                     Kernel 4.1.0
>>>>                                                     and 4.3.0.
>>>>
>>>>                                                     Host is Debian
>>>>                                                     testing with
>>>>                                                     QEMU 2.5.0.
>>>>                                                     I tried both
>>>>                                                     Debian and
>>>>                                                     Windows 7 as a
>>>>                                                     guest, but both
>>>>                                                     are showing
>>>>                                                     exactly the
>>>>                                                     same behaviour.
>>>>                                                     Mainboard is an
>>>>                                                     ASUS Z87-PLUS.
>>>>                                                     The 750ti is
>>>>                                                     produced by
>>>>                                                     ASUS aswell.
>>>>
>>>>                                                     Any idea how I
>>>>                                                     could get
>>>>                                                     passthrough
>>>>                                                     running?
>>>>
>>>>                                                     _______________________________________________
>>>>                                                     vfio-users
>>>>                                                     mailing list
>>>>                                                     vfio-users at redhat.com
>>>>                                                     https://www.redhat.com/mailman/listinfo/vfio-users
>>>>
>>>>
>>>
>>>
>>>                                                 _______________________________________________
>>>                                                 vfio-users mailing list
>>>                                                 vfio-users at redhat.com
>>>                                                 https://www.redhat.com/mailman/listinfo/vfio-users
>>>
>>>
>>
>>
>>                                         _______________________________________________
>>                                         vfio-users mailing list
>>                                         vfio-users at redhat.com
>>                                         https://www.redhat.com/mailman/listinfo/vfio-users
>>
>>
>>
>>
>
>
>
> _______________________________________________
> vfio-users mailing list
> vfio-users at redhat.com
> https://www.redhat.com/mailman/listinfo/vfio-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20160128/0eed3886/attachment.htm>


More information about the vfio-users mailing list