[vfio-users] GPU pass through freezing on Juno OpenStack KVM guests

Alex Williamson alex.l.williamson at gmail.com
Thu Sep 24 23:20:32 UTC 2015


Hi James,

On Thu, Sep 24, 2015 at 5:02 PM, James McEvoy <jmcevoy at penguincomputing.com>
wrote:

> Problem Statement:  Cannot start a server virtual machine instance using
> the Juno version of OpenStack when an nVidia K2 Grid card is attached to
> the libvirt.xml configuration.
>
> Sortware Version and configuration.
> Openstack version Juno using KVM for the hypervisor
> OS Ubuntu 14.04
> Linux kernel 3.16.0-49-generic
> libvirt version: 1.2.2
>
> The configuration includes 3 servers
> 1 - Openstack Controller
> 2 - Openstack Compute servers with K2 Grid cards.
>
> We have run into a problem getting the GPU passed through to a KVM guest
> started by OpenStack Juno using the vfio method.
>
> OpenStack will launch an instance of the CentOS 6.7 OS image successfully
> if we choose a flavor that does not include a GPU.  If we try the same
> image on the same server with a flavor that adds a GPU the instance will
> not boot.
>
> However if we configure the GPU using the virt-manager GUI on the same KVM
> hypervisor host we get access to the GPU which is how we did most of our
> testing.
>
> Our working hypothesis for the root cause of this issue looks like it may
> be related to the libvirt.xml file generated by OpenStack which is used to
> define the guest instance on the hypervisor host. The xml generated by the
> virt-manager GUI additional configuration directives beyond what OpenStack
> provides.
>
> The configuration generated by OpenStack does not boot.
>
> I believe that nVidia should provide a plugin for OpenStack that will
> correctly configure the GPU for pass through.
>

You're assigning a device, that happens to be an nvidia GPU, but what's
unique about it that nvidia should be providing a plugin here?  If you're
assigning a GRID card, it really is just another devices as far as device
assignment is concerned.  The tricks and workarounds documented in most of
the blogs are only relevant for consumer grade GeForce and low-end Quadros.


> We are experimenting by updating the nova source to write out the
> additional lines to the libvirt.xml file openstack writes to
> /var/lib/nova/instances/<UUID>/libvirt.xml to see if it will work...
>
> I highlited the differences below with a + at the beginning of the line:
>
> Config that works... GPU added by virt-manager on my laptop which is
> running CentOS7 to vmhost3-b.
>
>     <hostdev mode='subsystem' type='pci' managed='yes'>
> +     <driver name='vfio'/>
>

This is likely a problem with your underlying OS.  Some distributions, like
RHEL7, disable legacy PCI device assignment, so that only VFIO based device
assignment is available.  If you're relying on features that are unique to
VFIO, then you somehow need to enforce that preference.  According to the
libvirt documentation (https://libvirt.org/formatdomain.html#elementsHostDev)
vfio is the default since libvirt 1.1.3.  Perhaps you just need to setup
your system to make sure that vfio modules are loaded in order for that to
be selected as the preference.


>       <source>
>         <address domain='0x0000' bus='0x86' slot='0x00' function='0x0'/>
>       </source>
> +     <alias name='hostdev0'/>
> +     <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
> function='0x0'/>
>

I expect the alias is not visible to QEMU, but failing to set a fixed
address for a device appears to be a general failure on the openstack
side.  It should do this regardless of whether the device is a physical
assigned device or emulated device.  Without this, existing devices might
move to different addresses if new devices are added or others removed.
This means that a Windows guest might need to update configs, causing
additional boots, and Linux guests might lose device persistence if based
on PCI address, including BusID options in Xorg files for GPUs.  Thanks,

Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20150924/9e7be97e/attachment.htm>


More information about the vfio-users mailing list