[libvirt] RFC: Creating mediated devices with libvirt

Thu Jun 15 08:33:01 UTC 2017

On Thu, Jun 15, 2017 at 12:06:43AM +0200, Erik Skultety wrote:
> Hi all,
> 
> so there's been an off-list discussion about finally implementing creation of
> mediated devices with libvirt and it's more than desired to get as many opinions
> on that as possible, so please do share your ideas. This did come up already as
> part of some older threads ([1] for example), so this will be a respin of the
> discussions. Long story short, we decided to put device creation off and focus
> on the introduction of the framework as such first and build upon that later,
> i.e. now.
> 
> [1] https://www.redhat.com/archives/libvir-list/2017-February/msg00177.html
> 
> ========================================
> PART 1: NODEDEV-DRIVER
> ========================================
> 
> API-wise, device creation through the nodedev driver should be pretty
> straightforward and without any issues, since virNodeDevCreateXML takes an XML
> and does support flags. Looking at the current device XML:
> 
> <device>
>   <name>mdev_0cce8709_0640_46ef_bd14_962c7f73cc6f</name>
>   <path>/sys/devices/pci0000:00/.../0cce8709-0640-46ef-bd14-962c7f73cc6f</path>
>   <parent>pci_0000_03_00_0</parent>
>   <driver>
>     <name>vfio_mdev</name>
>   </driver>
>   <capability type='mdev'>
>     <type id='nvidia-11'/>
>     <iommuGroup number='13'/>
>     <uuid>UUID<uuid> <!-- optional enhancement, see below -->
>   </capability>
> </device>
> 
> We can ignore <path>,<driver>,<iommugroup> elements, since these are useless
> during creation. We also cannot use <name> since we don't support arbitrary
> names and we also can't rely on users providing a name in correct form which we
> would need to further parse in order to get the UUID.
> So since the only thing missing to successfully use create an mdev using XML is
> the UUID (if user doesn't want it to be generated automatically), how about
> having a <uuid> subelement under <capability> just like PCIs have <domain> and
> friends, USBs have <bus> & <device>, interfaces have <address> to uniquely
> identify the device even if the name itself is unique.
> Removal of a device should work as well, although we might want to
> consider creating a *Flags version of the API.
> 
> =============================================================
> PART 2: DOMAIN XML & DEVICE AUTO-CREATION, NO POLICY INVOLVED!
> =============================================================
> 
> There were some doubts about auto-creation mentioned in [1], although they
> weren't specified further. So hopefully, we'll get further in the discussion
> this time.
> 
> From my perspective there are two main reasons/benefits to that:
> 
> 1) Convenience
> For apps like virt-manager, user will want to add a host device transparently,
> "hey libvirt, I want an mdev assigned to my VM, can you do that". Even for
> higher management apps, like oVirt, even they might not care about the parent
> device at all times and considering that they would need to enumerate the
> parents, pick one, create the device XML and pass it to the nodedev driver, IMHO
> it would actually be easier and faster to just do it directly through sysfs,
> bypassing libvirt once again....

The convenience only works if the policy we've provided in libvirt actually
matches the policy the application wants. I think it is quite likely that with
cloud the mdevs will be created out of band from the domain startup process.
It is possible the app will just have a fixed set of mdevs pre-created when
the host starts up. Or that the mgmt app wants the domain startup process to
be a two phase setup, where it first allocates the resources needed, and later
then tries to start the guest. This is why I keep saying that putting this kind
of "convenient" policy in libvirt is a bad idea - it is essentially just putting
a bit of virt-manager code into libvirt - more advanced apps will need more
flexibility in this area.

> 2) Future domain migration
> Suppose now that the mdev backing physical devices support state dump and
> reload. Chances are, that the corresponding mdev doesn't even exist or has a
> different UUID on the destination, so libvirt would do its best to handle this
> before the domain could be resumed.

This is not an unusual scenario - there are already many other parts of the
device backend config that need to change prior to migration, especially for
anything related to host devices, so apps already have support for doing
this, which is more flexible & convenient becasue it doesn't tie creation of
the mdevs to running of the migrate command.

IOW, I'm still against adding any kind of automatic creation policy for
mdevs in libvirt. Just provide the node device API support.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|