[libvirt] RFC: Creating mediated devices with libvirt

Daniel P. Berrange berrange at redhat.com
Fri Jun 23 09:25:38 UTC 2017


On Thu, Jun 22, 2017 at 05:57:34PM -0400, John Ferlan wrote:
> 
> 
> On 06/14/2017 06:06 PM, Erik Skultety wrote:
> > Hi all,
> > 
> > so there's been an off-list discussion about finally implementing creation of
> > mediated devices with libvirt and it's more than desired to get as many opinions
> > on that as possible, so please do share your ideas. This did come up already as
> > part of some older threads ([1] for example), so this will be a respin of the
> > discussions. Long story short, we decided to put device creation off and focus
> > on the introduction of the framework as such first and build upon that later,
> > i.e. now.
> > 
> > [1] https://www.redhat.com/archives/libvir-list/2017-February/msg00177.html
> > 
> > ========================================
> > PART 1: NODEDEV-DRIVER
> > ========================================
> > 
> > API-wise, device creation through the nodedev driver should be pretty
> > straightforward and without any issues, since virNodeDevCreateXML takes an XML
> > and does support flags. Looking at the current device XML:
> > 
> > <device>
> >   <name>mdev_0cce8709_0640_46ef_bd14_962c7f73cc6f</name>
> >   <path>/sys/devices/pci0000:00/.../0cce8709-0640-46ef-bd14-962c7f73cc6f</path>
> >   <parent>pci_0000_03_00_0</parent>
> >   <driver>
> >     <name>vfio_mdev</name>
> >   </driver>
> >   <capability type='mdev'>
> >     <type id='nvidia-11'/>
> >     <iommuGroup number='13'/>
> >     <uuid>UUID<uuid> <!-- optional enhancement, see below -->
> >   </capability>
> > </device>
> > 
> > We can ignore <path>,<driver>,<iommugroup> elements, since these are useless
> > during creation. We also cannot use <name> since we don't support arbitrary
> > names and we also can't rely on users providing a name in correct form which we
> > would need to further parse in order to get the UUID.
> > So since the only thing missing to successfully use create an mdev using XML is
> > the UUID (if user doesn't want it to be generated automatically), how about
> > having a <uuid> subelement under <capability> just like PCIs have <domain> and
> > friends, USBs have <bus> & <device>, interfaces have <address> to uniquely
> > identify the device even if the name itself is unique.
> > Removal of a device should work as well, although we might want to
> > consider creating a *Flags version of the API.
> 
> 
> Has any thought been put towards creating an mdev pool modeled after the
> Storage Pool? Similar to how vHBA's are created from a Storage Pool XML
> definition.
> 
> That way XML could be defined to keep track of a lot of different things
> that you may need and would require only starting the pool in order to
> access.
> 
> Placed "appropriately" - the mdev's could already be available by the
> time node device state initialization occurs too since the pool would
> conceivably been created/defined using data from the physical device and
> the calls to create the virtual devices would have occurred. Much easier
> to add logic to a new driver/pool mgmt to handle whatever considerations
> there are than adding logic into the existing node device driver.

All those things you describe are possible with the node device API,
once we add the inactive object concept that other APIs have. It is
also more flexible to use the node device concept, because it seemlessly
integrates with the physical PCI device management. We've already seen
with SRIOV NICs that mgmt apps needed the flexibility to choose between
assigning the physical NIC, vs assigning individual functions. I expect
the same to be true of mdevs, where you choose between assigning the
GPU PCI device, vs one of the mdev vGPUs.  In OpenStack what I'm expecting
is that the existing PCI device / SRIOV device mgmt code (that is based
on the node device APIs) is genericised to cover arbitrary types of node
device, not simply those with the pci capability. Thus we'd expect mdev
mgmt to be part of the node device APIs framework, not split off in a
separate set of pool APIs. 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




More information about the libvir-list mailing list