[PATCH] util: basic support for vendor-specific vfio drivers

Erik Skultety eskultet at redhat.com
Mon Aug 1 13:58:42 UTC 2022


On Mon, Aug 01, 2022 at 09:30:38AM -0400, Laine Stump wrote:
> On 8/1/22 7:58 AM, Erik Skultety wrote:
> > On Mon, Aug 01, 2022 at 12:02:22AM -0400, Laine Stump wrote:
> > > Before a PCI device can be assigned to a guest with VFIO, that device
> > > must be bound to the vfio-pci driver rather than to the device's
> > > normal driver. The vfio-pci driver provides APIs that permit QEMU to
> > > perform all the necessary operations to make the device accessible to
> > > the guest.
> > > 
> > > There has been kernel work recently to support vendor/device-specific
> > > VFIO drivers that provide the basic vfio-pci driver functionality
> > > while adding support for device-specific operations (for example these
> > > device-specific drivers are planned to support live migration of
> > > certain devices). All that will be needed to make this functionality
> > > available will be to bind the new vendor-specific driver to the device
> > > (rather than the generic vfio-pci driver, which will continue to work
> > > just without the extra functionality).
> > > 
> > > But until now libvirt has required that all PCI devices being assigned
> > > to a guest with VFIO specifically have the "vfio-pci" driver bound to
> > > the device. So even if the user manually binds a shiny new
> > > vendor-specific driver to the device (and puts "managed='no'" in the
> > > config to prevent libvirt from changing that), libvirt will just fail
> > > during startup of the guest (or during hotplug) because the driver
> > > bound to the device isn't named exactly "vfio-pci".
> > > 
> > > Fortunately these new vendor/device-specific drivers can be easily
> > > identified as being "vfio-pci + extra stuff" - all that's needed is to
> > > look at the output of the "modinfo $driver_name" command to see if
> > > "vfio_pci" is in the alias list for the driver.
> > > 
> > > That's what this patch does. When libvirt checks the driver bound to a
> > > device (either to decide if it needs to bind to a different driver or
> > > perform some other operation, or if the current driver is acceptable
> > > as-is), if the driver isn't specifically "vfio-pci", then it will look
> > > at the output of modinfo for the driver that *is* bound to the device;
> > > if modinfo shows vfio_pci as an alias for that device, then we'll
> > > behave as if the driver was exactly "vfio-pci".
> > 
> > Since you say that the vendor/device-specific drivers does each of such drivers
> > implement the base vfio-pci functionality or they simply call into the base
> > driver? The reason why I'm asking is that if each of the vendor-specific
> > drivers depend on the vfio-pci module to be loaded as well, then reading
> > /proc/modules should suffice as vfio-pci should be listed right next to the
> > vendor-specific one. What am I missing?
> I don't know the definitive answer to that, as I have no example of a
> working vendor-specific driver to look at and only know about the kernel
> work going on second-hand from Alex. It looks like even the vfio_pci driver
> itself depends on other presumably lower level vfio-* modules (it directly
> uses vfio_pci_core, which in turn uses vfio and vfio_vifqfd), so possibly
> these new drivers would be depending on one or more of those lower level
> modules rather than vfio_pci. Also I would imagine it would be possible for
> other drivers to also depend on the vfio-pci driver while not themselves
> being a vfio driver.
> 
> > 
> > The 'alias' field is optional so do we have any support guarantees from the
> > vendors that the it will always be filled in correctly? I mean you surely
> > handle that case in the code, but once we start supporting this there's no way
> > back and we already know how painful it can be to convince the vendors to
> > follow some kind of standard so that we don't need to maintain several code
> > paths based on a vendor-matrix.
> 
> The aliases are what is used to determine the "best" vfio driver for a
> particular device, so I don't think it would be possible for a driver to not
> implement it, and the method I've used here to determine if a driver is a
> vfio driver was recommended by Alex after a couple of discussions on the
> subject.
> 
> > 
> > ...
> > 
> > > +int
> > > +virPCIDeviceGetDriverNameAndType(virPCIDevice *dev,
> > > +                                 char **drvName,
> > > +                                 virPCIStubDriver *drvType)
> > > +{
> > > +    g_autofree char *drvPath = NULL;
> > > +    g_autoptr(virCommand) cmd = NULL;
> > > +    g_autofree char *output = NULL;
> > > +    g_autoptr(GRegex) regex = NULL;
> > > +    g_autoptr(GError) err = NULL;
> > > +    g_autoptr(GMatchInfo) info = NULL;
> > > +    int exit;
> > > +    int tmpType;
> > > +
> > > +    if (virPCIDeviceGetDriverPathAndName(dev, &drvPath, drvName) < 0)
> > > +        return -1;
> > > +
> > > +    if (!*drvName) {
> > > +        *drvType = VIR_PCI_STUB_DRIVER_NONE;
> > > +        return 0;
> > > +    }
> > > +
> > > +    tmpType = virPCIStubDriverTypeFromString(*drvName);
> > > +
> > > +    if (tmpType > VIR_PCI_STUB_DRIVER_NONE) {
> > > +        *drvType = tmpType;
> > > +        return 0; /* exact match of a known driver name (or no name) */
> > > +    }
> > > +
> > > +    /* Check the output of "modinfo $drvName" to see if it has
> > > +     * "vfio_pci" as an alias. If it does, then this driver should
> > > +     * also be considered as a vfio-pci driver, because it implements
> > > +     * all the functionality of the basic vfio-pci (plus additional
> > > +     * device-specific stuff).
> > > +     */
> > 
> > Instead of calling an external program and then grepping its output which
> > technically could change in the future, wouldn't it be better if we read
> > /lib/modules/`uname -r`/modules.alias and filtered whatever line had the
> > vfio-pci' substring and compared the module name with the user-provided device
> > driver?
> 
> Again, although I was hesistant about calling an external command, and asked
> if there was something simpler, Alex still suggested modinfo, so I'll let
> him answer that. Alex?
> 
> (Also, although the format of the output of "uname -r" is pretty much
> written in stone, you're still running an external command :-))

Well, that was just my attempt to express whatever is the running kernel in
that path, but of course I didn't mean to run the command verbatim as it
contradicts what I'm trying to suggest. We can get the version from e.g. QEMU
caps (<kernelVersion>), it's also available in /proc/version. Grepping the
codebase we don't seem to be using the kernelVersion attribute from QEMU caps
at the moment.

Erik



More information about the libvir-list mailing list