[libvirt] question about PCI new_id sysfs interface

Alex Williamson alex.williamson at redhat.com
Tue Jun 28 19:17:08 UTC 2016


On Tue, 28 Jun 2016 14:39:22 -0400
Laine Stump <laine at laine.org> wrote:

> On 06/28/2016 01:39 PM, Jim Fehlig wrote:
> > After updating the dom0 kernel on one of my Xen test hosts, I noticed 
> > problems with PCI hostdev management. E.g
> >
> > # virsh nodedev-detach pci_0000_07_10_1
> > error: Failed to detach device pci_0000_07_10_1
> > error: Failed to add PCI device ID '8086 1520' to pciback: File exists
> >
> > It turns out there was a small interface change to new_id with the 
> > following commit to 3.16 kernel
> >
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/pci/pci-driver.c?h=v3.16&id=8895d3bcb8ba960b1b83f95d772b641352ea8e51 
> >
> >
> > which now causes xen_pciback to fail writes of "vendorid productid" to 
> > new_id. e.g.
> >
> > # echo "8086 1520" > /sys/bus/pci/drivers/pciback/new_id
> > -bash: echo: write error: File exists
> >
> > Interestingly, vfio doesn't encounter the same error
> >
> > # echo "8086 1520" > /sys/bus/pci/drivers/vfio-pci/new_id
> > # echo $?
> > 0
> >
> > vfio-pci has:
> > static struct pci_driver vfio_pci_driver = {
> >         .name           = "vfio-pci",
> >         .id_table       = NULL, /* only dynamic ids */
> >
> > while xen-pciback has:
> > static const struct pci_device_id pcistub_ids[] = {
> >         {
> >          .vendor = PCI_ANY_ID,
> >          .device = PCI_ANY_ID,
> >          .subvendor = PCI_ANY_ID,
> >          .subdevice = PCI_ANY_ID,
> >          },
> >         {0,},
> > };
> > static struct pci_driver xen_pcibk_pci_driver = {
> >         .name = "pciback",
> >         .id_table = pcistub_ids,
> >
> > So any vendor/device pair will match for xen-pciback, while none will 
> > match for vfio-pci.
> >
> > But after reading that commit and the associated thread, it is not 
> > clear to me how to best fix this. Options are
> >
> > 1. set .id_table to NULL for xen-pciback
> > 2. drop using the new_id interface from libvirt
> > 3. pass more values (subvendor, subdevice, class, etc) to the new_id 
> > interface
> >
> > I'm not sure what problems, if any, options 1 and 2 might cause. 
> > Option 2 seems the best approach since new_id seems to be a rather 
> > unsafe interface.  
> 
> Regardless of your current problem (as Dan says in his reply, this is 
> kernel breakage and should be fixed)...
> 
> "Unsafe" was *one* of the words that came to my mind when I first saw 
> the new_id interface. These days there is a sysfs interface called 
> driver_override that seems much more thoughtfully designed - you just 
> write the name of the desired driver to /sys/devices/[rest of path to 
> device]/driver_override. I didn't check if this is the version of the 
> patch that was pushed upstream, but the commit log message does give a 
> nice synopsis of its use:
> 
> https://www.redhat.com/archives/libvir-list/2014-April/msg00382.html
> 
> It would be nice to completely get rid of new_id in libvirt, but 
> driver_override doesn't exist in 2.6 kernels, so we have to keep it 
> around for compatibility with RHEL6/CentOS6. In the meantime, I wouldn't 
> complain at all if someone added support for driver_override that would 
> fallback to new_id if the driver_override node wasn't found. (A nice 
> side effect would be that your problem would be solved even when the 
> kernel wasn't fixed - driver_override is present at least as far back as 
> kernel 3.10, and you say your problem doesn't occur until 3.16).

+1  The new_id interface has several issues and for meta drivers like
vfio-pci, pci-stub, and xen-pciback it should probably be considered
deprecated.  driver_override is the preferred interface.  In addition
to the fix made by the referenced commit, which resolved some really
difficult to debug issues, new_id is racy.  Any time we add and remove
an ID to a driver there's a window where any device matching that
specification can bind to the driver.  driver_override avoids these
sorts of issues.  There are also bus types which do not support dynamic
IDs, but have added support for driver_override, ex. vfio-platform.
Migrating to driver_override makes that support easier.  xen-pciback
seems like it was always used in a questionable way if the driver
already binds to PCI_ANY_ID, but userspace persists in trying to add a
specific ID anyway.  -EEXIST seems like an actual correct response
rather than silently building up overlapping and conflicting dynamic ID
entries.

Also it looks to me like both the new_id behavior and driver_override
were both introduced in 3.16, the driver_override commit is:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/pci/pci-driver.c?h=v3.16&id=782a985d7af26db39e86070d28f987cad21313c0

Thanks,
Alex




More information about the libvir-list mailing list