[libvirt] [PATCH 1/2] virpci: simplify virPCIDeviceBindToStub

Laine Stump laine at laine.org
Fri Jul 29 00:01:03 UTC 2016


On 07/11/2016 02:23 PM, Jim Fehlig wrote:
> Early in virPCIDeviceBindToStub, there is a check to see if the
> stub is already bound to the device, returning success with no
> further actions if that is the case.
>
> The same condition is unnecessarily checked later in the function.

Looking at the original code, the condition (whether the device is bound 
to the stub driver) is checked after writing the PCI ID of the device to 
the stub driver's "new_id" node. If you look here:

https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-bus-pci

It says that when you write a PCI ID to a driver's "new_id", if there 
are any devices with the given PCI ID currently not bound to any driver, 
they will be immediately bound to the given driver. So I don't think the 
check is unnecessary - if the device wasn't bound to any driver to begin 
with (e.g. if the host driver for the device was blacklisted), it will 
be bound to the stub driver *immediately* after writing the PCI ID to 
new_id (ie before getting to the 2nd check).

So the condition isn't unnecessarily checked. It really can happen that 
we weren't bound to the stub in the first case, but were by the time we 
get to the 2nd.

On the other hand, this points out how utterly atrocious the new_id 
interface is - when I tested this by blacklisting the igbvf driver (so 
all the VFs of my 82576 card would have no driver bound to them), then 
started a domain that used a single VF, *all* of the VFs were suddenly 
bound to vfio-pci !!!

I also made a bunch of comments below before I noticed this part that 
I've put at the top, but in the end I think that especially since this 
whole bind/unbind/new_id interface is a dying thing, it would be better 
to leave it untouched (to avoid unexpected regressions) unless it's 
going to make a significant difference in how the driver_override stuff 
is added in.

> Remove the unneeded checks to simplify the logic a bit.
>
> Signed-off-by: Jim Fehlig <jfehlig at suse.com>
> ---
>   src/util/virpci.c | 68 +++++++++++++++++++++++--------------------------------
>   1 file changed, 28 insertions(+), 40 deletions(-)
>
> diff --git a/src/util/virpci.c b/src/util/virpci.c
> index 132948d..127b3b6 100644
> --- a/src/util/virpci.c
> +++ b/src/util/virpci.c
> @@ -1196,7 +1196,6 @@ static int
>   virPCIDeviceBindToStub(virPCIDevicePtr dev)
>   {
>       int result = -1;
> -    bool reprobe = false;
>       char *stubDriverPath = NULL;
>       char *driverLink = NULL;
>       char *path = NULL; /* reused for different purposes */
> @@ -1225,10 +1224,16 @@ virPCIDeviceBindToStub(virPCIDevicePtr dev)
>               /* The device is already bound to the correct driver */
>               VIR_DEBUG("Device %s is already bound to %s",
>                         dev->name, stubDriverName);
> +            dev->unbind_from_stub = true;
> +            dev->remove_slot = true;

I understand where you got these two lines from (they were part of the 
2nd "if (virFileLinkPointsTo(driverLink, stubDriverPath))" clause that 
you removed), but I don't think they should be put here - if we start 
out with the device already bound to the stub driver, then when we're 
finished we *won't* want to unbind_from_stub *or* remove_slot. To test 
this I commented them out and did several tests with three different builds:

A) a build without this patch
B) a build with this patch as-is
C) a build with this patch, but with the two lines above commented out.

With each libvirt build, I started then destroyed a domain with a 
<hostdev managed='yes'>, looking at the driver link in the devices sysfs 
directory before, during, and after the test. I did this with the 
following starting states:

1) device is bound to host driver before we start
2) device is bound to stub driver before we start
3) device isn't bound to *any* driver before we start

In all those (9) cases the device ends up bound to the same driver at 
the end as it was bound to at the beginning.

(I also tried starting the domain, then restarting libvirtd and then 
destroying the domain - both the old and new code have the same bug: 
even if the device was bound to the stub driver in the beginning, it 
ends up being bound to the host driver (or nothing, if the host driver 
has been blacklisted in modprobe.d) at the end.)

Result: I still believe those two lines above shouldn't be added.

>               result = 0;
>               goto cleanup;
>           }
> -        reprobe = true;
> +        /*
> +         * If the device is bound to a driver that is not the stub,  we'll
> +         * need to reprobe later
> +         */


Maybe instead of just "later" you could say "later when we unbind from 
the stub".


> +        dev->reprobe = true;
>       }
>   
>       /* Add the PCI device ID to the stub's dynamic ID table;
> @@ -1249,51 +1254,34 @@ virPCIDeviceBindToStub(virPCIDevicePtr dev)
>           goto cleanup;
>       }
>   
> -    /* check whether the device is bound to pci-stub when we write dev->id to
> -     * ${stubDriver}/new_id.
> -     */
> -    if (virFileLinkPointsTo(driverLink, stubDriverPath)) {
> -        dev->unbind_from_stub = true;
> -        dev->remove_slot = true;
> -        result = 0;
> -        goto remove_id;
> -    }
> -

As mentioned above, this really *could* happen -writing to new_id could 
immediately bind the device to the stub driver.

And if it wasn't immediately bound to the stub, that means it was 
already bound to something else, so we need to write the device's PCI 
address to the original driver's "unbind" (and *that* will cause it to be
>       if (virPCIDeviceUnbind(dev) < 0)
>           goto remove_id;
>   
> -    /* If the device was bound to a driver we'll need to reprobe later */
> -    dev->reprobe = reprobe;
> +    /* Xen's pciback.ko wants you to use new_slot first */
> +    VIR_FREE(path);
> +    if (!(path = virPCIDriverFile(stubDriverName, "new_slot")))
> +        goto remove_id;
>   
> -    /* If the device isn't already bound to pci-stub, try binding it now.
> -     */
> -    if (!virFileLinkPointsTo(driverLink, stubDriverPath)) {

I think the above conditional also needs to stay - apparently it's 
possible that unbinding from the host driver still wouldn't be enough to 
get the device bound to the stuf, in which case you'd need to write the 
device's PCI address to the stub driver's "new_slot" in order to make it 
bind. But you don't want to do that unconditionally - I'd wager that 
it's more expenside to do the remove_slot later, and that's why the code 
is trying to avoid it.

> -        /* Xen's pciback.ko wants you to use new_slot first */
> -        VIR_FREE(path);
> -        if (!(path = virPCIDriverFile(stubDriverName, "new_slot")))
> -            goto remove_id;
> -
> -        if (virFileExists(path) && virFileWriteStr(path, dev->name, 0) < 0) {
> -            virReportSystemError(errno,
> -                                 _("Failed to add slot for "
> -                                   "PCI device '%s' to %s"),
> -                                 dev->name, stubDriverName);
> -            goto remove_id;
> -        }
> -        dev->remove_slot = true;
> +    if (virFileExists(path) && virFileWriteStr(path, dev->name, 0) < 0) {
> +        virReportSystemError(errno,
> +                             _("Failed to add slot for "
> +                               "PCI device '%s' to %s"),
> +                             dev->name, stubDriverName);
> +        goto remove_id;
> +    }
> +    dev->remove_slot = true;
>   
> -        VIR_FREE(path);
> -        if (!(path = virPCIDriverFile(stubDriverName, "bind")))
> -            goto remove_id;
> +    VIR_FREE(path);
> +    if (!(path = virPCIDriverFile(stubDriverName, "bind")))
> +        goto remove_id;
>   
> -        if (virFileWriteStr(path, dev->name, 0) < 0) {
> -            virReportSystemError(errno,
> -                                 _("Failed to bind PCI device '%s' to %s"),
> -                                 dev->name, stubDriverName);
> -            goto remove_id;
> -        }
> -        dev->unbind_from_stub = true;
> +    if (virFileWriteStr(path, dev->name, 0) < 0) {
> +        virReportSystemError(errno,
> +                             _("Failed to bind PCI device '%s' to %s"),
> +                             dev->name, stubDriverName);
> +        goto remove_id;
>       }
> +    dev->unbind_from_stub = true;
>   
>       result = 0;
>   


In the end, I think this function should remain as it is.




More information about the libvir-list mailing list