[libvirt] [PATCH] hostdev: fix net config restore error

Laine Stump laine at laine.org
Fri Apr 24 16:41:45 UTC 2015


On 04/22/2015 08:02 AM, John Ferlan wrote:
>
> On 04/15/2015 01:29 PM, Huanle Han wrote:
>> Fix for such a case:
>> 1. Domain A and B xml contain the same SRIOV net hostdev(<interface
>> type='hostdev' /> with same pci address).
>> 2. virsh start A (Successfully, and configure the SRIOV net with
>> custom mac)
>> 3. virsh start B (Fail because of the hostdev used by domain A or other
>> reason.)
>> In step 3, 'virHostdevNetConfigRestore' is called for the hostdev
>> which is still used by domain A. It makes the mac/vlan of the SRIOV net
>> change.
>>
>> Code Change in this fix:
>> 1. As the pci used by other domain have been removed from
>> 'pcidevs' in previous loop, we only restore the nic config for
>> the hostdev still in 'pcidevs'(used by this domain)
>> 2. wrap a function 'virHostdevIsPCINetDevice', which detect whether the
>> hostdev is a pci net device or not.
>> 3. update the comments to make it more clear
>>
>> Signed-off-by: Huanle Han <hanxueluo at gmail.com>
>> ---
>>  src/util/virhostdev.c | 70 ++++++++++++++++++++++++++++++++++-----------------
>>  1 file changed, 47 insertions(+), 23 deletions(-)
>>
> This seems reasonable - although it should have been two patches - the
> first just to create virHostdevIsPCINetDevice and the second to fix the
> issue and update the comments...

I agree. That change obscures the functional change.

>   Something I can take care of for you
> before pushing...
>
> Hopefully Laine can also take a quick look at it and verify the actions
> since he's most familiar with the SRIOV code path...
>
> John
>> diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c
>> index f583e54..a2719d3 100644
>> --- a/src/util/virhostdev.c
>> +++ b/src/util/virhostdev.c
>> @@ -350,6 +350,14 @@ virHostdevNetDevice(virDomainHostdevDefPtr hostdev, char **linkdev,
>>      return ret;
>>  }
>>  
>> +static int
>> +virHostdevIsPCINetDevice(virDomainHostdevDefPtr hostdev)
>> +{
>> +    return hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS &&
>> +        hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI &&
>> +        hostdev->parent.type == VIR_DOMAIN_DEVICE_NET &&
>> +        hostdev->parent.data.net;
>> +}
>>  
>>  static int
>>  virHostdevNetConfigVirtPortProfile(const char *linkdev, int vf,
>> @@ -481,10 +489,7 @@ virHostdevNetConfigRestore(virDomainHostdevDefPtr hostdev,
>>      /* This is only needed for PCI devices that have been defined
>>       * using <interface type='hostdev'>. For all others, it is a NOP.
>>       */
>> -    if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS ||
>> -        hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI ||
>> -        hostdev->parent.type != VIR_DOMAIN_DEVICE_NET ||
>> -        !hostdev->parent.data.net)
>> +    if (!virHostdevIsPCINetDevice(hostdev))
>>         return 0;
>>  
>>      isvf = virHostdevIsVirtualFunction(hostdev);
>> @@ -604,16 +609,11 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr hostdev_mgr,
>>       * the network device, set the netdev config */
>>      for (i = 0; i < nhostdevs; i++) {
>>           virDomainHostdevDefPtr hostdev = hostdevs[i];
>> -         if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS)
>> +         if (!virHostdevIsPCINetDevice(hostdev))
>>               continue;

I think the above call should be moved from here into
virHostdevNetConfigReplace(). This will made it more of a mirror of
virHostdevNetConfigRestore() (which is what it is supposed to be).
(either that or *remove* the check from virHostdevNetConfigRestore() and
add it to its caller in the one place where it is appropriate - this
would be slightly more efficient, since it's already checked in one of
two places that virHostdevNetConfigRestore() is called). One or the
other of those can be a followup patch, though...

>> -         if (hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI)
>> -             continue;
>> -         if (hostdev->parent.type == VIR_DOMAIN_DEVICE_NET &&
>> -             hostdev->parent.data.net) {
>> -             if (virHostdevNetConfigReplace(hostdev, uuid,
>> -                                            hostdev_mgr->stateDir) < 0) {
>> -                 goto resetvfnetconfig;
>> -             }
>> +         if (virHostdevNetConfigReplace(hostdev, uuid,
>> +                                        hostdev_mgr->stateDir) < 0) {
>> +             goto resetvfnetconfig;
>>           }
>>           last_processed_hostdev_vf = i;

This *slightly* changes what is done, since the old code would have
updated last_processed_hostdev_vf when encountering any hostdev device
that was PCI but *not* the child of a NetDef. That's purely cosmetic
though, because the restore operation done up to
last_processed_hostdev_vf (on failure) is a NOP for any hostdev that
*isn't* the child of a NetDef. tl;dr - this change is OK.


>>      }
>> @@ -781,19 +781,20 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr,
>>          goto cleanup;
>>      }
>>  
>> -    /* Again 4 loops; mark all devices as inactive before reset
>> +    /* Here are 4 loops; mark all devices as inactive before reset

I don't see the value of changing "Again" to "Here are". Both are
equally pointless additions to the text of the comments. Instead, maybe
take this chance to make the comments more understandable:

"Loop through the assigned devices 4 times: 1) delete them all from
activePCIHostdevs, 2) restore network config of SRIOV netdevs, 3) Do a
PCI reset on each device, 4) reattach the devices to their host drivers
(managed) or add them to inactivePCIHostdevs (!managed)."

>>       * them and reset all the devices before re-attach.
>>       * Attach mac and port profile parameters to devices
>>       */
>> +
>> +    /* Loop 1: delete the copy of the dev from pcidevs if it's used by
>> +     * other domain. Or delete it from activePCIHostDevs if it had
>> +     * been used by this domain.
>> +     */

"Loop 1: verify that each device in the hostdevs list really was in use
by this domain, and remove them all from the activePCIHostdevs list."


(Another side note on cleanups that should be done in a later patch:
Looking at this code, I think virHostdevGetActivePCIHostDeviceList()
(called just above to create the pcidevs list) should just be given the
driver and domain name, then it could do the entire job of this loop,
just moving each item from the activePCIHostdevs list onto the newly
created list. That function is only called once anyway.)


>>      i = 0;
>>      while (i < virPCIDeviceListCount(pcidevs)) {
>>          virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i);
>>          virPCIDevicePtr activeDev = NULL;
>>  
>> -        /* delete the copy of the dev from pcidevs if it's used by
>> -         * other domain. Or delete it from activePCIHostDevs if it had
>> -         * been used by this domain.
>> -         */
>>          activeDev = virPCIDeviceListFind(hostdev_mgr->activePCIHostdevs, dev);
>>          if (activeDev) {
>>              const char *usedby_drvname;
>> @@ -815,13 +816,33 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr,
>>       */
>>  
>>      /*
>> -     * For SRIOV net host devices, unset mac and port profile before
>> -     * reset and reattach device
>> +     * Loop 2: For SRIOV net host devices used by this domain, unset mac and port profile before
>> +     * resetting and reattaching device

Line is too long. Anyway, a better comment would be:

"Loop 2: restore original network config of hostdevs that used
<interface type='hostdev'>"

>>       */
>> -    for (i = 0; i < nhostdevs; i++)
>> -        virHostdevNetConfigRestore(hostdevs[i], hostdev_mgr->stateDir,
>> -                                   oldStateDir);
>> +    for (i = 0; i < nhostdevs; i++) {
>> +        virDomainHostdevDefPtr hostdev = hostdevs[i];
>>  
>> +        if (virHostdevIsPCINetDevice(hostdev)) {
>> +            virDomainHostdevSubsysPCIPtr pcisrc = &hostdev->source.subsys.u.pci;
>> +            virPCIDevicePtr dev = NULL;
>> +            dev = virPCIDeviceNew(pcisrc->addr.domain, pcisrc->addr.bus,
>> +                                  pcisrc->addr.slot, pcisrc->addr.function);
>> +            if (dev) {


(This whole "virPCIDeviceNew()" thing would be completely unnecessary if
virPCIDeviceListFind() just used virPCIDeviceAddress instead of
virPCIDevice, and virDevicePCIAddress (what's in the hostdev entry) also
used virPCIDeviceAddress instead of adding the domain/bus/slot/function
separately. But again, that's not the fault of this patch; just nagging
about the general mess...)


>> +                if (virPCIDeviceListFind(pcidevs, dev)) {
>> +                    virHostdevNetConfigRestore(hostdev, hostdev_mgr->stateDir,
>> +                                               oldStateDir);
>> +                }
>> +            } else {
>> +                virErrorPtr err = virGetLastError();
>> +                VIR_ERROR(_("Failed to new PCI device: %s"),
>> +                          err ? err->message : _("unknown error"));
>> +                virResetError(err);

I guess you're adding this VIR_ERROR() because that's what was done in
Loop 3 in case of error. I think both are unnecessary and should be
removed/not added. There are times when this function is being called as
the result of some other error that already occurred, and we don't want
to mask that in any way (which leads us to discussing the idea that
whoever calls us to clean up after an error need to save their error
before calling us, and restore it after we're done, but that's *another*
story.

I would say remove the else clause here, and we can remove it from loop
3 in a later cleanup patch.

> +            }
> +            virPCIDeviceFree(dev);
> +        }
> +    }


This is getting more and more confusing. First we matched up the domains
"hostdevs" list to the HostdevMgr's activePCIHostDevs list to get a list
of the devices we want to do something with ("pcidevs"). This means that
we know for a fact that everything on "pcidevs" is a device that was on
the hostdevs list *and* that it was listed as actively in use by "some
domain". Then we further narrow down that list by verifying that each
device in pcidevs is listed as in use by *this* domain (and this driver,
in case two drivers use the same domain name).

So now we have a list of devices that we know are in use by our domain.
And what does this new code do? It goes through every item in the
hostdevs list of the domain to verify that it is on the pcidevs list,
then performs an operation on it. But we already know that the pcidevs
list only contains devices that are listed in hostdevs, so why not do
the loop based on the pcidevs list?

Oh, I get it - it's because we need the info from the hostdev entry
(which is our only pointer to the parent NetDef) in order to perform the
necessary NetConfigRestore). Bah. Okay, I accept defeat (for now :-P)


>> +
>> +    /* Loop 3: reset pci device used by this domain */

"Loop 3: perform a PCI Reset on all devices"

(it's been pretty well established by now that we're only operating on
devices that were used by this domain)

(By the way, I don't see the point of the extra VIR_ERROR in this loop
either, although again that isn't the problem of this patch...)

>>      for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) {
>>          virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i);
>>  
>> @@ -834,6 +855,9 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr,
>>          }
>>      }
>>  
>> +    /* Loop 4: reattach pci devices used by this domain
>> +     * and steal all the devices from pcidevs
>> +     */

"Loop 4: reattach devices to their host drivers (if managed) or place
them on the inactive list (if not managed)"

>>      while (virPCIDeviceListCount(pcidevs) > 0) {
>>          virPCIDevicePtr dev = virPCIDeviceListStealIndex(pcidevs, 0);
>>          virHostdevReattachPCIDevice(dev, hostdev_mgr);

At the end, it looks like this patch will do what was intended. I would
say ACK on the following conditions:

1) splitting into 2 patches as John suggests.

2) changing the comments as suggested above

3) removing the VIR_ERROR() from loop 2.

(At some point in the future, all of the code surrounding this patch
should be rewritten, but today is not that day)




More information about the libvir-list mailing list