[libvirt] [PATCHv2] pci: properly handle out-of-order SRIOV virtual functions

Niilona niilona at gmail.com
Thu Nov 7 14:06:45 UTC 2013


Hi forget this as approach works using kernel-patch.
Correctly was mentioned that the approach using leading-zeroes was a
try to fix something happened in other place.
Haven't get intention to libvirt side after that, wrote untested patch
for this symptom.
It's is not complete - no more error checking's.
But it should pre-select only "virtfn" entries to be handled in
while-loop, and after that scan through them in increasing numerical
order.

Still, here it is :

< /* routine to select only "virtfn" -entries */
< static int
< virtfn_select(const struct dirent *entry)
< {
<     return (strncmp(entry->d_name,"virtfn", 6) == 0) ? 1 : 0;
< }
<
2401a2395,2396
>     DIR *dir = NULL;
>     struct dirent *entry = NULL;
2404,2406d2398
<     struct dirent **namelist;
<     int entry_count = 0;
<     int current_entry = 0;
2414,2415c2406,2407
<     struct stat sb;
<     if ((stat(sysfs_path, &sb) == -1) || ((sb.st_mode & S_IFMT) != S_IFDIR)) {
---
>     dir = opendir(sysfs_path);
>     if (dir == NULL) {
2423,2425c2415,2416
<     entry_count = scandir(sysfs_path, &namelist, virtfn_select, alphasort);
<
<     while ( current_entry < entry_count ) {
---
>     while ((entry = readdir(dir))) {
>         if (STRPREFIX(entry->d_name, "virtfn")) {
2428c2419
<             if (virBuildPath(&device_link, sysfs_path, namelist[
current_entry ]->d_name ) == -1) {
---
>             if (virBuildPath(&device_link, sysfs_path, entry->d_name) == -1) {
2432c2423
<             current_entry ++;
---
>
2453a2445
>         }
2460,2462c2452,2453
<     for ( i = 0; i < entry_count; i ++)
<       free( namelist[ i ] );
<     free( namelist );
---
>     if (dir)
>         closedir(dir);

On Thu, Nov 7, 2013 at 2:32 PM, Ján Tomko <jtomko at redhat.com> wrote:
> On 11/07/2013 12:10 PM, Laine Stump wrote:
>> This resolves:
>>
>>   https://bugzilla.redhat.com/show_bug.cgi?id=1025397
>>
>> When libvirt creates the list of an SRIOV Physical Function's (PF)
>> Virtual Functions (VF), it assumes that the order of "virtfn*" links
>> returned by readdir() from the PF's sysfs directory is already in the
>> correct order. Experience has shown that this is not always the case.
>>
>> This results in 1) incorrect assumptions made by consumers of the
>> output of the virt_functions list of virsh nodedev-dumpxml, and 2)
>> setting MAC address and vlan tag on the wrong VF (since libvirt uses
>> netlink to set mac address and vlan tag, netlink requires the VF#, and
>> the function virPCIGetVirtualFunctionIndex() returns the wrong index
>> due to the improperly ordered VF list). See the bugzilla report for an
>> example of this improper behavior.
>>
>> The solution provided by this patch is for virPCIGetVirtualFunctions
>> to first gather all the "virtfn*" names from the PFs sysfs directory,
>> then allocate a virtual_functions array large enough to hold all
>> entries, and finally to create a device entry for each virtfn* name
>> and place it into the virtual_functions array at the proper index
>> (based on interpreting the value following "virtfn" in the name).
>>
>> Checks are introduced to ensure that 1) the virtfn list given in sysfs
>> is not sparse (ie, there are no indexes larger than the length of the
>> list), and that no two virtfn* entries decode to the same index.
>> ---
>>
>> Difference from V1: Based on jtomko's review, truncate trailing NULLs
>> in virtual_functions array, and actually check for NULLs in the middle
>> (a sparse array) and generate an error in that case, as we already
>> claimed to do in the commit message.
>>
>> (Sorry for the lack of inter-diff, but the only changes from V1 are
>> the addition of the lines starting with "truncate any trailing nulls"
>> and continuing until the next VIR_DEBUG.)
>>
>>  src/util/virpci.c | 115 ++++++++++++++++++++++++++++++++++++++++++------------
>>  1 file changed, 90 insertions(+), 25 deletions(-)
>>
>> diff --git a/src/util/virpci.c b/src/util/virpci.c
>> index 148631f..f744c8b 100644
>> --- a/src/util/virpci.c
>> +++ b/src/util/virpci.c
>> @@ -2379,6 +2379,8 @@ virPCIGetPhysicalFunction(const char *vf_sysfs_path,
>>      return ret;
>>  }
>>
>> +static const char *virtfnPrefix = "virtfn";
>> +
>>  /*
>>   * Returns virtual functions of a physical function
>>   */
>> @@ -2393,6 +2395,8 @@ virPCIGetVirtualFunctions(const char *sysfs_path,
>>      struct dirent *entry = NULL;
>>      char *device_link = NULL;
>>      char errbuf[64];
>> +    char **virtfnNames = NULL;
>> +    size_t nVirtfnNames = 0;
>>
>>      VIR_DEBUG("Attempting to get SR IOV virtual functions for device"
>>                "with sysfs path '%s'", sysfs_path);
>> @@ -2411,51 +2415,112 @@ virPCIGetVirtualFunctions(const char *sysfs_path,
>>
>>      while ((entry = readdir(dir))) {
>>          if (STRPREFIX(entry->d_name, "virtfn")) {
>> -            virPCIDeviceAddress *config_addr = NULL;
>> +            char *tempName;
>>
>> -            if (virBuildPath(&device_link, sysfs_path, entry->d_name) == -1) {
>> -                virReportOOMError();
>> +            /* save these to sort into virtual_functions array */
>> +            if (VIR_STRDUP(tempName, entry->d_name) < 0)
>> +                goto error;
>> +            if (VIR_APPEND_ELEMENT(virtfnNames, nVirtfnNames, tempName) < 0) {
>> +                VIR_FREE(tempName);
>>                  goto error;
>>              }
>> +        }
>> +    }
>
> Wouldn't it be easier to just count the entries beginning with "virtfn" and
> then go from 0 to n? (This would need a special return value for
> virPCIGetDeviceAddressFromSysfsLink if the link does not exist, if the entries
> in the directory can truly be sparse, or a virtfn-poorly-chosen-name appears)
>
>>
>> -            VIR_DEBUG("Number of virtual functions: %d",
>> -                      *num_virtual_functions);
>> +    /* pre-allocate because we will be filling in out of order */
>> +    if (nVirtfnNames && VIR_ALLOC_N(*virtual_functions, nVirtfnNames) < 0)
>> +        goto error;
>> +    *num_virtual_functions = nVirtfnNames;
>>
>> -            if (virPCIGetDeviceAddressFromSysfsLink(device_link,
>> -                                                    &config_addr) !=
>> -                SRIOV_FOUND) {
>> -                VIR_WARN("Failed to get SRIOV function from device "
>> -                         "link '%s'", device_link);
>> -                VIR_FREE(device_link);
>> -                continue;
>> -            }
>> +    for (i = 0; i < nVirtfnNames; i++) {
>> +        virPCIDeviceAddress *config_addr = NULL;
>> +        unsigned int virtfnIndex;
>>
>> -            if (VIR_REALLOC_N(*virtual_functions,
>> -                              *num_virtual_functions + 1) < 0) {
>> -                VIR_FREE(config_addr);
>> -                goto error;
>> -            }
>> +        if (virBuildPath(&device_link, sysfs_path, virtfnNames[i]) < 0) {
>> +            virReportOOMError();
>> +            goto error;
>> +        }
>> +
>> +        if (virStrToLong_ui(virtfnNames[i] + strlen(virtfnPrefix),
>> +                            0, 10, &virtfnIndex) < 0) {
>> +            virReportError(VIR_ERR_INTERNAL_ERROR,
>> +                           _("Invalid virtual function link name '%s' "
>> +                             "in physical function directory '%s'"),
>> +                           virtfnNames[i], sysfs_path);
>> +            goto error;
>> +        }
>
> This rejects "virtfn-poor-name-choice" (which is fine by me, it's really a
> poor choice).
>
>> +
>> +        if (virtfnIndex >= nVirtfnNames) {
>> +            virReportError(VIR_ERR_INTERNAL_ERROR,
>> +                           _("Virtual function link name '%s' larger than "
>> +                             "total count of virtual functions %zu "
>> +                             "in physical function directory '%s'"),
>> +                           virtfnNames[i], nVirtfnNames, sysfs_path);
>> +            goto error;
>> +        }
>> +
>
>> +        if ((*virtual_functions)[virtfnIndex]) {
>> +            virReportError(VIR_ERR_INTERNAL_ERROR,
>> +                           _("Virtual function link name '%s' "
>> +                             "generates duplicate index %zu "
>> +                             "in physical function directory '%s'"),
>> +                           virtfnNames[i], nVirtfnNames, sysfs_path);
>> +            goto error;
>> +        }
>
> Can there really be two entries in a directory with the same name?
>
>>
>> -            (*virtual_functions)[*num_virtual_functions] = config_addr;
>> -            (*num_virtual_functions)++;
>> +        if (virPCIGetDeviceAddressFromSysfsLink(device_link, &config_addr) !=
>> +            SRIOV_FOUND) {
>
> This also ignores OOM errors.
>
>> +            VIR_WARN("Failed to get SRIOV function from device link '%s'",
>> +                     device_link);
>>              VIR_FREE(device_link);
>> +            continue;
>>          }
>
>> +
>> +        VIR_DEBUG("Found virtual function %d", virtfnIndex);
>> +        (*virtual_functions)[virtfnIndex] = config_addr;
>> +        VIR_FREE(device_link);
>>      }
>
>
>
>> +    /* truncate any trailing nulls due to files having names starting
>> +     * with "virtfn" which weren't actually a link to a virtual
>> +     * function.
>> +     */
>> +    while (*num_virtual_functions &&
>> +           !(*virtual_functions)[(*num_virtual_functions) - 1])
>> +        (*num_virtual_functions)--;
>> +
>> +    /* failure of realloc when shrinking is safe */
>> +    ignore_value(VIR_REALLOC_N(*virtual_functions, *num_virtual_functions));
>> +
>
> This treats contiguous missing entries at the end differently than the missing
> ones at the beginning, are the virtfns with larger indexes less important? :)
>
>> +    /* if there are any NULLs left in the array, fail and log an error
>> +     * - the virtual function array cannot be sparse.
>> +     */
>> +    for (i = 0; i < *num_virtual_functions; i++) {
>> +        if (!(*virtual_functions)[*num_virtual_functions]) {
>> +            virReportError(VIR_ERR_INTERNAL_ERROR,
>> +                           _("Missing virtual function link for VF %zu "
>> +                             "in physical function directory '%s'"),
>> +                           i, sysfs_path);
>
> In other words, virPCIGetDeviceAddressFromSysfsLink failed. If we really need
> to ignore entries that aren't a symlink with a PCI address, the array has to
> be sparse and we need to fix all the users :(
>
>> +            goto error;
>> +        }
>> +    }
>> +
>> +    VIR_DEBUG("Number of virtual functions: %d", *num_virtual_functions);
>
> Jan
>
>
> --
> libvir-list mailing list
> libvir-list at redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list



-- 
---------------------------------
Niilo Minkkinen
niilo.minkkinen at iki.fi

“Brittiläisen tutkimuksen mukaan pyöräily pidentää elinajan odotetta
peräti 20 kertaa enemmän kuin onnettomuusriskin kohoaminen sitä
lyhentää! (BMA, British Medical Association. Cycling: towards health &
safety. Oxford University Press, 1992).
---------------------------------




More information about the libvir-list mailing list