[libvirt] [PATCHv2] pci: properly handle out-of-order SRIOV virtual functions
Niilona
niilona at gmail.com
Thu Nov 7 14:06:45 UTC 2013
Hi forget this as approach works using kernel-patch.
Correctly was mentioned that the approach using leading-zeroes was a
try to fix something happened in other place.
Haven't get intention to libvirt side after that, wrote untested patch
for this symptom.
It's is not complete - no more error checking's.
But it should pre-select only "virtfn" entries to be handled in
while-loop, and after that scan through them in increasing numerical
order.
Still, here it is :
< /* routine to select only "virtfn" -entries */
< static int
< virtfn_select(const struct dirent *entry)
< {
< return (strncmp(entry->d_name,"virtfn", 6) == 0) ? 1 : 0;
< }
<
2401a2395,2396
> DIR *dir = NULL;
> struct dirent *entry = NULL;
2404,2406d2398
< struct dirent **namelist;
< int entry_count = 0;
< int current_entry = 0;
2414,2415c2406,2407
< struct stat sb;
< if ((stat(sysfs_path, &sb) == -1) || ((sb.st_mode & S_IFMT) != S_IFDIR)) {
---
> dir = opendir(sysfs_path);
> if (dir == NULL) {
2423,2425c2415,2416
< entry_count = scandir(sysfs_path, &namelist, virtfn_select, alphasort);
<
< while ( current_entry < entry_count ) {
---
> while ((entry = readdir(dir))) {
> if (STRPREFIX(entry->d_name, "virtfn")) {
2428c2419
< if (virBuildPath(&device_link, sysfs_path, namelist[
current_entry ]->d_name ) == -1) {
---
> if (virBuildPath(&device_link, sysfs_path, entry->d_name) == -1) {
2432c2423
< current_entry ++;
---
>
2453a2445
> }
2460,2462c2452,2453
< for ( i = 0; i < entry_count; i ++)
< free( namelist[ i ] );
< free( namelist );
---
> if (dir)
> closedir(dir);
On Thu, Nov 7, 2013 at 2:32 PM, Ján Tomko <jtomko at redhat.com> wrote:
> On 11/07/2013 12:10 PM, Laine Stump wrote:
>> This resolves:
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1025397
>>
>> When libvirt creates the list of an SRIOV Physical Function's (PF)
>> Virtual Functions (VF), it assumes that the order of "virtfn*" links
>> returned by readdir() from the PF's sysfs directory is already in the
>> correct order. Experience has shown that this is not always the case.
>>
>> This results in 1) incorrect assumptions made by consumers of the
>> output of the virt_functions list of virsh nodedev-dumpxml, and 2)
>> setting MAC address and vlan tag on the wrong VF (since libvirt uses
>> netlink to set mac address and vlan tag, netlink requires the VF#, and
>> the function virPCIGetVirtualFunctionIndex() returns the wrong index
>> due to the improperly ordered VF list). See the bugzilla report for an
>> example of this improper behavior.
>>
>> The solution provided by this patch is for virPCIGetVirtualFunctions
>> to first gather all the "virtfn*" names from the PFs sysfs directory,
>> then allocate a virtual_functions array large enough to hold all
>> entries, and finally to create a device entry for each virtfn* name
>> and place it into the virtual_functions array at the proper index
>> (based on interpreting the value following "virtfn" in the name).
>>
>> Checks are introduced to ensure that 1) the virtfn list given in sysfs
>> is not sparse (ie, there are no indexes larger than the length of the
>> list), and that no two virtfn* entries decode to the same index.
>> ---
>>
>> Difference from V1: Based on jtomko's review, truncate trailing NULLs
>> in virtual_functions array, and actually check for NULLs in the middle
>> (a sparse array) and generate an error in that case, as we already
>> claimed to do in the commit message.
>>
>> (Sorry for the lack of inter-diff, but the only changes from V1 are
>> the addition of the lines starting with "truncate any trailing nulls"
>> and continuing until the next VIR_DEBUG.)
>>
>> src/util/virpci.c | 115 ++++++++++++++++++++++++++++++++++++++++++------------
>> 1 file changed, 90 insertions(+), 25 deletions(-)
>>
>> diff --git a/src/util/virpci.c b/src/util/virpci.c
>> index 148631f..f744c8b 100644
>> --- a/src/util/virpci.c
>> +++ b/src/util/virpci.c
>> @@ -2379,6 +2379,8 @@ virPCIGetPhysicalFunction(const char *vf_sysfs_path,
>> return ret;
>> }
>>
>> +static const char *virtfnPrefix = "virtfn";
>> +
>> /*
>> * Returns virtual functions of a physical function
>> */
>> @@ -2393,6 +2395,8 @@ virPCIGetVirtualFunctions(const char *sysfs_path,
>> struct dirent *entry = NULL;
>> char *device_link = NULL;
>> char errbuf[64];
>> + char **virtfnNames = NULL;
>> + size_t nVirtfnNames = 0;
>>
>> VIR_DEBUG("Attempting to get SR IOV virtual functions for device"
>> "with sysfs path '%s'", sysfs_path);
>> @@ -2411,51 +2415,112 @@ virPCIGetVirtualFunctions(const char *sysfs_path,
>>
>> while ((entry = readdir(dir))) {
>> if (STRPREFIX(entry->d_name, "virtfn")) {
>> - virPCIDeviceAddress *config_addr = NULL;
>> + char *tempName;
>>
>> - if (virBuildPath(&device_link, sysfs_path, entry->d_name) == -1) {
>> - virReportOOMError();
>> + /* save these to sort into virtual_functions array */
>> + if (VIR_STRDUP(tempName, entry->d_name) < 0)
>> + goto error;
>> + if (VIR_APPEND_ELEMENT(virtfnNames, nVirtfnNames, tempName) < 0) {
>> + VIR_FREE(tempName);
>> goto error;
>> }
>> + }
>> + }
>
> Wouldn't it be easier to just count the entries beginning with "virtfn" and
> then go from 0 to n? (This would need a special return value for
> virPCIGetDeviceAddressFromSysfsLink if the link does not exist, if the entries
> in the directory can truly be sparse, or a virtfn-poorly-chosen-name appears)
>
>>
>> - VIR_DEBUG("Number of virtual functions: %d",
>> - *num_virtual_functions);
>> + /* pre-allocate because we will be filling in out of order */
>> + if (nVirtfnNames && VIR_ALLOC_N(*virtual_functions, nVirtfnNames) < 0)
>> + goto error;
>> + *num_virtual_functions = nVirtfnNames;
>>
>> - if (virPCIGetDeviceAddressFromSysfsLink(device_link,
>> - &config_addr) !=
>> - SRIOV_FOUND) {
>> - VIR_WARN("Failed to get SRIOV function from device "
>> - "link '%s'", device_link);
>> - VIR_FREE(device_link);
>> - continue;
>> - }
>> + for (i = 0; i < nVirtfnNames; i++) {
>> + virPCIDeviceAddress *config_addr = NULL;
>> + unsigned int virtfnIndex;
>>
>> - if (VIR_REALLOC_N(*virtual_functions,
>> - *num_virtual_functions + 1) < 0) {
>> - VIR_FREE(config_addr);
>> - goto error;
>> - }
>> + if (virBuildPath(&device_link, sysfs_path, virtfnNames[i]) < 0) {
>> + virReportOOMError();
>> + goto error;
>> + }
>> +
>> + if (virStrToLong_ui(virtfnNames[i] + strlen(virtfnPrefix),
>> + 0, 10, &virtfnIndex) < 0) {
>> + virReportError(VIR_ERR_INTERNAL_ERROR,
>> + _("Invalid virtual function link name '%s' "
>> + "in physical function directory '%s'"),
>> + virtfnNames[i], sysfs_path);
>> + goto error;
>> + }
>
> This rejects "virtfn-poor-name-choice" (which is fine by me, it's really a
> poor choice).
>
>> +
>> + if (virtfnIndex >= nVirtfnNames) {
>> + virReportError(VIR_ERR_INTERNAL_ERROR,
>> + _("Virtual function link name '%s' larger than "
>> + "total count of virtual functions %zu "
>> + "in physical function directory '%s'"),
>> + virtfnNames[i], nVirtfnNames, sysfs_path);
>> + goto error;
>> + }
>> +
>
>> + if ((*virtual_functions)[virtfnIndex]) {
>> + virReportError(VIR_ERR_INTERNAL_ERROR,
>> + _("Virtual function link name '%s' "
>> + "generates duplicate index %zu "
>> + "in physical function directory '%s'"),
>> + virtfnNames[i], nVirtfnNames, sysfs_path);
>> + goto error;
>> + }
>
> Can there really be two entries in a directory with the same name?
>
>>
>> - (*virtual_functions)[*num_virtual_functions] = config_addr;
>> - (*num_virtual_functions)++;
>> + if (virPCIGetDeviceAddressFromSysfsLink(device_link, &config_addr) !=
>> + SRIOV_FOUND) {
>
> This also ignores OOM errors.
>
>> + VIR_WARN("Failed to get SRIOV function from device link '%s'",
>> + device_link);
>> VIR_FREE(device_link);
>> + continue;
>> }
>
>> +
>> + VIR_DEBUG("Found virtual function %d", virtfnIndex);
>> + (*virtual_functions)[virtfnIndex] = config_addr;
>> + VIR_FREE(device_link);
>> }
>
>
>
>> + /* truncate any trailing nulls due to files having names starting
>> + * with "virtfn" which weren't actually a link to a virtual
>> + * function.
>> + */
>> + while (*num_virtual_functions &&
>> + !(*virtual_functions)[(*num_virtual_functions) - 1])
>> + (*num_virtual_functions)--;
>> +
>> + /* failure of realloc when shrinking is safe */
>> + ignore_value(VIR_REALLOC_N(*virtual_functions, *num_virtual_functions));
>> +
>
> This treats contiguous missing entries at the end differently than the missing
> ones at the beginning, are the virtfns with larger indexes less important? :)
>
>> + /* if there are any NULLs left in the array, fail and log an error
>> + * - the virtual function array cannot be sparse.
>> + */
>> + for (i = 0; i < *num_virtual_functions; i++) {
>> + if (!(*virtual_functions)[*num_virtual_functions]) {
>> + virReportError(VIR_ERR_INTERNAL_ERROR,
>> + _("Missing virtual function link for VF %zu "
>> + "in physical function directory '%s'"),
>> + i, sysfs_path);
>
> In other words, virPCIGetDeviceAddressFromSysfsLink failed. If we really need
> to ignore entries that aren't a symlink with a PCI address, the array has to
> be sparse and we need to fix all the users :(
>
>> + goto error;
>> + }
>> + }
>> +
>> + VIR_DEBUG("Number of virtual functions: %d", *num_virtual_functions);
>
> Jan
>
>
> --
> libvir-list mailing list
> libvir-list at redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list
--
---------------------------------
Niilo Minkkinen
niilo.minkkinen at iki.fi
“Brittiläisen tutkimuksen mukaan pyöräily pidentää elinajan odotetta
peräti 20 kertaa enemmän kuin onnettomuusriskin kohoaminen sitä
lyhentää! (BMA, British Medical Association. Cycling: towards health &
safety. Oxford University Press, 1992).
---------------------------------
More information about the libvir-list
mailing list