[libvirt] [PATCH V2] nodedev: Fix failing to parse PCI address for non-PCI network devices

Erik Skultety eskultet at redhat.com
Tue Jan 9 11:35:43 UTC 2018


On Mon, Jan 08, 2018 at 10:08:59AM -0700, Jim Fehlig wrote:
> Based loosely on a patch from Fei Li <fli at suse.com>.
>
> Commit 8708ca01c added virNetDevSwitchdevFeature() to check if a network
> device has Switchdev capabilities. virNetDevSwitchdevFeature() attempts
> to retrieve the PCI device associated with the network device, ignoring
> non-PCI devices. It does so via the following call chain
>
>   virNetDevSwitchdevFeature()->virNetDevGetPCIDevice()->
>   virPCIGetDeviceAddressFromSysfsLink()
>
> For non-PCI network devices (qeth, Xen vif, etc),
> virPCIGetDeviceAddressFromSysfsLink() will report an error when
> virPCIDeviceAddressParse() fails. virPCIDeviceAddressParse() also
> logs an error. After commit 8708ca01c there are now two errors reported
> for each non-PCI network device even though the errors are harmless.
>
> To avoid changing virPCIGetDeviceAddressFromSysfsLink(),
> virPCIDeviceAddressParse() and related code that has quite a few
> call sites, add a function to check if a network device is a
> PCI device and use it in virNetDevSwitchdevFeature() before attempting
> to retrieve the PCI device.
> ---
>
> Suggestions welcome on a better name for virPCIIsPCIDevice, or even
> a better approach to solving this issue. Currently virPCIIsPCIDevice
> assumes the device is PCI if its subsystem starts with 'pci'. I'd be
> happy to hear if there is a better way to determine if a network
> device is PCI.

Overall I like this approach, but see my comments below. I'm also curious about
some points I raised.

>
>  src/libvirt_private.syms |  1 +
>  src/util/virnetdev.c     | 25 ++++++++++++++++++++++---
>  src/util/virpci.c        | 32 ++++++++++++++++++++++++++++++++
>  src/util/virpci.h        |  3 +++
>  4 files changed, 58 insertions(+), 3 deletions(-)
>
> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
> index a705fa846..bdf98ded1 100644
> --- a/src/libvirt_private.syms
> +++ b/src/libvirt_private.syms
> @@ -2458,6 +2458,7 @@ virPCIGetVirtualFunctionInfo;
>  virPCIGetVirtualFunctions;
>  virPCIHeaderTypeFromString;
>  virPCIHeaderTypeToString;
> +virPCIIsPCIDevice;
>  virPCIIsVirtualFunction;
>  virPCIStubDriverTypeFromString;
>  virPCIStubDriverTypeToString;
> diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c
> index eb2d119bf..a9af08797 100644
> --- a/src/util/virnetdev.c
> +++ b/src/util/virnetdev.c
> @@ -1148,6 +1148,21 @@ virNetDevSysfsDeviceFile(char **pf_sysfs_device_link, const char *ifname,
>  }
>
>
> +static bool
> +virNetDevIsPCIDevice(const char *devName)
> +{
> +    char *vfSysfsDevicePath = NULL;
> +    bool ret;
> +
> +    if (virNetDevSysfsFile(&vfSysfsDevicePath, devName, "device/subsystem") < 0)
> +        return false;

See below for ^this.

> +
> +    ret = virPCIIsPCIDevice(vfSysfsDevicePath);
> +    VIR_FREE(vfSysfsDevicePath);
> +    return ret;
> +}
> +
> +
>  static virPCIDevicePtr
>  virNetDevGetPCIDevice(const char *devName)
>  {
> @@ -3236,14 +3251,18 @@ virNetDevSwitchdevFeature(const char *ifname,
>      if (is_vf == 1 && virNetDevGetPhysicalFunction(ifname, &pfname) < 0)
>          goto cleanup;
>
> -    pci_device_ptr = pfname ? virNetDevGetPCIDevice(pfname) :
> -                              virNetDevGetPCIDevice(ifname);

I'd probably leave this as is, virNetDevGetPCIDevice already calls the
virNetDevSysfsFile, then you'd have something like virNetDevIsPCIDevice instead
of virPCIisPCIDevice (since those methods already assume a PCI device which we
don't know at this point). Something like this:

virNetDevIsPCIDevice(const char *device)
{
    char *subsystem_link;

    virAsprintf(&subsystem_link, "%s/subsystem", device);

    <body of your virPCIIsPCIDevice>
}

virNetDevGetPCIDevice
{
    if (virNetDevSysfsFile(&vfSysfsDevicePath, devname, "device") < 0)
        goto cleanup;

    if (virNetDevIsPCIDevice(vfSysfsDevicePath) < 0)
        goto cleanup;

        ....
}


> +    if (pfname == NULL && VIR_STRDUP(pfname, ifname) < 0)
> +        goto cleanup;
> +

^This hunk would be unnecessary then.

>      /* No PCI device, then no feature bit to check/add */
> -    if (pci_device_ptr == NULL) {
> +    if (!virNetDevIsPCIDevice(pfname)) {

^This one too...

>          ret = 0;
>          goto cleanup;
>      }
>
> +    if ((pci_device_ptr = virNetDevGetPCIDevice(pfname)) == NULL)
> +        goto cleanup;
> +
>      if (!(nl_msg = nlmsg_alloc_simple(family_id,
>                                        NLM_F_REQUEST | NLM_F_ACK))) {
>          virReportOOMError();
> diff --git a/src/util/virpci.c b/src/util/virpci.c
> index fe57bef32..f22d89cd7 100644
> --- a/src/util/virpci.c
> +++ b/src/util/virpci.c
> @@ -2651,6 +2651,38 @@ virPCIGetDeviceAddressFromSysfsLink(const char *device_link)
>      return bdf;
>  }
>
> +/**
> + * virPCIIsPCIDevice:
> + * @device_link: sysfs path for the device
> + *
> + * Returns true if the device specified in @device_link sysfs
> + * path is a PCI device, otherwise returns false.
> + */
> +bool
> +virPCIIsPCIDevice(const char *device_link)

This could still be part of the virnetdev module and be called
virNetDevIsPCIDevice.

> +{
> +    char *device_path = NULL;
> +    char *subsys = NULL;
> +    bool ret;
> +
> +    if (!virFileExists(device_link)) {
> +        VIR_DEBUG("'%s' does not exist", device_link);

This indicates a failure so it should be and error, but we don't care about the
failure, changing the level of the message to debug looks odd, because you just
don't want it to show up in the log, yet when debugging is ON, this would hint
that we failed to do something for some reason, rather than indicate the
codeflow we took. I don't know, I wouldn't abuse the log levels like this, as I
said, we don't care about the existence of the link, so no message is fine.

> +        return false;
> +    }
> +
> +    device_path = canonicalize_file_name(device_link);
> +    if (device_path == NULL) {
> +        VIR_DEBUG("Failed to resolve device link '%s'", device_link);

^This is an error, we have a link, but we failed to resolve, we do it in a few
places already. Also, you should use virFileResolveLink for these purposes to
get the absolute path.

> +        return false;
> +    }
> +
> +    subsys = last_component(device_path);
> +    ret = STRPREFIX(subsys, "pci");

So, I see 2 options, one would be to somehow call back to the nodedev driver
and ask for the parent device and its capabilities, from which you'd check PCI.
I checked and we don't do anything like that yet and I do believe it would be a
significant non-trivial amount of work to do. Therefore I think that this
approach is acceptable, I'd do it as well, however, I also found this:
https://www.kernel.org/doc/html/v4.13/admin-guide/sysfs-rules.html

If I understand the "device"-link section correctly, no application should rely
on the 'device' link existence once the device itself can be accessed via
/sys/devices, which in my case and probably yours as well, you can. So if I'm
not mistaken, there is a chance that some platforms might have the pci device
under /sys/devices and so the 'device' link within <x>/net/<iface> wouldn't
exist which would cause the logic of your patch to be flawed. Now, if kernel
assumes that every application should build the whole directory chain starting
at /sys/devices ending with your device just to get to this device's parent is
IMHO madness and I do see a usecase in the backwards pointing symlinks.

> +
> +    VIR_FREE(device_path);
> +    return ret;
> +}
> +




More information about the libvir-list mailing list