[libvirt] [PATCH v4 1/2] qemu_domain: NVLink2 bridge detection function for PPC64

Erik Skultety eskultet at redhat.com
Tue Apr 2 08:34:27 UTC 2019


On Tue, Mar 12, 2019 at 06:55:49PM -0300, Daniel Henrique Barboza wrote:
> The NVLink2 support in QEMU implements the detection of NVLink2
> capable devices by verifying the attributes of the VFIO mem region
> QEMU allocates for the NVIDIA GPUs. To properly allocate an
> adequate amount of memLock, Libvirt needs this information before
> a QEMU instance is even created, thus querying QEMU is not
> possible and opening a VFIO window is too much.
>
> An alternative is presented in this patch. Making the following
> assumptions:
>
> - if we want GPU RAM to be available in the guest, an NVLink2 bridge
> must be passed through;
>
> - an unknown PCI device can be classified as a NVLink2 bridge
> if its device tree node has 'ibm,gpu', 'ibm,nvlink',
> 'ibm,nvlink-speed' and 'memory-region'.

Alexey mentioned that it should be enough to check for the properties ^above.
I'm just wondering, knowing this is IBM's PPC8/9 if the assumptions we have
made are going to stay with further revisions of PPC, NVLink and GPUs, IOW
we need to be sure "ibm,nvlink" won't be renamed with further revisions, e.g.
other cards than V100 in the future, because then compatibility and revision
selection comes into the picture.

>
> This patch introduces a helper called @ppc64VFIODeviceIsNV2Bridge
> that checks the device tree node of a given PCI device and
> check if it meets the criteria to be a NVLink2 bridge. This

Just out of curiosity, what about NVLink 1.0? Apart from performance, I wasn't
able to find something useful in terms of compatibility, is there something to
consider, since we're only relying on NVLink 2.0?

> new function will be used in a follow-up patch that, using the
> first assumption, will set up the rlimits of the guest
> accordingly.
>
> Signed-off-by: Daniel Henrique Barboza <danielhb413 at gmail.com>
> ---
>  src/qemu/qemu_domain.c | 29 +++++++++++++++++++++++++++++
>  1 file changed, 29 insertions(+)
>
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index 1659e88478..dcc92d253c 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -10398,6 +10398,35 @@ qemuDomainUpdateCurrentMemorySize(virDomainObjPtr vm)
>  }
>
>
> +/**
> + * ppc64VFIODeviceIsNV2Bridge:
> + * @device: string with the PCI device address
> + *
> + * This function receives a string that represents a PCI device,
> + * such as '0004:04:00.0', and tells if the device is a NVLink2
> + * bridge.
> + */
> +static bool
> +ppc64VFIODeviceIsNV2Bridge(const char *device)
> +{
> +    const char *nvlink2Files[] = {"ibm,gpu", "ibm,nvlink",
> +                                  "ibm,nvlink-speed", "memory-region"};

Like I said above, if ^this is not to change, then I'm okay in principle, still
I feel like this needs David's ACK (putting him on CC)

Codewise, Peter already provided you with comments.

Erik


> +    char *file;
> +    size_t i;
> +
> +    for (i = 0; i < 4; i++) {
> +        if ((virAsprintf(&file, "/sys/bus/pci/devices/%s/of_node/%s",
> +                         device, nvlink2Files[i])) < 0)
> +            return false;
> +
> +        if (!virFileExists(file))
> +            return false;
> +    }
> +
> +    return true;
> +}
> +
> +
>  /**
>   * getPPC64MemLockLimitBytes:
>   * @def: domain definition
> --
> 2.20.1
>
> --
> libvir-list mailing list
> libvir-list at redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list




More information about the libvir-list mailing list