[libvirt] [PATCH v4 1/2] qemu_domain: NVLink2 bridge detection function for PPC64

David Gibson dgibson at redhat.com
Wed Apr 3 23:25:27 UTC 2019


On Tue, 2 Apr 2019 10:34:27 +0200
Erik Skultety <eskultet at redhat.com> wrote:

> On Tue, Mar 12, 2019 at 06:55:49PM -0300, Daniel Henrique Barboza wrote:
> > The NVLink2 support in QEMU implements the detection of NVLink2
> > capable devices by verifying the attributes of the VFIO mem region
> > QEMU allocates for the NVIDIA GPUs. To properly allocate an
> > adequate amount of memLock, Libvirt needs this information before
> > a QEMU instance is even created, thus querying QEMU is not
> > possible and opening a VFIO window is too much.
> >
> > An alternative is presented in this patch. Making the following
> > assumptions:
> >
> > - if we want GPU RAM to be available in the guest, an NVLink2 bridge
> > must be passed through;
> >
> > - an unknown PCI device can be classified as a NVLink2 bridge
> > if its device tree node has 'ibm,gpu', 'ibm,nvlink',
> > 'ibm,nvlink-speed' and 'memory-region'.  
> 
> Alexey mentioned that it should be enough to check for the properties ^above.
> I'm just wondering, knowing this is IBM's PPC8/9 if the assumptions we have
> made are going to stay with further revisions of PPC, NVLink and GPUs, IOW
> we need to be sure "ibm,nvlink" won't be renamed with further revisions, e.g.
> other cards than V100 in the future, because then compatibility and revision
> selection comes into the picture.
> 
> >
> > This patch introduces a helper called @ppc64VFIODeviceIsNV2Bridge
> > that checks the device tree node of a given PCI device and
> > check if it meets the criteria to be a NVLink2 bridge. This  
> 
> Just out of curiosity, what about NVLink 1.0? Apart from performance, I wasn't
> able to find something useful in terms of compatibility, is there something to
> consider, since we're only relying on NVLink 2.0?

Probably not.  NVLink 1 is entirely incompatible, and AFAIK support for
it has been postponed until probably never.

> > new function will be used in a follow-up patch that, using the
> > first assumption, will set up the rlimits of the guest
> > accordingly.
> >
> > Signed-off-by: Daniel Henrique Barboza <danielhb413 at gmail.com>
> > ---
> >  src/qemu/qemu_domain.c | 29 +++++++++++++++++++++++++++++
> >  1 file changed, 29 insertions(+)
> >
> > diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> > index 1659e88478..dcc92d253c 100644
> > --- a/src/qemu/qemu_domain.c
> > +++ b/src/qemu/qemu_domain.c
> > @@ -10398,6 +10398,35 @@ qemuDomainUpdateCurrentMemorySize(virDomainObjPtr vm)
> >  }
> >
> >
> > +/**
> > + * ppc64VFIODeviceIsNV2Bridge:
> > + * @device: string with the PCI device address
> > + *
> > + * This function receives a string that represents a PCI device,
> > + * such as '0004:04:00.0', and tells if the device is a NVLink2
> > + * bridge.
> > + */
> > +static bool
> > +ppc64VFIODeviceIsNV2Bridge(const char *device)
> > +{
> > +    const char *nvlink2Files[] = {"ibm,gpu", "ibm,nvlink",
> > +                                  "ibm,nvlink-speed", "memory-region"};  
> 
> Like I said above, if ^this is not to change, then I'm okay in principle, still
> I feel like this needs David's ACK (putting him on CC)
> 
> Codewise, Peter already provided you with comments.

Hm, I don't feel I can answer this definitively.  My guess would be to
ask Alexey, and AFAIK you already have information from him.

-- 
David Gibson <dgibson at redhat.com>
Principal Software Engineer, Virtualization, Red Hat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20190404/03b24462/attachment-0001.sig>


More information about the libvir-list mailing list