[libvirt] [PATCH 1/1] qemu: host NUMA hugepage policy without guest NUMA

Peter Krempa pkrempa at redhat.com
Wed Oct 12 07:48:07 UTC 2016


On Wed, Oct 12, 2016 at 15:04:53 +1100, Sam Bobroff wrote:
> At the moment, guests that are backed by hugepages in the host are
> only able to use policy to control the placement of those hugepages
> on a per-(guest-)CPU basis. Policy applied globally is ignored.
> 
> Such guests would use <memoryBacking><hugepages/></memoryBacking> and
> a <numatune> block with <memory mode=... nodeset=.../> but no <memnode
> .../> elements.
> 
> This patch corrects this by, in this specific case, changing the QEMU
> command line from "-mem-prealloc -mem-path=..." (which cannot
> specify NUMA policy) to "-object memory-backend-file ..." (which can).
> 
> Note: This is not visible to the guest and does not appear to create
> a migration incompatibility.
> 
> Signed-off-by: Sam Bobroff <sam.bobroff at au1.ibm.com>
> ---
> There was some discussion leading up to this patch, here:
> 
> https://www.redhat.com/archives/libvir-list/2016-October/msg00033.html
> 
> During that discussion it seemed that fixing this issue would cause
> migration incompatibility but after some careful testing, it appears
> that it only does when a memory-backend object is attached to a guest
> NUMA node and that is not the case here. If only one is created, and
> used globally (not attached via mem=<id>), the migration data does not
> seem to be changed and so it seems reasonable to try something like
> this patch.
> 
> This patch does work for my test cases but I don't claim a deep
> understanding of the libvirt code so this is at least partly a RFC.
> Comments welcome :-)
> 
> Cheers,
> Sam.
> 
>  src/qemu/qemu_command.c | 55 ++++++++++++++++++++++++++++++++++++++++---------
>  src/qemu/qemu_command.h |  2 +-
>  2 files changed, 46 insertions(+), 11 deletions(-)
> 
> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
> index 0804133..c28c8f2 100644
> --- a/src/qemu/qemu_command.c
> +++ b/src/qemu/qemu_command.c
> @@ -3143,7 +3143,7 @@ qemuBuildMemoryBackendStr(unsigned long long size,
>                            int guestNode,
>                            virBitmapPtr userNodeset,
>                            virBitmapPtr autoNodeset,
> -                          virDomainDefPtr def,
> +                          const virDomainDefPtr def,

This does not work as expected. Using 'const' with types that hide the
pointer makes the pointer const and not the structure that the pointer
points to.

>                            virQEMUCapsPtr qemuCaps,
>                            virQEMUDriverConfigPtr cfg,
>                            const char **backendType,
> @@ -7129,12 +7129,18 @@ qemuBuildSmpCommandLine(virCommandPtr cmd,
>  
>  static int
>  qemuBuildMemPathStr(virQEMUDriverConfigPtr cfg,
> -                    const virDomainDef *def,
> +                    const virDomainDefPtr def,

Same as above. This change is wrong.

>                      virQEMUCapsPtr qemuCaps,
> -                    virCommandPtr cmd)
> +                    virCommandPtr cmd,
> +                    virBitmapPtr auto_nodeset)
>  {
>      const long system_page_size = virGetSystemPageSizeKB();
>      char *mem_path = NULL;
> +    virBitmapPtr nodemask = NULL;
> +    const char *backendType = NULL;
> +    char *backendStr = NULL;
> +    virJSONValuePtr props;
> +    int rv = -1;
>  
>      /*
>       *  No-op if hugepages were not requested.
> @@ -7159,18 +7165,47 @@ qemuBuildMemPathStr(virQEMUDriverConfigPtr cfg,
>      if (qemuGetHupageMemPath(cfg, def->mem.hugepages[0].size, &mem_path) < 0)
>          return -1;
>  
> -    virCommandAddArgList(cmd, "-mem-prealloc", "-mem-path", mem_path, NULL);
> +    if (virDomainNumatuneMaybeGetNodeset(def->numa, auto_nodeset,
> +                                         &nodemask, -1) < 0)
> +        return -1;
> +    if (nodemask && virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_FILE)) {
> +        props = virJSONValueNewObject();

This is leaked, qemuBuildMemoryBackendStr allocates one itself.

> +        if (qemuBuildMemoryBackendStr(virDomainDefGetMemoryInitial(def),
> +                                      0, -1, NULL, auto_nodeset,
> +                                      def, qemuCaps, cfg, &backendType,
> +                                      &props, false) < 0)
> +            goto cleanup;
> +        if (!(backendStr = virQEMUBuildObjectCommandlineFromJSON(backendType,
> +                                                                 "mem",
> +                                                                 props)))
> +            goto cleanup;

So you create a memory object that does not get used ...

> +        virCommandAddArgList(cmd, "-object", backendStr, NULL);

So does qemu use memory objects that are not used magically as it's
memory? That seems odd. Could you please elaborate on this part and how
it's supposed to work? Without a proper elaboration how this is supposed
to work with qemu this patch is not acceptable.

> +        rv = 0;
> +cleanup:

This won't pass syntax check. Labels have to be spaced out.

> +        VIR_FREE(backendStr);
> +        VIR_FREE(props);

This is a JSON value object, you need to free it with the proper
function.

> +    }
> +    else {

This breaks coding style.

> +        if (nodemask || 1)

Erm, a tautology?

> +            virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> +                           _("Memory file backend objects are "
> +                             "unsupported by QEMU binary. Global NUMA "
> +                             "hugepage policy will be ignored."));

Reporting an error but not terminating the startup is not acceptable.
Errors should not be reported unless something fails.

> +        virCommandAddArgList(cmd, "-mem-prealloc", "-mem-path", mem_path, NULL);
> +        rv = 0;
> +    }
>      VIR_FREE(mem_path);
>  
> -    return 0;
> +    return rv;
>  }
>  

Missing a change to the test files. Please make sure that you run 'make
check' and 'make syntax-check' and send only patches that pass those.

Peter
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20161012/dd05053f/attachment-0001.sig>


More information about the libvir-list mailing list