[libvirt] [PATCH 1/1] qemu: host NUMA hugepage policy without guest NUMA
Peter Krempa
pkrempa at redhat.com
Wed Oct 12 07:48:07 UTC 2016
On Wed, Oct 12, 2016 at 15:04:53 +1100, Sam Bobroff wrote:
> At the moment, guests that are backed by hugepages in the host are
> only able to use policy to control the placement of those hugepages
> on a per-(guest-)CPU basis. Policy applied globally is ignored.
>
> Such guests would use <memoryBacking><hugepages/></memoryBacking> and
> a <numatune> block with <memory mode=... nodeset=.../> but no <memnode
> .../> elements.
>
> This patch corrects this by, in this specific case, changing the QEMU
> command line from "-mem-prealloc -mem-path=..." (which cannot
> specify NUMA policy) to "-object memory-backend-file ..." (which can).
>
> Note: This is not visible to the guest and does not appear to create
> a migration incompatibility.
>
> Signed-off-by: Sam Bobroff <sam.bobroff at au1.ibm.com>
> ---
> There was some discussion leading up to this patch, here:
>
> https://www.redhat.com/archives/libvir-list/2016-October/msg00033.html
>
> During that discussion it seemed that fixing this issue would cause
> migration incompatibility but after some careful testing, it appears
> that it only does when a memory-backend object is attached to a guest
> NUMA node and that is not the case here. If only one is created, and
> used globally (not attached via mem=<id>), the migration data does not
> seem to be changed and so it seems reasonable to try something like
> this patch.
>
> This patch does work for my test cases but I don't claim a deep
> understanding of the libvirt code so this is at least partly a RFC.
> Comments welcome :-)
>
> Cheers,
> Sam.
>
> src/qemu/qemu_command.c | 55 ++++++++++++++++++++++++++++++++++++++++---------
> src/qemu/qemu_command.h | 2 +-
> 2 files changed, 46 insertions(+), 11 deletions(-)
>
> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
> index 0804133..c28c8f2 100644
> --- a/src/qemu/qemu_command.c
> +++ b/src/qemu/qemu_command.c
> @@ -3143,7 +3143,7 @@ qemuBuildMemoryBackendStr(unsigned long long size,
> int guestNode,
> virBitmapPtr userNodeset,
> virBitmapPtr autoNodeset,
> - virDomainDefPtr def,
> + const virDomainDefPtr def,
This does not work as expected. Using 'const' with types that hide the
pointer makes the pointer const and not the structure that the pointer
points to.
> virQEMUCapsPtr qemuCaps,
> virQEMUDriverConfigPtr cfg,
> const char **backendType,
> @@ -7129,12 +7129,18 @@ qemuBuildSmpCommandLine(virCommandPtr cmd,
>
> static int
> qemuBuildMemPathStr(virQEMUDriverConfigPtr cfg,
> - const virDomainDef *def,
> + const virDomainDefPtr def,
Same as above. This change is wrong.
> virQEMUCapsPtr qemuCaps,
> - virCommandPtr cmd)
> + virCommandPtr cmd,
> + virBitmapPtr auto_nodeset)
> {
> const long system_page_size = virGetSystemPageSizeKB();
> char *mem_path = NULL;
> + virBitmapPtr nodemask = NULL;
> + const char *backendType = NULL;
> + char *backendStr = NULL;
> + virJSONValuePtr props;
> + int rv = -1;
>
> /*
> * No-op if hugepages were not requested.
> @@ -7159,18 +7165,47 @@ qemuBuildMemPathStr(virQEMUDriverConfigPtr cfg,
> if (qemuGetHupageMemPath(cfg, def->mem.hugepages[0].size, &mem_path) < 0)
> return -1;
>
> - virCommandAddArgList(cmd, "-mem-prealloc", "-mem-path", mem_path, NULL);
> + if (virDomainNumatuneMaybeGetNodeset(def->numa, auto_nodeset,
> + &nodemask, -1) < 0)
> + return -1;
> + if (nodemask && virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_FILE)) {
> + props = virJSONValueNewObject();
This is leaked, qemuBuildMemoryBackendStr allocates one itself.
> + if (qemuBuildMemoryBackendStr(virDomainDefGetMemoryInitial(def),
> + 0, -1, NULL, auto_nodeset,
> + def, qemuCaps, cfg, &backendType,
> + &props, false) < 0)
> + goto cleanup;
> + if (!(backendStr = virQEMUBuildObjectCommandlineFromJSON(backendType,
> + "mem",
> + props)))
> + goto cleanup;
So you create a memory object that does not get used ...
> + virCommandAddArgList(cmd, "-object", backendStr, NULL);
So does qemu use memory objects that are not used magically as it's
memory? That seems odd. Could you please elaborate on this part and how
it's supposed to work? Without a proper elaboration how this is supposed
to work with qemu this patch is not acceptable.
> + rv = 0;
> +cleanup:
This won't pass syntax check. Labels have to be spaced out.
> + VIR_FREE(backendStr);
> + VIR_FREE(props);
This is a JSON value object, you need to free it with the proper
function.
> + }
> + else {
This breaks coding style.
> + if (nodemask || 1)
Erm, a tautology?
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("Memory file backend objects are "
> + "unsupported by QEMU binary. Global NUMA "
> + "hugepage policy will be ignored."));
Reporting an error but not terminating the startup is not acceptable.
Errors should not be reported unless something fails.
> + virCommandAddArgList(cmd, "-mem-prealloc", "-mem-path", mem_path, NULL);
> + rv = 0;
> + }
> VIR_FREE(mem_path);
>
> - return 0;
> + return rv;
> }
>
Missing a change to the test files. Please make sure that you run 'make
check' and 'make syntax-check' and send only patches that pass those.
Peter
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20161012/dd05053f/attachment-0001.sig>
More information about the libvir-list
mailing list