[libvirt][PATCH v1 1/1] introduce an attribute "migratable" to numatune memory element

Peter Krempa pkrempa at redhat.com
Tue Oct 6 07:15:29 UTC 2020


On Tue, Oct 06, 2020 at 13:47:25 +0800, Luyao Zhong wrote:
>   <numatune>
>     <memory mode="strict" nodeset="1-4,^3" migratable="yes/no" />
>     <memnode cellid="0" mode="strict" nodeset="1"/>
>     <memnode cellid="2" mode="preferred" nodeset="2"/>
>   </numatune>
> 
> Attribute ``migratable`` will be 'no' by default, and 'yes' indicates
> that it allows operating system or hypervisor migrating the memory
> pages between different memory nodes, that also means we will not
> rely on hypervisor to set the memory policy or memory affinity, we only
> use cgroups to restrict the memory nodes, so if ``migratable`` is 'yes',
> the ``mode`` which indicates memory policy will be ignored.

Note that I'm not reviewing whether this is justified and makes sense.
I'm commenting purely on the code.

> ---
We require that patches are split into logical blocks. You need to split
this commit at least into following patches:
>  docs/formatdomain.rst                         |  8 +++-
>  docs/schemas/domaincommon.rng                 |  5 +++
>  src/conf/numa_conf.c                          | 45 +++++++++++++++++++
>  src/conf/numa_conf.h                          |  3 ++
>  src/libvirt_private.syms                      |  1 +

Docs, schema and parser changes should be separated

>  src/qemu/qemu_command.c                       |  6 ++-
>  src/qemu/qemu_process.c                       | 21 +++++++++
>  .../numatune-memory-migratable.args           | 34 ++++++++++++++
>  .../numatune-memory-migratable.err            |  1 +
>  .../numatune-memory-migratable.xml            | 33 ++++++++++++++
>  tests/qemuxml2argvtest.c                      |  5 +++

Implementation in qemu can be all together with tests. Please also add a
qemuxml2xmltest case.

I've provided some more feedback in a recent review:

https://www.redhat.com/archives/libvir-list/2020-October/msg00231.html

specifically some guidance on tests:

https://www.redhat.com/archives/libvir-list/2020-October/msg00395.html


>  11 files changed, 160 insertions(+), 2 deletions(-)
>  create mode 100644 tests/qemuxml2argvdata/numatune-memory-migratable.args
>  create mode 100644 tests/qemuxml2argvdata/numatune-memory-migratable.err
>  create mode 100644 tests/qemuxml2argvdata/numatune-memory-migratable.xml
> 
> diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
> index cc4f91d4ea..4e386a0c3d 100644
> --- a/docs/formatdomain.rst
> +++ b/docs/formatdomain.rst

[...]

> @@ -1097,6 +1097,12 @@ NUMA Node Tuning
>     will be ignored if it's specified. If ``placement`` of ``vcpu`` is 'auto',
>     and ``numatune`` is not specified, a default ``numatune`` with ``placement``
>     'auto' and ``mode`` 'strict' will be added implicitly. :since:`Since 0.9.3`
> +   Attribute ``migratable`` is 'no' by default, and 'yes' indicates that it
> +   allows operating system or hypervisor migrating the memory pages between
> +   different memory nodes, that also means we will not rely on hypervisor to
> +   set the memory policy or memory affinity, we only use cgroups to restrict
> +   the memory nodes, so if ``migratable`` is 'yes', the ``mode`` which indicates
> +   memory policy will be ignored.

So ... does this make us ignore the per NUMA-node set 'mode'? If yes,
the code should actually reject any configs where we ignore some
settings rather than just relying on the docs.


>  ``memnode``
>     Optional ``memnode`` elements can specify memory allocation policies per each
>     guest NUMA node. For those nodes having no corresponding ``memnode`` element,

[...]

> diff --git a/src/conf/numa_conf.c b/src/conf/numa_conf.c
> index 6653ba05a6..c14ba1295c 100644
> --- a/src/conf/numa_conf.c
> +++ b/src/conf/numa_conf.c

[...]

> @@ -292,6 +294,15 @@ virDomainNumatuneParseXML(virDomainNumaPtr numa,
>          placement = VIR_DOMAIN_NUMATUNE_PLACEMENT_AUTO;
>  
>      if (node) {
> +        if ((tmp = virXMLPropString(node, "migratable")) &&
> +            (migratable = virTristateBoolTypeFromString(tmp)) <= 0) {
> +            virReportError(VIR_ERR_CONFIG_UNSUPPORTED,

VIR_ERR_XML_ERROR

> +                           _("Invalid 'migratable' attribute value '%s'"), tmp);
> +            goto cleanup;
> +        }
> +        numa->memory.migratable = migratable;
> +        VIR_FREE(tmp);
> +
>          if ((tmp = virXMLPropString(node, "mode")) &&
>              (mode = virDomainNumatuneMemModeTypeFromString(tmp)) < 0) {
>              virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> @@ -369,6 +380,11 @@ virDomainNumatuneFormatXML(virBufferPtr buf,
>          tmp = virDomainNumatuneMemModeTypeToString(numatune->memory.mode);
>          virBufferAsprintf(buf, "<memory mode='%s' ", tmp);
>  
> +        if (numatune->memory.migratable) {

Use an explicit condition here:

if (numatune->memory.migratable != VIR_TRISTATE_BOOL_ABSENT)

> +            tmp = virTristateBoolTypeToString(numatune->memory.migratable);
> +            virBufferAsprintf(buf, "migratable='%s' ", tmp);
> +        }
> +
>          if (numatune->memory.placement == VIR_DOMAIN_NUMATUNE_PLACEMENT_STATIC) {
>              if (!(nodeset = virBitmapFormat(numatune->memory.nodeset)))
>                  return -1;
> @@ -427,6 +443,35 @@ virDomainNumaFree(virDomainNumaPtr numa)
>      VIR_FREE(numa);
>  }
>  
> +/**
> + * virDomainNumatuneGetMigratable:
> + * @numatune: pointer to numatune definition
> + * @migratable: where to store the result

This argument is not used by the only place calling it. Please remove
it.

> + *
> + * Get the migratable attribute for domain's memory. It's safe to
> + * pass NULL to @migratable if the return value is the only info needed.

> + *
> + * Returns: migratable value
> + */
> +int virDomainNumatuneGetMigratable(virDomainNumaPtr numatune,
> +                                   virTristateBool *migratable)
> +{
> +    virTristateBool tmp_migratable = 0;
> +
> +    if (!numatune)
> +        return tmp_migratable;

Just directly return VIR_TRISTATE_BOOL_ABSENT here ...

> +
> +    if (numatune->memory.specified)
> +        tmp_migratable = numatune->memory.migratable;
> +    else
> +        return tmp_migratable;
> +
> +    if (migratable != NULL)
> +        *migratable = tmp_migratable;


Also none of this is really needed, just directly return the value.

> +
> +    return tmp_migratable;
> +}
> +
>  /**
>   * virDomainNumatuneGetMode:
>   * @numatune: pointer to numatune definition

[...]

> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
> index 476cf6972e..882e7e6ba2 100644
> --- a/src/qemu/qemu_command.c
> +++ b/src/qemu/qemu_command.c
> @@ -3189,7 +3189,11 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps,
>              return -1;
>      }
>  
> -    if (nodemask) {
> +    // If migratable attribute is yes, we should only use cgroups setting
> +    // memory affinity, and skip passing the host-nodes and policy parameters
> +    // to QEMU command line.

We don't use line comments ('//') in our code base. Please use block
comments.

> +    if (nodemask &&
> +        virDomainNumatuneGetMigratable(def->numa, NULL) != VIR_TRISTATE_BOOL_YES) {
>          if (!virNumaNodesetIsAvailable(nodemask))
>              return -1;
>          if (virJSONValueObjectAdd(props,
> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> index 6b5de29fdb..9c1116064b 100644
> --- a/src/qemu/qemu_process.c
> +++ b/src/qemu/qemu_process.c

[...]

> @@ -2694,6 +2695,26 @@ qemuProcessSetupPid(virDomainObjPtr vm,
>                                                  &mem_mask, -1) < 0)
>              goto cleanup;
>  
> +        // For vCPU threads, mem_mask is different among cells
> +        if (nameval == VIR_CGROUP_THREAD_VCPU) {
> +            virDomainNumaPtr numatune = vm->def->numa;
> +            virBitmapPtr numanode_cpumask = NULL;
> +            for (i = 0; i < virDomainNumaGetNodeCount(numatune); i++) {
> +                numanode_cpumask = virDomainNumaGetNodeCpumask(numatune, i);
> +                // 'i' indicates the cell id, if the vCPU id is in this cell,
> +                // we need get the corresonding nodeset
> +                if (virBitmapIsBitSet(numanode_cpumask, id)) {
> +                    if (virDomainNumatuneMaybeFormatNodeset(numatune,
> +                                                            priv->autoNodeset,
> +                                                            &mem_mask, i) < 0) {
> +                        goto cleanup;
> +                    } else {
> +                        break;
> +                    }
> +                }
> +            }
> +        }

This hunk doesn't seem to belong to this patch. The comments (apart from
using wrong syntax) don't seem to justify why or how this relates to
memory migrability.

> +
>          if (virCgroupNewThread(priv->cgroup, nameval, id, true, &cgroup) < 0)
>              goto cleanup;
>  

[...]

> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c
> index 49567326d4..e6554518d7 100644
> --- a/tests/qemuxml2argvtest.c
> +++ b/tests/qemuxml2argvtest.c
> @@ -1950,6 +1950,11 @@ mymain(void)
>  
>      DO_TEST("numatune-memory", NONE);
>      DO_TEST_PARSE_ERROR("numatune-memory-invalid-nodeset", NONE);
> +    DO_TEST("numatune-memory-migratable",
> +            QEMU_CAPS_NUMA,
> +            QEMU_CAPS_OBJECT_MEMORY_RAM);
> +    DO_TEST_PARSE_ERROR("numatune-memory-migratable", NONE);

Pleased don't use DO_TEST any more. Use DO_TEST_CAPS_* (such as
DO_TEST_CAPS_LATEST) instead for new tests.

The negative test doesn't seem to make sense here since it's not testing
a negative case in your patch. This is visible from the error message it
records:

 > +++ b/tests/qemuxml2argvdata/numatune-memory-migratable.err
 > @@ -0,0 +1 @@
 > +unsupported configuration: Per-node memory binding is not supported with this QEMU




More information about the libvir-list mailing list