[libvirt] [PATCH v2 1/2] add nodeset='all' for interleave mode

John Ferlan jferlan at redhat.com
Thu Oct 11 16:03:44 UTC 2018



On 9/29/18 6:58 AM, Peng Hao wrote:
> sometimes we hoped that the memory of vm can be evenly distributed in
> all nodes according to interleave mode. But different hosts
> has different node number. So we add nodeset='all' for interleave mode.
> 
> Signed-off-by: Peng Hao <peng.hao2 at zte.com.cn>
> ---
>  src/conf/numa_conf.c          | 77 ++++++++++++++++++++++++++++++++++++-------
>  1 files changed, 64 insertions(+), 13 deletions(-)
> 

Not so sure you followed Michal's v1 review comments with respect to
needing an RNG update and test coverage (xml2xml and xml2argv). For the
RNG, look at docs/schemas/*.rng and search on "nodeset" and "cpuset".
Eventually you will find where nodeset gets defined for the schema and
perhaps be able to figure out a way to adjust the pattern that's
allowable for the schema. The XML file you create that would have "all"
would need to pass a virt-xml-validate type command. For the xml2xml and
xml2argv you will need to make sure your grammar (RNG) change works for
XML processing and it would create the command line you expect.

Next, do you think the very narrow band of numatune / memory / nodeset
would be the only "set" to benefit from having being able to specify "in
some manner" all the for a set?  As noted above nodeset and cpuset are
reusable throughout libvirt in various forms.

Dig deeper - think of how that string is presented and parsed (hint: see
virBitmap*). Knowing how many "nodes" there are is part of the <numa>
child of <cpu>, right? While knowing that "all" means one thing "today"
it could mean something different on some subsequent change and if the
numa topology is changed there's lots of code currently around that
would seemingly need to handle "all".

Anyway, back to the cpuset/nodeset presentation. If today it's "A,C,E"
or "A-E,^B,^D", then why not devise a way to use "0,~0" meaning 0 and
the one's complement of 0 (e.g. everything else). That way if I wanted
to, I could "0,~0,^4" meaning everything except 4. Of course the
downside is that I need to know what the guest maximum is. The only way
to know that is in post parse processing. So doing this may involve
moving the translation of @nodeset and @cpuset...

Whether any of this is of use to anyone - who knows. I just think using
"all" is not the best idea. If you've defined your guest to have 4 vCPUs
and/or use 4 NUMA nodes, then it should be very simple to say 0-3 for
the cpuset or nodeset. Allowing "all" is in a way the lazy way and may
not be true at some point in the future. Maybe someone adds a new node
or vCPU to the guest and wants it to perform some other specific task.
If adding that "misses" someone setting "all" for some nodeset, then
that usage model would be unexpected.

As an aside, are there hugepage implications to this? Or other nodeset
configuration options related to NUMA that I'm not as aware of...


> diff --git a/src/conf/numa_conf.c b/src/conf/numa_conf.c
> index 97a3ca4..a336a62 100644
> --- a/src/conf/numa_conf.c
> +++ b/src/conf/numa_conf.c
> @@ -29,6 +29,9 @@
>  #include "virnuma.h"
>  #include "virstring.h"
>  
> +#if WITH_NUMACTL
> +#include <numa.h>
> +#endif

Personally, probably should stick to only having src/util/virnuma.c do this.

>  /*
>   * Distance definitions defined Conform ACPI 2.0 SLIT.
>   * See include/linux/topology.h
> @@ -66,6 +69,7 @@ typedef virDomainNumaNode *virDomainNumaNodePtr;
>  struct _virDomainNuma {
>      struct {
>          bool specified;
> +        bool allnode;
>          virBitmapPtr nodeset;
>          virDomainNumatuneMemMode mode;
>          virDomainNumatunePlacement placement;
> @@ -259,13 +263,21 @@ virDomainNumatuneParseXML(virDomainNumaPtr numa,
>  
>          tmp = virXMLPropString(node, "nodeset");
>          if (tmp) {
> -            if (virBitmapParse(tmp, &nodeset, VIR_DOMAIN_CPUMASK_LEN) < 0)
> -                goto cleanup;
> -
> -            if (virBitmapIsAllClear(nodeset)) {
> +            if (STREQ(tmp, "all") && !virNumaIsAvailable()) {

What does it matter? Seems to me there'd already be some check related
to NUMA availability in the event some nodeset range was defined.

Doing this in grammar checking doesn't feel right.

I took a cursory look, but you can dig deeper yourself to understand
what happens if nodeset is defined and NUMA isn't available otherwise.

>                  virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> -                               _("Invalid value of 'nodeset': %s"), tmp);
> +                              _("Invalid nodeset=%s when numactl is not supported"), tmp);
                                 ^
The alignment here is wrong.

>                  goto cleanup;
> +            } else if (STREQ(tmp, "all") && virNumaIsAvailable()) {
> +                numa->memory.allnode = true;

So "all" would mean we wait until some time in the future to determine
what that means as opposed to setting those up now. What happens to
various checks that may expect that numa->memory.nodeset contains
something? What other checks are out there that will be missed?

> +            } else {
> +                if (virBitmapParse(tmp, &nodeset, VIR_DOMAIN_CPUMASK_LEN) < 0)
> +                    goto cleanup;
> +
> +                if (virBitmapIsAllClear(nodeset)) {
> +                    virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> +                                   _("Invalid value of 'nodeset': %s"), tmp);
> +                    goto cleanup;
> +                }
>              }
>  
>              VIR_FREE(tmp);
> @@ -319,10 +331,14 @@ virDomainNumatuneFormatXML(virBufferPtr buf,
>          virBufferAsprintf(buf, "<memory mode='%s' ", tmp);
>  
>          if (numatune->memory.placement == VIR_DOMAIN_NUMATUNE_PLACEMENT_STATIC) {
> -            if (!(nodeset = virBitmapFormat(numatune->memory.nodeset)))
> -                return -1;
> -            virBufferAsprintf(buf, "nodeset='%s'/>\n", nodeset);
> -            VIR_FREE(nodeset);
> +            if (numatune->memory.allnode == true) {

There would be no need for the == true comparison, it's a boolean operation.

> +                virBufferAddLit(buf, "nodeset='all'/>\n");
> +            } else {
> +                if (!(nodeset = virBitmapFormat(numatune->memory.nodeset)))
> +                    return -1;
> +                virBufferAsprintf(buf, "nodeset='%s'/>\n", nodeset);
> +                VIR_FREE(nodeset);
> +            }
>          } else if (numatune->memory.placement) {
>              tmp = virDomainNumatunePlacementTypeToString(numatune->memory.placement);
>              virBufferAsprintf(buf, "placement='%s'/>\n", tmp);
> @@ -489,6 +505,37 @@ virDomainNumatuneMaybeFormatNodeset(virDomainNumaPtr numatune,
>      return 0;
>  }
>  
> +#if WITH_NUMACTL
> +static int
> +makeAllnodeBitmap(virDomainNumaPtr numa)

Method name doesn't adhere to standards in https://libvirt.org/hacking.html

> +{
> +    size_t i = 0, maxnode = 0;
> +    virBitmapPtr bitmap = NULL;
> +
> +    if ((bitmap = virBitmapNew(VIR_DOMAIN_CPUMASK_LEN)) == NULL)

Consider,

    if (!(bitmap = xxx())))

> +        return -1;
> +    virBitmapClearAll(bitmap);
> +    maxnode = numa_max_node();

I see a virNumaGetMaxNode...

So are we apply "all" to the guest or host configuration?  I guess that
part just isn't clear enough to me. Configuring NUMA isn't in my normal
list of things to look at, but since the patch has been sitting for a
bit I've looked.

> +    for (i = 0; i <= maxnode; i++) {

So what is the value of @maxnode in relation to the value of
VIR_DOMAIN_CPUMASK_LEN (1024)?   Less I hope...

Furthermore if counting starts at 0, then it should be < not <=, right?

> +        if (virBitmapSetBit(bitmap, i) < 0) {
> +            virBitmapFree(bitmap);
> +                return -1;
> +        }
> +    }
> +
> +    virBitmapFree(numa->memory.nodeset);
> +    numa->memory.nodeset = bitmap;
> +
> +    return 0;
> +}
> +#else
> +static int

^^^^ This wouldn't be static in a src/util/vir*.c module...

> +makeAllnodeBitmap(virDomainNumaPtr numa)

And @numa would be ATTRIBUTE_UNUSED; otherwise, there'd be compiler issues.

> +{
> +    return -1;

This would be wrong because there'd be a failure that some parsing
failing for some unknown reason.  You'd need to add a real error message.

> +}
> +#endif
> +
>  int
>  virDomainNumatuneSet(virDomainNumaPtr numa,
>                       bool placement_static,
> @@ -538,20 +585,28 @@ virDomainNumatuneSet(virDomainNumaPtr numa,
>      }
>  
>      if (placement == VIR_DOMAIN_NUMATUNE_PLACEMENT_DEFAULT) {
> -        if (numa->memory.nodeset || placement_static)
> +        if (numa->memory.nodeset || placement_static || numa->memory.allnode)
>              placement = VIR_DOMAIN_NUMATUNE_PLACEMENT_STATIC;
>          else
>              placement = VIR_DOMAIN_NUMATUNE_PLACEMENT_AUTO;
>      }
>  
>      if (placement == VIR_DOMAIN_NUMATUNE_PLACEMENT_STATIC &&
> -        !numa->memory.nodeset) {
> +        !numa->memory.nodeset && !numa->memory.allnode) {

So if the next hunk would theoretically fill in numa->memory.nodeset for
you, then why modify this check now? Why wouldn't the next hunk go above
this hunk?

>          virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
>                         _("nodeset for NUMA memory tuning must be set "
>                           "if 'placement' is 'static'"));
>          goto cleanup;
>      }
>  
> +    if (placement == VIR_DOMAIN_NUMATUNE_PLACEMENT_STATIC &&
> +        mode == VIR_DOMAIN_NUMATUNE_MEM_INTERLEAVE &&
> +        numa->memory.allnode && virNumaIsAvailable()) {

Seems to me XML post parse processing would have needed to check that if
you had this allnode set, then you better also have static and
interleave (and of course that Numa is available). That way, all you'd
have to do is determine that allnode was set and call your function to
do that.

Those are "my" thoughts on the matter. Still not clear as to the need.
I'm 50/50 on it. I see benefit, but there are concerns especially as how
things relate to migration and domain save/restore.

John

> +
> +        if (makeAllnodeBitmap(numa) < 0)
> +            goto cleanup;
> +    }
> +
>      /* setting nodeset when placement auto is invalid */
>      if (placement == VIR_DOMAIN_NUMATUNE_PLACEMENT_AUTO &&
>          numa->memory.nodeset) {
> 




More information about the libvir-list mailing list