[libvirt] [PATCH v5 1/4] numa: describe siblings distances within cells

Jim Fehlig jfehlig at suse.com
Wed Oct 25 22:12:25 UTC 2017


On 10/12/2017 01:31 PM, Wim Ten Have wrote:
> From: Wim ten Have <wim.ten.have at oracle.com>
> 
> Add support for describing NUMA distances in a domain's <numa> <cell>
> XML description.
> 
> Below is an example of a 4 node setup:
> 
>    <cpu>
>      <numa>
>        <cell id='0' cpus='0-3' memory='2097152' unit='KiB'>
>          <distances>
>            <sibling id='0' value='10'/>
>            <sibling id='1' value='21'/>
>            <sibling id='2' value='31'/>
>            <sibling id='3' value='21'/>
>          </distances>
>        </cell>
>        <cell id='1' cpus='4-7' memory='2097152' unit='KiB'>
>          <distances>
>            <sibling id='0' value='21'/>
>            <sibling id='1' value='10'/>
>            <sibling id='2' value='21'/>
>            <sibling id='3' value='31'/>
>          </distances>
>        </cell>
>        <cell id='2' cpus='8-11' memory='2097152' unit='KiB'>
>          <distances>
>            <sibling id='0' value='31'/>
>            <sibling id='1' value='21'/>
>            <sibling id='2' value='10'/>
>            <sibling id='3' value='21'/>
>          </distances>
>        <cell id='3' cpus='12-15' memory='2097152' unit='KiB'>
>          <distances>
>            <sibling id='0' value='21'/>
>            <sibling id='1' value='31'/>
>            <sibling id='2' value='21'/>
>            <sibling id='3' value='10'/>
>          </distances>
>        </cell>
>      </numa>
>    </cpu>
> 
> A <cell> defines a NUMA node. <distances> describes the NUMA distance
> from the <cell> to the other NUMA nodes (the <sibling>s).  For example,
> in above XML description, the distance between NUMA node0 <cell id='0'
> ...> and NUMA node2 <sibling id='2' ...> is 31.
> 
> Valid distance value are '10 <= value <= 255'.  A distance value of 10
> represents the distance to the node itself.  A distance value of 20
> represents the default value for remote nodes but other values are
> possible depending on the physical topology of the system.
> 
> When distances are not fully described, any missing sibling distance
> values will default to 10 for local nodes and 20 for remote nodes.
> 
> Signed-off-by: Wim ten Have <wim.ten.have at oracle.com>
> ---
> Changes on v1:
> - Add changes to docs/formatdomain.html.in describing schema update.
> Changes on v2:
> - Automatically apply distance symmetry maintaining cell <-> sibling.
> - Check for maximum '255' on numaDistanceValue.
> - Automatically complete empty distance ranges.
> - Check that sibling_id's are in range with cell identifiers.
> - Allow non-contiguous ranges, starting from any node id.
> - Respect parameters as ATTRIBUTE_NONNULL fix functions and callers.
> - Add and apply topology for LOCAL_DISTANCE=10 and REMOTE_DISTANCE=20.
> Changes on v3
> - Add UNREACHABLE if one locality is unreachable from another.
> - Add code cleanup aligning function naming in a separated patch.
> - Add numa related driver code in a separated patch.
> - Remove <choice> from numaDistanceValue schema/basictypes.rng
> - Correct doc changes.
> Changes on v4
> - Fix symmetry error under virDomainNumaDefNodeDistanceParseXML()
> ---
>   docs/formatdomain.html.in   |  63 ++++++++++++-
>   docs/schemas/basictypes.rng |   7 ++
>   docs/schemas/cputypes.rng   |  18 ++++
>   src/conf/numa_conf.c        | 221 +++++++++++++++++++++++++++++++++++++++++++-
>   4 files changed, 305 insertions(+), 4 deletions(-)
> 
> diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> index c0e3c2221..ab2a89c6a 100644
> --- a/docs/formatdomain.html.in
> +++ b/docs/formatdomain.html.in
> @@ -1529,7 +1529,68 @@
>       </p>
>   
>       <p>
> -      This guest NUMA specification is currently available only for QEMU/KVM.
> +      This guest NUMA specification is currently available only for
> +      QEMU/KVM and Xen.  Whereas Xen driver also allows for a distinct
> +      description of NUMA arranged <code>sibling</code> <code>cell</code>
> +      <code>distances</code> <span class="since">Since 3.8.0</span>.

Now 3.9.0.

> +    </p>
> +
> +    <p>
> +      Under NUMA h/w architecture, distinct resources such as memory

For the website docs, I think it is better to write "hardware" instead of "h/w".

> +      create a designated distance between <code>cell</code> and
> +      <code>siblings</code> that now can be described with the help of
> +      <code>distances</code>. A detailed description can be found within
> +      the ACPI (Advanced Configuration and Power Interface Specification)
> +      within the chapter explaining the system's SLIT (System Locality
> +      Distance Information Table).
> +    </p>
> +
> +<pre>
> +...
> +<cpu>
> +  ...
> +  <numa>
> +    <cell id='0' cpus='0,4-7' memory='512000' unit='KiB'>
> +      <distances>
> +        <sibling id='0' value='10'/>
> +        <sibling id='1' value='21'/>
> +        <sibling id='2' value='31'/>
> +        <sibling id='3' value='41'/>
> +      </distances>
> +    </cell>
> +    <cell id='1' cpus='1,8-10,12-15' memory='512000' unit='KiB' memAccess='shared'>
> +      <distances>
> +        <sibling id='0' value='21'/>
> +        <sibling id='1' value='10'/>
> +        <sibling id='2' value='21'/>
> +        <sibling id='3' value='31'/>
> +      </distances>
> +    </cell>
> +    <cell id='2' cpus='2,11' memory='512000' unit='KiB' memAccess='shared'>
> +      <distances>
> +        <sibling id='0' value='31'/>
> +        <sibling id='1' value='21'/>
> +        <sibling id='2' value='10'/>
> +        <sibling id='3' value='21'/>
> +      </distances>
> +    </cell>
> +    <cell id='3' cpus='3' memory='512000' unit='KiB'>
> +      <distances>
> +        <sibling id='0' value='41'/>
> +        <sibling id='1' value='31'/>
> +        <sibling id='2' value='21'/>
> +        <sibling id='3' value='10'/>
> +      </distances>
> +    </cell>
> +  </numa>
> +  ...
> +</cpu>
> +...</pre>
> +
> +    <p>
> +      Under Xen driver, if no <code>distances</code> are given to describe
> +      the SLIT data between different cells, it will default to a scheme
> +      using 10 for local and 20 for remote distances.
>       </p>
>   
>       <h3><a id="elementsEvents">Events configuration</a></h3>
> diff --git a/docs/schemas/basictypes.rng b/docs/schemas/basictypes.rng
> index 1ea667cdf..1a18cd31b 100644
> --- a/docs/schemas/basictypes.rng
> +++ b/docs/schemas/basictypes.rng
> @@ -77,6 +77,13 @@
>       </choice>
>     </define>
>   
> +  <define name="numaDistanceValue">
> +    <data type="unsignedInt">
> +      <param name="minInclusive">10</param>
> +      <param name="maxInclusive">255</param>
> +    </data>
> +  </define>
> +
>     <define name="pciaddress">
>       <optional>
>         <attribute name="domain">
> diff --git a/docs/schemas/cputypes.rng b/docs/schemas/cputypes.rng
> index 3eef16abc..c45b6dfb2 100644
> --- a/docs/schemas/cputypes.rng
> +++ b/docs/schemas/cputypes.rng
> @@ -129,6 +129,24 @@
>             </choice>
>           </attribute>
>         </optional>
> +      <optional>
> +        <element name="distances">
> +          <oneOrMore>
> +            <ref name="numaDistance"/>
> +          </oneOrMore>
> +        </element>
> +      </optional>
> +    </element>
> +  </define>
> +
> +  <define name="numaDistance">
> +    <element name="sibling">
> +      <attribute name="id">
> +        <ref name="unsignedInt"/>
> +      </attribute>
> +      <attribute name="value">
> +        <ref name="numaDistanceValue"/>
> +      </attribute>
>       </element>
>     </define>
>   
> diff --git a/src/conf/numa_conf.c b/src/conf/numa_conf.c
> index b71dc012c..00a3343fb 100644
> --- a/src/conf/numa_conf.c
> +++ b/src/conf/numa_conf.c
> @@ -29,6 +29,15 @@
>   #include "virnuma.h"
>   #include "virstring.h"
>   
> +/*
> + * Distance definitions defined Conform ACPI 2.0 SLIT.
> + * See include/linux/topology.h
> + */
> +#define LOCAL_DISTANCE          10
> +#define REMOTE_DISTANCE         20
> +/* SLIT entry value is a one-byte unsigned integer. */
> +#define UNREACHABLE            255
> +
>   #define VIR_FROM_THIS VIR_FROM_DOMAIN
>   
>   VIR_ENUM_IMPL(virDomainNumatuneMemMode,
> @@ -48,6 +57,8 @@ VIR_ENUM_IMPL(virDomainMemoryAccess, VIR_DOMAIN_MEMORY_ACCESS_LAST,
>                 "shared",
>                 "private")
>   
> +typedef struct _virDomainNumaDistance virDomainNumaDistance;
> +typedef virDomainNumaDistance *virDomainNumaDistancePtr;
>   
>   typedef struct _virDomainNumaNode virDomainNumaNode;
>   typedef virDomainNumaNode *virDomainNumaNodePtr;
> @@ -66,6 +77,12 @@ struct _virDomainNuma {
>           virBitmapPtr nodeset;   /* host memory nodes where this guest node resides */
>           virDomainNumatuneMemMode mode;  /* memory mode selection */
>           virDomainMemoryAccess memAccess; /* shared memory access configuration */
> +
> +        struct _virDomainNumaDistance {
> +            unsigned int value; /* locality value for node i->j or j->i */
> +            unsigned int cellid;
> +        } *distances;           /* remote node distances */
> +        size_t ndistances;
>       } *mem_nodes;           /* guest node configuration */
>       size_t nmem_nodes;
>   
> @@ -686,6 +703,174 @@ virDomainNumatuneNodesetIsAvailable(virDomainNumaPtr numatune,
>   }
>   
>   
> +static int
> +virDomainNumaDefNodeDistanceParseXML(virDomainNumaPtr def,
> +                                     xmlXPathContextPtr ctxt,
> +                                     unsigned int cur_cell)
> +{
> +    int ret = -1;
> +    int sibling;
> +    char *tmp = NULL;
> +    xmlNodePtr *nodes = NULL;
> +    size_t i, ndistances = def->nmem_nodes;
> +
> +    if (!ndistances)
> +        return 0;
> +
> +    /* check if NUMA distances definition is present */
> +    if (!virXPathNode("./distances[1]", ctxt))
> +        return 0;
> +
> +    if ((sibling = virXPathNodeSet("./distances[1]/sibling", ctxt, &nodes)) <= 0) {
> +        virReportError(VIR_ERR_XML_ERROR, "%s",
> +                       _("NUMA distances defined without siblings"));
> +        goto cleanup;
> +    }
> +
> +    for (i = 0; i < sibling; i++) {
> +        virDomainNumaDistancePtr ldist, rdist;
> +        unsigned int sibling_id, sibling_value;
> +
> +        /* siblings are in order of parsing or explicitly numbered */
> +        if (!(tmp = virXMLPropString(nodes[i], "id"))) {
> +            virReportError(VIR_ERR_XML_ERROR,
> +                           _("Missing 'id' attribute in NUMA "
> +                             "distances under 'cell id %d'"),
> +                           cur_cell);
> +            goto cleanup;
> +        }
> +
> +        /* The "id" needs to be applicable */
> +        if (virStrToLong_uip(tmp, NULL, 10, &sibling_id) < 0) {
> +            virReportError(VIR_ERR_XML_ERROR,
> +                           _("Invalid 'id' attribute in NUMA "
> +                             "distances for sibling: '%s'"),
> +                           tmp);
> +            goto cleanup;
> +        }
> +        VIR_FREE(tmp);
> +
> +        /* The "id" needs to be within numa/cell range */
> +        if (sibling_id >= ndistances) {
> +            virReportError(VIR_ERR_XML_ERROR,
> +                           _("There is no cell administrated matching "
> +                             "'sibling_id %d' under NUMA 'cell id %d' "),
> +                           sibling_id, cur_cell);

I think this is a little awkward and could be problematic for translation. A 
better wording might be

'sibling_id %d' does not refer to a valid cell within NUMA 'cell id %d'

> +            goto cleanup;
> +        }
> +
> +        /* We need a locality value. Check and correct
> +         * distance to local and distance to remote node.
> +         */
> +        if (!(tmp = virXMLPropString(nodes[i], "value"))) {
> +            virReportError(VIR_ERR_XML_ERROR,
> +                           _("Missing 'value' attribute in NUMA distances "
> +                             "under 'cell id %d' for 'sibling id %d'"),
> +                           cur_cell, sibling_id);
> +            goto cleanup;
> +        }
> +
> +        /* The "value" needs to be applicable */
> +        if (virStrToLong_uip(tmp, NULL, 10, &sibling_value) < 0) {
> +            virReportError(VIR_ERR_XML_ERROR,
> +                           _("Invalid 'value' attribute in NUMA "
> +                             "distances for value: '%s'"),

I think it would be better if this error message was similar to the next two. 
E.g. "'value %s' is invalid for 'sibling id %d' under NUMA 'cell id %d'". What 
do you think?

> +                           tmp);
> +            goto cleanup;
> +        }
> +        VIR_FREE(tmp);
> +
> +        /* LOCAL_DISTANCE <= "value" <= UNREACHABLE */
> +        if (sibling_value < LOCAL_DISTANCE ||
> +            sibling_value > UNREACHABLE) {
> +            virReportError(VIR_ERR_XML_ERROR,
> +                           _("Out of range value '%d' set for "
> +                             "'sibling id %d' under NUMA 'cell id %d' "),

Extra space at the end of this error message.

> +                           sibling_value, sibling_id, cur_cell);
> +            goto cleanup;
> +        }
> +
> +        /* Assure correct LOCAL_DISTANCE setting if given */
> +        if (sibling_id == cur_cell &&
> +            sibling_value != LOCAL_DISTANCE) {
> +            virReportError(VIR_ERR_XML_ERROR,
> +                           _("Invalid value '%d !LOCAL_DISTANCE' set for "
> +                             "'sibling id %d' under NUMA 'cell id %d' "),

Extra space at the end of this one too. Also, I think we can drop the 
'!LOCAL_DISTANCE'.

> +                           sibling_value, sibling_id, cur_cell);
> +            goto cleanup;
> +        }
> +
> +        /* Assure correct LOCAL_DISTANCE setting if given */
> +        if (sibling_id != cur_cell &&
> +            sibling_value == LOCAL_DISTANCE) {
> +            virReportError(VIR_ERR_XML_ERROR,
> +                           _("Invalid value '%d LOCAL_DISTANCE' set for "
> +                             "'sibling id %d' under NUMA 'cell id %d' "),

Same comments here.

> +                           sibling_value, sibling_id, cur_cell);
> +            goto cleanup;
> +        }
> +
> +        /* Apply the local / remote distance */
> +        ldist = def->mem_nodes[cur_cell].distances;
> +        if (!ldist) {
> +            if (def->mem_nodes[cur_cell].ndistances) {
> +                virReportError(VIR_ERR_XML_ERROR,
> +                               _("Invalid 'ndistances' set in NUMA "
> +                                 "distances for sibling id: '%d'"),
> +                               cur_cell);
> +                goto cleanup;
> +            }

A value in def->mem_nodes[cur_cell].ndistances seems more like a coding error. 
It's definitely not an XML error since ndistances isn't in the schema.

> +
> +            if (VIR_ALLOC_N(ldist, ndistances) < 0)
> +                goto cleanup;
> +
> +            ldist[cur_cell].value = LOCAL_DISTANCE;
> +            ldist[cur_cell].cellid = cur_cell;
> +            def->mem_nodes[cur_cell].ndistances = ndistances;
> +        }
> +
> +        ldist[sibling_id].cellid = sibling_id;
> +        ldist[sibling_id].value = sibling_value;
> +        def->mem_nodes[cur_cell].distances = ldist;
> +
> +        /* Apply symmetry if none given */
> +        rdist = def->mem_nodes[sibling_id].distances;
> +        if (!rdist) {
> +            if (def->mem_nodes[sibling_id].ndistances) {
> +                virReportError(VIR_ERR_XML_ERROR,
> +                               _("Invalid 'ndistances' set in NUMA "
> +                                 "distances for sibling id: '%d'"),
> +                               sibling_id);
> +                goto cleanup;
> +            }

Same comment/question here.

> +
> +            if (VIR_ALLOC_N(rdist, ndistances) < 0)
> +                goto cleanup;
> +
> +            rdist[sibling_id].value = LOCAL_DISTANCE;
> +            rdist[sibling_id].cellid = sibling_id;
> +            def->mem_nodes[sibling_id].ndistances = ndistances;
> +        }
> +
> +        rdist[cur_cell].cellid = cur_cell;
> +        if (!rdist[cur_cell].value)
> +            rdist[cur_cell].value = sibling_value;
> +        def->mem_nodes[sibling_id].distances = rdist;
> +    }
> +
> +    ret = 0;
> +
> + cleanup:
> +    if (ret) {
> +        for (i = 0; i < ndistances; i++)
> +            VIR_FREE(def->mem_nodes[i].distances);
> +    }
> +    VIR_FREE(nodes);
> +    VIR_FREE(tmp);
> +
> +    return ret;
> +}
> +
>   int
>   virDomainNumaDefCPUParseXML(virDomainNumaPtr def,
>                               xmlXPathContextPtr ctxt)
> @@ -694,7 +879,7 @@ virDomainNumaDefCPUParseXML(virDomainNumaPtr def,
>       xmlNodePtr oldNode = ctxt->node;
>       char *tmp = NULL;
>       int n;
> -    size_t i;
> +    size_t i, j;
>       int ret = -1;
>   
>       /* check if NUMA definition is present */
> @@ -712,7 +897,6 @@ virDomainNumaDefCPUParseXML(virDomainNumaPtr def,
>       def->nmem_nodes = n;
>   
>       for (i = 0; i < n; i++) {
> -        size_t j;
>           int rc;
>           unsigned int cur_cell = i;
>   
> @@ -788,6 +972,10 @@ virDomainNumaDefCPUParseXML(virDomainNumaPtr def,
>               def->mem_nodes[cur_cell].memAccess = rc;
>               VIR_FREE(tmp);
>           }
> +
> +        /* Parse NUMA distances info */
> +        if (virDomainNumaDefNodeDistanceParseXML(def, ctxt, cur_cell) < 0)
> +                goto cleanup;
>       }
>   
>       ret = 0;
> @@ -815,6 +1003,8 @@ virDomainNumaDefCPUFormatXML(virBufferPtr buf,
>       virBufferAddLit(buf, "<numa>\n");
>       virBufferAdjustIndent(buf, 2);
>       for (i = 0; i < ncells; i++) {
> +        int ndistances;
> +
>           memAccess = virDomainNumaGetNodeMemoryAccessMode(def, i);
>   
>           if (!(cpustr = virBitmapFormat(virDomainNumaGetNodeCpumask(def, i))))
> @@ -829,7 +1019,32 @@ virDomainNumaDefCPUFormatXML(virBufferPtr buf,
>           if (memAccess)
>               virBufferAsprintf(buf, " memAccess='%s'",
>                                 virDomainMemoryAccessTypeToString(memAccess));
> -        virBufferAddLit(buf, "/>\n");
> +
> +        ndistances = def->mem_nodes[i].ndistances;
> +        if (!ndistances) {
> +            virBufferAddLit(buf, "/>\n");
> +        } else {
> +            size_t j;
> +            virDomainNumaDistancePtr distances = def->mem_nodes[i].distances;
> +
> +            virBufferAddLit(buf, ">\n");
> +            virBufferAdjustIndent(buf, 2);
> +            virBufferAddLit(buf, "<distances>\n");
> +            virBufferAdjustIndent(buf, 2);
> +            for (j = 0; j < ndistances; j++) {
> +                if (distances[j].value) {
> +                    virBufferAddLit(buf, "<sibling");
> +                    virBufferAsprintf(buf, " id='%d'", distances[j].cellid);
> +                    virBufferAsprintf(buf, " value='%d'", distances[j].value);
> +                    virBufferAddLit(buf, "/>\n");
> +                }
> +            }
> +            virBufferAdjustIndent(buf, -2);
> +            virBufferAddLit(buf, "</distances>\n");
> +            virBufferAdjustIndent(buf, -2);
> +            virBufferAddLit(buf, "</cell>\n");
> +        }
> +
>           VIR_FREE(cpustr);
>       }
>       virBufferAdjustIndent(buf, -2);

I could have missed them, but I didn't notice any existing tests for all this 
NUMA parsing/formatting. If I did miss them, then your new settings should to be 
added to those tests. If no tests exist, we should consider adding some as a 
follow-up.

Regards,
Jim




More information about the libvir-list mailing list