[libvirt] [PATCH 8/9] conf: Introduce cputune/memorytune to support memory bandwidth allocation

bing.niu bing.niu at intel.com
Fri Jul 27 04:49:45 UTC 2018



On 2018年07月27日 04:36, John Ferlan wrote:
> 
> subj:
> 
> Add support for memorytune XML processing for resctrl MBA

OK!
> 
> On 07/18/2018 03:57 AM, bing.niu at intel.com wrote:
>> From: Bing Niu <bing.niu at intel.com>
>>
>> Introduce a new section memorytune to support memory bandwidth allocation.
>> This is consistent with existing cachetune. As the example:
>> below:
>>    <cputune>
>>      ......
>>      <memorytune vcpus='0'>
>>        <node id='0' bandwidth='30'/>
>>      </memorytune>
>>    </cputune>
>>
>> vpus      --- vpus subjected to this memory bandwidth.
>> id        --- on which node memory bandwidth to be set.
>> bandwidth --- the memory bandwidth percent to set.
>>
>> Signed-off-by: Bing Niu <bing.niu at intel.com>
>> ---
>>   docs/formatdomain.html.in                          |  35 ++++
>>   docs/schemas/domaincommon.rng                      |  17 ++
>>   src/conf/domain_conf.c                             | 199 +++++++++++++++++++++
>>   .../memorytune-colliding-allocs.xml                |  30 ++++
>>   .../memorytune-colliding-cachetune.xml             |  32 ++++
>>   tests/genericxml2xmlindata/memorytune.xml          |  33 ++++
>>   tests/genericxml2xmltest.c                         |   5 +
>>   7 files changed, 351 insertions(+)
>>   create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-allocs.xml
>>   create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml
>>   create mode 100644 tests/genericxml2xmlindata/memorytune.xml
>>
> 
> Strangely I expected to actually have a merge conflict here with
> previous changes, but I didn't!
> 

Lucky~
>> diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
>> index a3afe13..4e38446 100644
>> --- a/docs/formatdomain.html.in
>> +++ b/docs/formatdomain.html.in
>> @@ -757,6 +757,10 @@
>>         <cache id='0' level='3' type='both' size='3' unit='MiB'/>
>>         <cache id='1' level='3' type='both' size='3' unit='MiB'/>
>>       </cachetune>
>> +    <memorytune vcpus='0-3'>
>> +      <node id='0' bandwidth='60'/>
>> +    </memorytune>
>> +
>>     </cputune>
>>     ...
>>   </domain>
>> @@ -950,7 +954,38 @@
>>               </dl>
>>             </dd>
>>           </dl>
>> +      </dd>
>>   
>> +      <dt><code>memorytune</code></dt>
>> +      <dd>
>> +        Optional <code>memorytune</code> element can control allocations for
> 
> s/Optional/<span class="since">Since 4.7.0</span>, the optional/
> 
> e.g. Since 4.7.0, the optional...

OK! I added that before, but deleted since I am not sure which release 
this series can be accepted
> 
> 
>> +        memory bandwidth using the resctrl on the host. Whether or not is this
>> +        supported can be gathered from capabilities where some limitations like
>> +        minimum bandwidth and required granularity are reported as well. The
>> +        required attribute <code>vcpus</code> specifies to which vCPUs this
>> +        allocation applies. A vCPU can only be member of one
>> +        <code>memorytune</code> element allocations. <code>vcpus</code> specified
> 
> s/./. The/

OK!
> 
>> +        by <code>memorytune</code> can be identical to those specified by
>> +        <code>cachetune</code>. However they are not allowed to overlap each other.
> 
> The last 2 sentences are "confusing".  We didn't add similar wording to
> cachetune... It's one of those visual things I suppose, but it seems
> like there's a relationship between cachetune and memtune, but only if
> they intersect vCPU values.  Thus if cachetune "joins" vcpus='0-3', then
> memtune must also join them - using other combinations of 0-3 isn't
> allowed.

Yes! Let's update cachetune part also.
> 
> I have to only imagine that would get more complicated with RDT
> 
>> +        Supported subelements are:
>> +        <dl>
>> +          <dt><code>node</code></dt>
>> +          <dd>
>> +            This element controls the allocation of CPU memory bandwidth and has the
>> +            following attributes:
>> +            <dl>
>> +              <dt><code>id</code></dt>
>> +              <dd>
>> +                Host node id from which to allocate memory bandwidth.
>> +              </dd>
>> +              <dt><code>bandwidth</code></dt>
>> +              <dd>
>> +                The memory bandwidth to allocate from this node. The value by default
>> +                is in percentage.
> 
> Should we indicate that it must be between 1 and 100 inclusive?

Yes. However the minimum allowed memory bandwidth is defined by 
min_bandwidth of RDT. This is reported in host capability XML.
> 
>> +              </dd>
>> +            </dl>
>> +          </dd>
>> +        </dl>
>>         </dd>
>>       </dl>
>>   
>> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
>> index f24a563..b4cd96b 100644
>> --- a/docs/schemas/domaincommon.rng
>> +++ b/docs/schemas/domaincommon.rng
>> @@ -983,6 +983,23 @@
>>               </oneOrMore>
>>             </element>
>>           </zeroOrMore>
>> +        <zeroOrMore>
>> +          <element nit's trying to clear up in my mind whether the vcpus used byame="memorytune">
>> +            <attribute name="vcpus">
>> +              <ref name='cpuset'/>
>> +            </attribute>
>> +            <oneOrMore>
>> +              <element name="node">
>> +                <attribute name="id">
>> +                  <ref name='unsignedInt'/>
>> +                </attribute>
>> +                <attribute name="bandwidth">
>> +                  <ref name='unsignedInt'/>
>> +                </attribute>
>> +              </element>
>> +            </oneOrMore>
>> +          </element>
>> +        </zeroOrMore>
>>         </interleave>
>>       </element>
>>     </define>
>> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
>> index 695981c..ea9276f 100644
>> --- a/src/conf/domain_conf.c
>> +++ b/src/conf/domain_conf.c
>> @@ -19139,6 +19139,128 @@ virDomainCachetuneDefParse(virDomainDefPtr def,
>>   }
>>   
>>   
>> +static int
>> +virDomainMemorytuneDefParseMemory(xmlXPathContextPtr ctxt,
>> +                                  xmlNodePtr node,
>> +                                  virResctrlAllocPtr alloc)
>> +{
>> +    xmlNodePtr oldnode = ctxt->node;
>> +    unsigned int id;
>> +    unsigned int bandwidth;
>> +    char *tmp = NULL;
>> +    int ret = -1;
>> +
>> +    ctxt->node = node;
>> +
>> +    tmp = virXMLPropString(node, "id");
>> +    if (!tmp) {
>> +        virReportError(VIR_ERR_XML_ERROR, "%s",
>> +                       _("Missing memorytune attribute 'id'"));
>> +        goto cleanup;
>> +    }
>> +    if (virStrToLong_uip(tmp, NULL, 10, &id) < 0) {
>> +        virReportError(VIR_ERR_XML_ERROR,
>> +                       _("Invalid memorytune attribute 'id' value '%s'"),
>> +                       tmp);
>> +        goto cleanup;
>> +    }
>> +    VIR_FREE(tmp);
>> +
>> +    tmp = virXMLPropString(node, "bandwidth");
>> +    if (!tmp) {
>> +        virReportError(VIR_ERR_XML_ERROR, "%s",
>> +                       _("Missing memorytune attribute 'bandwidth'"));
>> +        goto cleanup;
>> +    }
>> +    if (virStrToLong_uip(tmp, NULL, 10, &bandwidth) < 0) {
>> +        virReportError(VIR_ERR_XML_ERROR,
>> +                       _("Invalid memorytune attribute 'bandwidth' value '%s'"),
>> +                       tmp);
>> +        goto cleanup;
>> +    }
> 
> Should we check that bandwidth is between 1 and 100 inclusive?  eg < 1
> || > 100, then error.  Or leave that to SetMemoryBandwidth (IDC).

Let's do it in SetMemoryBandwidth. Make XML parse parameters only. :)
> 
>> +    VIR_FREE(tmp);
>> +    if (virResctrlSetMemoryBandwidth(alloc, id, bandwidth) < 0)
>> +        goto cleanup;
>> +
>> +    ret = 0;
>> + cleanup:
>> +    ctxt->node = oldnode;
>> +    VIR_FREE(tmp);
>> +    return ret;
>> +}
>> +
>> +
>> +static int
>> +virDomainMemorytuneDefParse(virDomainDefPtr def,
>> +                            xmlXPathContextPtr ctxt,
>> +                            xmlNodePtr node,
>> +                            unsigned int flags)
>> +{
>> +    xmlNodePtr oldnode = ctxt->node;
>> +    xmlNodePtr *nodes = NULL;
>> +    virBitmapPtr vcpus = NULL;
>> +    virResctrlAllocPtr alloc = NULL;
>> +    ssize_t i = 0;
>> +    int n;
>> +    int ret = -1;
>> +    bool new_alloc = false;
>> +
>> +    ctxt->node = node;
>> +
>> +    if (virDomainRestuneParseVcpus(def, node, &vcpus) < 0)
>> +        goto cleanup;
>> +
>> +    if (virBitmapIsAllClear(vcpus)) {
>> +        ret = 0;
>> +        goto cleanup;
>> +    }
> 
> Just realized that for here and for cachetune we never parse the
> subelements if BitmapIsAllClear.
> 
> That would mean anything provided is lost - of course it seems silly to
> have an empty bitmap too.  Do we even test for that? Is it even feasible?

Above checking is actually for XML define a vcpus that is not exist.
If domain define vcpu number as 2. However XML describe cachetune/ memtune
<cachetune vcpus='10'>
   ......
<cachetune>
or
<memtune vcpus='10'>
   ......
<memtune>

Domain can still be created, but cachetune or memtune is ignored. Since 
vcpus element is controlling which vcpu thread will be added to 
corresponding resctrl group.
This is existing cachetune behavior. memtune just follow.
> 
>> +
>> +    if ((n = virXPathNodeSet("./node", ctxt, &nodes)) < 0) {
>> +        virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
>> +                       _("Cannot extract memory nodes under memorytune"));
>> +        goto cleanup;
>> +    }
>> +
>> +    if (virDomainFindResctrlAlloc(def, vcpus, &alloc) < 0)
>> +        goto cleanup;
>> +
>> +    if (!alloc) {
>> +        alloc = virResctrlAllocNew();
>> +        if (!alloc)
>> +            goto cleanup;
>> +        new_alloc = true;
>> +    } else {
>> +        alloc = virObjectRef(alloc);
> 
> 
> Ahh... so this is where I had some confusion initially, but makes more
> sense now given one of the failure xml examples.
> 
> Still, does this cause issues:
> 
> <memtune vcpus='3-5'>
>    <node id='0' bandwidth='60'/>
> </memtune
> <memtune vcpus='3-5'>
>    <node id='0' bandwidth='60'/>
> </memtune>

This is valid in virResctrlSetMemoryBandwidth and throw an error. Since 
you already set memory bandwidth at node 0.
> 
> or would the second one be <node id='1' bandwidth='60'/>

This is good, since you are setting memory bandwidth on different node.
> 
> Yes, they should put <node id='1' with the first group, but QE likes to
> stretch the boundaries of imagination once they read the code.
> 
>> +    }
>> +
>> +    for (i = 0; i < n; i++) {
>> +        if (virDomainMemorytuneDefParseMemory(ctxt, nodes[i], alloc) < 0)
>> +            goto cleanup;
>> +    }
>> +    if (virResctrlAllocIsEmpty(alloc)) {
>> +        ret = 0;
>> +        goto cleanup;
>> +    }
>> +    /*
>> +     * If this is a new allocation, format ID and append to restune, otherwise
>> +     * just update the existing alloc information */
> 
> The following code doesn't do what the "otherwise" comment suggests -
> although I perhaps still confused over the @alloc usage above.

If alloc is existing already. Memory bandwidth information is updated in 
  virDomainMemorytuneDefParseMemory and alloc structure is already 
stored in domain->restunes(will be resallocs in next version :) ). 
virDomainFindResctrlAlloc is used to find a existing alloc and return 
its pointer.

If alloc is new, we need append new alloc structure to domain->restunes.
> 
>> +    if (new_alloc) {
>> +        if (virDomainUpdateRestune(def, node, alloc, vcpus, flags) < 0)
>> +            goto cleanup;
>> +        vcpus = NULL;
>> +        alloc = NULL;
>> +    }
>> +
>> +    ret = 0;
>> + cleanup:
>> +    ctxt->node = oldnode;
>> +    virObjectUnref(alloc);
>> +    virBitmapFree(vcpus);
>> +    VIR_FREE(nodes);
>> +    return ret;
>> +}
>> +
>> +
>>   static virDomainDefPtr
>>   virDomainDefParseXML(xmlDocPtr xml,
>>                        xmlNodePtr root,
>> @@ -19734,6 +19856,18 @@ virDomainDefParseXML(xmlDocPtr xml,
>>       }
>>       VIR_FREE(nodes);
>>   
>> +    if ((n = virXPathNodeSet("./cputune/memorytune", ctxt, &nodes)) < 0) {
>> +        virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
>> +                       _("cannot extract memorytune nodes"));
>> +        goto error;
>> +    }
>> +
>> +    for (i = 0; i < n; i++) {
>> +        if (virDomainMemorytuneDefParse(def, ctxt, nodes[i], flags) < 0)
>> +            goto error;
>> +    }
>> +    VIR_FREE(nodes);
>> +
>>       if (virCPUDefParseXML(ctxt, "./cpu[1]", VIR_CPU_TYPE_GUEST, &def->cpu) < 0)
>>           goto error;
>>   
>> @@ -27028,6 +27162,68 @@ virDomainCachetuneDefFormat(virBufferPtr buf,
>>   
>>   
>>   static int
>> +virDomainMemorytuneDefFormatHelper(unsigned int id,
>> +                                   unsigned int bandwidth,
>> +                                   void *opaque)
>> +{
>> +    virBufferPtr buf = opaque;
>> +
>> +    virBufferAsprintf(buf,
>> +                      "<node id='%u' bandwidth='%u'/>\n",
>> +                      id, bandwidth);
>> +    return 0;
>> +}
>> +
>> +
>> +static int
>> +virDomainMemorytuneDefFormat(virBufferPtr buf,
>> +                            virDomainRestuneDefPtr restune,
>> +                            unsigned int flags)
>> +{
>> +    virBuffer childrenBuf = VIR_BUFFER_INITIALIZER;
>> +    char *vcpus = NULL;
>> +    int ret = -1;
>> +
>> +    virBufferSetChildIndent(&childrenBuf, buf);
>> +    virResctrlAllocForeachMemory(restune->alloc,
>> +                                 virDomainMemorytuneDefFormatHelper,
>> +                                 &childrenBuf);
> 
> probably could wrap this in a ignore_value(); since FormatHelper only
> ever returns 0. The error checking is next:

Let's add a return value check, and also for cachetune part. ;)
>> +
>> +
> 
> You can removed 1 of the 2 empty lines above here.
> 

OK
>> +    if (virBufferCheckError(&childrenBuf) < 0)
>> +        goto cleanup;
>> +
>> +    if (!virBufferUse(&childrenBuf)) {
>> +        ret = 0;
>> +        goto cleanup;
>> +    }
>> +
>> +    vcpus = virBitmapFormat(restune->vcpus);
>> +    if (!vcpus)
>> +        goto cleanup;
>> +
>> +    virBufferAsprintf(buf, "<memorytune vcpus='%s'", vcpus);
>> +
>> +    if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) {
>> +        const char *alloc_id = virResctrlAllocGetID(restune->alloc);
>> +        if (!alloc_id)
>> +            goto cleanup;
>> +
>> +        virBufferAsprintf(buf, " id='%s'", alloc_id);
>> +    }
>> +    virBufferAddLit(buf, ">\n");
>> +
>> +    virBufferAddBuffer(buf, &childrenBuf);
>> +    virBufferAddLit(buf, "</memorytune>\n");
>> +
>> +    ret = 0;
>> + cleanup:
>> +    virBufferFreeAndReset(&childrenBuf);
>> +    VIR_FREE(vcpus);
>> +    return ret;
>> +}
>> +
>> +static int
>>   virDomainCputuneDefFormat(virBufferPtr buf,
>>                             virDomainDefPtr def,
>>                             unsigned int flags)
>> @@ -27132,6 +27328,9 @@ virDomainCputuneDefFormat(virBufferPtr buf,
>>       for (i = 0; i < def->nrestunes; i++)
>>           virDomainCachetuneDefFormat(&childrenBuf, def->restunes[i], flags);
>>   
>> +    for (i = 0; i < def->nrestunes; i++)
>> +        virDomainMemorytuneDefFormat(&childrenBuf, def->restunes[i], flags);
>> +
>>       if (virBufferCheckError(&childrenBuf) < 0)
>>           return -1;
>>   
>> diff --git a/tests/genericxml2xmlindata/memorytune-colliding-allocs.xml b/tests/genericxml2xmlindata/memorytune-colliding-allocs.xml
>> new file mode 100644
>> index 0000000..9b8ebaa
>> --- /dev/null
>> +++ b/tests/genericxml2xmlindata/memorytune-colliding-allocs.xml
>> @@ -0,0 +1,30 @@
>> +<domain type='qemu'>
>> +  <name>QEMUGuest1</name>
>> +  <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
>> +  <memory unit='KiB'>219136</memory>
>> +  <currentMemory unit='KiB'>219136</currentMemory>
>> +  <vcpu placement='static'>4</vcpu>
>> +  <cputune>
>> +    <memorytune vcpus='0'>
>> +      <node id='0' bandwidth='50'/>
>> +      <node id='0' bandwidth='50'/>
> 
> This is illegal because node id='#' uses the same number, right?

Yes, illegal
> 
>> +    </memorytune>
>> +  </cputune>
>> +  <os>
>> +    <type arch='i686' machine='pc'>hvm</type>
>> +    <boot dev='hd'/>
>> +  </os>
>> +  <clock offset='utc'/>
>> +  <on_poweroff>destroy</on_poweroff>
>> +  <on_reboot>restart</on_reboot>
>> +  <on_crash>destroy</on_crash>
>> +  <devices>
>> +    <emulator>/usr/bin/qemu-system-i686</emulator>
>> +    <controller type='usb' index='0'/>
>> +    <controller type='ide' index='0'/>
>> +    <controller type='pci' index='0' model='pci-root'/>
>> +    <input type='mouse' bus='ps2'/>
>> +    <input type='keyboard' bus='ps2'/>
>> +    <memballoon model='virtio'/>
>> +  </devices>
>> +</domain>
>> diff --git a/tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml b/tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml
>> new file mode 100644
>> index 0000000..5416870
>> --- /dev/null
>> +++ b/tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml
>> @@ -0,0 +1,32 @@
>> +<domain type='qemu'>
>> +  <name>QEMUGuest1</name>
>> +  <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
>> +  <memory unit='KiB'>219136</memory>
>> +  <currentMemory unit='KiB'>219136</currentMemory>
>> +  <vcpu placement='static'>4</vcpu>
>> +  <cputune>
>> +    <cachetune vcpus='0-1'>
>> +      <cache id='0' level='3' type='code' size='12' unit='KiB'/>
>> +    </cachetune>
>> +    <memorytune vcpus='0'>
>> +      <node id='0' bandwidth='50'/>
>> +    </memorytune>
> 
> This is illegal because memory tune needs to be '0-1', right?

Yes, you are right. And thanks a lot~

> 
> A few fixups seem necessary. Mostly minor stuff.
> 
> John
[....]




More information about the libvir-list mailing list