[libvirt] [PATCH 03/10] conf: Add CMT capability to host

Wang, Huaqiang huaqiang.wang at intel.com
Mon Sep 10 19:17:40 UTC 2018



> -----Original Message-----
> From: John Ferlan [mailto:jferlan at redhat.com]
> Sent: Saturday, September 8, 2018 1:11 AM
> To: Wang, Huaqiang <huaqiang.wang at intel.com>; libvir-list at redhat.com
> Cc: Feng, Shaohe <shaohe.feng at intel.com>; Niu, Bing <bing.niu at intel.com>;
> Ding, Jian-feng <jian-feng.ding at intel.com>; Zang, Rui <rui.zang at intel.com>
> Subject: Re: [libvirt] [PATCH 03/10] conf: Add CMT capability to host
> 
> 
> 
> On 09/07/2018 04:37 AM, Wang, Huaqiang wrote:
> >
> >
> >> -----Original Message-----
> >> From: John Ferlan [mailto:jferlan at redhat.com]
> >> Sent: Wednesday, September 5, 2018 7:59 PM
> >> To: Wang, Huaqiang <huaqiang.wang at intel.com>; libvir-list at redhat.com
> >> Cc: Feng, Shaohe <shaohe.feng at intel.com>; Niu, Bing
> >> <bing.niu at intel.com>; Ding, Jian-feng <jian-feng.ding at intel.com>;
> >> Zang, Rui <rui.zang at intel.com>
> >> Subject: Re: [libvirt] [PATCH 03/10] conf: Add CMT capability to host
> >>
> >>
> >>
> >> On 08/27/2018 07:23 AM, Wang Huaqiang wrote:
> >>> CMT capability for each cache bank, includes -. Maximum CMT
> >>> monitoring groups(sharing with MBM) could be created,
> >>>    which reflects the maximum hardware RMID count.
> >>> -. 'cache threshold'.
> >>> -. Statistical information of last level cache, the actual cache
> >>>    occupancy.
> >>>
> >>> cache is splitted into 'bank's, each bank MAY have different cache
> >>> configuration, report cache monitoring capability in unit of cache bank.
> >>>
> >>> cache monitor capability is shown as below:
> >>>
> >>>     <cache>
> >>>       <bank id='0' level='3' type='both' size='15' unit='MiB' cpus='0-5'>
> >>>         <control granularity='768' unit='KiB' type='code' maxAllocs='8'/>
> >>>         <control granularity='768' unit='KiB' type='data'
> >>> maxAllocs='8'/>
> >>> +       <monitor threshold='540672' unit='B' maxAllocs='176'/>
> >>> +         <feature name=llc_occupancy/>
> >>> +       </monitor>
> >>>       </bank>
> >>>       <bank id='1' level='3' type='both' size='15' unit='MiB' cpus='6-11'>
> >>>         <control granularity='768' unit='KiB' type='code' maxAllocs='8'/>
> >>>         <control granularity='768' unit='KiB' type='data'
> >>> maxAllocs='8'/>
> >>> +       <monitor threshold='540672' unit='B' maxAllocs='176'/>
> >>> +         <feature name=llc_occupancy/>
> >>> +       </monitor>
> >>>
> >>>       </bank>
> >>>     </cache>
> >>>
> >>> Signed-off-by: Wang Huaqiang <huaqiang.wang at intel.com>
> >>> ---
> >>>  docs/schemas/capability.rng | 28 ++++++++++++++++++++++++++++
> >>>  src/conf/capabilities.c     | 17 +++++++++++++++++
> >>>  2 files changed, 45 insertions(+)
> >>>
> >>
> >> This output would be combined with part of existing patch2.
> >>
> >
> > Will be combined in next version patch.
> >
> >>> diff --git a/docs/schemas/capability.rng
> >>> b/docs/schemas/capability.rng index d61515c..67498f1 100644
> >>> --- a/docs/schemas/capability.rng
> >>> +++ b/docs/schemas/capability.rng
> >>> @@ -314,6 +314,24 @@
> >>>                </attribute>
> >>>              </element>
> >>>            </zeroOrMore>
> >>> +          <zeroOrMore>
> >>> +            <element name='monitor'>
> >>> +              <attribute name='threshold'>
> >>> +                <ref name='unsignedInt'/>
> >>> +              </attribute>
> >>> +              <attribute name='unit'>
> >>> +                <ref name='unit'/>
> >>> +              </attribute>
> >>> +              <attribute name='maxAllocs'>
> >>> +                <ref name='unsignedInt'/>
> >>> +              </attribute>
> >>> +              <zeroOrMore>
> >>> +                <element name='feature'>
> >>> +                  <ref name='monitorFeature'/>
> >>> +                </element>
> >>> +              </zeroOrMore>
> >>> +            </element>
> >>> +          </zeroOrMore>
> >>>          </element>
> >>>        </oneOrMore>
> >>>      </element>
> >>> @@ -329,6 +347,16 @@
> >>>      </attribute>
> >>>    </define>
> >>>
> >>> +  <define name='monitorFeature'>
> >>> +    <attribute name='name'>
> >>> +      <choice>
> >>> +        <value>llc_occupancy</value>
> >>> +        <value>mbm_total_bytes</value>
> >>> +        <value>mbm_local_bytes</value>
> >>
> >> So these are the only 3 values you'll ever expect?  Probably not a
> >> good idea to list them like this or the current algorithm is overkill looking for
> prefixed "llc_"
> >> and "mbm_" values.
> >>
> >> If "llc_somethingnew" shows up some day, then the schema is invalidated.
> >>
> >
> > Disagree.
> > I don't think the schema will be invalidated when new "llc_somethingnew"
> > comes. Libvirt only recognize these three feature names.
> 
> Hmm... maybe not clear enough - see, virt-xml-validate.
> 
> Using <choice> limits the choices or allowed values to that known list.
> So if some kernel some day adds llc_somethingnew, but the libvirt rng file for
> the customer isn't updated to include that, then the XML doesn't validate.
> 
> Trying to think of something else similar that exists today, but nothing springs to
> mind.
> 

Align with you.
Let's pass through all features parsed from 'info/L3_MON/features' now and in
the future.
Then only constrain on feature name is being a printable character. Do you agree?

> 
> 
> >
> > If a new hardware feature name comes, take your example, the
> > 'llc_somethingnew', then without a function enabling, should we let it be
> shown here?
> > I think properly no. It makes sense to say libvirt only supports the
> > enabled hardware feature, not all hardware features. To get the new
> 'llc_somethingnew'
> > supported here, you need to make changes here and submit the patch.
> >
> >> If all you're supporting or care about is the 3 values, then each
> >> should be fetched separately.  Just the names make me wonder if they
> >> come with some associated value that would be in some file.  Perhaps
> answered in later patches.
> >
> > Only cares about the 3 values. Will apply more strict name rule to
> > them in source code in next version patch.
> >
> > " Just the names make me wonder if they come with some associated
> > value that would be in some file.  Perhaps answered in later patches." -- not
> understand.
> >
> >>> +      </choice>
> >>> +    </attribute>
> >>> +  </define>
> >>> +
> >>>    <define name='memory_bandwidth'>
> >>>      <element name='memory_bandwidth'>
> >>>        <oneOrMore>
> >>> diff --git a/src/conf/capabilities.c b/src/conf/capabilities.c index
> >>> 5280348..7932088 100644
> >>> --- a/src/conf/capabilities.c
> >>> +++ b/src/conf/capabilities.c
> >>> @@ -942,6 +942,23 @@ virCapabilitiesFormatCaches(virBufferPtr buf,
> >>>                                controls->max_allocation);
> >>>          }
> >>>
> >>> +        if (bank->monitor &&
> >>> +            bank->monitor->nfeatures) {
> >>> +            virBufferAsprintf(&childrenBuf,
> >>> +                              "<monitor threshold='%u' unit='B' "
> >>
> >> Why is "unit='B' " - does it really matter and is it technically right?
> >
> > 'unit' is the unit of 'threshold'. 'threshold' and 'unit' reflect the
> > value reported through resctrl/'max_threshold_occupancy'.
> >
> >> If it's only ever going to be 'B', then easy enough to document that way.
> >>
> >
> > Realized 'unit' shouldn't be 'B', it could be 'KiB', 'MiB' ....
> > Should be dynamically changed accordingly. Will be beautified with
> > 'virFormatIntPretty'.
> >
> 
> Face-palm on my previous response - KiB not MiB <sigh>, it must be Friday.  Oh it
> is!!! yay!

'Unit' will be removed. Now it doesn't matter for KiB or MiB ... :) 

Thanks for review. 
Huaqiang

> 
> John
> 
> >>> +                              "maxAllocs='%u'>\n",
> >>> +                              bank->monitor->cache_threshold,
> >>> +                              bank->monitor->max_allocation);
> >>> +            for (j = 0; j < bank->monitor->nfeatures; j++) {
> >>> +                virBufferAdjustIndent(&childrenBuf, 2);
> >>> +                virBufferAsprintf(&childrenBuf,
> >>> +                                  "<feature name='%s'/>\n",
> >>> +                                  bank->monitor->features[j]);
> >>> +                virBufferAdjustIndent(&childrenBuf, -2);
> >>> +            }
> >>> +            virBufferAddLit(&childrenBuf, "</monitor>\n");
> >>> +        }
> >>> +
> >>
> >> Not clear this data is at the right level, still.
> >>
> >
> > I outlined my considerations for putting cache monitor capability
> > under the data structure of 'cache bank' in my reply of  patch 2. It
> > is a bit long, please go to that email for details.
> > Again welcome suggestions.
> >
> >
> > Thanks for review!
> > Huaqiang
> >
> >> John
> >>
> >>>          if (virBufferCheckError(&childrenBuf) < 0)
> >>>              return -1;
> >>>
> >>>




More information about the libvir-list mailing list