[libvirt] [PATCHv5 00/19] Introduce x86 Cache Monitoring Technology (CMT)

John Ferlan jferlan at redhat.com
Tue Oct 9 16:53:47 UTC 2018



On 10/9/18 6:30 AM, Wang Huaqiang wrote:
> This series of patches and the series already been merged introduce
> the x86 Cache Monitoring Technology (CMT) to libvirt by interacting
> with kernel resource control (resctrl) interface. CMT is one of the
> Intel(R) x86 CPU feature which belongs to the Resource Director
> Technology (RDT). CMT reports the occupancy of the last level cache,
> which is shared by all CPU cores.
> 
> In the v1 series, an original and complete feature for CMT was introduced
> The v2 and v3 patches address the feature for the host capability of CMT.
> v4 is addressing the feature for monitoring VM vcpu thread set cache
> occupancy and reporting it through a virsh command.
> 
> We have serval discussion about the enabling of CMT, please refer to
> following links for the RFCs.
> RFCv3
> https://www.redhat.com/archives/libvir-list/2018-August/msg01213.html
> RFCv2
> https://www.redhat.com/archives/libvir-list/2018-July/msg00409.html
> https://www.redhat.com/archives/libvir-list/2018-July/msg01241.html
> RFCv1
> https://www.redhat.com/archives/libvir-list/2018-June/msg00674.html
> 
> And the merged commits are list as below, for host capability of CMT.
> 6af8417415508c31f8ce71234b573b4999f35980
> 8f6887998bf63594ae26e3db18d4d5896c5f2cb4
> 58fcee6f3a2b7e89c21c1fb4ec21429c31a0c5b8
> 12093f1feaf8f5023dcd9d65dff111022842183d
> a5d293c18831dcf69ec6195798387fbb70c9f461
> 
> 
> 1. About reason why CMT is necessary in libvirt?
> The perf events of 'CMT, MBML, MBMT' have been phased out since Linux
> kernel commit c39a0e2c8850f08249383f2425dbd8dbe4baad69, in libvirt
> the perf based cmt,mbm will not work with the latest linux kernel. These
> patches add CMT feature to libvirt through kernel resctrlfs interface.
> 
> 2 Create cache monitoring group (cache monitor).
> 
>     The main interface for creating monitoring group is through XML file. The
> proposed configuration is like:
> 
>       <cputune>
>         <cachetune vcpus='1'>
>           <cache id='0' level='3' type='code' size='7680' unit='KiB'/>
>           <cache id='1' level='3' type='data' size='3840' unit='KiB'/>
>     +     <monitor level='3' vcpus='1'/>

Duplication of vcpus is odd for a child entry isn't it?  It's not in the
<cache> entry...

>         </cachetune>
>         <cachetune vcpus='4-7'>
>     +     <monitor level='3' vcpus='4-6'/>

... but perhaps that means using 4-6 is OK because it's a subset of the
parent cachetune 4-7?

I'm not sure I can keep track of all the discussions we've had about
this, so this could be something we've already covered, but has moved
out of my short term memory.

>         </cachetune>
>       </cputune>
> 
> In above XML, created 2 cache resctrl allocation groups and 2 resctrl
> monitoring groups.
> The changes of cache monitor will be effective in next booting of VM.
> 
> 2 Show CMT result through command 'domstats'
> 
> Adding the interface in qemu to report this information for resource
> monitor group through command 'virsh domstats --cpu-total'.
> Below is a typical output:
> 
>      # virsh domstats 1 --cpu-total
>      Domain: 'ubuntu16.04-base'
>      ...
>        cpu.cache.monitor.count=2
>        cpu.cache.0.name=vcpus_1
>        cpu.cache.0.vcpus=1
>        cpu.cache.0.bank.count=2
>        cpu.cache.0.bank.0.id=0
>        cpu.cache.0.bank.0.bytes=4505600
>        cpu.cache.0.bank.1.id=1
>        cpu.cache.0.bank.1.bytes=5586944
>        cpu.cache.1.name=vcpus_4-6

So perhaps "this" is more correct 4-6 (I assume this comes from the
<cachetune> entryu...

>        cpu.cache.1.vcpus=4,5,6

Interesting that a name can be 4-6, but these are each called out. Can
someone have "5,7,9"?  How does that look on the name line and then on
the vcpus line.

>        cpu.cache.1.bank.count=2
>        cpu.cache.1.bank.0.id=0
>        cpu.cache.1.bank.0.bytes=17571840
>        cpu.cache.1.bank.1.id=1
>        cpu.cache.1.bank.1.bytes=29106176

Obviously a different example than above with only 1 <monitor> entry...
and the .bytes values for everything doesn't match up with the kb values
above.

> 
> 
> Changes in v5:
> - qemu: Setting up vcpu and adding pids to resctrl monitor groups during
> re-connection.
> - Add the document for domain configuration related to resctrl monitor.
> 

Probably should have posted a reply to your v4 series to indicate you
were working on a v5 due to whatever reason so that no one started
reviewing it...

It takes a "long time" to set aside the time to review large series...

Also, while it may pass your compiler, the patch18 needed:

-    unsigned int nmonitors = NULL;
+    unsigned int nmonitors = 0;

Something I thought I had pointed out in much earlier reviews...

I'll work through the series over the next day or so with any luck...
It is on my short term radar at least.

John

> Changes in v4:
> v4 is addressing the feature for monitoring VM vcpu
> thread set cache occupancy and reporting it through a
> virsh command.
> - Introduced resctrl default allocation
> - Introduced resctrl monitor and default monitor
> 
> Changes in v3:
> - Addressed John Ferlan's review.
> - Typo fixed.
> - Removed VIR_ENUM_DECL(virMonitor);
> 
> Changes in v2:
> - Introduced MBM capability.
> - Capability layout changed
>     * Moved <monitor> from cahe <bank> to <cache>
>     * Renamed <Threshold> to <reuseThreshold>
> - Document for 'reuseThreshold' changed.
> - Introduced API virResctrlInfoGetMonitorPrefix
> - Added more tests, covering standalone CMT, fake new
>   feature.
> - Creating CMT resource control group will be
>   subsequent job.
> 
> 
> Wang Huaqiang (19):
>   docs: Refactor schemas to support default allocation
>   util: Introduce resctrl monitor for CMT
>   util: Refactor code for adding PID to the resource group
>   util: Add interface for adding PID to monitor
>   util: Refactor code for determining allocation path
>   util: Add monitor interface to determine path
>   util: Refactor code for creating resctrl group
>   util: Add interface for creating monitor group
>   util: Add more interfaces for resctrl monitor
>   util: Introduce default monitor
>   conf: Refactor code for matching existing resctrls
>   conf: Refactor virDomainResctrlAppend
>   conf: Add resctrl monitor configuration
>   Util: Add function for checking if monitor is running
>   qemu: enable resctrl monitor in qemu
>   conf: Add a 'id' to virDomainResctrlDef
>   qemu: refactor qemuDomainGetStatsCpu
>   qemu: Report cache occupancy (CMT) with domstats
>   qemu: Setting up vcpu and adding pids to resctrl monitor groups during
>     reconnection
> 
>  docs/formatdomain.html.in                          |  30 +-
>  docs/schemas/domaincommon.rng                      |  14 +-
>  src/conf/domain_conf.c                             | 327 ++++++++++--
>  src/conf/domain_conf.h                             |  12 +
>  src/libvirt-domain.c                               |   9 +
>  src/libvirt_private.syms                           |  12 +
>  src/qemu/qemu_driver.c                             | 271 +++++++++-
>  src/qemu/qemu_process.c                            |  52 +-
>  src/util/virresctrl.c                              | 562 ++++++++++++++++++++-
>  src/util/virresctrl.h                              |  49 ++
>  tests/genericxml2xmlindata/cachetune-cdp.xml       |   3 +
>  .../cachetune-colliding-monitor.xml                |  30 ++
>  tests/genericxml2xmlindata/cachetune-small.xml     |   7 +
>  tests/genericxml2xmltest.c                         |   2 +
>  14 files changed, 1277 insertions(+), 103 deletions(-)
>  create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
> 




More information about the libvir-list mailing list