[libvirt] [RFC PATCHv2 00/10] x86 RDT Cache Monitoring Technology (CMT)

Wang, Huaqiang huaqiang.wang at intel.com
Tue Jul 17 07:19:41 UTC 2018


Hi Martin,

Thanks for your comments. Please see my reply inline.

> -----Original Message-----
> From: Martin Kletzander [mailto:mkletzan at redhat.com]
> Sent: Tuesday, July 17, 2018 2:27 PM
> To: Wang, Huaqiang <huaqiang.wang at intel.com>
> Cc: libvir-list at redhat.com; Feng, Shaohe <shaohe.feng at intel.com>; Niu, 
> Bing <bing.niu at intel.com>; Ding, Jian-feng <jian-feng.ding at intel.com>; 
> Zang, Rui <rui.zang at intel.com>
> Subject: Re: [libvirt] [RFC PATCHv2 00/10] x86 RDT Cache Monitoring 
> Technology (CMT)
> 
> On Mon, Jul 09, 2018 at 03:00:48PM +0800, Wang Huaqiang wrote:
> >
> >This is the V2 of RFC and the POC source code for introducing x86 RDT 
> >CMT feature, thanks Martin Kletzander for his review and constructive 
> >suggestion for V1.
> >
> >This series is trying to provide the similar functions of the perf 
> >event based CMT, MBMT and MBML features in reporting cache occupancy, 
> >total memory bandwidth utilization and local memory bandwidth 
> >utilization information in livirt. Firstly we focus on cmt.
> >
> >x86 RDT Cache Monitoring Technology (CMT) provides a medthod to track 
> >the cache occupancy information per CPU thread. We are leveraging the 
> >implementation of kernel resctrl filesystem and create our patches on 
> >top of that.
> >
> >Describing the functionality from a high level:
> >
> >1. Extend the output of 'domstats' and report CMT inforamtion.
> >
> >Comparing with perf event based CMT implementation in libvirt, this 
> >series extends the output of command 'domstat' and reports cache 
> >occupancy information like these:
> ><pre>
> >[root at dl-c200 libvirt]# virsh domstats vm3 --cpu-resource
> >Domain: 'vm3'
> >  cpu.cacheoccupancy.vcpus_2.value=4415488
> >  cpu.cacheoccupancy.vcpus_2.vcpus=2
> >  cpu.cacheoccupancy.vcpus_1.value=7839744
> >  cpu.cacheoccupancy.vcpus_1.vcpus=1
> >  cpu.cacheoccupancy.vcpus_0,3.value=53796864
> >  cpu.cacheoccupancy.vcpus_0,3.vcpus=0,3
> ></pre>
> >The vcpus have been arragned into three monitoring groups, these 
> >three groups cover vcpu 1, vcpu 2 and vcpus 0,3 respectively. Take an 
> >example, the 'cpu.cacheoccupancy.vcpus_0,3.value' reports the cache 
> >occupancy information for vcpu 0 and vcpu 3, the
> 'cpu.cacheoccupancy.vcpus_0,3.vcpus'
> >represents the vcpu group information.
> >
> >To address Martin's suggestion "beware as 1-4 is something else than
> >1,4 so you need to differentiate that.", the content of 'vcpus'
> >(cpu.cacheoccupancy.<groupname>.vcpus=xxx) has been specially 
> >processed, if vcpus is a continous range, e.g. 0-2, then the output 
> >of cpu.cacheoccupancy.vcpus_0-2.vcpus will be like 
> >'cpu.cacheoccupancy.vcpus_0-2.vcpus=0,1,2'
> >instead of
> >'cpu.cacheoccupancy.vcpus_0-2.vcpus=0-2'.
> >Please note that 'vcpus_0-2' is a name of this monitoring group, 
> >could be specified any other word from the XML configuration file or 
> >lively changed with the command introduced in following part.
> >
> 
> One small nit according to the naming (but it shouldn't block any 
> reviewers from reviewing, just keep this in mind for next version for 
> example) is that this is still inconsistent.

OK.  I'll try to use words such as 'cache', 'cpu resource' and avoid using
'RDT', 'CMT'.

The way domstats are structured when there is something like an
> array could shed some light into this.  What you suggested is really 
> kind of hard to parse (although looks better).  What would you say to something like this:
> 
>   cpu.cacheoccupancy.count = 3
>   cpu.cacheoccupancy.0.value=4415488
>   cpu.cacheoccupancy.0.vcpus=2
>   cpu.cacheoccupancy.0.name=vcpus_2
>   cpu.cacheoccupancy.1.value=7839744
>   cpu.cacheoccupancy.1.vcpus=1
>   cpu.cacheoccupancy.1.name=vcpus_1
>   cpu.cacheoccupancy.2.value=53796864
>   cpu.cacheoccupancy.2.vcpus=0,3
>   cpu.cacheoccupancy.2.name=0,3
> 

Your arrangement looks more reasonable, thanks for your advice. 
However, as I mentioned in another email that I sent to libvirt-list 
hours ago, the kernel resctrl interface provides cache occupancy
information for each cache block for every resource group.
Maybe we need to expose the cache occupancy for each cache block.
If you agree, we need to refine the 'domstats' output message,
how about this:

  cpu.cacheoccupancy.count=3
  cpu.cacheoccupancy.0.name=vcpus_2
  cpu.cacheoccupancy.0.vcpus=2
  cpu.cacheoccupancy.0.block.count=2
  cpu.cacheoccupancy.0.block.0.bytes=5488
  cpu.cacheoccupancy.0.block.1. bytes =4410000
  cpu.cacheoccupancy.1.name=vcpus_1
  cpu.cacheoccupancy.1.vcpus=1
  cpu.cacheoccupancy.1.block.count=2
  cpu.cacheoccupancy.1.block.0. bytes =7839744
  cpu.cacheoccupancy.1.block.0. bytes =0
  cpu.cacheoccupancy.2.name=0,3
  cpu.cacheoccupancy.2.vcpus=0,3
  cpu.cacheoccupancy.2.block.count=2
  cpu.cacheoccupancy.2.block.0. bytes=53796864
  cpu.cacheoccupancy.2.block.1. bytes=0

> Other than that I didn't go through all the patches now, sorry.




More information about the libvir-list mailing list