[libvirt] [RFC PATCHv2 00/10] x86 RDT Cache Monitoring Technology (CMT)

Martin Kletzander mkletzan at redhat.com
Tue Jul 17 09:10:41 UTC 2018


On Tue, Jul 17, 2018 at 07:19:41AM +0000, Wang, Huaqiang wrote:
>Hi Martin,
>
>Thanks for your comments. Please see my reply inline.
>
>> -----Original Message-----
>> From: Martin Kletzander [mailto:mkletzan at redhat.com]
>> Sent: Tuesday, July 17, 2018 2:27 PM
>> To: Wang, Huaqiang <huaqiang.wang at intel.com>
>> Cc: libvir-list at redhat.com; Feng, Shaohe <shaohe.feng at intel.com>; Niu,
>> Bing <bing.niu at intel.com>; Ding, Jian-feng <jian-feng.ding at intel.com>;
>> Zang, Rui <rui.zang at intel.com>
>> Subject: Re: [libvirt] [RFC PATCHv2 00/10] x86 RDT Cache Monitoring
>> Technology (CMT)
>>
>> On Mon, Jul 09, 2018 at 03:00:48PM +0800, Wang Huaqiang wrote:
>> >
>> >This is the V2 of RFC and the POC source code for introducing x86 RDT
>> >CMT feature, thanks Martin Kletzander for his review and constructive
>> >suggestion for V1.
>> >
>> >This series is trying to provide the similar functions of the perf
>> >event based CMT, MBMT and MBML features in reporting cache occupancy,
>> >total memory bandwidth utilization and local memory bandwidth
>> >utilization information in livirt. Firstly we focus on cmt.
>> >
>> >x86 RDT Cache Monitoring Technology (CMT) provides a medthod to track
>> >the cache occupancy information per CPU thread. We are leveraging the
>> >implementation of kernel resctrl filesystem and create our patches on
>> >top of that.
>> >
>> >Describing the functionality from a high level:
>> >
>> >1. Extend the output of 'domstats' and report CMT inforamtion.
>> >
>> >Comparing with perf event based CMT implementation in libvirt, this
>> >series extends the output of command 'domstat' and reports cache
>> >occupancy information like these:
>> ><pre>
>> >[root at dl-c200 libvirt]# virsh domstats vm3 --cpu-resource
>> >Domain: 'vm3'
>> >  cpu.cacheoccupancy.vcpus_2.value=4415488
>> >  cpu.cacheoccupancy.vcpus_2.vcpus=2
>> >  cpu.cacheoccupancy.vcpus_1.value=7839744
>> >  cpu.cacheoccupancy.vcpus_1.vcpus=1
>> >  cpu.cacheoccupancy.vcpus_0,3.value=53796864
>> >  cpu.cacheoccupancy.vcpus_0,3.vcpus=0,3
>> ></pre>
>> >The vcpus have been arragned into three monitoring groups, these
>> >three groups cover vcpu 1, vcpu 2 and vcpus 0,3 respectively. Take an
>> >example, the 'cpu.cacheoccupancy.vcpus_0,3.value' reports the cache
>> >occupancy information for vcpu 0 and vcpu 3, the
>> 'cpu.cacheoccupancy.vcpus_0,3.vcpus'
>> >represents the vcpu group information.
>> >
>> >To address Martin's suggestion "beware as 1-4 is something else than
>> >1,4 so you need to differentiate that.", the content of 'vcpus'
>> >(cpu.cacheoccupancy.<groupname>.vcpus=xxx) has been specially
>> >processed, if vcpus is a continous range, e.g. 0-2, then the output
>> >of cpu.cacheoccupancy.vcpus_0-2.vcpus will be like
>> >'cpu.cacheoccupancy.vcpus_0-2.vcpus=0,1,2'
>> >instead of
>> >'cpu.cacheoccupancy.vcpus_0-2.vcpus=0-2'.
>> >Please note that 'vcpus_0-2' is a name of this monitoring group,
>> >could be specified any other word from the XML configuration file or
>> >lively changed with the command introduced in following part.
>> >
>>
>> One small nit according to the naming (but it shouldn't block any
>> reviewers from reviewing, just keep this in mind for next version for
>> example) is that this is still inconsistent.
>
>OK.  I'll try to use words such as 'cache', 'cpu resource' and avoid using
>'RDT', 'CMT'.
>

Oh, you misunderstood, I meant the naming in the domstats output =)

>The way domstats are structured when there is something like an
>> array could shed some light into this.  What you suggested is really
>> kind of hard to parse (although looks better).  What would you say to something like this:
>>
>>   cpu.cacheoccupancy.count = 3
>>   cpu.cacheoccupancy.0.value=4415488
>>   cpu.cacheoccupancy.0.vcpus=2
>>   cpu.cacheoccupancy.0.name=vcpus_2
>>   cpu.cacheoccupancy.1.value=7839744
>>   cpu.cacheoccupancy.1.vcpus=1
>>   cpu.cacheoccupancy.1.name=vcpus_1
>>   cpu.cacheoccupancy.2.value=53796864
>>   cpu.cacheoccupancy.2.vcpus=0,3
>>   cpu.cacheoccupancy.2.name=0,3
>>
>
>Your arrangement looks more reasonable, thanks for your advice.
>However, as I mentioned in another email that I sent to libvirt-list
>hours ago, the kernel resctrl interface provides cache occupancy
>information for each cache block for every resource group.
>Maybe we need to expose the cache occupancy for each cache block.
>If you agree, we need to refine the 'domstats' output message,
>how about this:
>
>  cpu.cacheoccupancy.count=3
>  cpu.cacheoccupancy.0.name=vcpus_2
>  cpu.cacheoccupancy.0.vcpus=2
>  cpu.cacheoccupancy.0.block.count=2
>  cpu.cacheoccupancy.0.block.0.bytes=5488
>  cpu.cacheoccupancy.0.block.1. bytes =4410000
>  cpu.cacheoccupancy.1.name=vcpus_1
>  cpu.cacheoccupancy.1.vcpus=1
>  cpu.cacheoccupancy.1.block.count=2
>  cpu.cacheoccupancy.1.block.0. bytes =7839744
>  cpu.cacheoccupancy.1.block.0. bytes =0
>  cpu.cacheoccupancy.2.name=0,3
>  cpu.cacheoccupancy.2.vcpus=0,3
>  cpu.cacheoccupancy.2.block.count=2
>  cpu.cacheoccupancy.2.block.0. bytes=53796864
>  cpu.cacheoccupancy.2.block.1. bytes=0
>

What do you mean by cache block?  Is that (cache_size / granularity)?  In that
case it looks fine, I guess (without putting too much thought into it).

Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20180717/d3a7a386/attachment-0001.sig>


More information about the libvir-list mailing list