[libvirt] [V3] RFC for support cache tune in libvirt

Martin Kletzander mkletzan at redhat.com
Wed Jan 11 10:55:28 UTC 2017

On Wed, Jan 11, 2017 at 10:05:26AM +0000, Daniel P. Berrange wrote:
>On Tue, Jan 10, 2017 at 07:42:59AM +0000, Qiao, Liyong wrote:
>> Add support for cache allocation.
>> Thanks Martin for the previous version comments, this is the v3 version for RFC , I’v have some PoC code [2]. The follow changes are partly finished by the PoC.
>> #Propose Changes
>> ## virsh command line
>> 1. Extend output of nodeinfo, to expose L3 cache size for Level 3 (last level cache size).
>> This will expose how many cache on a host which can be used.
>> root at s2600wt:~/linux# virsh nodeinfo | grep L3
>> L3 cache size:       56320 KiB
>Ok, as previously discussed, we should include this in the capabilities
>XML instead and have info about all the caches. We likely also want to
>relate which CPUs are associated with which cache in some way.
>eg if we have this topology
>    <topology>
>      <cells num='2'>
>        <cell id='0'>
>          <cpus num='6'>
>            <cpu id='0' socket_id='0' core_id='0' siblings='0'/>
>            <cpu id='1' socket_id='0' core_id='2' siblings='1'/>
>            <cpu id='2' socket_id='0' core_id='4' siblings='2'/>
>            <cpu id='6' socket_id='0' core_id='1' siblings='6'/>
>            <cpu id='7' socket_id='0' core_id='3' siblings='7'/>
>            <cpu id='8' socket_id='0' core_id='5' siblings='8'/>
>          </cpus>
>        </cell>
>        <cell id='1'>
>          <cpus num='6'>
>            <cpu id='3' socket_id='1' core_id='0' siblings='3'/>
>            <cpu id='4' socket_id='1' core_id='2' siblings='4'/>
>            <cpu id='5' socket_id='1' core_id='4' siblings='5'/>
>            <cpu id='9' socket_id='1' core_id='1' siblings='9'/>
>            <cpu id='10' socket_id='1' core_id='3' siblings='10'/>
>            <cpu id='11' socket_id='1' core_id='5' siblings='11'/>
>          </cpus>
>        </cell>
>      </cells>
>    </topology>
>We might have something like this cache info
>    <cache>
>      <bank type="l3" size="56320" units="KiB" cpus="0,2,3,6,7,8"/>
>      <bank type="l3" size="56320" units="KiB" cpus="3,4,5,9,10,11"/>
>      <bank type="l2" size="256" units="KiB" cpus="0"/>
>      <bank type="l2" size="256" units="KiB" cpus="1"/>
>      <bank type="l2" size="256" units="KiB" cpus="2"/>
>      <bank type="l2" size="256" units="KiB" cpus="3"/>
>      <bank type="l2" size="256" units="KiB" cpus="4"/>
>      <bank type="l2" size="256" units="KiB" cpus="5"/>
>      <bank type="l2" size="256" units="KiB" cpus="6"/>
>      <bank type="l2" size="256" units="KiB" cpus="7"/>
>      <bank type="l2" size="256" units="KiB" cpus="8"/>
>      <bank type="l2" size="256" units="KiB" cpus="9"/>
>      <bank type="l2" size="256" units="KiB" cpus="10"/>
>      <bank type="l2" size="256" units="KiB" cpus="11"/>
>      <bank type="l1i" size="256" units="KiB" cpus="0"/>
>      <bank type="l1i" size="256" units="KiB" cpus="1"/>
>      <bank type="l1i" size="256" units="KiB" cpus="2"/>
>      <bank type="l1i" size="256" units="KiB" cpus="3"/>
>      <bank type="l1i" size="256" units="KiB" cpus="4"/>
>      <bank type="l1i" size="256" units="KiB" cpus="5"/>
>      <bank type="l1i" size="256" units="KiB" cpus="6"/>
>      <bank type="l1i" size="256" units="KiB" cpus="7"/>
>      <bank type="l1i" size="256" units="KiB" cpus="8"/>
>      <bank type="l1i" size="256" units="KiB" cpus="9"/>
>      <bank type="l1i" size="256" units="KiB" cpus="10"/>
>      <bank type="l1i" size="256" units="KiB" cpus="11"/>
>      <bank type="l1d" size="256" units="KiB" cpus="0"/>
>      <bank type="l1d" size="256" units="KiB" cpus="1"/>
>      <bank type="l1d" size="256" units="KiB" cpus="2"/>
>      <bank type="l1d" size="256" units="KiB" cpus="3"/>
>      <bank type="l1d" size="256" units="KiB" cpus="4"/>
>      <bank type="l1d" size="256" units="KiB" cpus="5"/>
>      <bank type="l1d" size="256" units="KiB" cpus="6"/>
>      <bank type="l1d" size="256" units="KiB" cpus="7"/>
>      <bank type="l1d" size="256" units="KiB" cpus="8"/>
>      <bank type="l1d" size="256" units="KiB" cpus="9"/>
>      <bank type="l1d" size="256" units="KiB" cpus="10"/>
>      <bank type="l1d" size="256" units="KiB" cpus="11"/>
>    </cache>
>which shows each socket has its own dedicated L3 cache, and each
>core has its own L2 & L1 cache.
>> 2. Extend capabilities outputs.
>> virsh capabilities | grep resctrl
>>     <cpu>
>>     ...
>>       <resctrl name='L3' unit='KiB' cache_size='56320' cache_unit='2816'/>
>>     </cpu>
>>     This will tell that the host have enabled resctrl(which you can find it in /sys/fs/resctrl),
>> And it supports to allocate 'L3' type cache, total 'L3' cache size is 56320 KiB, and the minimum unit size of 'L3' cache is 2816 KiB.
>>   P.S. L3 cache size unit is the minum l3 cache unit can be allocated. It's hardware related and can not be changed.
>If we're already reported cache in the capabilities from step
>one, then it ought to be extendable to cover this reporting.
>    <cache>
>      <bank type="l3" size="56320" units="KiB" cpus="0,2,3,6,7,8">
>          <control unit="KiB" min="2816"/>
>      </bank>
>      <bank type="l3" size="56320" units="KiB" cpus="3,4,5,9,10,11">
>          <control unit="KiB" min="2816"/>
>      </bank>
>    </cache>
>note how we report the control info for both l3 caches, since they
>come from separate sockets and thus could conceivably report different
>info if different CPUs were in each socket.
>> 3. Add new virsh command 'nodecachestats':
>> This API is to expose vary cache resouce left on each hardware (cpu socket).
>> It will be formated as:
>> <resource_type>.<resource_id>: left size KiB
>> for example I have a 2 socket cpus host, and I'v enabled cat_l3 feature only
>> root at s2600wt:~/linux# virsh nodecachestats
>> L3.0 : 56320 KiB
>> L3.1 : 56320 KiB
>>   P.S. resource_type can be L3, L3DATA, L3CODE, L2 for now.
>This feels like something we should have in the capabilities XML too
>rather than a new command
>    <cache>
>      <bank type="l3" size="56320" units="KiB" cpus="0,2,3,6,7,8">
>          <control unit="KiB" min="2816" avail="56320/>
>      </bank>
>      <bank type="l3" size="56320" units="KiB" cpus="3,4,5,9,10,11">
>          <control unit="KiB" min="2816" avail="56320"/>
>      </bank>
>    </cache>
>> 4. Add new interface to manage how many cache can be allociated for a domain
>> root at s2600wt:~/linux# virsh cachetune kvm02 --l3.count 2
>> root at s2600wt:~/linux# virsh cachetune kvm02
>> l3.count       : 2
>> This will allocate 2 units(2816 * 2) l3 cache for domain kvm02
>> ## Domain XML changes
>> Cache Tuneing
>> <domain>
>>   ...
>>   <cachetune>
>>     <l3_cache_count>2</l3_cache_count>
>>   </cachetune>
>>   ...
>> </domain>
>IIUC, the kernel lets us associate individual PIDs
>with each cache. Since each vCPU is a PID, this means
>we are able to allocate different cache size to
>different CPUs. So we need to be able to represent
>that in the XML. I think we should also represent
>the allocation in a normal size (ie KiB), not in
>count of min unit.
>So eg this shows allocating two cache banks and giving
>one to the first 4 cpus, and one to the second 4 cpus
>   <cachetune>
>      <bank type="l3" size="5632" unit="KiB" cpus="0,1,2,3"/>
>      <bank type="l3" size="5632" unit="KiB" cpus="4,5,6,7"/>
>   </cachetune>

I agree with your approach, we just need to keep in mind two more
things.  I/O threads and the mail QEMU (emulator) thread can have
allocations as well.  Also we need to say on which socket the allocation
should be done.

>|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
>|: http://libvirt.org              -o-             http://virt-manager.org :|
>|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20170111/49b55b68/attachment-0001.sig>

More information about the libvir-list mailing list