[libvirt] [V3] RFC for support cache tune in libvirt

乔立勇(Eli Qiao) qiaoliyong at gmail.com
Thu Jan 12 03:15:39 UTC 2017


>
>
>     <cache>
>       <bank type="l3" size="56320" units="KiB" cpus="0,2,3,6,7,8"/>
>       <bank type="l3" size="56320" units="KiB" cpus="3,4,5,9,10,11"/>
>

yes, I like this too, it could tell the the resource sharing logic by cpus.

Another thinking is that if kernel enable CDP, it will split l3 cache to
code / data type
    <cache>
      <bank type="l3code" size="28160" units="KiB" cpus="0,2,3,6,7,8"/>
      <bank type="l3data" size="28160" units="KiB" cpus="3,4,5,9,10,11"/>

So these information should not only
from /sys/devices/system/cpu/cpu0/cache/index3/size , also depend on if
linux resctrl under /sys/fs/resctrl/



>       <bank type="l2" size="256" units="KiB" cpus="0"/>
>

I think on your system you don't enable SMT, so if on a system which
enabled SMT.

we will have:
      <bank type="l2" size="256" units="KiB" cpus="0, 44"/>


      <bank type="l2" size="256" units="KiB" cpus="1"/>
>       <bank type="l2" size="256" units="KiB" cpus="2"/>
>       <bank type="l2" size="256" units="KiB" cpus="3"/>
>       <bank type="l2" size="256" units="KiB" cpus="4"/>
>       <bank type="l2" size="256" units="KiB" cpus="5"/>
>       <bank type="l2" size="256" units="KiB" cpus="6"/>
>       <bank type="l2" size="256" units="KiB" cpus="7"/>
>       <bank type="l2" size="256" units="KiB" cpus="8"/>
>       <bank type="l2" size="256" units="KiB" cpus="9"/>
>       <bank type="l2" size="256" units="KiB" cpus="10"/>
>       <bank type="l2" size="256" units="KiB" cpus="11"/>
>       <bank type="l1i" size="256" units="KiB" cpus="0"/>
>       <bank type="l1i" size="256" units="KiB" cpus="1"/>
>       <bank type="l1i" size="256" units="KiB" cpus="2"/>
>       <bank type="l1i" size="256" units="KiB" cpus="3"/>
>       <bank type="l1i" size="256" units="KiB" cpus="4"/>
>       <bank type="l1i" size="256" units="KiB" cpus="5"/>
>       <bank type="l1i" size="256" units="KiB" cpus="6"/>
>       <bank type="l1i" size="256" units="KiB" cpus="7"/>
>       <bank type="l1i" size="256" units="KiB" cpus="8"/>
>       <bank type="l1i" size="256" units="KiB" cpus="9"/>
>       <bank type="l1i" size="256" units="KiB" cpus="10"/>
>       <bank type="l1i" size="256" units="KiB" cpus="11"/>
>       <bank type="l1d" size="256" units="KiB" cpus="0"/>
>       <bank type="l1d" size="256" units="KiB" cpus="1"/>
>       <bank type="l1d" size="256" units="KiB" cpus="2"/>
>       <bank type="l1d" size="256" units="KiB" cpus="3"/>
>       <bank type="l1d" size="256" units="KiB" cpus="4"/>
>       <bank type="l1d" size="256" units="KiB" cpus="5"/>
>       <bank type="l1d" size="256" units="KiB" cpus="6"/>
>       <bank type="l1d" size="256" units="KiB" cpus="7"/>
>       <bank type="l1d" size="256" units="KiB" cpus="8"/>
>       <bank type="l1d" size="256" units="KiB" cpus="9"/>
>       <bank type="l1d" size="256" units="KiB" cpus="10"/>
>       <bank type="l1d" size="256" units="KiB" cpus="11"/>
>     </cache>
>
>

hmm... l2 and l1 cache are per core, I am not sure if we really need to
tune the l2 and l1 cache at all, that's too low level.......

Per my understanding, if we expose this kinds of capabilities, we should
support to manage it, just wonder if we are too early to
expose it since low level (linux kernel) have not support it yet.



> which shows each socket has its own dedicated L3 cache, and each
> core has its own L2 & L1 cache.
>
> > 2. Extend capabilities outputs.
> >
> > virsh capabilities | grep resctrl
> >     <cpu>
> >     ...
> >       <resctrl name='L3' unit='KiB' cache_size='56320'
> cache_unit='2816'/>
> >     </cpu>
> >
> >     This will tell that the host have enabled resctrl(which you can find
> it in /sys/fs/resctrl),
> > And it supports to allocate 'L3' type cache, total 'L3' cache size is
> 56320 KiB, and the minimum unit size of 'L3' cache is 2816 KiB.
> >   P.S. L3 cache size unit is the minum l3 cache unit can be allocated.
> It's hardware related and can not be changed.
>
> If we're already reported cache in the capabilities from step
> one, then it ought to be extendable to cover this reporting.
>
>     <cache>
>       <bank type="l3" size="56320" units="KiB" cpus="0,2,3,6,7,8">
>           <control unit="KiB" min="2816"/>
>       </bank>
>       <bank type="l3" size="56320" units="KiB" cpus="3,4,5,9,10,11">
>           <control unit="KiB" min="2816"/>
>       </bank>
>     </cache>
>
>
Looks good to me.


> note how we report the control info for both l3 caches, since they
> come from separate sockets and thus could conceivably report different
> info if different CPUs were in each socket.
>
> > 3. Add new virsh command 'nodecachestats':
> > This API is to expose vary cache resouce left on each hardware (cpu
> socket).
> >
> > It will be formated as:
> >
> > <resource_type>.<resource_id>: left size KiB
> >
> > for example I have a 2 socket cpus host, and I'v enabled cat_l3 feature
> only
> >
> > root at s2600wt:~/linux# virsh nodecachestats
> > L3.0 : 56320 KiB
> > L3.1 : 56320 KiB
> >
> >   P.S. resource_type can be L3, L3DATA, L3CODE, L2 for now.
>
> This feels like something we should have in the capabilities XML too
> rather than a new command
>
>     <cache>
>       <bank type="l3" size="56320" units="KiB" cpus="0,2,3,6,7,8">
>           <control unit="KiB" min="2816" avail="56320/>
>       </bank>
>       <bank type="l3" size="56320" units="KiB" cpus="3,4,5,9,10,11">
>           <control unit="KiB" min="2816" avail="56320"/>
>       </bank>
>     </cache>
>
> > 4. Add new interface to manage how many cache can be allociated for a
> domain
> >
> > root at s2600wt:~/linux# virsh cachetune kvm02 --l3.count 2
> >
> > root at s2600wt:~/linux# virsh cachetune kvm02
> > l3.count       : 2
> >
> > This will allocate 2 units(2816 * 2) l3 cache for domain kvm02
> >
> > ## Domain XML changes
> >
> > Cache Tuneing
> >
> > <domain>
> >   ...
> >   <cachetune>
> >     <l3_cache_count>2</l3_cache_count>
> >   </cachetune>
> >   ...
> > </domain>
>
> IIUC, the kernel lets us associate individual PIDs
> with each cache. Since each vCPU is a PID, this means
> we are able to allocate different cache size to
> different CPUs. So we need to be able to represent
> that in the XML. I think we should also represent
> the allocation in a normal size (ie KiB), not in
> count of min unit.
>
>
ok


> So eg this shows allocating two cache banks and giving
> one to the first 4 cpus, and one to the second 4 cpus
>
>    <cachetune>
>       <bank type="l3" size="5632" unit="KiB" cpus="0,1,2,3"/>
>       <bank type="l3" size="5632" unit="KiB" cpus="4,5,6,7"/>
>

oh, that depend what the CPUs topology, so I don't like here to ad cpus =
"0, 1, 2 , 3", we can not guarantee VM can running though CPU 0 1 2 3, so
they  may not benefit the cache bank.


>    </cachetune>
>
>
> Regards,
> Daniel
> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/
> :|
> |: http://libvirt.org              -o-             http://virt-manager.org
> :|
> |: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/
> :|
>



-- 
Best regards
- Eli

天涯无处不重逢
a leaf duckweed belongs to the sea , where not to meet in life
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20170112/d2b0b915/attachment-0001.htm>


More information about the libvir-list mailing list