[libvirt] [PATCH 06/10] util: Introduce resctrl monitor for CMT

John Ferlan jferlan at redhat.com
Tue Sep 11 16:23:47 UTC 2018



On 09/10/2018 02:10 PM, Wang, Huaqiang wrote:
> 
> 
>> -----Original Message-----
>> From: John Ferlan [mailto:jferlan at redhat.com]
>> Sent: Wednesday, September 5, 2018 11:00 PM
>> To: Wang, Huaqiang <huaqiang.wang at intel.com>; libvir-list at redhat.com
>> Cc: Feng, Shaohe <shaohe.feng at intel.com>; Niu, Bing <bing.niu at intel.com>;
>> Ding, Jian-feng <jian-feng.ding at intel.com>; Zang, Rui <rui.zang at intel.com>
>> Subject: Re: [libvirt] [PATCH 06/10] util: Introduce resctrl monitor for CMT
>>
>>
>>
>> On 08/27/2018 07:23 AM, Wang Huaqiang wrote:
>>> 'virResctrlAllocMon' denotes a resctrl monitor reporting the resource
>>> consumption information.
>>>
>>> This patch introduced the interfaces for resctrl monitor.
>>>
>>> Relationship of 'resctrl allocation' and 'resctrl monitor':
>>> 1. resctrl monitor monitors resources (cache or memory bandwidth) of
>>> particular allocation.
>>> 2. resctrl allocation may refer to the 'default' allocation if no
>>> dedicated resource 'control' applied to it. The 'default' allocation
>>> enjoys remaining resource that not allocated.
>>> 3. resctrl monitor belongs to 'default' allocation if no 'cachetune'
>>> specified in XML file.
>>> 4. one resctrl allocation may have several monitors. It is also
>>> permitted that there is no resctrl monitor associated with an
>>> allocation.
>>>
>>> Key data structures:
>>>
>>> +    struct _virResctrlAllocMon {
>>> +        char *id;
>>> +        char *path;
>>> +    };
>>>
>>>     struct _virResctrlAlloc {
>>>       virObject parent;
>>>
>>>       @@ -276,6 +289,12 @@ struct _virResctrlAlloc {
>>>                    virResctrlAllocMemBWPtr mem_bw;
>>> +                   virResctrlAllocMonPtr *monitors;
>>> +                   size_t nmonitors;
>>>       }
>>>
>>> Signed-off-by: Wang Huaqiang <huaqiang.wang at intel.com>
>>> ---
>>>  src/libvirt_private.syms |   6 +
>>>  src/util/virresctrl.c    | 361
>> ++++++++++++++++++++++++++++++++++++++++++++++-
>>>  src/util/virresctrl.h    |  31 ++++
>>>  3 files changed, 394 insertions(+), 4 deletions(-)
>>>
>>
>> Similar to the previous patch - there's a bit too much going on for just one patch
>> here.
>>
>> Introducing "default_group" and "monitors".  This needs some separation.
>>
> 
> Will split this patch into serval patches.
> The 'default_group' will be a separate patch.
> 
>> I am not going to look at this patch.  I think you really need to describe this
>> default_group concept quite a bit more as you'd be totally changing the
>> meaning of alloc->path by "removing" everything after the
>> SYSFS_RESCTRL_PATH. What's the purpose of alloc->path then in this
>> environment?
> 
> This is a bug.
> 'default_group' is not allowed to be removed in libvirt. It is created
> by the resource control file system at the  time of mounting, can only be
> removed at the time of file system unmounting by the system.
> In next version virResctrlAllocRemove, a check will be made to determine if
> it is removing root directory /sys/fs/resctrl, if yes, report an error.

Like I've said before - just be sure to properly separate things. Fixing
existing issues in the middle of new code changes is just one of those
areas that makes reviewing difficult.

> 
> Following paragraph explains the 'default_group' and some relevant concepts. 
> I'll also write these content into cover of patch series.

In much less detail I hope...  Not sure I can properly page in the rest.
I read it, but I certainly cannot internalize it.

John

> 
> The resctrl default group is described initially by Kernel document
> 'intel_rdt_ui.txt', with the link
> "https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt".
> 
> The default group is the root directory of the resource control file system,
> that is, the directory of '/sys/fs/resctrl/'. The default group is created
> immediately after the mounting of resctrl file system, and owns all the tasks
> and make full use of all resources.
> 	
> In these patches, the libvirt virresctrl 'default_group' is introduced by
> extending the scope of current 'resource allocation'. virResctrlAlloc is
> presenting a 'resource allocation'. 'default_group' is also called
> 'default allocation'.
> 
> The 'default allocation' represents the root directory of resource control
> file system. The main difference to a non-default allocation is the
> 'default group' is created after fs mounting, and owns all system tasks, while
> non-default allocation need to create through 'mdkir' and explicit operation
> of filling PID into the tasks file.
> 
> The main purpose for introducing 'default allocation' is creating monitors
> that for non-default allocation to save hardware resources.
> 
> Think about the case: a KVM virtual machine with 1 working vcpu, we want to
> monitor the cache utilization of the vcpu but don't apply any resource
> limitaion on vcpu process. We have two possible solutions:
> 
> 1. Create a new "resource allocation" through making a directory under
> /sys/fs/resctrl/, assuming the allocation is /sys/fs/resource/p0,
> and put the host process PID for the vcpu into file
> /fys/fs/resource/p0/tasks. Then Get the cache utilization information through
> file  /fys/fs/resource/p0/mon_data/llc_occupancy.
> 
> 2. Do not create an extra allocation, but create a resource monitor under
> default allocation by creating an sub-directory under
> '/sys/fs/resctrl/mon_group', then put the vcpu PID into the monitor's
> sub-directory's tasks file. Get the cache utilization information through this
> monitor's llc_occupancy file.
> 
> Comparing with slotion1, solution2 uses less hardware resource by saving one
> CLOSID. Normally we have much more RMIDs than CLOSIDs, for CPU E5-2699v4, the
> number of CLOSID is 16, while RMID number is 176.
> 
> To support 'default_group' the domain's xml configuration file need to be
> changed:
> 
> The 'default allocation' has the similar behavior with original libvirt defined
> 'resource allocation' for creating a monitor group, getting resource
> consumption data from a monitor, as well as the task assignment operations.
> The 'default allocation' could be looked as a special 'resource allocation'.
> 
> 
> Libvirt treats virResctrlAlloc (sometimes called resource allocation)
> as the representing of resctrl allocation, each resctrl allocation has the
> capability to report the resource consumption information of involved
> tasks, through files 'llc_occupancy', 'mbm_total_bytes' and
> 'mbm_local_bytes' under the directory of allocation.
> virResctrlAllocMon (sometimes called resource monitor) is introduced to
> represent virResctrlAlloc's role for resource consumption monitoring.
> One or more resource monitors could be created to monitor the
> resource utilization for a small set of tasks of current allocation.
> This explains why the 'monitor' is being put under 'cachetune' element in
> domain's XML configuration file.
> 
>       <cputune>
>         <cachetune vcpus='0-1'>
>           <cache id='0' level='3' type='code' size='7680' unit='KiB'/>
>           <cache id='1' level='3' type='data' size='3840' unit='KiB'/>
>     +     <monitor vcpus='0-1'/>
>     +     <monitor vcpus='0'/>
>         </cachetune>
>       </cputune>
> 
> The 'defautl_group' is resource allocation shared by all system tasks that do
> not have specified resource limitation. In existing libvirt policy no resource
> limitation is allowed to put on it. so I need to generate configuration such as
> 
>         <cachetune vcpus='3'>
>     +     <monitor vcpus='3'/>
>         </cachetune>
> 
> In the implementation, this monitor for monitoring domain's vcpu 3 is
> created under 'default allocation', default allocation is the directory
> /sys/fs/resctrl, which is set up at time of mounting.
> 
> And above is the reason for why element 'cache' is changed being
> optional in RNG file.
> 
>           <element name="cachetune">
>             <attribute name="vcpus">
>               <ref name='cpuset'/>
>             </attribute>
> -           <oneOrMore>
> +           <zeroOrMore>
>               <element name="cache">
>                 <attribute name="id">
>                   <ref name='unsignedInt'/>
>                 </attribute>
>                 <attribute name="level">
>                   <ref name='unsignedInt'/>
>                 </attribute>
>                 <attribute name="type">
>                   <choice>
>                     <value>both</value>
>                     <value>code</value>
>                     <value>data</value>
>                   </choice>
>                 </attribute>
>                 <attribute name="size">
>                   <ref name='unsignedLong'/>
>                 </attribute>
>                 <optional>
>                   <attribute name='unit'>
>                     <ref name='unit'/>
>                   </attribute>
>                 </optional>
>               </element>
> -           </oneOrMore>
> +           </zeroOrMore>
>             <zeroOrMore>
>               <element name="monitor">
>                 <attribute name="vcpus">
>                   <ref name='cpuset'/>
>                 </attribute>
>               </element>
>             </zeroOrMore>
>           </element>
> 
> Thanks for review.
> Huaqiang
> 

[...]




More information about the libvir-list mailing list