[libvirt] [PATCHv5 10/19] cdtil: Introduce default monitor

Wang, Huaqiang huaqiang.wang at intel.com
Mon Oct 15 09:22:55 UTC 2018



On 10/12/2018 11:18 PM, John Ferlan wrote:
>
> On 10/11/18 8:02 AM, Wang, Huaqiang wrote:
>> Answers refined.
>>
>> On 10/11/2018 3:14 AM, John Ferlan wrote:
>>> On 10/9/18 6:30 AM, Wang Huaqiang wrote:
>>>> In resctrl file system, more than one monitoring groups
>>>> could be created within one allocation group, along with
>>>> the creation of allocation group, a monitoring group is
>>>> created at the same, which monitors the resource
>>>> utilization information of whole allocation group.
>>>>
>>>> This patch is introducing the concept of default monitor,
>>>> which represents the particular monitoring group that
>>>> created along with the creation of allocation group.
>>>>
>>>> Default monitor shares the common 'vcpu' list with the
>>>> allocation.
>>>>
>>>> Signed-off-by: Wang Huaqiang <huaqiang.wang at intel.com>
>>>> ---
>>>>   src/libvirt_private.syms |  1 +
>>>>   src/util/virresctrl.c    | 23 +++++++++++++++++++++++
>>>>   src/util/virresctrl.h    |  2 ++
>>>>   3 files changed, 26 insertions(+)
>>>>
> I assume the two responses to my review are very similar, but I'll use
> this second one since it's timewise later...
>
>>>> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
>>>> index c93d19f..4b22ed4 100644
>>>> --- a/src/libvirt_private.syms
>>>> +++ b/src/libvirt_private.syms
>>>> @@ -2689,6 +2689,7 @@ virResctrlMonitorGetID;
>>>>   virResctrlMonitorNew;
>>>>   virResctrlMonitorRemove;
>>>>   virResctrlMonitorSetCacheLevel;
>>>> +virResctrlMonitorSetDefault;
>>>>   virResctrlMonitorSetID;
>>>>   
> [...]
>
>>>> + * a monitoring group is created at the same, which monitors the resource
>>>> + * utilization information of whole allocation group.
>>> Reword - it's a bit redundant
>> Rewording like these:
>>
>>   * There are two type of monitors, the default monitor and the non-default
>>   * monitor. Within one allocation, up to one default monitor and more than
>>   * one non-default monitors could be created.
>>   *
>>   * The flag 'default_monitor' defined in structure virResctrlMonitor denotes
>>   * if a monitor is the default one or not.
>>
> Starting with "Within one allocation,..." - doesn't make much sense to
> me as a reader, sorry.
>
>>>> + * A virResctrlMonitor with @default_monitor marked as 'true' is representing
>>>> + * the monitoring group created along with the creation of allocation group.
>>> Well I'm a bit lost, but let's see what happens. I'm not sure what
>>> you're trying to delineate here. There is the creation of an
>>> resctrl->alloc when a <monitor> is found by no <cachetune> is found.
>>> Thus, we create an empty <cachetune> (one with no <cache> elements). To
>>> me that's a default environment.

Re-read above paragraph, you have pointed out very clearly about your 
point, and now I finally
catch up with you ...  and the default monitor could be simplified:
Before there is a pointer point to 'allocation', which is 
@resctrl->alloc in dom_conf.c or
@monitor->alloc in virresctrl.c, the 'default' case could be determined 
by checking if
@alloc is empty pointer or not.

Then there is no need for 'virResctrlMonitorSetDefault' and 
'@monitor->default_monitor' variable.

Will remove the concept of 'default monitor' and related data field in 
virResctrlMonitor as well as
the API(s). (mainly virResctrlMonitorSetDefault).

'Default allocation' concept related wording will be removed too. And 
there are not too much
data field and API(s) related to default allocation, and I will check my 
code to remove all stuff for it.

Do we have any gap about this?

>>>
>>> I assume a similar paradigm could exist if there was or wasn't a
>>> <memorytune> element...
>> Make you confused, my bad. I was trying to introducing the situation of
>> '/sys/fs/resctrl' filesystem and what I was done here at the same time.
>>
>> Regarding the XML layout for default monitor, following XML lines represents
>> a default monitor, the key rule is the default monitor's <monitor> entry has
>> the same 'vcpus' setting as <cachetune>.
>>
>> <cachetune vcpus='0-1'>
>>    <monitor level='3' vcpus='0-1'/>
>> </cachetune>
>>
>> and following example illustrates a <cachetune> with two monitors, one of
>> them is a default monitor
>> <cachetune vcpus='0-1'>
>>    <monitor level='3' vcpus='0-1'/>
> This gets tagged as a default monitor just because the vcpus match?

Yes.

>
>>    <monitor level='3' vcpus='0'/>
> and this is not a default monitor because the vcpus don't match

Yes.

>
>> </cachetune>
>>
>> Particularly, following XML layout doesn't define any monitor or allocation.
>> Following XML lines does not have any effect, and does not indicate an
>> error.
>>
>> <cachetune vcpus='0-1'/>
>>
>> or
>>
>> <cachetune vcpus='0-1'>
>> </cachetune>
>>
> I understand that, but that wasn't my point. If someone modified their
> XML and added just that, then add the resctrl->alloc as soon as you see
> it. Then when you see a <cache>, that gets added to some cache list.
> When you see a <monitor> that gets added to some monitor list.
>
> The whole concept of calling some a default something just isn't working
> for me. It's a cache or monitor for either all or some subset of the
> vcpus for the cachetune (or resctrl->alloc). Nothing more, nothing less.

Again, I'll remove the default concepts that I tried introduced before.

No concept for 'default allocation' and 'default monitor' will be 
removed. The
difference between my called 'default monitor' and 'non-default monitor' 
is tiny,
and could be handled with a simple judgement. (For a monitor whether is 
a default one or
not could be found out by checking associated @alloc pointer.)


>>>>    */
>>>>   struct _virResctrlMonitor {
>>>>       virObject parent;
>>>> @@ -355,6 +362,8 @@ struct _virResctrlMonitor {
>>>>       /* libvirt-generated path in /sys/fs/resctrl for this particular
>>>>        * monitor */
>>>>       char *path;
>>>> +    /* Boolean flag for default monitor */
>>>> +    bool default_monitor;
>>>>       /* The cache 'level', special for cache monitor */
>>>>       unsigned int cache_level;
>>>>   };
>>>> @@ -2499,6 +2508,13 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor,
>>>>           return -1;
>>>>       }
>>>>   
>>>> +    if (monitor->default_monitor) {
>>>> +        if (VIR_STRDUP(monitor->path, monitor->alloc->path) < 0)
>>> See check below ... at this point monitor->alloc could be NULL and won't
>>> make this STRDUP very happy.
>> Thanks for catching my bug. I did not run the code on my machine with default
>> monitor, because I have trouble in run libvirt with CAT, creating non-default
>> resctrl allocation. I am working on it, and will more tests for monitor.
>>
>> Using following code to fix it: (also changed next lines....)
>>   
>>        if (monitor->alloc)
>>            alloc_path = monitor->alloc->path;
>>        else
>>            alloc_path = SYSFS_RESCTRL_PATH;
>>   
>>        if (monitor->default_monitor) {
>>            if (VIR_STRDUP(monitor->path, alloc_path) < 0)
>>                return -1;
>>   
>>            return 0;
>>        }
>>   
>>>> +            return -1;
>>>> +
>>>> +        return 0;
>>>> +    }
>>>> +
>>>>       if (monitor->alloc)
>>>>           alloc_path = monitor->alloc->path;
>>>>       else
>>>> @@ -2739,3 +2755,10 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor,
>>>>       return virResctrlMonitorGetStatistic(monitor, "llc_occupancy",
>>>>                                            nbank, bankids, bankcaches);
>>>>   }
>>>> +
>>>> +
>>>> +void
>>>> +virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor)
>>>> +{
>>>> +    monitor->default_monitor = true;
>>>> +}
>>> I really don't see what the value of this is.  Looking later on it seems
>>> there's some sort of check that the vcpus desired for the monitor match
>>> that of some cachetune entry and you then set default.
>>>
>>> To me that could happen multiple times, e.g.:
>>>
>>>     <cachetune vcpus='0-1' ... />
>>>     <cachetune vcpus='2-3' ... />
>>>
>>> and
>>>
>>>      <monitor vcpus='0-1' .../>
>>>      <monitor vcpus='2-3' .../>
>>>
>>> so, then it would seem there would be two defaults.
>> Yes. Two defaults. But I think the <monitor .. > entry should be placed under
>> <cachetune> entry.
>>
>> I'd like to rewrite your configuration in following way:
>>
>>     <cachetune vcpus='0-1'>
>>       <monitor levels='3' vcpus='0-1'/>
>>     <cachetune/>
>>
>>     <cachetune vcpus='2-3'>
>>       <monitor levels='3' vcpus='2-3'/>
>>     <cachetune/>
>>
>> upper configuration creates two allocations and one default monitor for each
>> allocation.
> and a default monitor vs. a non-default monitor has no specific meaning
> afaict.

Agree.

Will remove this 'default' concept and related things.

>
>> By the way
>>
>>      <cachetune vcpus='0-1' />
>>
>> has no effect for resctrl but not be considered as an error.
>>
>> Unlike default allocation, default monitor could be created for multiple
>> times in scope of system, but with a limitation that you can only create one
>> default monitor for one allocation at most.
>> There is only one default allocation in whole system.
>>
>> Every monitor is linked with some specific allocation, either default allocation
>> or non-default allocation (or just 'allocation').
>>
> Simplify the code - default allocation means no <cache> lines true?

Maybe no, or I misinterpreted your words.
'default allocation' groups is specifying the the resctrl group 
'/sys/fs/resctrl', which defines
the cache or memory bandwidth policy in file '/sys/fs/resctrl/schemata'.

It is possible to change the resource allocating policy through changing 
content of
'/sys/fs/resctrl/schemata'.
But In 'libvirt' virresctrl model, I am not sure it is possible to 
change the resource allocation
policy through adding some <cache> entry in some place. I need to 
investigate this for a full
test of CMT.

But, any way, the concept of 'default allocation' should be possible to 
be simplified. I'll do
that in my V6 code.

>
> Is monitor the related to the cache or the cachetune? If you had:
>
> <cachetune 0-4>
>    <cache 0>
>    <cache 1-2>
>    <cache 3>
>    <cache 4>
> ...

Do the numbers in above <cache> element refer to the 'vcpu(s)'?  If 
answer is yes,
then it might be impossible to split vcpus in one <cachetune>.
This is based on the consumption that: one <cachetune> is representing a
directory under '/sys/fs/resctrl' (or itself), and you could create only 
one allocation
and make one kind of cache/memoryBW allocation policy for a bunch of vcpus.

It is impossible to create multiple cache allocating policies in one 
<cachetune>, and it
is not allowed to specify any 'vcpus' in the <cache> entry.
but you can assign different cache policy for cache bank in different 
host node in one
<cachetune>.

    <cachetune>
       <cache id='0' level='3' type='both' size='2816' unit='KiB'/>   
<-- can not specify 'vcpus'
       <cache id='1' level='3' type='both' size='2816' unit='KiB'/>   
<-- the first number is 'id', not vcpu
       <monitor level='3' vcpus='0-2'/>
       <monitor level='3' vcpus='0'/>
       <monitor level='3' vcpus='1'/>
       <monitor level='3' vcpus='2'/>
     </cachetune>

> Then is :
>
>    <monitor 0-4>
>    <monitor 2>
>
> acceptible?  If so, then I'm not seeing the need for default monitor
> and/or default allocation.

Again. I realized this part could be simplified. Will be done in next 
code update for your review.

>
>> The upper manner of monitor is determined by kernel's 'resctrl' file system.
>>
>> If I created an allocation by creating a directory 'p0' under '/sys/fs/resctrl/',
>> and after adding two vcpus' PID to file '/sys/fs/resctrl/p0/tasks' and making
>> some changes to 'sys/fs/resctrl/p0/schemata' file, then
>> the allocation is functional for resource limitation.
>> The libvirt CAT code is doing similar operations for VM in an automatic way.
>>
> The next hunk I'll need to look at later - too many other things cycling
> through my head right now. Nice picture, but correlating this to code is
> not clicking.  My quick look though - I see the same files in both the
> default and non-default pictures - the path to get there is different,
> but that path afaik is generated as part of the processing and shouldn't
> know whether it's default or non-default. It's just a path to data based
> on some other "base" path.
>
> John

Thanks for review.
Huaqiang

>> Let me show the files under '/sys/fs/resctrl/p0':
>>
>> .
>> ├── cpus
>> ├── cpus_list
>> ├── mon_data
>> │   ├── mon_L3_00
>> │   │   ├── llc_occupancy
>> │   │   ├── mbm_local_bytes
>> │   │   └── mbm_total_bytes
>> │   └── mon_L3_01
>> │       ├── llc_occupancy
>> │       ├── mbm_local_bytes
>> │       └── mbm_total_bytes
>> ├── mon_groups
>> ├── schemata
>> └── tasks
>>
>> We can find some files for the monitoring role, the 'llc_occupancy'
>> file, the 'mbm_local_bytes' file and the 'mbm_total_bytes' file.
>> The truth is 'resctrl' fs create a monitoring group for each allocation
>> group along with its creation, and this monitoring group is what I
>> introduced default monitor.
>> The default monitor shares the same 'tasks' file with allocation, so
>> it monitors the resource utilization for all pids existing in 'tasks' file.
>>
>> Since this monitoring group is created whenever creating a libvirt
>> allocation, in the design of libvirt resctrl monitor, I choose to make it
>> shown not shown according to the XML configuration.
>>
>> To create other monitoring groups just making sub-directories under
>> the 'mon_group' directory, and adding corresponding vcpu PID to
>> that sub-directory's 'tasks' file.
>> The virResctrlMonitorCreate function creates this kind of sub-directory
>> under this 'mon_group' directory for non-default monitor.
>> For non-default monitor, you can specify a subset of pids of that in
>> allocation 'tasks' file, and no pid overlap allowed between non-default
>> monitors.
>>
>> For an allocation with one default monitor and a non-default monitor
>> (or just 'monitor' in wording), the files layout are like these:
>> .
>> ├── cpus
>> ├── cpus_list
>> ├── mon_data                        <--- default monitor interface
>> │   ├── mon_L3_00
>> │   │   ├── llc_occupancy
>> │   │   ├── mbm_local_bytes
>> │   │   └── mbm_total_bytes
>> │   └── mon_L3_01
>> │       ├── llc_occupancy
>> │       ├── mbm_local_bytes
>> │       └── mbm_total_bytes
>> ├── mon_groups
>> │   └── mon1
>> │       ├── cpus
>> │       ├── cpus_list
>> │       ├── mon_data             <--- non-default monitor interface
>> │       │   ├── mon_L3_00
>> │       │   │   ├── llc_occupancy
>> │       │   │   ├── mbm_local_bytes
>> │       │   │   └── mbm_total_bytes
>> │       │   └── mon_L3_01
>> │       │       ├── llc_occupancy
>> │       │       ├── mbm_local_bytes
>> │       │       └── mbm_total_bytes
>> │       └── tasks
>> ├── schemata
>> └── tasks
>>
>> This patch is trying to let monitor has the capability to mark a monitor
>> as a default monitor, and the default monitor is 'physically' existed in
>> kernel 'resctrl' file system, and which has a different manner with
>> other monitors.
>>
>>> Is all this being done to save a few steps in
>>> virResctrlMonitorDeterminePath? If so, then I see no value. It only adds
>>> confusion.
>> default monitor has a different role with other monitors, hope I have
>> documented it clearly.
>>
>> Without identifying the default monitor, all monitors will be create under
>> allocation's 'mon_group' directory, the following configuration will not be
>> supported due to overlap between monitors.
>>
>> <cachetune vcpus='0-1'>
>>    <monitor level='3' vcpus='0-1'/>
>>    <monitor level='3' vcpus='0'/>
>> </cachetune>
>>
>> So default monitor is valuable if you get to know the backend mechanisim
>> of kernel resctrl file system.
>>
> [...]




More information about the libvir-list mailing list