[libvirt] [PATCH 2/3] Qemu: add CMT support

Ren, Qiaowei qiaowei.ren at intel.com
Fri Oct 30 07:59:30 UTC 2015

> -----Original Message-----
> From: Daniel P. Berrange [mailto:berrange at redhat.com]
> Sent: Thursday, October 29, 2015 10:56 PM
> To: Ren, Qiaowei
> Cc: libvir-list at redhat.com; Feng, Shaohe
> Subject: Re: [PATCH 2/3] Qemu: add CMT support
> On Thu, Oct 29, 2015 at 02:02:29PM +0800, Qiaowei Ren wrote:
> > One RFC in
> > https://www.redhat.com/archives/libvir-list/2015-June/msg01509.html
> >
> > CMT (Cache Monitoring Technology) can be used to measure the usage of
> > cache by VM running on the host. This patch will extend the bulk stats
> > API (virDomainListGetStats) to add this field. Applications based on
> > libvirt can use this API to achieve cache usage of VM. Because CMT
> > implementation in Linux kernel is based on perf mechanism, this patch
> > will enable perf event for CMT when VM is created and disable it when
> > VM is destroyed.
> >
> > Signed-off-by: Qiaowei Ren <qiaowei.ren at intel.com>
> Thanks for re-sending this patchset, it has reminded me of the concerns /
> questions I had around this previously.
> Just ignoring the code for a minute, IIUC the design is
>  - Open a file handle to the kernel perf system for each running VM
>  - Associate that perf event file handle with the QEMU VM PID
>  - Enable recording of the CMT perf event on that file handle
>  - Report the CMT event values in the virDomainGetStats() API
>    call when VIR_DOMAIN_STATS_CACHE is requested
> My two primary concerns are
>  1. Do we want to have a perf event FD open for every running
>     VM all the time.
>  2. Is the virDomainGetStats() integration the right API approach
> For item 1, my concern is that the CMT event is only ever going to be consumed
> by OpenStack, and even then, only OpenStack installs which have the schedular
> plugin that cares about the CMT event data. It feels undesirable to have this perf
> system enabled for all libvirt VMs, when perhaps < 1 % of libvirt users actually
> want this data. It feels like we need some mechanism to decide when this event
> is enabled
> For item 2, my concern is first when virDomainGetStats is the right API. I think it
> probably *is* the right API, since I can't think of a better way.
> Should we however, be having a very special case VIR_DOMAIN_STATS_CACHE
> group, or should we have something more generic.
> For example, if I run 'perf event' I see
> List of pre-defined events (to be used in -e):
>   branch-instructions OR branches                    [Hardware event]
>   branch-misses                                      [Hardware event]
>   bus-cycles                                         [Hardware event]
>   cache-misses                                       [Hardware event]
>   cache-references                                   [Hardware event]
>   cpu-cycles OR cycles                               [Hardware event]
>   instructions                                       [Hardware event]
>   ref-cycles                                         [Hardware event]
>   stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]
>   alignment-faults                                   [Software event]
>   context-switches OR cs                             [Software event]
>   cpu-clock                                          [Software event]
>   cpu-migrations OR migrations                       [Software event]
>   dummy                                              [Software event]
>   emulation-faults                                   [Software event]
>   major-faults                                       [Software event]
>   minor-faults                                       [Software event]
>   ...any many many more...
> Does it make sense to extend the virDomainStats API to *only* deal with
> reporting of 1 specific perf event that you care about right now. It feels like it
> might be better if we did something more general purpose.
> eg what if something wants to get 'major-faults' data in future ?
> So we add a VIR_DOMAIN_STATS_MAJOR_FAULT enum item, etc.
> Combining these two concerns, I think we might need 2 things
>  - A new API to turn on/off collection of specific perf events
> This could be something like
>    virDomainGetPerfEvents(virDOmainPtr dom,
>                           virTypedParameter params);
> This would fill virTypedParameters with one entry for each perf event, using the
> value would indicate that event is enabled for the VM. A corresponding
>    virDomainSetPerfEvents(virDOmainPtr dom,
>                           virTypedParameter params);
> would enable you to toggle the flag, to enable/disable the particular list of perf
> events you care about.
> With that, we could have a 'VIR_DOMAIN_STATS_PERF_EVENT' enum item for
> virDomainStats which causes reporting of all previously enabled perf events
> This would avoid us needing to have the perf event enabled for all VMs all the
> time. Only applications using libvirt which actually need the data would turn it on.
> It would also be now scalable to all types of perf event, instead of just one
> specific event

Daniel, thanks for your nice feedback. I will re-implement my patch according to your comments.


More information about the libvir-list mailing list