[libvirt] [PATCH 2/3] Qemu: add CMT support
qiaowei.ren at intel.com
Fri Oct 30 07:59:30 UTC 2015
> -----Original Message-----
> From: Daniel P. Berrange [mailto:berrange at redhat.com]
> Sent: Thursday, October 29, 2015 10:56 PM
> To: Ren, Qiaowei
> Cc: libvir-list at redhat.com; Feng, Shaohe
> Subject: Re: [PATCH 2/3] Qemu: add CMT support
> On Thu, Oct 29, 2015 at 02:02:29PM +0800, Qiaowei Ren wrote:
> > One RFC in
> > https://www.redhat.com/archives/libvir-list/2015-June/msg01509.html
> > CMT (Cache Monitoring Technology) can be used to measure the usage of
> > cache by VM running on the host. This patch will extend the bulk stats
> > API (virDomainListGetStats) to add this field. Applications based on
> > libvirt can use this API to achieve cache usage of VM. Because CMT
> > implementation in Linux kernel is based on perf mechanism, this patch
> > will enable perf event for CMT when VM is created and disable it when
> > VM is destroyed.
> > Signed-off-by: Qiaowei Ren <qiaowei.ren at intel.com>
> Thanks for re-sending this patchset, it has reminded me of the concerns /
> questions I had around this previously.
> Just ignoring the code for a minute, IIUC the design is
> - Open a file handle to the kernel perf system for each running VM
> - Associate that perf event file handle with the QEMU VM PID
> - Enable recording of the CMT perf event on that file handle
> - Report the CMT event values in the virDomainGetStats() API
> call when VIR_DOMAIN_STATS_CACHE is requested
> My two primary concerns are
> 1. Do we want to have a perf event FD open for every running
> VM all the time.
> 2. Is the virDomainGetStats() integration the right API approach
> For item 1, my concern is that the CMT event is only ever going to be consumed
> by OpenStack, and even then, only OpenStack installs which have the schedular
> plugin that cares about the CMT event data. It feels undesirable to have this perf
> system enabled for all libvirt VMs, when perhaps < 1 % of libvirt users actually
> want this data. It feels like we need some mechanism to decide when this event
> is enabled
> For item 2, my concern is first when virDomainGetStats is the right API. I think it
> probably *is* the right API, since I can't think of a better way.
> Should we however, be having a very special case VIR_DOMAIN_STATS_CACHE
> group, or should we have something more generic.
> For example, if I run 'perf event' I see
> List of pre-defined events (to be used in -e):
> branch-instructions OR branches [Hardware event]
> branch-misses [Hardware event]
> bus-cycles [Hardware event]
> cache-misses [Hardware event]
> cache-references [Hardware event]
> cpu-cycles OR cycles [Hardware event]
> instructions [Hardware event]
> ref-cycles [Hardware event]
> stalled-cycles-frontend OR idle-cycles-frontend [Hardware event]
> alignment-faults [Software event]
> context-switches OR cs [Software event]
> cpu-clock [Software event]
> cpu-migrations OR migrations [Software event]
> dummy [Software event]
> emulation-faults [Software event]
> major-faults [Software event]
> minor-faults [Software event]
> ...any many many more...
> Does it make sense to extend the virDomainStats API to *only* deal with
> reporting of 1 specific perf event that you care about right now. It feels like it
> might be better if we did something more general purpose.
> eg what if something wants to get 'major-faults' data in future ?
> So we add a VIR_DOMAIN_STATS_MAJOR_FAULT enum item, etc.
> Combining these two concerns, I think we might need 2 things
> - A new API to turn on/off collection of specific perf events
> This could be something like
> virDomainGetPerfEvents(virDOmainPtr dom,
> virTypedParameter params);
> This would fill virTypedParameters with one entry for each perf event, using the
> VIR_TYPED_PARAM_BOOLEAN type. A 'true'
> value would indicate that event is enabled for the VM. A corresponding
> virDomainSetPerfEvents(virDOmainPtr dom,
> virTypedParameter params);
> would enable you to toggle the flag, to enable/disable the particular list of perf
> events you care about.
> With that, we could have a 'VIR_DOMAIN_STATS_PERF_EVENT' enum item for
> virDomainStats which causes reporting of all previously enabled perf events
> This would avoid us needing to have the perf event enabled for all VMs all the
> time. Only applications using libvirt which actually need the data would turn it on.
> It would also be now scalable to all types of perf event, instead of just one
> specific event
Daniel, thanks for your nice feedback. I will re-implement my patch according to your comments.
More information about the libvir-list