[libvirt] Overhead for a default cpu cg placement scheme

Daniel P. Berrange berrange at redhat.com
Thu Jun 18 09:09:55 UTC 2015


On Wed, Jun 17, 2015 at 10:55:35PM +0300, Andrey Korolyov wrote:
> 
> Sorry for a delay, the 'perf numa numa-mem -p 8 -t 2 -P 384 -C 0 -M 0
> -s 200 -zZq --thp 1 --no-data_rand_walk' exposes a difference of value
> 0.96 by 1. The trick I did (and successfully forget) before is in
> setting the value of the cfs_quota in a machine wide group, up one
> level from individual vcpus.
> 
> Right now, libvirt sets values from
> <cputune>
> <period>100000</period>
> <quota>200000</quota>
> </cputune>
> for each vCPU thread cgroup, which is a bit wrong by my understanding , like
> /cgroup/cpu/machine/vmxx/vcpu0: period=100000, quota=2000000
> /cgroup/cpu/machine/vmxx/vcpu1: period=100000, quota=2000000
> /cgroup/cpu/machine/vmxx/vcpu2: period=100000, quota=2000000
> /cgroup/cpu/machine/vmxx/vcpu3: period=100000, quota=2000000
> 
> 
> In other words, the user (me) assumed that he limited total
> consumption of the VM by two cores total, though all every thread can
> consume up to a single CPU, resulting in a four-core consumption
> instead. With different cpu count/quota/host cpu count ratios there
> would be different practical limitations with same period to quota
> ratio, where a single total quota will result in much more predictable
> top consumption. I had put the same quota to period ratio in a
> VM-level directory to meet the expectancies from a config setting and
> there one can observe a mentioned performance drop.
> 
> With default placement there is no difference in a performance
> numbers, but the behavior of the libvirt itself is kinda controversial
> there. The documentation says that this is a right behavior as well,
> but I think that the limiting the vcpu group with total quota is far
> more flexible than per-vcpu limitations which can negatively impact
> single-threaded processes in the guest, plus the overall consumption
> should be recalculated every time when host core count or guest core
> count changes. Sorry for not mentioning the custom scheme before, if
> mine assumption about execution flexibility is plainly wrong, I`ll
> withdraw my concerns from above. I am using the 'mine' scheme for a
> couple of years in production and it is proved (for me) to be a far
> less complex for a workload balancing for a cpu-congested hypervisor
> than a generic one.

As you say there are two possible directions libvirt was able to take
when implementing the schedular tunables. Either apply them to the
VM as a whole, or apply them to the individual vCPUS. We debated this
a fair bit, but in the end we took the per-VCPU approach. There were
two real compelling reasons. First, if users have 2 guests with
identical configurations, but give one of the guests 2 vCPUs and the
other guest 4 vCPUs, the general expectation is that the one with
4 vCPUS will have twice the performance. If we apply the CFS tuning
at the VM level, then as you added vCPUs you'd get no increase in
performance.  The second reason was that people wanted to be able to
control performance of the emulator threads, separately from the
vCPU threads. Now we also have dedicated I/O threads that can have
different tuning set. This would be impossible if we were always
setting stuff at the VM level.

It would in theory be possible for us to add a further tunable to the
VM config which allowed VM level tuning.  eg we could define something
like

 <vmtune>
   <period>100000</period>
   <quota>200000</quota>
 </vmtune>

Semantically, if <vmtune> was set, we would then forbid use of the
<cputune> and <emulatortune> configurations, as they'd be mutually
exclusive. In such a case we'd avoid creating the sub-cgroups for
vCPUs and emulator threads, etc.

The question is whether the benefit would outweigh the extra code
complexity to deal with this. I appreciate you would desire this
kind of setup, but I think we'd probably need more than one person
requesting use of this kind of setup in order to justify the work
involved.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list