[libvirt] Suboptimal default cpu Cgroup

Radim Krčmář rkrcmar at redhat.com
Thu Aug 14 18:35:37 UTC 2014


2014-08-14 13:55-0400, Andrew Theurer:
> 
> 
> ----- Original Message -----
> > From: "Radim Krčmář" <rkrcmar at redhat.com>
> > To: libvir-list at redhat.com
> > Cc: "Daniel P. Berrange" <berrange at redhat.com>, "Andrew Theurer" <atheurer at redhat.com>
> > Sent: Thursday, August 14, 2014 9:25:05 AM
> > Subject: Suboptimal default cpu Cgroup
> > 
> > Hello,
> > 
> > by default, libvirt with KVM creates a Cgroup hierarchy in 'cpu,cpuacct'
> > [1], with 'shares' set to 1024 on every level.  This raises two points:
> > 
> > 1) Every VM is given an equal amount of CPU time. [2]
> >    ($CG/machine.slice/*/shares = 1024)
> > 
> >    Which means that smaller / less loaded guests are given an advantage.
> > 
> > 2) All VMs combined are given 1024 shares. [3]
> >    ($CG/machine.slice/shares)
> > 
> >    This is made even worse on RHEL7, by sched_autogroup_enabled = 0, so
> >    every other process in the system is given the same amount of CPU as
> >    all VMs combined.
> > 
> > It does not seem to be possible to tune shares and get a good general
> > behavior, so the best solution I can see is to disable the cpu cgroup
> > and let users do it when needed.  (Keeping all tasks in $CG/tasks.)
> 
> Could we have each VM's shares be nr_vcpu * 1024, and the share for $CG/machine.slice be sum of all VM's share?

That would be unfair in a different way ... some examples:

VM's shares = nr_vcpu * 1024:
- 1 and 10 VCPU guests both running only one task in overcommit,
  larger guest gets 10 times more CPU.  (Feature?)

$CG/machine.slice = sum (VM's shares):
- 'shares' are bound by 262144 right now, so it wouldn't scale beyond
  one large guest.  (Not a big problem, but has ugly solutions.)
- Default system tasks still have 1024, so their share would get
  unfairly small if we had some idle guests as well.

  10 CPU machine with 10*10 VCPU guests, only one of which is actively
  running: A non-vm task would get just ~1% of the CPU, not ~10%, like
  we would expect with 11 running tasks.
  And it would be even worse with autogrouping.

---
> [...]
> > 2: To reproduce, run two guests with > 1 VCPU and execute two spinners
> >    on the first and one on the second.
> >    The result will be 50%/50% CPU assignment between guests;  66%/33%
> >    seems more natural, but it could still be considered as a feature.

(Please note a mistake here: the host is implied to have 1-2 CPUs.
 It would have been better to use nr_cpus as well ...)




More information about the libvir-list mailing list