[libvirt] Suboptimal default cpu Cgroup
Radim Krčmář
rkrcmar at redhat.com
Fri Aug 15 14:08:34 UTC 2014
2014-08-15 10:50+0200, Martin Kletzander:
> On Thu, Aug 14, 2014 at 04:25:05PM +0200, Radim Krčmář wrote:
> >Hello,
> >
> >by default, libvirt with KVM creates a Cgroup hierarchy in 'cpu,cpuacct'
> >[1], with 'shares' set to 1024 on every level. This raises two points:
> >
> >1) Every VM is given an equal amount of CPU time. [2]
> > ($CG/machine.slice/*/shares = 1024)
> >
> > Which means that smaller / less loaded guests are given an advantage.
> >
>
> This is a default with which we do nothing unless the user (or mgmt
> app) wants to.
(I'd argue that the default is to do nothing at all ;)
> What you say is true only when there is no spare time
> (the machines need more time than available). Such overcommit is the
> problem of the user, I'd say.
I don't like that it breaks an assumption that VCPU behaves as a task.
(Complicated systems are hard to operate without consistency and our
behavior is really punishing for users that don't read everything.)
> >2) All VMs combined are given 1024 shares. [3]
> > ($CG/machine.slice/shares)
> >
>
> This is a problem even on system without slices (systemd), because
> there is /machine/cpu.shares == 1024 anyway.
(Thanks, haven't noticed this on my professionally deformed userspace
choices.)
> Is there a way to
> disable hierarchy in this case (to say cpu.shares=-1 for example)?
Apart from the obvious "don't create what you don't want", probably not,
cpu.shares are clamped by 2 and 2^18.
> Because if not, then it has only limited use (we cannot prepare the
> hierarchy and just write a number in some file when we want to start
> using it). That's a pity, but there are probably less use cases then
> hundreds of lines of code that would need to be changed in order to
> support this in kernel.
And hierarchy imposes performance degradation as well, so developers
probably never expected we'd create useless cgroups.
(Should be proportional to their depth => having {emulator,vcpu*} by
default is counterproductive as well.)
Creating the hierarchy on demand is not much harder than writing a
value, especially if we do it through libvirt anyway.
A version of your proposal would extend cgroups with something like
categorization: we could add an "effective control group" variable that
allows scheduler code to start at a point higher in the hierarchy.
Libvirt could continue doing what it does now and performance would
improve without creating too many special cases.
I can see the flame on LKML.
> > This is made even worse on RHEL7, by sched_autogroup_enabled = 0, so
> > every other process in the system is given the same amount of CPU as
> > all VMs combined.
> >
>
> But sched_autogroup_enabled = 1 wouldn't make it much better, since it
> would group the machines together anyway, right?
Yes, it would be just a bit better for VMs, because other processes
would be grouped as well.
> >It does not seem to be possible to tune shares and get a good general
> >behavior, so the best solution I can see is to disable the cpu cgroup
> >and let users do it when needed. (Keeping all tasks in $CG/tasks.)
> >
>
> I agree with you that it's not the best default scenario we can do,
> and maybe not using cgroups until needed would bring us a good
> benefit. That is for cgroups like cpu and blkio only, I think.
I haven't delved into other cgroups much, but there is a good question
whether we want them :)
Does $feature do something useful on top of complicating things?
More information about the libvir-list
mailing list