[libvirt] [PATCH] cgroup: Drop resource partition from virSystemdMakeScopeName

Daniel P. Berrange berrange at redhat.com
Tue Jul 21 15:12:09 UTC 2015


On Mon, Jul 20, 2015 at 11:10:18AM +0200, Peter Krempa wrote:
> On Fri, Jul 17, 2015 at 15:40:31 +0100, Daniel Berrange wrote:
> > On Thu, Jul 16, 2015 at 04:18:08PM +0200, Peter Krempa wrote:
> > > The scope name, even according to our docs is
> > > "machine-$DRIVER\x2d$VMNAME.scope" virSystemdMakeScopeName would use the
> > > resource partition name instead of "machine-" if it was specified thus
> > > creating invalid scope paths.
> > > 
> > > This makes libvirt drop cgroups for a VM that uses custom resource
> > > partition upon reconnecting since the detected scope name would not
> > > match the expected name generated by virSystemdMakeScopeName.
> > > 
> > > The error is exposed by the following log entry:
> > > 
> > > debug : virCgroupValidateMachineGroup:302 : Name 'machine-qemu\x2dtestvm.scope' for controller 'cpu' does not match 'testvm', 'testvm.libvirt-qemu' or 'machine-test-qemu\x2dtestvm.scope'
> > > 
> > > for a "/machine/test" resource and "testvm" vm.
> > 
> > This doesn't look right - the whole idea of the resource partition
> > name is that it overrides the default /machine based path. So
> > ignoring the resource partition is not right, unless I'm mis-
> > understanding what this patch is doing.
> 
> Perhaps I wasn't clear enough in the explanation. The code path that
> calls the function where I've removed the partition is called on daemon
> restart, where the paths to cgroups are re-detected upon startup.
> 
> At first virCgroupNewDetect() is called on a VM's pid and the function
> detects cgroup paths for the given process.
> 
> The detected data for a VM with
> <partition>/machine/asdf/test</partiotion> look like:
> (gdb) p **group
> $2 = {path = 0x7fff94000ec0 "",
>       controllers = {{type = 0,
>                       mountPoint = 0x7fff94001330 "/sys/fs/cgroup/cpu,cpuacct", 
>                       linkPoint = 0x7fff94001470 "/sys/fs/cgroup/cpu",
>                       placement = 0x7fff94001fb0 "/machine.slice/machine-asdf.slice/machine-asdf-test.slice/machine-qemu\\x2df20live2.scope/emulator"},
>                      {type = 0,
>                       mountPoint = 0x7fff94001200 "/sys/fs/cgroup/cpu,cpuacct",
>                       linkPoint = 0x7fff94001450 "/sys/fs/cgroup/cpuacct", 
>                       placement = 0x7fff94002020 "/machine.slice/machine-asdf.slice/machine-asdf-test.slice/machine-qemu\\x2df20live2.scope/emulator"},
>                       {type = 0,
>                        mountPoint = 0x7fff94001410 "/sys/fs/cgroup/cpuset",
>                        linkPoint = 0x0,
>                        placement = 0x7fff94001ad0 "/machine.slice/machine-asdf.slice/machine-asdf-test.slice/machine-qemu\\x2df20live2.scope/emulator"},
> 
>         ... and so on for other cgroups.
> 
> After the detection virCgroupValidateMachineGroup() is called on the
> detected cgroup array to validate them with the following algorithm:
> 
> FOREACH entry IN 'cgroup' array:
>     Find the last directory name in entry.placement (strrchr(..., '/')).
> 
>     IF the cgroup is one of CPU* cgroups AND
>        the last directory component is '/emulator':
> 
>        find next previous directory component and strip '/emuator' from it
> 
> 
>     # This equals to 'machine-qemu\\x2df20live2.scope' in the example above.
> 
>     # validate the name
> 
>     IF directory name IS EQUAL TO 'MACHINE_NAME' OR
>                                   'MACHINE_NAME.libvirt-DRIVER_NAME' OR
>                                   virSystemdMakeScopeName(...):
>         cgroup is valid
> 
> 
> Since the algorithm does not extract the partition name or validate it
> in any way in non-systemd scenarios and in scenarios where systemd is
> used the generated name wouldn't ever match since the scope name as
> extracted in the code does not include the slice generated from
> <partition> the cgroups code does not work on systemd machines after
> libvirt restart.
> 
> The patch as-is removes the partition name from the code that generates
> the systemd SCOPE name since that does not contain it. The partition is
> used to generate SLICE name which is contained in the cgroup path before
> the SCOPE.

Ok, I understand what's going on now and reproduced the problem locally
too. So ACK to your proposed patch.

> The question that comes into my mind is whether we should actually make
> sure that the <partition> corresponds to the SLICE in the detected path,
> or just matching the machine SCOPE is good enough in this case.

I'm not sure it matters much - as long as we've detected the cgroup tree
we'll do the right thing cleaning it up on guest shutdown, even if the
<partition> in XML is different.


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list