[libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo

Serge Hallyn serge.hallyn at ubuntu.com
Mon Sep 7 15:55:57 UTC 2015


Quoting Daniel P. Berrange (berrange at redhat.com):
> On Mon, Sep 07, 2015 at 03:39:13PM +0000, Serge Hallyn wrote:
> > Quoting Daniel P. Berrange (berrange at redhat.com):
> > > On Thu, Sep 03, 2015 at 11:51:16AM +0200, Cédric Bosdonnat wrote:
> > > > We already have a fuse mount to reflect the cgroup memory restrictions
> > > > in the container. This commit adds the same for the number of available
> > > > CPUs. Only the CPUs listed by virProcessGetAffinity are shown in the
> > > > container's cpuinfo.
> > > 
> > > So this (re-)raises some interesting / difficult questions that I'm
> > > not sure we have a good answer to.
> > > 
> > > The main concern is that actually this is not really a problem specific
> > > to containers, rather it is related to cgroup resource confinement.
> > > ie the cgroup has confined a process(es) to a set of CPUs are the process
> > > is using /proc/cpuinfo to count CPUs and so is wrong. Cgroups are being
> > > increasingly widely used in Linux, particularly since systemd, so pretty
> > > much any process has to expect that it can be confined to a subset of
> > > CPUs.
> > > 
> > > IOW, any application using /proc/cpuinfo to determine "available"
> > > resource is already broken, even when run on bare metal. The same also
> > > applies to the use of /proc/meminfo, which we previously faked via
> > > fuse.
> > > 
> > > So the question is whether we should invest time trying to fake the
> > > /proc/cpuinfo in containers, when any apps we'd be fixing are already
> > > broken in bare metal. Apps might have avoided /proc/cpuinfo and instead
> > > be trying /sys/devices/system/cpu/ which your patch isn't trying to
> > > fake. This is just as broken, because sysfs doesn't reflect cgroup
> > > confinement either.
> > > 
> > > I think what is ultimately needed for applications is some kind of
> > > libresource.so library that they can use to query what resources
> > 
> > Does anyone remember who it was that announced an effort to this
> > end a year or two ago, or know what the status of it is?
> 
> I don't recall seeing any formal announcement about this, but I have
> had this exact same discussion with Red Hat folks involved with
> Docker and similar higher level container mgmt tools, so perhaps
> someone involved in those efforts is working on it ?

Ah, my memory was failing me, so took a bit of searching, but 

http://fabiokung.com/2014/03/13/memory-inside-linux-containers/

I can't find anything called 'libmymem', and in 2014 he said

https://github.com/docker/docker/issues/8427#issuecomment-58255159

so maybe this never went anywhere.

For the same reasons you cited above, and because everyeone is rolling
their own at fuse level, I still think that a libresource and patches
to proc tools to use them, is the right way to go.  We have no shortage
of sample code for the functions doing the actual work, between libvirt,
lxc, docker, etc :)

Should we just go ahead and start a libresource github project?




More information about the libvir-list mailing list