[libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo

Cedric Bosdonnat cbosdonnat at suse.com
Mon Sep 7 13:21:12 UTC 2015


On Mon, 2015-09-07 at 13:23 +0100, Daniel P. Berrange wrote:
> On Thu, Sep 03, 2015 at 11:51:16AM +0200, Cédric Bosdonnat wrote:
> > We already have a fuse mount to reflect the cgroup memory restrictions
> > in the container. This commit adds the same for the number of available
> > CPUs. Only the CPUs listed by virProcessGetAffinity are shown in the
> > container's cpuinfo.
> 
> So this (re-)raises some interesting / difficult questions that I'm
> not sure we have a good answer to.
> 
> The main concern is that actually this is not really a problem specific
> to containers, rather it is related to cgroup resource confinement.
> ie the cgroup has confined a process(es) to a set of CPUs are the process
> is using /proc/cpuinfo to count CPUs and so is wrong. Cgroups are being
> increasingly widely used in Linux, particularly since systemd, so pretty
> much any process has to expect that it can be confined to a subset of
> CPUs.

I agree.

> IOW, any application using /proc/cpuinfo to determine "available"
> resource is already broken, even when run on bare metal. The same also
> applies to the use of /proc/meminfo, which we previously faked via
> fuse.
> 
> So the question is whether we should invest time trying to fake the
> /proc/cpuinfo in containers, when any apps we'd be fixing are already
> broken in bare metal. Apps might have avoided /proc/cpuinfo and instead
> be trying /sys/devices/system/cpu/ which your patch isn't trying to
> fake. This is just as broken, because sysfs doesn't reflect cgroup
> confinement either.

I agree /sys/devices/system/cpu should be patched too... but it contains
much more subtle things to handle. At least I don't have a good enough
knowledge of that FS to fake it properly.

> I think what is ultimately needed for applications is some kind of
> libresource.so library that they can use to query what resources
> are available in their compute environment, which can intelligently
> query cgroups directly, and ignore the legacy /proc & /sys interfaces
> for counting memory / cpu availability. I don't think that's something
> that libvirt should solve - if anything it could be systemd, or a
> standalone project.

Ok, then not something that would be available in a reasonable time
frame unless we start it. Do you know if someone in another project is
caring about that problem?

> So I'm increasingly convinced that LXC should not try to fake out
> any /proc & /sys file content, and instead document the limitations.
> I'm also thinking that we should kill off our existing meminfo fake
> fuse at some point.

OK.

> The more minor concern I have is around the implementation. AFAIR, the
> /proc/cpuinfo file contents is not standardized across architectures,
> so I'm concerned whether your parsing code is robust on non-x86 arches.

Hum... I didn't even know that file would change with arch'es.

--
Cedric




More information about the libvir-list mailing list