[libvirt-users] Can we disable write to /sys/fs/cgroup tree inside container ?

Daniel P. Berrange berrange at redhat.com
Wed Oct 18 18:09:30 UTC 2017


On Wed, Oct 18, 2017 at 09:00:17PM +0300, mxs kolo wrote:
>  Hi all
> 
> Each lxc container on node have mounted tmpfs for cgroups tree:
> [root-inside-lxc at tst1 ~]# mount | grep cgroups
> cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup
> (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu)
> cgroup on /sys/fs/cgroup/cpuset type cgroup
> (rw,nosuid,nodev,noexec,relatime,cpuset)
> cgroup on /sys/fs/cgroup/memory type cgroup
> (rw,nosuid,nodev,noexec,relatime,memory)
> cgroup on /sys/fs/cgroup/devices type cgroup
> (rw,nosuid,nodev,noexec,relatime,devices)
> cgroup on /sys/fs/cgroup/freezer type cgroup
> (rw,nosuid,nodev,noexec,relatime,freezer)
> cgroup on /sys/fs/cgroup/blkio type cgroup
> (rw,nosuid,nodev,noexec,relatime,blkio)
> cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup
> (rw,nosuid,nodev,noexec,relatime,net_prio,net_cls)
> cgroup on /sys/fs/cgroup/perf_event type cgroup
> (rw,nosuid,nodev,noexec,relatime,perf_event)
> cgroup on /sys/fs/cgroup/systemd type cgroup
> (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
> cgroup on /sys/fs/cgroup/hugetlb type cgroup
> (rw,nosuid,nodev,noexec,relatime,hugetlb)
> cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
> 
> It's by default,  at least in my case.
> Problem is, that it's full cgroups tree -  from hardware node and from
> all another containers on node.
> [root-inside-lxc at tst1 ~]#  for i in `ls
> /sys/fs/cgroup/devices/machine.slice/machine-lxc*/devices.list`; do
> echo $i; cat $i; done
> /sys/fs/cgroup/devices/machine.slice/machine-lxc\x2d10297\x2dtst2.scope/devices.list
> c 1:3 rwm
> c 1:5 rwm
> c 1:7 rwm
> c 1:8 rwm
> c 1:9 rwm
> c 5:0 rwm
> c 5:2 rwm
> c 10:229 rwm
> b 253:6 rw
> c 136:* rwm
> /sys/fs/cgroup/devices/machine.slice/machine-lxc\x2d9951\x2dtst1.scope/devices.list
> c 1:3 rwm
> c 1:5 rwm
> c 1:7 rwm
> c 1:8 rwm
> c 1:9 rwm
> c 5:0 rwm
> c 5:2 rwm
> c 10:229 rwm
> b 253:7 rw
> c 136:* rwm
> 
> Hardware node file, view inside tst1 container:
> [root-inside-lxc at tst1 ~]# cat /sys/fs/cgroup/devices/devices.list
> a *:* rwm
> 
> What is best way to prevent viewing and editing of all cgroups
> structures except belonging to current lxc container (selinux,
> apparmor ) ?
> Why libvirt mount  /sys/fs/cgroup/* inside container as rw ?
> 
> We use kernel 3.10.0-693.2.2.el7.x86_64 and XFS and therefore our
> containers are privileged. Yes, we know that in such containers root
> can use SysRq at least for reboot hardware node. But problem with
> cgroups can be more hidden and  cryptic.
> 
> p.s.
>  As show short test, root user can disable device zero on node
> [root-lxc at tst1 ~]# echo "c 1:5 rwm"  > /sys/fs/cgroup/devices/devices.deny
> or all devices in another container
> [root-lxc at tst1 ~]# echo "a *:* rwm"  >
> /sys/fs/cgroup/devices/machine.slice/machine-lxc\x2d10297\x2dtst2.scope/devices.deny

There's only two ways to make a container secure

 - Use user namespaces
 - Apply SELinux policy to the container

If neither of those are used, we don't try to play games to hide stuff
like cgroups from root inside a container, as that's just security through
obscurity

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




More information about the libvirt-users mailing list