[libvirt] [RFC PATCH 2/2] LXC: Create ro overlay mounts only if we're not within a user namespace

Gao feng gaofeng at cn.fujitsu.com
Mon Jul 1 02:26:51 UTC 2013


On 06/28/2013 06:17 PM, Daniel P. Berrange wrote:
> On Thu, Jun 27, 2013 at 08:56:25AM +0800, Gao feng wrote:
>> On 06/26/2013 07:01 PM, Daniel P. Berrange wrote:
>>> On Wed, Jun 26, 2013 at 05:56:19PM +0800, Gao feng wrote:
>>>> On 06/26/2013 05:38 PM, Daniel P. Berrange wrote:
>>>>> On Wed, Jun 26, 2013 at 10:26:10AM +0800, Gao feng wrote:
>>>>>> On 06/26/2013 04:39 AM, Daniel P. Berrange wrote:
>>>>>>> On Thu, Jun 13, 2013 at 08:02:18PM +0200, Richard Weinberger wrote:
>>>>>>>> Within a user namespace root can remount these filesysems at any
>>>>>>>> time rw.
>>>>>>>> Create these mappings only if we're not playing with user namespaces.
>>>>>>>
>>>>>>> This is a problem with the way we're initializing mounts in the
>>>>>>> user namespace. 
>>>>>>
>>>>>> This problem exists even libvirt lxc doesn't support user namespace.
>>>>>
>>>>> Yes, and this is a problem that user namespace is intended to
>>>>> solve.
>>>>>
>>>>>>> We need to ensure that the initial mounts setup
>>>>>>> by libvirt can't be changed by admin inside the container. Preventing
>>>>>>> the container admin from remounting or unmounting these mounts is key
>>>>>>> to security.
>>>>>>>
>>>>>>> IIUC, the only way to ensure this is to start a new user namespace
>>>>>>> /after/ setting up all mounts.
>>>>>>>
>>>>>>
>>>>>> start a new user namespace means the container will lose controller of
>>>>>> mount namespace. so the container can't do mount operation too, though
>>>>>> we only can mount a little of filesystems in un-init user namespace.
>>>>>
>>>>> Merely being able to unmount is sufficient to exploit the host. Consider
>>>>> that the container was configured with the following mapping
>>>>>
>>>>>   / -> /
>>>>>   /export/mycontainer/home -> /home
>>>>>
>>>>> Now, if the container admin can umount /home, then they can now
>>>>> see the home directory contents of the host. At least this is
>>>>> likely to be information leakage, and if any of the host home
>>>>> directories have UIDs that overlap with those assigned to the
>>>>> container ID map, you have a potentially exploitable situation.
>>>>>
>>>>> Hence we need to ensure that the container cannot unmount or
>>>>> remount anything setup by libvirt. AFAICT, this means that all
>>>>> the mounts libvirt does, must be performed in a seprate user
>>>>> namespace to that wit hthe container will eventually run in.
>>>>>
>>>>
>>>> Libvirt mounts something for the container in one user namesapce,
>>>> and then libvirt calls unshare to create a new user namespace and
>>>> start the init task of container.
>>>>
>>>> Yes, the users in container can't do mount/unmount/remount on all
>>>> of filesystem. but they can call unshare to create a new mount namespace,
>>>> and they will have rights to mount/unmount/remount in this new created
>>>> mount namespace. they can still umount /home to see the home directory
>>>> contents of host.
>>>
>>> An existing filesystem mount can only be remounted/unmounted by the
>>> (user ID, usernamespace) that originally mounted it. So even if you
>>> start a new mount namespace, you cannot unmount stuff setup by the
>>> parent user namespace.
>>>
>>
>> Please also setup the uid_map/gid_map for the unshared user namespace.
>> even in container, user has rights to setup these two files.
>>
>>> # unshare --mount --user /bin/sh
>>> sh-4.2$ umount /sys/kernel/debug
>>> umount: /sys/kernel/debug: Invalid argument
>>>
>>
>> in terminal one
>> $ id
>> uid=1000(gaofeng) gid=1000(gaofeng) groups=1000(gaofeng)
>> $ ./unshare --mount --user /bin/sh
>> sh-4.2$ echo $$
>> 17110
>> sh-4.2$
>>
>> in other terminal,setup id map for new userns.
>> $echo 0 1000 1 > /proc/17110/uid_map
>> $echo 0 1000 1 > /proc/17110/gid_map
>>
>> and then in terminal one
>> sh-4.2$ umount -l /home/
> 
> Oh, hmm, forgot about the uid mapping. I thought the capabilities would
> be allowing me unmount regardless.
> 
> Well, given that we're at rc2 now & I'm still unclear about how some
> aspects of the userns setup is working, I'm afraid we'll have to wait
> until 1.1.1 for the userns LXC code to merge.  I'll aim todo it next
> week, so that we have plenty of time for further testing before the
> 1.1.1 release.
> 

Ok, I think Richard had tested the userns support.
Hi Richard, can you give me your ack or tested-by?

Thanks!




More information about the libvir-list mailing list