[libvirt] [RFC PATCH 2/2] LXC: Create ro overlay mounts only if we're not within a user namespace

Gao feng gaofeng at cn.fujitsu.com
Mon Jul 1 12:29:56 UTC 2013


On 07/01/2013 07:57 PM, Gao feng wrote:
> On 07/01/2013 07:05 PM, Richard Weinberger wrote:
>> Am 01.07.2013 12:33, schrieb Daniel P. Berrange:
>>> On Mon, Jul 01, 2013 at 08:29:14AM +0200, Richard Weinberger wrote:
>>>> Am 01.07.2013 04:26, schrieb Gao feng:
>>>>>> Well, given that we're at rc2 now & I'm still unclear about how some
>>>>>> aspects of the userns setup is working, I'm afraid we'll have to wait
>>>>>> until 1.1.1 for the userns LXC code to merge.  I'll aim todo it next
>>>>>> week, so that we have plenty of time for further testing before the
>>>>>> 1.1.1 release.
>>>>>>
>>>>>
>>>>> Ok, I think Richard had tested the userns support.
>>>>> Hi Richard, can you give me your ack or tested-by?
>>>>
>>>> I'm still facing one userns related issue.
>>>
>>> [snip]
>>>
>>>> After creating it attach to it's console, you'll find bash as pid 1.
>>>> And you'll find that /proc/1/ is not fully uid/gid-mapped:
>>>> ---cut---
>>>> # ls -la /proc/1/
>>>> total 0
>>>> dr-xr-xr-x  8 root   root    0 Jul  1 06:06 .
>>>> dr-xr-xr-x 74 nobody nogroup 0 Jul  1 06:06 ..
>>>> dr-xr-xr-x  2 root   root    0 Jul  1 06:06 attr
>>>
>>> [snip]
>>>
>>>> Any ideas what's going on here?
>>>
>>> No, it is very odd. It smells like a kernel issue to me. What
>>> version are you running ?
>>
>> I see this issue on all kernels.
>> Currently I'm using vanilla v3.9.x and v3.10.
>>
>>> I've also tried running the demo programs shown on the LWN.net
>>> article
>>>
>>>    https://lwn.net/Articles/532593/
>>>
>>> and they don't operate in the way described by the article - the demo
>>> programs continue to ru as 'nfsnobody' even after the mappings are
>>> setup.
>>>
>>> I'm just using the Fedora 3.9.4-303 kernel, rebuilt with userns enabled
>>> in KConfig.  I'm wondering if there is still stuff missing in 3.9.x
>>> that prevents this from working properly, or if the kernel behaviour
>>> changed after those LWN articles were written.
>>
>> To me it looks like the capability system behaves odd.
>> The mappings in /proc are fine as long I do not call capng_updatev().
>> Also calling capng_updatev() with parameters that do not change the current cap set
>> triggers the odd behavior too.
>>
> 
> This issue is occured after we call setuid, the init task of container is set to be un-dumpable
> after setuid. I don't know why, the kernel set the owner of /proc/<pid>/* to root user of host when
> the task is un-dumpable.
> 
>> So we see two (related?) issues:
>> 1. If we try updating the capabilities of pid1 /proc/1/ has unmapped files till we exec().
>> 2. Dropping  capabilities does not work we always gain a fresh and full capability set.
>>
> 
> This problem disappeared after
> 1, remove capabilities dropping
> 2, call prctl(PR_SET_DUMPABLE, 1) after setuid/gid.
> 
>> BTW: I'm sure the issues are not caused by Gau Feng's userns patches.
> 
> I think this more like a kernel bug. we should set the owner of /proc/<pid>/* to the root user
> of container not the host.


You can try the program attached, the owner of /proc/<pid of this program>/* is incorrect too.

Hmm, it's better to fix this problem in kernel. it's most like a userns bug.

Thanks
-------------- next part --------------
A non-text attachment was scrubbed...
Name: a.c
Type: text/x-csrc
Size: 147 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20130701/26494ae1/attachment-0001.bin>


More information about the libvir-list mailing list