[libvirt] [PATCHv2 3/3] util: make virSetUIDGID async-signal-safe

Eric Blake eblake at redhat.com
Thu Jul 11 21:50:52 UTC 2013


On 07/09/2013 11:41 PM, Laine Stump wrote:
> On 07/09/2013 09:17 PM, Eric Blake wrote:
>> https://bugzilla.redhat.com/show_bug.cgi?id=964358
>>
>> POSIX states that multi-threaded apps should not use functions
>> that are not async-signal-safe between fork and exec, yet we
>> were using getpwuid_r and initgroups.  Although rare, it is
>> possible to hit deadlock in the child, when it tries to grab
>> a mutex that was already held by another thread in the parent.
>> I actually hit this deadlock when testing multiple domains
>> being started in parallel with a command hook, with the following
>> backtrace in the child:
>>

>> +++ b/src/lxc/lxc_container.c
>> @@ -1,5 +1,5 @@
>>  /*
>> - * Copyright (C) 2008-2012 Red Hat, Inc.
>> + * Copyright (C) 2008-2013 Red Hat, Inc.
>>   * Copyright (C) 2008 IBM Corp.
>>   *
>>   * lxc_container.c: file description
>> @@ -347,12 +347,17 @@ int lxcContainerWaitForContinue(int control)
>>   */
>>  static int lxcContainerSetID(virDomainDefPtr def)
>>  {
>> +    gid_t *groups;
>> +    int ngroups;
>> +
>>      /* Only call virSetUIDGID when user namespace is enabled
>>       * for this container. And user namespace is only enabled
>>       * when nuidmap&ngidmap is not zero */
>>
>>      VIR_DEBUG("Set UID/GID to 0/0");
>> -    if (def->idmap.nuidmap && virSetUIDGID(0, 0) < 0) {
>> +    if (def->idmap.nuidmap &&
>> +        ((ngroups = virGetGroupList(0, 0, &groups) < 0) ||
>> +         virSetUIDGID(0, 0, groups, ngroups) < 0)) {
>>          virReportSystemError(errno, "%s",
>>                               _("setuid or setgid failed"));
>>          return -1;

[1]

>> @@ -1904,7 +1928,7 @@ parenterror:
>>
>>      /* set desired uid/gid, then attempt to create the directory */
>>
>> -    if (virSetUIDGID(uid, gid) < 0) {
>> +    if (virSetUIDGID(uid, gid, groups, ngroups) < 0) {
>>          ret = -errno;
>>          goto childerror;
>>      }
> 
> 
> It's too late for me to be reviewing code - I can tell because I spent
> several minutes being concerned that you didn't free groups in this code
> path. Duh. That's really all I can say to myself :-)

Only _after_ posting, did I reread my patch and your comments, and while
_this_ instance doesn't need a free, I noticed that the lxc_container.c
instance [1] leaks groups (at least, until exec() makes leaks
irrelevant).  But _that_ in turn implies that I called virGetGroupList
in between clone() and exec(), which is precisely what I documented
should not be done.

I guess I'd better work on a followup patch that hoists the
virGetGroupList to occur before the lxc clone().

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 621 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20130711/c55050fe/attachment-0001.sig>


More information about the libvir-list mailing list