[Libguestfs] [PATCH libnbd 2/2] api: Implement local command with systemd socket activation.

Fri Sep 27 13:16:01 UTC 2019

On 9/27/19 7:28 AM, Richard W.M. Jones wrote:

>>> +      /* We must unset CLOEXEC on the fd.  (dup2 above does this
>>> +       * implicitly because CLOEXEC is set on the fd, not on the
>>> +       * socket).
>>> +       */
>>> +      flags = fcntl (s, F_GETFD, 0);
>>> +      if (flags == -1) {
>>> +        perror ("fcntl: F_GETFD");
>>
>> perror is not async-signal-safe.  Calling it in a child of a
>> potentially-multi-threaded parent is therefore prone to deadlock, if
>> perror() triggers a request to grab any resource that was left
>> locked by a different thread holding the lock but stranded by the
>> fork.
> 
> I suspect these sort of problems are unsolvable.  I've studied enough
> customer bug reports where we've had to reverse engineer exactly how
> something failed given only a slim error log to know that error
> messages are always useful and I wouldn't want to remove these.
> However because of the chance of deadlock, how about a deadlock-free
> function that write(2)'s the message + errno to stderr?

Yes, sticking to write() is better than nothing.  Open-coding an 
int-to-string for displying errno is doable, or bypassing strerror() and 
going straight for the non-portable sys_errlist[errno] lookup (if 
configure says sys_errlist is available) can also be used - it's a 
question of how nice do you want to make things, while still trying to 
remain safe.

>>> +
>>> +    snprintf (listen_pid, sizeof listen_pid, "%ld", (long) getpid ());
>>> +    setenv ("LISTEN_PID", listen_pid, 1);
>>> +    setenv ("LISTEN_FDS", "1", 1);
> 
> And we could use similar code here.

Yeah, if we're going to open-code a signal-safe int-to-string, we have 
several uses for it :)

> 
>> snprintf() and setenv() are also not async-signal-safe.  Which is a
>> bummer, since you really don't know the child pid until in the
>> child. You could open-code the translation of a decimal number into
>> a buffer without snprintf, but you also have to figure out how to
>> safely put it into the environ seen by the child.  How does systemd
>> do it?
> 
> For setenv(), systemd does this by building a char **environ which is
> passed to execve.  The code is complicated to say the least:
> 
> https://github.com/systemd/systemd/blob/5ac1530eca652cd16819853fe06e76e156f5cf5e/src/core/dbus-execute.c

Can we pre-build the bulk of the replacement environ in the parent, 
leaving just the one entry to munge in the child?  At least that way we 
can malloc without worries.

> 
>>> +    _exit (EXIT_FAILURE);
>>
>> Is EXIT_FAILURE the best here, or should we consider exiting with
>> status 126/127 to match what other programs (sh, env, nice, nohup,
>> ...) do when execvp() fails?
> 
> Yes (this is also a bug elsewhere).  Not sure if 126 or 127 is
> right:
> 
> https://www.gnu.org/software/bash/manual/html_node/Exit-Status.html

Depends on the value of errno after the fact.  ENOENT => 127, all others 
126.

> 
> ...
>> Do we document that nbd_close() may block?
> 
> We should do.  I've added it in a separate patch.
> 
> I'll have a rethink about this patch and come back with something
> better in a while.
> 
> Rich.
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org