[Libguestfs] [PATCH libnbd 2/2] api: Implement local command with systemd socket activation.
Richard W.M. Jones
rjones at redhat.com
Fri Sep 27 12:28:10 UTC 2019
On Thu, Sep 26, 2019 at 03:56:27PM -0500, Eric Blake wrote:
> >+ /* Use /tmp instead of TMPDIR because we must ensure the path is
> >+ * short enough to store in the sockaddr_un. On some platforms this
> >+ * may cause problems so we may need to revisit it. XXX
> >+ */
>
> Is the use of socketpair() any better than creating a socket under /tmp?
AIUI socketpair() can't be used to create a listening socket. Systemd
socket activation has two modes -- one where you pass an already
accepted socket, and one where you pass a listening socket and the
server accepts connections until it is killed. In nbdkit and qemu-nbd
we are using the latter one.
...
> If we fail here or later, do/should we try to clean up the
> /tmp/libnbdXXX directory created earlier?
>
> /me reads ahead - nbd_close tries to address it
>
> Still, if we fail at this point, h->sa_sockpath is set but not yet
> created [1]
I modified the code so that it doesn't try to delete literal
/tmp/libnbdXXXXXX on some error paths and should be more robust.
> >+ close (s);
> >+ }
> >+ else {
> >+ /* We must unset CLOEXEC on the fd. (dup2 above does this
> >+ * implicitly because CLOEXEC is set on the fd, not on the
> >+ * socket).
> >+ */
> >+ flags = fcntl (s, F_GETFD, 0);
> >+ if (flags == -1) {
> >+ perror ("fcntl: F_GETFD");
>
> perror is not async-signal-safe. Calling it in a child of a
> potentially-multi-threaded parent is therefore prone to deadlock, if
> perror() triggers a request to grab any resource that was left
> locked by a different thread holding the lock but stranded by the
> fork.
I suspect these sort of problems are unsolvable. I've studied enough
customer bug reports where we've had to reverse engineer exactly how
something failed given only a slim error log to know that error
messages are always useful and I wouldn't want to remove these.
However because of the chance of deadlock, how about a deadlock-free
function that write(2)'s the message + errno to stderr?
> >+ _exit (EXIT_FAILURE);
> >+ }
> >+ if (fcntl (s, F_SETFD, flags & ~FD_CLOEXEC) == -1) {
> >+ perror ("fcntl: F_SETFD");
> >+ _exit (EXIT_FAILURE);
> >+ }
>
> About all we can _safely_ do is let our _exit() status inform the
> parent process that something failed, and let the parent process
> print the error message. But even passing errno as the _exit()
> value is not portable (GNU Hurd errno values are intentionally
> larger than what fit in 8-bit returns that the parent would see).
> It's also possible to set up a cloexec pipe (in addition to
> everything else) so that the child can write() error details into
> the pipe, but that's a lot of effort compared to just blindly
> stating that the child failed but declaring full details of why as
> lost.
>
> >+ }
> >+
> >+ snprintf (listen_pid, sizeof listen_pid, "%ld", (long) getpid ());
> >+ setenv ("LISTEN_PID", listen_pid, 1);
> >+ setenv ("LISTEN_FDS", "1", 1);
And we could use similar code here.
> snprintf() and setenv() are also not async-signal-safe. Which is a
> bummer, since you really don't know the child pid until in the
> child. You could open-code the translation of a decimal number into
> a buffer without snprintf, but you also have to figure out how to
> safely put it into the environ seen by the child. How does systemd
> do it?
For setenv(), systemd does this by building a char **environ which is
passed to execve. The code is complicated to say the least:
https://github.com/systemd/systemd/blob/5ac1530eca652cd16819853fe06e76e156f5cf5e/src/core/dbus-execute.c
> >+ _exit (EXIT_FAILURE);
>
> Is EXIT_FAILURE the best here, or should we consider exiting with
> status 126/127 to match what other programs (sh, env, nice, nohup,
> ...) do when execvp() fails?
Yes (this is also a bug elsewhere). Not sure if 126 or 127 is
right:
https://www.gnu.org/software/bash/manual/html_node/Exit-Status.html
...
> Do we document that nbd_close() may block?
We should do. I've added it in a separate patch.
I'll have a rethink about this patch and come back with something
better in a while.
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW
More information about the Libguestfs
mailing list