[Libguestfs] [PATCH libnbd v2 2/2] api: Implement local command with systemd socket activation.

Tue Oct 1 14:11:52 UTC 2019

On Tue, Oct 01, 2019 at 08:24:33AM -0500, Eric Blake wrote:
> >+#else /* !HAVE_EXECVPE */
> >+  SET_NEXT_STATE (%.DEAD)
> >+  set_error (ENOTSUP, "platform does not support socket activation");
> >+  return 0;
> >+#endif
> 
> We probably ought to add a matching nbd_supports_socket_activation()
> feature function.
> 
> Or, it would be possible to create a fallback for execvpe() on
> platforms that lack it by using execlpe() and our own path-walker
> utility function.  Can be done as a followup patch.  If we do that,
> then the mere presence of LIBNBD_HAVE_NBD_CONNECT_SA is witness
> enough of the functionality, rather than needing a runtime probe.

I'm hoping we will find the time to write a replacement execvpe so
that we can implement this on all platforms.  That way we can avoid
having a redundant nbd_supports_socket_activation() call that (in
future) always returns true.

> >+  if (h->argv)
> >+    nbd_internal_free_string_list (h->argv);
> 
> How can h->argv ever be previously set?

Probably not right now, but it might happen if we ever implement error
recovery for either nbd_connect_socket_activation or
nbd_connect_command.  At the moment these functions move the handle to
the DEAD state if they fail, but that's not really necessary in all
cases.

> >--- a/lib/handle.c
> >+++ b/lib/handle.c
> >@@ -129,6 +129,16 @@ nbd_close (struct nbd_handle *h)
> >    free_cmd_list (h->cmds_in_flight);
> >    free_cmd_list (h->cmds_done);
> >    nbd_internal_free_string_list (h->argv);
> >+  if (h->sa_sockpath) {
> >+    if (h->pid > 0)
> >+      kill (h->pid, SIGTERM);
> >+    unlink (h->sa_sockpath);
> >+    free (h->sa_sockpath);
> >+  }
> >+  if (h->sa_tmpdir) {
> >+    rmdir (h->sa_tmpdir);
> >+    free (h->sa_tmpdir);
> >+  }
> >    free (h->unixsocket);
> >    free (h->hostname);
> >    free (h->port);
> 
> Somewhat pre-existing: we have a waitpid() here (good, so we don't
> hang on to a zombie process), but we are relying on the child
> process to gracefully go away (whether for connect_command when
> stdin closes, or for connect_sa on receipt of SIGTERM).  Do we need
> a retry loop that escalates to SIGKILL if the child process does not
> quickly respond to the initial condition?  On the other hand, the
> fact that our waitpid() blocks until the child changes status means
> that if a child ever wedges, the fact that we wedge too gives some
> visibility to the client that it's not libnbd's fault and that they
> need to get the bug fixed in their child process.

I added a note to the documentation of nbd_close to say it can block.
However that's unfortunate and we should really provide a way to let
callers avoid that behaviour.  See:

https://github.com/libguestfs/libnbd/commit/b9ed0d56a5e26b1a051f68e7565dd321f2bcd2f8

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/