[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Libguestfs] [nbdkit PATCH 1/2] server: Propagate unexpected nbdkit failure with --run



On 9/30/19 12:15 PM, Eric Blake wrote:
When running captive, we were blindly calling kill(pid) of the captive
nbdkit child process, even if that process has already exited
unexpectedly (most likely, from an assertion failure) and another
opened in its place (pid recycling is rare, but not impossible).

Actually, pid recycling is NOT possible in this case, because the nbdkit process is in zombie state until we either wait() on it (which we didn't) or exit (in which case init(1) reaps it instead). Still, the fact that we did not check whether kill() failed due to ESRCH was not helping the fact that we need the wait to detect early failure.

(Pid recycling is more of an issue when dealing with pids that are not your direct children, or if you ignore SIGCHLD)

I'm tempted to send a v2 of this that unconditionally tries kill() (which is safe after all), and which waits until the nbdkit process is gone (possibly sending SIGKILL if too much time elapses) rather than hoping that SIGTERM was sufficient (it's not nice to strand a wedged nbdkit child process), and which improves the commit message.

 We
need to check that the child process still exists, and if it
unexpectedly died, ensure that our exit status reflects that fact.

This is true regardless of how we handle a zombie process.

Note that nbdkit normally should exit with status 0 (even when it is
sent a signal like SIGHUP or SIGINT) or die from a signal.

While at it, fix the fact that system() can fail with a value that is
not appropriate to hand to WIFEXITED() if the child was not even
spawned, but cannot fail with WIFSTOPPED.  Also, reflect death from
signal to a status > 128 rather than 1 (we could be even fancier and
also re-raise the signal so that we die from the same thing, but it's
not obvious we need that much work...).

Signed-off-by: Eric Blake <eblake redhat com>
---
  server/captive.c | 43 +++++++++++++++++++++++++++++++++----------
  1 file changed, 33 insertions(+), 10 deletions(-)


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]