[Libguestfs] [nbdkit PATCH] tests: Don't strand hung nbdkit processes

Richard W.M. Jones rjones at redhat.com
Fri Mar 20 21:59:28 UTC 2020


On Fri, Mar 20, 2020 at 02:47:12PM -0500, Eric Blake wrote:
> We've recently been hitting a transient hung rpm build when using make
> 4.3, due to a bug in test-nbd-tls-psk.sh.  We're still trying to
> isolate the correct fix for that bug (it might be in the nbd plugin
> proper, but more likely is an issue in libnbd's tls handling of
> connection close), but in the meantime, this patch should at least
> cause a graceful fail rather than make hanging due to an nbdkit
> process that has hung.
> 
> Signed-off-by: Eric Blake <eblake at redhat.com>
> ---
>  tests/functions.sh.in | 31 ++++++++++++++++++++++++++++---
>  1 file changed, 28 insertions(+), 3 deletions(-)
> 
> diff --git a/tests/functions.sh.in b/tests/functions.sh.in
> index 8406fcf9..e483505e 100644
> --- a/tests/functions.sh.in
> +++ b/tests/functions.sh.in
> @@ -85,8 +85,8 @@ requires_ipv6_loopback ()
>      # instead.  It's very unlikely that port 1 is open.
>      if LANG=C qemu-img info "nbd:[::1]:1" |& \
>         grep -sq "Address family for hostname not supported"; then
> -	echo "$0: IPv6 loopback is not available, skipping this test"
> -	exit 77
> +        echo "$0: IPv6 loopback is not available, skipping this test"
> +        exit 77
>      fi
>  }
> 
> @@ -139,7 +139,32 @@ start_nbdkit ()
>      fi
> 
>      # Kill nbdkit on exit.
> -    cleanup_fn kill "$(cat "$pidfile")"
> +    cleanup_fn kill_nbdkit "$(cat "$pidfile")"
> +}
> +
> +# kill_nbdkit pid
> +#
> +# End the nbkdit process that created pidfile.  Exit this script with an
> +# error if nbdkit does not gracefully shutdown in a timely manner.
> +kill_nbdkit ()
> +{
> +    local pid=$1 i
> +
> +    # Start with SIGTERM, and wait for graceful exit
> +    kill $pid
> +    for i in {1..60}; do
> +        if ! kill -0 $pid 2>/dev/null; then
> +            break
> +        fi
> +        sleep 1
> +    done
> +    # If nbdkit has not exited, try SIGKILL and fail the test
> +    if test $i = 60; then
> +        echo "error: nbdkit pid $pid failed to respond to SIGTERM"
> +        kill -9 $pid
> +        # Append our failure after other cleanups
> +        cleanup_fn exit 1

Interesting, but OK.  I wonder if we need a more
general case for this? But ...

> +    fi
>  }
> 
>  # foreach_plugin f [args]

... but this all looks good to me thanks:

ACK

I think I'll actually do another small release once you've
pushed this because I will reenable Fedora tests that I
disabled yesterday.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html




More information about the Libguestfs mailing list