[Libguestfs] [nbdkit PATCH] tests: Test for faster shutdown

Eric Blake eblake at redhat.com
Fri Aug 23 18:18:02 UTC 2019


On 8/6/19 10:46 AM, Eric Blake wrote:
> The test relies on the timeout program.  Also, since the
> nbdkit_nanosleep code relies on the Linux extension POLLRDHUP to
> detect early client closure, we may have to relax that part of the
> test when porting to platforms that lack ppoll/POLLRDHUP.  (That is,
> while we should still be able to let a signal to the server shut down
> nbdkit quickly, it's harder to let a client close()ing its end cause
> nbdkit to shut down a single connection quickly when limited to a poll
> that can't identify EOF as a way to break a poll).
> 
> Signed-off-by: Eric Blake <eblake at redhat.com>
> ---
> 
> I'm pushing this along with the nbdkit_nanosleep patches, now that I'm
> reliably getting the test to fail in isolation and to pass with the
> sleep patches applied.  However, I suspect that we will definitely need
> to fix things for other platforms, and/or skip this test if we can't
> easily make other platforms match what Linux can do with POLLRDHUP.


> +# The server's response to shutdown signals should not stall on delays
> +qemu-io -f raw "nbd+unix:///?socket=$sock" -c 'r 0 512' </dev/null &
> +pid=$!
> +sleep 1
> +kill -s INT "$(cat "$pidfile")"
> +sleep 1
> +kill -s 0 "$(cat "$pidfile")" && {
> +    echo "Unexpected status; nbdkit didn't react fast enough to signal"
> +    fail=1

Rich already hit this failure on a heavily-loaded slow machine ('make
-j16), so I'm pushing this followup to turn a hard-coded one-second
sleep into a loop (5 seconds for now, but we could go longer) to let
things have longer to settle under load but to not tie up the test if it
is still fast:

From 05ccea5f9717c5117785c6cc6125596088ced328 Mon Sep 17 00:00:00 2001
From: Eric Blake <eblake at redhat.com>
Date: Fri, 23 Aug 2019 11:59:57 -0500
Subject: [nbdkit PATCH] tests: Increase timeouts on test-shutdown.sh

Testing that shutdown happens in a timely manner is inherently a racy
aspect (it's hard to guarantee that a slow heavily-loaded machine will
react as fast as desired.  Rich hit a test failure when running 'make
-j16', complaining "Unexpected status; nbdkit didn't react fast enough
to signal".

Improve the test by lengthening the delay of the server (so we have
more time to react to other timing events even uner load), lengthening
the timeout of the 'quit' command that normally completes quickly
without load, and rewriting the 'kill -0' test for a completed server
into a longer timeout via a loop (to let the test remain responsive
and complete quickly on a machine without load, but to allow longer
before failing the test on a machine with load).  On my machine, the
test still completes in under 3 seconds when run in isolation (most of
that time spent idling), but this should make the test friendlier to a
loaded slow machine.

Fixes: 782ad5b2
Reported-by: Richard W.M. Jones <rjones at redhat.com>
Signed-off-by: Eric Blake <eblake at redhat.com>
---
 tests/test-shutdown.sh | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/tests/test-shutdown.sh b/tests/test-shutdown.sh
index 0ac9c1eb..7d17045a 100755
--- a/tests/test-shutdown.sh
+++ b/tests/test-shutdown.sh
@@ -46,7 +46,7 @@ fail=0
 # react to both connection death and server shutdown without finishing
 # the entire delay duration.
 start_nbdkit -P shutdown.pid -U $sock --filter=noparallel --filter=delay \
-    null 1M serialize=connections rdelay=10
+    null 1M serialize=connections rdelay=60

 # Early client death should not stall connection of second client.
 trap '' ERR
@@ -55,7 +55,7 @@ test $? = 124 || {
     echo "Unexpected status; qemu-io should have been killed for timing
out"
     fail=1
 }
-timeout 1s qemu-io -f raw "nbd+unix:///?socket=$sock" -c 'quit' </dev/null
+timeout 5s qemu-io -f raw "nbd+unix:///?socket=$sock" -c 'quit' </dev/null
 test $? = 0 || {
     echo "Unexpected status; nbdkit was not responsive to allow second
qemu-io"
     fail=1
@@ -66,11 +66,14 @@ qemu-io -f raw "nbd+unix:///?socket=$sock" -c 'r 0
512' </dev/null &
 pid=$!
 sleep 1
 kill -s INT "$(cat "$pidfile")"
-sleep 1
-kill -s 0 "$(cat "$pidfile")" && {
+for (( i=0; i < 5; i++ )); do
+    kill -s 0 "$(cat "$pidfile")" || break
+    sleep 1
+done
+if [ $i = 5 ]; then
     echo "Unexpected status; nbdkit didn't react fast enough to signal"
     fail=1
-}
+fi
 wait $pid

 exit $fail
-- 
2.21.0



-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20190823/2ca289e8/attachment.sig>


More information about the Libguestfs mailing list