[libvirt] [PATCH] [TCK] nwfilter: add a test case using concurrency

Tue Nov 16 01:09:55 UTC 2010

On 11/15/2010 07:01 PM, Eric Blake wrote:
> On 11/15/2010 01:22 PM, Stefan Berger wrote:
>> Now that the existing scripts are (hopefully) cleaned up and my POSIX
>> compliancy shell-scripting skills have temporarily :-) improved, I am
>> now adding a test case that exercises concurrency. The test case creates
>> and destroys 2 VMs concurrently as well as changes their referenced
>> filters in a tight loop. This kind of test previously uncovered
>>
>> - deadlocks in libvirt due to lock-ordering in the nwfilter subsystem
>> - libvirt termination bug in libaugeas due to doubly closed file
>> descriptors and the side effects
>>
>> All of these have been fixed recently.
>>
>> The test script is known to run in bash, dash and ksh shells.
>>
>> Signed-off-by: Stefan Berger<stefanb at us.ibm.com>
>> Index: libvirt-tck/scripts/nwfilter/060-concurrency.t
>> ===================================================================
>> --- /dev/null
>> +++ libvirt-tck/scripts/nwfilter/060-concurrency.t
>> @@ -0,0 +1,5 @@
>> +#!/bin/sh
>> +
>> +pwd=$(dirname -- "$0")
>> +
>> +(cd -- "${pwd}"; sh ./nwfilter_concurrent.sh --tap-test)
> I'd use&&  instead of ; so that if the cd fails we don't try to run a
> random file in the wrong directory.
>
>> Index: libvirt-tck/scripts/nwfilter/nwfilter_concurrent.sh
>> ===================================================================
>> --- /dev/null
>> +++ libvirt-tck/scripts/nwfilter/nwfilter_concurrent.sh
>> @@ -0,0 +1,371 @@
>> +#!/bin/sh
>> +
>> +VIRSH=virsh
>> +
>> +# For each line starting with uri=, remove the prefix and set the hold
>> +# space to the rest of the line.  Then at file end, print the hold
>> +# space, which is effectively the last uri= line encountered.
>> +uri=$(sed -n '/^uri[     ]*=[     ]*/ {
>> +  s///
>> +  h
>> +}
>> +$ {
>> +  x
>> +  p
>> +}'<  "$LIBVIRT_TCK_CONFIG")
>> +: "${uri:=qemu:///system}"
>> +
>> +LIBVIRT_URI=${uri}
>> +
>> +FLAG_WAIT="$((1<<0))"
>> +FLAG_ATTACH="$((1<<1))"
>> +FLAG_VERBOSE="$((1<<2))"
>> +FLAG_LIBVIRT_TEST="$((1<<3))"
>> +FLAG_TAP_TEST="$((1<<4))"
>> +FLAG_FORCE_CLEAN="$((1<<5))"
> I see some common patterns here with your other tck shell scripts :)
>
> Would it be better to factor out some of this common initialization and
> common script routines (such as tap_fail) into a single .sh file and
> source that file up frong, rather than copying them into each driver?
> Especially since if we fix a bug in one, we have to copy that fix to
> multiple files at the moment?
>
At some point this could be done. Mostly the tap functions could be 
moved into a common file.
>> +
>> +killPrgs()
>> +{
>> +  msg="$1"
>> +
>> +  # terminate all process
>> +  [ "x${CREATE_DES_VM1_THR}x" != "xx" ]&&  \
>> +    kill -9 ${CREATE_DES_VM1_THR}
> Should we try 'kill -2' rather than 'kill -9' to send a SIGINT and try
> and allow the subprocesses a chance to gracefully clean up after
> themselves rather than forcefully quitting?
>
Fine by me.
>> +runTest()
>> +{
>> +  flags="$1"
>> +
>> +  passctr=0
>> +  failctr=0
>> +
>> +  tmpdir=`mktmpdir`
>> +  failctr=0
>> +  passctr=0
>> +  logvm1="${PWD}/${tmpdir}/logvm1"
>> +  logvm2="${PWD}/${tmpdir}/logvm2"
>> +  logfivm1="${PWD}/${tmpdir}/logfivm1"
>> +  logfivm2="${PWD}/${tmpdir}/logfivm2"
>> +
>> +  loops=15
>> +
> ...
>
>> +    val=$(cat "${logfivm2}" 2>/dev/null | tail -n 1)
>> +    ( [ "x${val}x" = "xx" ] || [ ${val} -lt ${tmp} ] ) \
>> +&&  testFail "${flags}" \
>> +        "VM2 filter log - step ${expect} ($val<  $tmp)" \
>> +      || testPass "${flags}" \
>> +        "VM2 filter log - step ${expect} ($val>= $tmp)"
>> +
>> +    expect=$(($expect + 1))
>> +    [ ${expect} -gt ${loops} ]&&  break;
>> +
>> +    sleep 4
>> +  done
> This seems very sensitive to timing measured on your machine.  Is there
> any way to make it more robust, and less likely to fail on a much faster
> or much slower machine?
>
Much slower machines would be a concern, or one that's extremely busy. 
Faster ones should not be a problem.

I also wasn't sure how to go about it and how to determine when a 
process really is stuck and let the test case fail. Theoretically one of 
the start-destroy processes could only go through a single 'cycle' and 
it's difficult to say then whether the test succeeded. Another 
possibility would be to require that each one of the start-destroy 
processes makes 5 cycles and then the test ends, independent of how long 
it takes.

    Stefan