[Avocado-devel] RFC: Nested tests (previously multi-stream test) [v5]

Tue May 24 14:53:14 UTC 2016

Hello guys,

this version returns to roots and tries to define clearly the single 
solution I find teasing for multi-host and other complex tests.

Changes:

     v2: Rewritten from scratch
     v2: Added examples for the demonstration to avoid confusion
     v2: Removed the mht format (which was there to demonstrate manual
         execution)
     v2: Added 2 solutions for multi-tests
     v2: Described ways to support synchronization
     v3: Renamed to multi-stream as it befits the purpose
     v3: Improved introduction
     v3: Workers are renamed to streams
     v3: Added example which uses library, instead of new test
     v3: Multi-test renamed to nested tests
     v3: Added section regarding Job API RFC
     v3: Better description of the Synchronization section
     v3: Improved conclusion
     v3: Removed the "Internal API" section (it was a transition between
         no support and "nested test API", not a "real" solution)
     v3: Using per-test granularity in nested tests (requires plugins
         refactor from Job API, but allows greater flexibility)
     v4: Removed "Standard python libraries" section (rejected)
     v4: Removed "API backed by cmdline" (rejected)
     v4: Simplified "Synchronization" section (only describes the
         purpose)
     v4: Refined all sections
     v4: Improved the complex example and added comments
     v4: Formulated the problem of multiple tasks in one stream
     v4: Rejected the idea of bounding it inside MultiTest class
         inherited from avocado.Test, using a library-only approach
     v5: Avoid mapping ideas to multi-stream definition and clearly
         define the idea I bear in my head for test building blocks
         called nested tests.

Motivation
==========

Allow building complex tests out of existing tests producing a single 
result depending on the complex test's requirements. Important thing is, 
that the complex test might run those tests on the same, but also on a 
different machine allowing simple development of multi-host tests. Note 
that the existing tests should stay (mostly) unchanged and executable as 
simple scenarios, or invoked by those complex tests.

Examples of what could be implemented using this feature:

1. Adding background (stress) tasks to existing test producing 
real-world scenarios.
    * cpu stress test + cpu hotplug test
    * memory stress test + migration
    * network+cpu+memory test on host, memory test on guest while
      running migration
    * running several migration tests (of the same and different type)

2. Multi-host tests implemented by splitting them into components and 
leveraging them from the main test.
    * multi-host migration
    * stressing a service from different machines

Nested tests
============

Test
----

A test is a receipt explaining prerequisites, steps to check how the 
unit under testing behaves and cleanup after successful or unsuccessful 
execution.

Test itself contains lots of neat features to simplify logging, results 
analysis and error handling evolved to simplify testing.

Test runner
-----------

Is responsible for driving the test(s) execution, which includes the 
standard test workflow (setUp/test/tearDown), handle plugin hooks 
(results/pre/post) as well as safe interruption.

Nested test
-----------

Is a test invoked by other test. It can either be executed in foreground 
(while the main test is waiting) or in background along with the main 
(and other background tests) test. It should follow the default test 
workflow (setUp/test/tearDown), it should keep all the neat test feature 
like logging and error handling and the results should also go into the 
main test's output, with the nested test's id  as prefix. All the 
produced files of the nested test should be located in a new directory 
inside the main test results dir in order to be able to browse either 
overall results (main test + nested tests) or just the nested tests ones.

Resolver
--------

Resolver is an avocado component resolving a test reference into a list 
of test templates compound of the test name, params and other 
`avocado.Test.__init__` arguments.

Very simple example
-------------------

This example demonstrates how to use existing test (SimpleTest 
"/usr/bin/wget example.org") in order to create a complex scenario 
(download the main page from example.org from multiple computers almost 
concurrently), without any modifications of the `SimpleTest`.

     import avocado

     class WgetExample(avocado.Test):
         def test(self):
             # Initialize nested test runner
             self.runner = avocado.NestedRunner(self)
             # This is what one calls on "avocado run"
             test_reference = "/usr/bin/wget example.org"
             # This is the resolved list of templates
             tests = avocado.resolver.resolve(test_reference)
             # We could support list of results, but for simplicity
             # allow only single test.
             assert len(tests) == 1, ("Resolver produced multiple test "
                                      "names: %s\n%s" % (test_reference,
                                                         tests)
             test = tests[0]
             for machine in self.params.get("machines"):
                 # Query a background job on the machine (local or
                 # remote) and return test id in order to query for
                 # the particular results or task interruption, ...
                 self.runner.run_bg(machine, test)
             # Wait for all background tasks to finish, raise exception
             # if any of them fails.
             self.runner.wait(ignore_errors=False)

When nothing fails, this usage has no benefit over the simple logging 
into a machine and firing up the command. The difference is, when 
something does not work as expected. With nested test, one get a runner 
exception if the machine is unreachable. And on test error he gets not 
only overall log, but also the per-nested-test results simplifying the 
error analysis. For 1, 2 or 3 machines, this makes no difference, but 
imagine you want to run this from hundreds of machines. Try finding the 
exception there.

Yes, you can implement the above without nested tests, but it requires a 
lot of boilerplate code to establish the connection (or raise an 
exception explaining why it was not possible and I'm not talking about 
"unable to establish connection", but granularity like "Invalid 
password", "Host is down", ...). Then you'd have to setup the output 
logging for that particular task, add the prefix, run the task (handling 
all possible exceptions) and interpret the results. All of this to get 
the same benefits very simple avocado test provides you.

Advanced example
----------------

Imagine a very complex scenario, for example a cloud with several 
services. One could write a big-fat test tailored just for this scenario 
and keep adding sub-scenarios producing unreadable source code.

With nested tests one could split this task into tests:

  * Setup a fake network
  * Setup cloud service
  * Setup in-cloud service A/B/C/D/...
  * Test in-cloud service A/B/C/D/...
  * Stress network
  * Migrate nodes

New variants could be easily added, for example DDoS attack to some 
nodes, node hotplug/unplug, ... by invoking those existing tests and 
combining them into a complex test.

Additionally note that some of the tests, eg. the setup cloud service 
and setup in-cloud service are quite generic tests, what could be reused 
many times in different tests. Yes, one could write a library to do 
that, but in that library he'd have to handle all exceptions and provide 
nice logging, while not clutter the main output with unnecessary 
information.

Job results
-----------

Combine (multiple) test results into understandable format. There are 
several formats, the most generic one is file format:

.
├── id  -- id of this job
├── job.log  -- overall job log
└── test-results  -- per-test-directories with test results
     ├── 1-passtest.py:PassTest.test  -- first test's results
     └── 2-failtest.py:FailTest.test  -- second test's results

Additionally it contains other files and directories produced by avocado 
plugins like json, xunit, html results, sysinfo gathering and info 
regarding the replay feature.

Test results
------------

In the end, every test produces results, which is what we're interested 
in. The results must clearly define the test status, should provide a 
record of what was executed and in case of failure, they should provide 
all the information in order to find the cause and understand the failure.

Standard tests does that by providing test log (debug, info, warning, 
error, critical), stdout, stderr, allowing to write to whiteboard and 
attach files in the results directory. Additionally due to structure of 
the test one knows what stage(s) of the test failed and pinpoint exact 
location of the failure (traceback in the log).

.
├── data  -- place for other files produced by a test
├── debug.log  -- debug, info, warn, error log
├── remote.log  -- additional log regarding remote session
├── stderr  -- standard error
├── stdout  -- standard output
├── sysinfo  -- provided by sysinfo plugin
│   ├── post
│   ├── pre
│   └── profile
└── whiteboard  -- file for arbitrary test data

I'd like to extend this structure of either a directory "subtests", or 
convention for directories intended for nested test results `r"\d+-.*"`.

The `r"\d+-.*"` reflects the current test-id notation, which nested 
tests should also respect, replacing the serialized-id by 
in-test-serialized-id. That way we easily identify which of the nested 
tests was executed first (which does not necessarily mean it finished as 
first).

In the end nested tests should be assigned a directory inside the main 
test's results (or main test's results/subtests) and it should produce 
the data/debug.log/stdout/stderr/whiteboard in there as well as 
propagate the debug.log with a prefix to the main test's debug.log (as 
well as job.log).

└── 1-parallel_wget.py:WgetExample.test  -- main test
     ├── data
     ├── debug.log  -- contains main log + nested logs with prefixes
     ├── remote.log
     ├── stderr
     ├── stdout
     ├── sysinfo
     │   ├── post
     │   ├── pre
     │   └── profile
     ├── whiteboard
     ├── 1-_usr_bin_wget\ example.org  -- first nested test
     │   ├── data
     │   ├── debug.log  -- contains only this nested test log
     │   ├── remote.log
     │   ├── stderr
     │   ├── stdout
     │   └── whiteboard
     ├── 2-_usr_bin_wget\ example.org  -- second nested test
...
     └── 3-_usr_bin_wget\ example.org  -- third nested test
...

Note that nested tests can finish with any result and it's up to the 
main test to evaluate that. This means that theoretically you could find 
nested tests which states `FAIL` or `ERROR` in the end. That might be 
confusing, so I think the `NestedRunner` should append last line to the 
test's log saying `Expected FAILURE` to avoid confusion while looking at 
results.

Note2: It might be impossible to pass messages in real-time across 
multiple machines, so I think at the end the main job.log should be 
copied to `raw_job.log` and the `job.log` should be reordered according 
to date-time of the messages. (alternatively we could only add a contrib 
script to do that).

Conclusion
==========

I believe nested tests would help people covering very complex scenarios 
by splitting them into pieces similarly to Lego. It allows easier 
per-component development, consistent results which are easy to analyze 
as one can see both, the overall picture and the specific pieces and it 
allows fixing bugs in all tests by fixing the single piece (nested test).

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/avocado-devel/attachments/20160524/76d3218a/attachment.sig>