[Avocado-devel] RFC: Nested tests (previously multi-stream test) [v5]

Wed May 25 05:40:29 UTC 2016

Hi Lukáš,

I often come up with the need to orchestrate test units, so your note is
quite interesting to me. I wonder about the high-level workflow that weaves
through those nested tests, these can end up being quite complex, and it
seems that having a way to describe what to do at every step would need to
be done as part of the description of the relationships between nested
tests.

The examples you showed had a fairly linear/serial relationship, do you
consider cases that are better described as directed acyclic graphs?

In the end it's a tradeoff between what capabilities to push in the core
test framework vs what remains strictly in the body of the test up to the
test writer to implement.

Thanks

-
Vincent

On Tue, May 24, 2016 at 7:53 AM, Lukáš Doktor <ldoktor at redhat.com> wrote:

> Hello guys,
>
> this version returns to roots and tries to define clearly the single
> solution I find teasing for multi-host and other complex tests.
>
> Changes:
>
>     v2: Rewritten from scratch
>     v2: Added examples for the demonstration to avoid confusion
>     v2: Removed the mht format (which was there to demonstrate manual
>         execution)
>     v2: Added 2 solutions for multi-tests
>     v2: Described ways to support synchronization
>     v3: Renamed to multi-stream as it befits the purpose
>     v3: Improved introduction
>     v3: Workers are renamed to streams
>     v3: Added example which uses library, instead of new test
>     v3: Multi-test renamed to nested tests
>     v3: Added section regarding Job API RFC
>     v3: Better description of the Synchronization section
>     v3: Improved conclusion
>     v3: Removed the "Internal API" section (it was a transition between
>         no support and "nested test API", not a "real" solution)
>     v3: Using per-test granularity in nested tests (requires plugins
>         refactor from Job API, but allows greater flexibility)
>     v4: Removed "Standard python libraries" section (rejected)
>     v4: Removed "API backed by cmdline" (rejected)
>     v4: Simplified "Synchronization" section (only describes the
>         purpose)
>     v4: Refined all sections
>     v4: Improved the complex example and added comments
>     v4: Formulated the problem of multiple tasks in one stream
>     v4: Rejected the idea of bounding it inside MultiTest class
>         inherited from avocado.Test, using a library-only approach
>     v5: Avoid mapping ideas to multi-stream definition and clearly
>         define the idea I bear in my head for test building blocks
>         called nested tests.
>
>
> Motivation
> ==========
>
> Allow building complex tests out of existing tests producing a single
> result depending on the complex test's requirements. Important thing is,
> that the complex test might run those tests on the same, but also on a
> different machine allowing simple development of multi-host tests. Note
> that the existing tests should stay (mostly) unchanged and executable as
> simple scenarios, or invoked by those complex tests.
>
> Examples of what could be implemented using this feature:
>
> 1. Adding background (stress) tasks to existing test producing real-world
> scenarios.
>    * cpu stress test + cpu hotplug test
>    * memory stress test + migration
>    * network+cpu+memory test on host, memory test on guest while
>      running migration
>    * running several migration tests (of the same and different type)
>
> 2. Multi-host tests implemented by splitting them into components and
> leveraging them from the main test.
>    * multi-host migration
>    * stressing a service from different machines
>
>
> Nested tests
> ============
>
> Test
> ----
>
> A test is a receipt explaining prerequisites, steps to check how the unit
> under testing behaves and cleanup after successful or unsuccessful
> execution.
>
> Test itself contains lots of neat features to simplify logging, results
> analysis and error handling evolved to simplify testing.
>
> Test runner
> -----------
>
> Is responsible for driving the test(s) execution, which includes the
> standard test workflow (setUp/test/tearDown), handle plugin hooks
> (results/pre/post) as well as safe interruption.
>
> Nested test
> -----------
>
> Is a test invoked by other test. It can either be executed in foreground
> (while the main test is waiting) or in background along with the main (and
> other background tests) test. It should follow the default test workflow
> (setUp/test/tearDown), it should keep all the neat test feature like
> logging and error handling and the results should also go into the main
> test's output, with the nested test's id  as prefix. All the produced files
> of the nested test should be located in a new directory inside the main
> test results dir in order to be able to browse either overall results (main
> test + nested tests) or just the nested tests ones.
>
> Resolver
> --------
>
> Resolver is an avocado component resolving a test reference into a list of
> test templates compound of the test name, params and other
> `avocado.Test.__init__` arguments.
>
> Very simple example
> -------------------
>
> This example demonstrates how to use existing test (SimpleTest
> "/usr/bin/wget example.org") in order to create a complex scenario
> (download the main page from example.org from multiple computers almost
> concurrently), without any modifications of the `SimpleTest`.
>
>     import avocado
>
>     class WgetExample(avocado.Test):
>         def test(self):
>             # Initialize nested test runner
>             self.runner = avocado.NestedRunner(self)
>             # This is what one calls on "avocado run"
>             test_reference = "/usr/bin/wget example.org"
>             # This is the resolved list of templates
>             tests = avocado.resolver.resolve(test_reference)
>             # We could support list of results, but for simplicity
>             # allow only single test.
>             assert len(tests) == 1, ("Resolver produced multiple test "
>                                      "names: %s\n%s" % (test_reference,
>                                                         tests)
>             test = tests[0]
>             for machine in self.params.get("machines"):
>                 # Query a background job on the machine (local or
>                 # remote) and return test id in order to query for
>                 # the particular results or task interruption, ...
>                 self.runner.run_bg(machine, test)
>             # Wait for all background tasks to finish, raise exception
>             # if any of them fails.
>             self.runner.wait(ignore_errors=False)
>
> When nothing fails, this usage has no benefit over the simple logging into
> a machine and firing up the command. The difference is, when something does
> not work as expected. With nested test, one get a runner exception if the
> machine is unreachable. And on test error he gets not only overall log, but
> also the per-nested-test results simplifying the error analysis. For 1, 2
> or 3 machines, this makes no difference, but imagine you want to run this
> from hundreds of machines. Try finding the exception there.
>
> Yes, you can implement the above without nested tests, but it requires a
> lot of boilerplate code to establish the connection (or raise an exception
> explaining why it was not possible and I'm not talking about "unable to
> establish connection", but granularity like "Invalid password", "Host is
> down", ...). Then you'd have to setup the output logging for that
> particular task, add the prefix, run the task (handling all possible
> exceptions) and interpret the results. All of this to get the same benefits
> very simple avocado test provides you.
>
> Advanced example
> ----------------
>
> Imagine a very complex scenario, for example a cloud with several
> services. One could write a big-fat test tailored just for this scenario
> and keep adding sub-scenarios producing unreadable source code.
>
> With nested tests one could split this task into tests:
>
>  * Setup a fake network
>  * Setup cloud service
>  * Setup in-cloud service A/B/C/D/...
>  * Test in-cloud service A/B/C/D/...
>  * Stress network
>  * Migrate nodes
>
> New variants could be easily added, for example DDoS attack to some nodes,
> node hotplug/unplug, ... by invoking those existing tests and combining
> them into a complex test.
>
> Additionally note that some of the tests, eg. the setup cloud service and
> setup in-cloud service are quite generic tests, what could be reused many
> times in different tests. Yes, one could write a library to do that, but in
> that library he'd have to handle all exceptions and provide nice logging,
> while not clutter the main output with unnecessary information.
>
> Job results
> -----------
>
> Combine (multiple) test results into understandable format. There are
> several formats, the most generic one is file format:
>
> .
> ├── id  -- id of this job
> ├── job.log  -- overall job log
> └── test-results  -- per-test-directories with test results
>     ├── 1-passtest.py:PassTest.test  -- first test's results
>     └── 2-failtest.py:FailTest.test  -- second test's results
>
> Additionally it contains other files and directories produced by avocado
> plugins like json, xunit, html results, sysinfo gathering and info
> regarding the replay feature.
>
> Test results
> ------------
>
> In the end, every test produces results, which is what we're interested
> in. The results must clearly define the test status, should provide a
> record of what was executed and in case of failure, they should provide all
> the information in order to find the cause and understand the failure.
>
> Standard tests does that by providing test log (debug, info, warning,
> error, critical), stdout, stderr, allowing to write to whiteboard and
> attach files in the results directory. Additionally due to structure of the
> test one knows what stage(s) of the test failed and pinpoint exact location
> of the failure (traceback in the log).
>
> .
> ├── data  -- place for other files produced by a test
> ├── debug.log  -- debug, info, warn, error log
> ├── remote.log  -- additional log regarding remote session
> ├── stderr  -- standard error
> ├── stdout  -- standard output
> ├── sysinfo  -- provided by sysinfo plugin
> │   ├── post
> │   ├── pre
> │   └── profile
> └── whiteboard  -- file for arbitrary test data
>
> I'd like to extend this structure of either a directory "subtests", or
> convention for directories intended for nested test results `r"\d+-.*"`.
>
> The `r"\d+-.*"` reflects the current test-id notation, which nested tests
> should also respect, replacing the serialized-id by in-test-serialized-id.
> That way we easily identify which of the nested tests was executed first
> (which does not necessarily mean it finished as first).
>
> In the end nested tests should be assigned a directory inside the main
> test's results (or main test's results/subtests) and it should produce the
> data/debug.log/stdout/stderr/whiteboard in there as well as propagate the
> debug.log with a prefix to the main test's debug.log (as well as job.log).
>
> └── 1-parallel_wget.py:WgetExample.test  -- main test
>     ├── data
>     ├── debug.log  -- contains main log + nested logs with prefixes
>     ├── remote.log
>     ├── stderr
>     ├── stdout
>     ├── sysinfo
>     │   ├── post
>     │   ├── pre
>     │   └── profile
>     ├── whiteboard
>     ├── 1-_usr_bin_wget\ example.org  -- first nested test
>     │   ├── data
>     │   ├── debug.log  -- contains only this nested test log
>     │   ├── remote.log
>     │   ├── stderr
>     │   ├── stdout
>     │   └── whiteboard
>     ├── 2-_usr_bin_wget\ example.org  -- second nested test
> ...
>     └── 3-_usr_bin_wget\ example.org  -- third nested test
> ...
>
> Note that nested tests can finish with any result and it's up to the main
> test to evaluate that. This means that theoretically you could find nested
> tests which states `FAIL` or `ERROR` in the end. That might be confusing,
> so I think the `NestedRunner` should append last line to the test's log
> saying `Expected FAILURE` to avoid confusion while looking at results.
>
> Note2: It might be impossible to pass messages in real-time across
> multiple machines, so I think at the end the main job.log should be copied
> to `raw_job.log` and the `job.log` should be reordered according to
> date-time of the messages. (alternatively we could only add a contrib
> script to do that).
>
>
> Conclusion
> ==========
>
> I believe nested tests would help people covering very complex scenarios
> by splitting them into pieces similarly to Lego. It allows easier
> per-component development, consistent results which are easy to analyze as
> one can see both, the overall picture and the specific pieces and it allows
> fixing bugs in all tests by fixing the single piece (nested test).
>
>
> _______________________________________________
> Avocado-devel mailing list
> Avocado-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/avocado-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/avocado-devel/attachments/20160524/caab9f68/attachment.htm>