[Avocado-devel] RFC: Nested tests (previously multi-stream test) [v5]

Wed May 25 17:26:07 UTC 2016

Hello Vincent,

Dne 25.5.2016 v 18:35 Vincent Matossian napsal(a):
> Hey Lukáš, what I meant by a DAG of dependencies is that  tests in the
> chain may depend on the output of a previous test to execute a
> subsequent test, to borrow from the cloud provisioning example you'd
> outlined
>
> Test Cloud Provision
>
> SubTest1- Provision a cloud service
> SubTest2- Run a configuration check
>
> At this point there may be several paths to go by depending on the
> results of the test, for example
I'm not sure what the dependency would be for those tests, but let's say 
you want to `Check CPU` and `Check Storage` in case the `configuration 
check` fails with `Configuration failed, check your hardware`. On the 
other hand when it fails with `Configuration failed, unable to access 
storage` then you only want to run `Check Storage`. When it passes, you 
want to run `Stress cloud service`.

>
> SubTest4.1- Stress CPU
> Or
> SubTest4.2- Stress Storage

This depends on the point of view. If you consider the whole scenario as 
a single test, then nested tests are the proposed solution for this. If 
your goal is to run various dependent subtests and you expect results 
from all subtests, then the JobAPI RFC is the answer.

Basically this example when running as nested tests:

$ avocado run cloud-provision.py
...
  (1/1) cloud-provision.py:Cloud.test_scenario1: PASS
...

where inside the test_scenario1 you describe the dependencies like 
mentioned in this RFC:

...
self.runner.run_fg(provision_test, ignore_error=False)
result = self.runner.run_fg(config_check_test, ignore_error=True)
if result.passed:
     self.runner.run_fg(stress_cloud_service, ignore_error=False)
elif "check your hardware" in result.exception:
     self.runner.run_fg(check_cpu_test, ignore_error=True)
     self.runner.run_fg(check_storage_test, ignore_error=True)
     raise result.exception
elif "unable to access storage" in result.exception:
     self.runner.run_fg(check_storage_test, ignore_error=True)
else:
     raise result.exception

and inside the the test results directory you'll find all particular 
results.

When we describe this in Job API terminology:

$ avocado job cloud-provision.py
...
(1/?) cloud-provision.py:Cloud.test_scenario1: PASS
(2/?) provision_test.py:Provision.test: PASS
(3/?) stress_cloud.py:Cloud.test: PASS
...

Where inside the `cloud-provision.py` you add tasks to the job based on 
their results. It's really similar to the nested tests, only the Job API 
adds tests to the current job, each producing results and the overall 
status depends on all tests statuses. In nested tests you use existing 
tests to create a complex task producing just a single result based on 
the main test's decision (you can ignore some failures, or expect them...).

Does any of these suit your needs?
Lukáš

>
>
> I definitely like the idea, just wondering how it will handle
> relationship definitions between independent tests
>
> Thanks!
>
> -
> Vincent
>
>
> On Wed, May 25, 2016 at 6:18 AM, Lukáš Doktor <ldoktor at redhat.com
> <mailto:ldoktor at redhat.com>> wrote:
>
>     Hello Vincent,
>
>     could you please provide an example? I'm not sure I understand your
>     concern. The beauty of nested tests is the simplicity. Basically the
>     main test just triggers the test(s) and waits for them to finish.
>     Then it can decide what to do with the results (bail out, ignore,
>     include in results, trigger another test(s), ...).
>
>     For complex tasks (like the advanced example) synchronization
>     mechanisms would have to be used for example inside the `Setup a
>     fake network` test to wait till all the tests finish and then
>     post-process/stop the fake network.
>
>     Obviously there is nothing what should prevent nested tests to
>     invoke another nested tests, but then the situation is the same.
>     They act as nested-main test for the nested-nested tests and when
>     the nested-nested tests finish it reports the single result and the
>     main test retrieves just the single result and it could decide what
>     to do next.
>
>     All of those together should allow great flexibility and
>     understandable/predictable results.
>
>     Regards,
>     Lukáš
>
>
>     Dne 25.5.2016 v 07:40 Vincent Matossian napsal(a):
>
>         Hi Lukáš,
>
>         I often come up with the need to orchestrate test units, so your
>         note is
>         quite interesting to me. I wonder about the high-level workflow that
>         weaves through those nested tests, these can end up being quite
>         complex,
>         and it seems that having a way to describe what to do at every step
>         would need to be done as part of the description of the
>         relationships
>         between nested tests.
>
>         The examples you showed had a fairly linear/serial relationship,
>         do you
>         consider cases that are better described as directed acyclic graphs?
>
>         In the end it's a tradeoff between what capabilities to push in
>         the core
>         test framework vs what remains strictly in the body of the test
>         up to
>         the test writer to implement.
>
>         Thanks
>
>         -
>         Vincent
>
>
>         On Tue, May 24, 2016 at 7:53 AM, Lukáš Doktor
>         <ldoktor at redhat.com <mailto:ldoktor at redhat.com>
>         <mailto:ldoktor at redhat.com <mailto:ldoktor at redhat.com>>> wrote:
>
>             Hello guys,
>
>             this version returns to roots and tries to define clearly
>         the single
>             solution I find teasing for multi-host and other complex tests.
>
>             Changes:
>
>                 v2: Rewritten from scratch
>                 v2: Added examples for the demonstration to avoid confusion
>                 v2: Removed the mht format (which was there to
>         demonstrate manual
>                     execution)
>                 v2: Added 2 solutions for multi-tests
>                 v2: Described ways to support synchronization
>                 v3: Renamed to multi-stream as it befits the purpose
>                 v3: Improved introduction
>                 v3: Workers are renamed to streams
>                 v3: Added example which uses library, instead of new test
>                 v3: Multi-test renamed to nested tests
>                 v3: Added section regarding Job API RFC
>                 v3: Better description of the Synchronization section
>                 v3: Improved conclusion
>                 v3: Removed the "Internal API" section (it was a
>         transition between
>                     no support and "nested test API", not a "real" solution)
>                 v3: Using per-test granularity in nested tests (requires
>         plugins
>                     refactor from Job API, but allows greater flexibility)
>                 v4: Removed "Standard python libraries" section (rejected)
>                 v4: Removed "API backed by cmdline" (rejected)
>                 v4: Simplified "Synchronization" section (only describes the
>                     purpose)
>                 v4: Refined all sections
>                 v4: Improved the complex example and added comments
>                 v4: Formulated the problem of multiple tasks in one stream
>                 v4: Rejected the idea of bounding it inside MultiTest class
>                     inherited from avocado.Test, using a library-only
>         approach
>                 v5: Avoid mapping ideas to multi-stream definition and
>         clearly
>                     define the idea I bear in my head for test building
>         blocks
>                     called nested tests.
>
>
>             Motivation
>             ==========
>
>             Allow building complex tests out of existing tests producing a
>             single result depending on the complex test's requirements.
>             Important thing is, that the complex test might run those
>         tests on
>             the same, but also on a different machine allowing simple
>             development of multi-host tests. Note that the existing
>         tests should
>             stay (mostly) unchanged and executable as simple scenarios, or
>             invoked by those complex tests.
>
>             Examples of what could be implemented using this feature:
>
>             1. Adding background (stress) tasks to existing test producing
>             real-world scenarios.
>                * cpu stress test + cpu hotplug test
>                * memory stress test + migration
>                * network+cpu+memory test on host, memory test on guest while
>                  running migration
>                * running several migration tests (of the same and
>         different type)
>
>             2. Multi-host tests implemented by splitting them into
>         components
>             and leveraging them from the main test.
>                * multi-host migration
>                * stressing a service from different machines
>
>
>             Nested tests
>             ============
>
>             Test
>             ----
>
>             A test is a receipt explaining prerequisites, steps to check
>         how the
>             unit under testing behaves and cleanup after successful or
>             unsuccessful execution.
>
>             Test itself contains lots of neat features to simplify logging,
>             results analysis and error handling evolved to simplify testing.
>
>             Test runner
>             -----------
>
>             Is responsible for driving the test(s) execution, which
>         includes the
>             standard test workflow (setUp/test/tearDown), handle plugin
>         hooks
>             (results/pre/post) as well as safe interruption.
>
>             Nested test
>             -----------
>
>             Is a test invoked by other test. It can either be executed in
>             foreground (while the main test is waiting) or in background
>         along
>             with the main (and other background tests) test. It should
>         follow
>             the default test workflow (setUp/test/tearDown), it should
>         keep all
>             the neat test feature like logging and error handling and the
>             results should also go into the main test's output, with the
>         nested
>             test's id  as prefix. All the produced files of the nested test
>             should be located in a new directory inside the main test
>         results
>             dir in order to be able to browse either overall results
>         (main test
>             + nested tests) or just the nested tests ones.
>
>             Resolver
>             --------
>
>             Resolver is an avocado component resolving a test reference
>         into a
>             list of test templates compound of the test name, params and
>         other
>             `avocado.Test.__init__` arguments.
>
>             Very simple example
>             -------------------
>
>             This example demonstrates how to use existing test (SimpleTest
>             "/usr/bin/wget example.org <http://example.org>
>         <http://example.org>") in order to create
>             a complex scenario (download the main page from example.org
>         <http://example.org>
>             <http://example.org> from multiple computers almost
>         concurrently),
>             without any modifications of the `SimpleTest`.
>
>                 import avocado
>
>                 class WgetExample(avocado.Test):
>                     def test(self):
>                         # Initialize nested test runner
>                         self.runner = avocado.NestedRunner(self)
>                         # This is what one calls on "avocado run"
>                         test_reference = "/usr/bin/wget example.org
>         <http://example.org>
>             <http://example.org>"
>
>                         # This is the resolved list of templates
>                         tests = avocado.resolver.resolve(test_reference)
>                         # We could support list of results, but for
>         simplicity
>                         # allow only single test.
>                         assert len(tests) == 1, ("Resolver produced
>         multiple test "
>                                                  "names: %s\n%s" %
>         (test_reference,
>                                                                     tests)
>                         test = tests[0]
>                         for machine in self.params.get("machines"):
>                             # Query a background job on the machine
>         (local or
>                             # remote) and return test id in order to
>         query for
>                             # the particular results or task
>         interruption, ...
>                             self.runner.run_bg(machine, test)
>                         # Wait for all background tasks to finish, raise
>         exception
>                         # if any of them fails.
>                         self.runner.wait(ignore_errors=False)
>
>             When nothing fails, this usage has no benefit over the simple
>             logging into a machine and firing up the command. The
>         difference is,
>             when something does not work as expected. With nested test,
>         one get
>             a runner exception if the machine is unreachable. And on
>         test error
>             he gets not only overall log, but also the per-nested-test
>         results
>             simplifying the error analysis. For 1, 2 or 3 machines, this
>         makes
>             no difference, but imagine you want to run this from hundreds of
>             machines. Try finding the exception there.
>
>             Yes, you can implement the above without nested tests, but it
>             requires a lot of boilerplate code to establish the
>         connection (or
>             raise an exception explaining why it was not possible and
>         I'm not
>             talking about "unable to establish connection", but
>         granularity like
>             "Invalid password", "Host is down", ...). Then you'd have to
>         setup
>             the output logging for that particular task, add the prefix,
>         run the
>             task (handling all possible exceptions) and interpret the
>         results.
>             All of this to get the same benefits very simple avocado test
>             provides you.
>
>             Advanced example
>             ----------------
>
>             Imagine a very complex scenario, for example a cloud with
>         several
>             services. One could write a big-fat test tailored just for this
>             scenario and keep adding sub-scenarios producing unreadable
>         source code.
>
>             With nested tests one could split this task into tests:
>
>              * Setup a fake network
>              * Setup cloud service
>              * Setup in-cloud service A/B/C/D/...
>              * Test in-cloud service A/B/C/D/...
>              * Stress network
>              * Migrate nodes
>
>             New variants could be easily added, for example DDoS attack
>         to some
>             nodes, node hotplug/unplug, ... by invoking those existing
>         tests and
>             combining them into a complex test.
>
>             Additionally note that some of the tests, eg. the setup cloud
>             service and setup in-cloud service are quite generic tests, what
>             could be reused many times in different tests. Yes, one
>         could write
>             a library to do that, but in that library he'd have to
>         handle all
>             exceptions and provide nice logging, while not clutter the main
>             output with unnecessary information.
>
>             Job results
>             -----------
>
>             Combine (multiple) test results into understandable format.
>         There
>             are several formats, the most generic one is file format:
>
>             .
>             ├── id  -- id of this job
>             ├── job.log  -- overall job log
>             └── test-results  -- per-test-directories with test results
>                 ├── 1-passtest.py:PassTest.test  -- first test's results
>                 └── 2-failtest.py:FailTest.test  -- second test's results
>
>             Additionally it contains other files and directories produced by
>             avocado plugins like json, xunit, html results, sysinfo
>         gathering
>             and info regarding the replay feature.
>
>             Test results
>             ------------
>
>             In the end, every test produces results, which is what we're
>             interested in. The results must clearly define the test status,
>             should provide a record of what was executed and in case of
>         failure,
>             they should provide all the information in order to find the
>         cause
>             and understand the failure.
>
>             Standard tests does that by providing test log (debug, info,
>             warning, error, critical), stdout, stderr, allowing to write to
>             whiteboard and attach files in the results directory.
>         Additionally
>             due to structure of the test one knows what stage(s) of the test
>             failed and pinpoint exact location of the failure (traceback
>         in the
>             log).
>
>             .
>             ├── data  -- place for other files produced by a test
>             ├── debug.log  -- debug, info, warn, error log
>             ├── remote.log  -- additional log regarding remote session
>             ├── stderr  -- standard error
>             ├── stdout  -- standard output
>             ├── sysinfo  -- provided by sysinfo plugin
>             │   ├── post
>             │   ├── pre
>             │   └── profile
>             └── whiteboard  -- file for arbitrary test data
>
>             I'd like to extend this structure of either a directory
>         "subtests",
>             or convention for directories intended for nested test results
>             `r"\d+-.*"`.
>
>             The `r"\d+-.*"` reflects the current test-id notation, which
>         nested
>             tests should also respect, replacing the serialized-id by
>             in-test-serialized-id. That way we easily identify which of the
>             nested tests was executed first (which does not necessarily
>         mean it
>             finished as first).
>
>             In the end nested tests should be assigned a directory
>         inside the
>             main test's results (or main test's results/subtests) and it
>         should
>             produce the data/debug.log/stdout/stderr/whiteboard in there
>         as well
>             as propagate the debug.log with a prefix to the main test's
>             debug.log (as well as job.log).
>
>             └── 1-parallel_wget.py:WgetExample.test  -- main test
>                 ├── data
>                 ├── debug.log  -- contains main log + nested logs with
>         prefixes
>                 ├── remote.log
>                 ├── stderr
>                 ├── stdout
>                 ├── sysinfo
>                 │   ├── post
>                 │   ├── pre
>                 │   └── profile
>                 ├── whiteboard
>                 ├── 1-_usr_bin_wget\ example.org <http://example.org>
>         <http://example.org>  -- first
>             nested test
>                 │   ├── data
>                 │   ├── debug.log  -- contains only this nested test log
>                 │   ├── remote.log
>                 │   ├── stderr
>                 │   ├── stdout
>                 │   └── whiteboard
>                 ├── 2-_usr_bin_wget\ example.org <http://example.org>
>         <http://example.org>  -- second
>             nested test
>             ...
>                 └── 3-_usr_bin_wget\ example.org <http://example.org>
>         <http://example.org>  -- third
>             nested test
>             ...
>
>             Note that nested tests can finish with any result and it's
>         up to the
>             main test to evaluate that. This means that theoretically
>         you could
>             find nested tests which states `FAIL` or `ERROR` in the end.
>         That
>             might be confusing, so I think the `NestedRunner` should
>         append last
>             line to the test's log saying `Expected FAILURE` to avoid
>         confusion
>             while looking at results.
>
>             Note2: It might be impossible to pass messages in real-time
>         across
>             multiple machines, so I think at the end the main job.log
>         should be
>             copied to `raw_job.log` and the `job.log` should be reordered
>             according to date-time of the messages. (alternatively we
>         could only
>             add a contrib script to do that).
>
>
>             Conclusion
>             ==========
>
>             I believe nested tests would help people covering very complex
>             scenarios by splitting them into pieces similarly to Lego.
>         It allows
>             easier per-component development, consistent results which
>         are easy
>             to analyze as one can see both, the overall picture and the
>         specific
>             pieces and it allows fixing bugs in all tests by fixing the
>         single
>             piece (nested test).
>
>
>             _______________________________________________
>             Avocado-devel mailing list
>             Avocado-devel at redhat.com <mailto:Avocado-devel at redhat.com>
>         <mailto:Avocado-devel at redhat.com <mailto:Avocado-devel at redhat.com>>
>             https://www.redhat.com/mailman/listinfo/avocado-devel
>
>
>
>
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/avocado-devel/attachments/20160525/a3862597/attachment.sig>