[Avocado-devel] RFC: Nested tests (previously multi-stream test) [v5]

Fri May 27 15:24:40 UTC 2016

On Fri, May 27, 2016 at 10:33:30AM +0200, Lukáš Doktor wrote:
> Dne 25.5.2016 v 23:36 Ademar Reis napsal(a):
> > On Wed, May 25, 2016 at 04:18:38PM -0300, Cleber Rosa wrote:
> > > 
> > > 
> > > On 05/24/2016 11:53 AM, Lukáš Doktor wrote:
> > > > Hello guys,
> > > > 
> > > > this version returns to roots and tries to define clearly the single
> > > > solution I find teasing for multi-host and other complex tests.
> > > > 
> > > > Changes:
> > > > 
> > > >     v2: Rewritten from scratch
> > > >     v2: Added examples for the demonstration to avoid confusion
> > > >     v2: Removed the mht format (which was there to demonstrate manual
> > > >         execution)
> > > >     v2: Added 2 solutions for multi-tests
> > > >     v2: Described ways to support synchronization
> > > >     v3: Renamed to multi-stream as it befits the purpose
> > > >     v3: Improved introduction
> > > >     v3: Workers are renamed to streams
> > > >     v3: Added example which uses library, instead of new test
> > > >     v3: Multi-test renamed to nested tests
> > > >     v3: Added section regarding Job API RFC
> > > >     v3: Better description of the Synchronization section
> > > >     v3: Improved conclusion
> > > >     v3: Removed the "Internal API" section (it was a transition between
> > > >         no support and "nested test API", not a "real" solution)
> > > >     v3: Using per-test granularity in nested tests (requires plugins
> > > >         refactor from Job API, but allows greater flexibility)
> > > >     v4: Removed "Standard python libraries" section (rejected)
> > > >     v4: Removed "API backed by cmdline" (rejected)
> > > >     v4: Simplified "Synchronization" section (only describes the
> > > >         purpose)
> > > >     v4: Refined all sections
> > > >     v4: Improved the complex example and added comments
> > > >     v4: Formulated the problem of multiple tasks in one stream
> > > >     v4: Rejected the idea of bounding it inside MultiTest class
> > > >         inherited from avocado.Test, using a library-only approach
> > > >     v5: Avoid mapping ideas to multi-stream definition and clearly
> > > >         define the idea I bear in my head for test building blocks
> > > >         called nested tests.
> > > > 
> > > > 
> > > > Motivation
> > > > ==========
> > > > 
> > > > Allow building complex tests out of existing tests producing a single
> > > > result depending on the complex test's requirements. Important thing is,
> > > > that the complex test might run those tests on the same, but also on a
> > > > different machine allowing simple development of multi-host tests. Note
> > > > that the existing tests should stay (mostly) unchanged and executable as
> > > > simple scenarios, or invoked by those complex tests.
> > > > 
> > > > Examples of what could be implemented using this feature:
> > > > 
> > > > 1. Adding background (stress) tasks to existing test producing
> > > > real-world scenarios.
> > > >    * cpu stress test + cpu hotplug test
> > > >    * memory stress test + migration
> > > >    * network+cpu+memory test on host, memory test on guest while
> > > >      running migration
> > > >    * running several migration tests (of the same and different type)
> > > > 
> > > > 2. Multi-host tests implemented by splitting them into components and
> > > > leveraging them from the main test.
> > > >    * multi-host migration
> > > >    * stressing a service from different machines
> > > > 
> > > > 
> > > > Nested tests
> > > > ============
> > > > 
> > > > Test
> > > > ----
> > > > 
> > > > A test is a receipt explaining prerequisites, steps to check how the
> > > > unit under testing behaves and cleanup after successful or unsuccessful
> > > > execution.
> > > > 
> > > 
> > > You probably meant "recipe" instead of "receipt".  OK, so this is an
> > > abstract definition...
> > > 
> > > > Test itself contains lots of neat features to simplify logging, results
> > > > analysis and error handling evolved to simplify testing.
> > > > 
> > > 
> > > ... while this describes concrete conveniences and utilities that users of
> > > the Avocado Test class can expect.
> > > 
> > > > Test runner
> > > > -----------
> > > > 
> > > > Is responsible for driving the test(s) execution, which includes the
> > > > standard test workflow (setUp/test/tearDown), handle plugin hooks
> > > > (results/pre/post) as well as safe interruption.
> > > > 
> > > 
> > > OK.
> > > 
> > > > Nested test
> > > > -----------
> > > > 
> > > > Is a test invoked by other test. It can either be executed in foreground
> > > 
> > > I got from this proposal that a nested test always has a parent.  Basic
> > > question is: does this parent have to be a regular (that is, non-nested)
> > > test?
> > > 
> > > Then, depending on the answer, the following question would also apply: do
> > > you believe a nesting level limit should be enforced?
> > 
> > Let's not introduce yet another concept here. I don't think there
> > would be a need for a "non-nested parent" rule.
> > 
> > Ditto for nesting level limits, I see no reason to break the
> > current abstraction. If we introduce enforcement, then the test
> > will have to be run in an environment that knows if it's nested
> > or not (and at which level). Proper error handling will require
> > this information and/or even worse: test writers may have access
> > to this variable and start using it.
> > 
> > > 
> > > > (while the main test is waiting) or in background along with the main
> > > > (and other background tests) test. It should follow the default test
> > > > workflow (setUp/test/tearDown), it should keep all the neat test feature
> > > > like logging and error handling and the results should also go into the
> > > > main test's output, with the nested test's id  as prefix. All the
> > > > produced files of the nested test should be located in a new directory
> > > > inside the main test results dir in order to be able to browse either
> > > > overall results (main test + nested tests) or just the nested tests ones.
> > > > 
> > > 
> > > Based on the example given later, you're attributing to the NestedRunner the
> > > responsibility to put the nested test results "in the right" location.  It
> > > sounds appropriate.  The tricky questions are really how they show up in the
> > > overall job/test result structure, because that reflects how much the
> > > NestedRunner looks like a "Job".
> > 
> > Since it's a test, which kind of visibility does the job have
> > about it? Is the result interpretation entirely up to the parent
> > test? Would nested-tests results be considered arbitrary data?
> > (think of test-results storage, or a database).
> > 
> Basically nested test results are similar to `whiteboard`. It's
> an extra information one can use and find in the results
> directory. I'd consider listing the directory in the `json`
> output in case there are any.

Which directory are you talking about here? Can you give an
example (I'm trying to understand what you have in mind).

> 
> > > 
> > > > Resolver
> > > > --------
> > > > 
> > > > Resolver is an avocado component resolving a test reference into a list
> > > > of test templates compound of the test name, params and other
> > > > `avocado.Test.__init__` arguments.
> > > > 
> > > > Very simple example
> > > > -------------------
> > > > 
> > > > This example demonstrates how to use existing test (SimpleTest
> > > > "/usr/bin/wget example.org") in order to create a complex scenario
> > > > (download the main page from example.org from multiple computers almost
> > > > concurrently), without any modifications of the `SimpleTest`.
> > 
> > This won't sound as a surprise to you, but although this is a
> > valid use-case, I don't think nested tests is the proper solution
> > for it. :-)
> > 
> It's just the simplest demonstration example I could come up
> with.
> 
> > > > 
> > > >     import avocado
> > > > 
> > > >     class WgetExample(avocado.Test):
> > > >         def test(self):
> > > >             # Initialize nested test runner
> > > >             self.runner = avocado.NestedRunner(self)
> > > >             # This is what one calls on "avocado run"
> > > >             test_reference = "/usr/bin/wget example.org"
> > > >             # This is the resolved list of templates
> > > >             tests = avocado.resolver.resolve(test_reference)
> > > >             # We could support list of results, but for simplicity
> > > >             # allow only single test.
> > > >             assert len(tests) == 1, ("Resolver produced multiple test "
> > > >                                      "names: %s\n%s" % (test_reference,
> > > >                                                         tests)
> > > >             test = tests[0]
> > > >             for machine in self.params.get("machines"):
> > > >                 # Query a background job on the machine (local or
> > > >                 # remote) and return test id in order to query for
> > > >                 # the particular results or task interruption, ...
> > > >                 self.runner.run_bg(machine, test)
> > 
> > Here we're missing something: you've described what a nested-test
> > is, but now you're introducing another concept: the ability to
> > run (some of these) sub-tests on different machines or
> > environments.
> > 
> > Which is precisely where it gets ugly, as it brings to the layer
> > of tests concepts which belong to a job. You should write at
> > least one section in your RFC to describe what you have in
> > mind in this case.
> > 
> 
> Yes, a job defines what test should be executed in what order
> and where, but it only defines this and then hand over the
> tasks to the runner. The runner itself is responsible for
> running local tasks locally, remote tasks remotely and parallel
> tasks in parallel. It's only the current avocado limitation and
> implementation detail, that it does not support parallel
> execution in runner and it runs tests remotely by running the
> full job remotely and reporting the results.
> 
> If we want parallel/remote per-test granularity in Job API,
> then we have to change this anyway. Then the job would define
> those relations, but runner would be the responsible one to
> leverage that.
> 
> When we come back to nested tests, the situation is absolutely
> the same. We have a test and the test decided to split some
> tasks into several nested tests. So the test itself knows what
> tests does it needs to run, when and where. Then it involves
> the nested runner to do the hard work and report results. Then
> the test gets to chose what to do with the results.
> 
> So yes, they are similar, connected, there is some overlap, but
> it's the same overlap as everywhere else. Basically Job API and
> nested tests are only different from point of view. Job API is
> focused on describing what tests should be executed, test
> dependencies and producing unified results the nested tests API
> is focused on composing tests of building blocks (tests).
> Whether those blocks are executed on the same machine or in
> parallel is not important to the job at all.
> > 
> > > >             # Wait for all background tasks to finish, raise exception
> > > >             # if any of them fails.
> > > >             self.runner.wait(ignore_errors=False)
> > 
> > You also didn't say anything about synchronization, although I'm
> > sure you do have something in mind. Do you expect nested-tests to
> > communicate with, or depend on, each other?
> > 
> You mentioned in previous versions, that it's not really
> important and is a detail. Yes, it'd be useful for most of the
> tests, but not essential. Yes, at least a barrier mechanism
> would be needed for most of the tests. (I mentioned the usage
> in response to Vincent Matossian)

Indeed, it's an implementation detail of multi-flow tests, but
nested tests are not necessarily the same thing. We could have
support for nested-tests completely independent of support for
multi-flow tests, solving different problems.

So I guess what I missed is a paragraph explaining your intention
here (imagine reading this RFC without the context from the
previous RFCs, where this was called "multi-stream" or
"multi-test"). I understand it now.

> 
> > > > 
> > > 
> > > Just for accounting purposes at this point, and not for applying judgment,
> > > let's take note that this approach requires the following sets of APIs to
> > > become "Test APIs":
> > > 
> > > * avocado.NestedRunner
> > > * avocado.resolver
> > > 
> > > Now, doing a bit of judgment. If I were an Avocado newcomer, looking at the
> > > Test API docs, I'd be intrigued at how these belong to the same very select
> > > group that includes only:
> > > 
> > > * avocado.Test
> > > * avocado.fail_on
> > > * avocado.main
> > > * avocado.VERSION
> > > 
> > > I'm not proposing a different approach or a different architecture.  If the
> > > proposed architecture included something like a NestedTest class, then
> > > probably the feeling is that it would indeed naturally belong to the same
> > > group.  I hope I managed to express my feeling, which may just be
> > > overreaction. If others share the same feeling, then it may be a sign of a
> > > red flag.
> > > 
> > > Now, considering my feeling is not an overreaction, this is how such an
> > > example could be written so that it does not put NestedRunner and resolver
> > > in the Test API namespace:
> > > 
> > >     from avocado import Test
> > >     from avocado.utils import nested
> > > 
> > >     class WgetExample(Test):
> > >     def test(self):
> > >         reference = "/usr/bin/wget example.org"
> > >         tests = []
> > >         for machine in self.params.get("machines"):
> > >            tests.append(nested.run_test_reference(self, reference, machine))
> > >         nested.wait(tests, ignore_errors=False)
> > > 
> > > This would solve the crossing (or pollution) of the Test API namespace, but
> > > it has a catch: the test reference resolution is either included in
> > > `run_test_reference` (which is a similar problem) or delegated to the remote
> > > machine.  Having the reference delegated sounds nice, until you need to
> > > identify backing files for the tests and copy them over to the remote
> > > machine.  So, take this as food for thought, and not as a fail proof
> > > solution.
> > > 
> > > > When nothing fails, this usage has no benefit over the simple logging
> > > > into a machine and firing up the command. The difference is, when
> > > > something does not work as expected. With nested test, one get a runner
> > > > exception if the machine is unreachable. And on test error he gets not
> > > > only overall log, but also the per-nested-test results simplifying the
> > > > error analysis. For 1, 2 or 3 machines, this makes no difference, but
> > > > imagine you want to run this from hundreds of machines. Try finding the
> > > > exception there.
> > > > 
> > > 
> > > I agree that it's nice to have the nested tests' logs.  What you're
> > > proposing is *core* (as in Test API) convenience, over something like:
> > > 
> > >     from avocado import Test
> > >     from avocado.utils import nested
> > > 
> > >     class WgetExample(Test):
> > >     def test(self):
> > >         reference = "/usr/bin/wget example.org"
> > >         tests = []
> > >         for machine in self.params.get("machines"):
> > >            tests.append(nested.run_test_reference(self, reference, machine))
> > >         nested.wait(tests, ignore_errors=False)
> > >         nested.save_results(tests,
> > >                             os.path.join(self.resultsdir, "nested"))
> > 
> > I agree it should not be part of the core API.
> > 
> > > 
> > > > Yes, you can implement the above without nested tests, but it requires a
> > > > lot of boilerplate code to establish the connection (or raise an
> > > > exception explaining why it was not possible and I'm not talking about
> > > > "unable to establish connection", but granularity like "Invalid
> > > > password", "Host is down", ...). Then you'd have to setup the output
> > > > logging for that particular task, add the prefix, run the task (handling
> > > > all possible exceptions) and interpret the results. All of this to get
> > > > the same benefits very simple avocado test provides you.
> > > > 
> > > 
> > > Having boiler plate code repeatedly written by users is indeed not a good
> > > thing.  And a well thought out API for users is the way to prevent boiler
> > > plate code from spreading around in tests.
> > > 
> > > The exception handling, that is, raising exceptions to flag failures in the
> > > nested tests execution is also a given IMHO.
> > > 
> > > > Advanced example
> > > > ----------------
> > > > 
> > > > Imagine a very complex scenario, for example a cloud with several
> > > > services. One could write a big-fat test tailored just for this scenario
> > > > and keep adding sub-scenarios producing unreadable source code.
> > > > 
> > > > With nested tests one could split this task into tests:
> > > > 
> > > >  * Setup a fake network
> > > >  * Setup cloud service
> > > >  * Setup in-cloud service A/B/C/D/...
> > > >  * Test in-cloud service A/B/C/D/...
> > > >  * Stress network
> > > >  * Migrate nodes
> > 
> > I don't understand your motivation here. Do you mean that setting
> > up a fake network as a (sub-)test would be a positive thing?
> > 
> Yes, by fake network I mean the added latency, additional hops,
> ... to emulate usage over the internet. This is something you
> can reuse for many tests. You can implement it as a library,
> but then you have to setup the logging, handle carefully all
> exceptions, ... Writing tests and more importantly analyzing
> test results is way easier.

That's not a test. I've seen this idea from you in the previous
discussion, of writing everything as a test. I don't see the
point of encapsulating code that does setup or configuration as a
test. If it's not testing something, it's not a test, by
definition. It can be a script or a library, but it's not a test.

"Just because you can do something, it doesn't mean you should"

Writing it as a test will cause headaches and waste of resources
when using things like the multiplexer.

> 
> > > > 
> > > > New variants could be easily added, for example DDoS attack to some
> > > > nodes, node hotplug/unplug, ... by invoking those existing tests and
> > > > combining them into a complex test.
> > > > 
> > > > Additionally note that some of the tests, eg. the setup cloud service
> > > > and setup in-cloud service are quite generic tests, what could be reused
> > > > many times in different tests. Yes, one could write a library to do
> > > > that, but in that library he'd have to handle all exceptions and provide
> > > > nice logging, while not clutter the main output with unnecessary
> > > > information.
> > 
> > Or one could create a job that runs the individual tests as
> > needed.
> > 
> > For this particular use-case, a custom job has many advantages.
> > To mention just one: the multiplexer.
> > 
> That depends. You can pass params to tests too, you can
> multiplex the full test too and you can expect some failures.

Please don't tell me you had the intention of allowing usage of
different multiplex files for each nested test. :-)

> For example you can have variant where you run `migrate-node`
> test, but due to some setting you expect it to fail. When using
> Job API that would produce failure in results and overall job
> failure.

If I expect it to fail, then it's not a failure.

> 
> So the only benefit in implementing this as a job is you get
> the results of all steps, which is useful for
> test-in-cloud-services but not for all the tests before and
> after. They are just background/setup tasks created in form of
> tests in order to simplify failure analysis.
> 
> Basically as everywhere else in the RFC. If you want a single
> result from your scenario, it's nested-test material. If you
> want per-test results, you want job API.

Let me elaborate a bit: right now the multiplexer is global per
job due to limitations in the test runner, but the plan for the
short term is to allow individual multiplexer files per test.
The high level design/architecture allows that and the job API
will certainly expose a way to do it.

So, for example, it should be possible to do something similar to:

 $ avocado run test1.py --multiplexer=mux1.yaml \
               test2.py --multiplexer=mux2.yaml \
               test3.py --multiplexer=mux3.yaml \
               test4.py --multiplexer=mux4.yaml

 Resulting in something like:

   01-test1.py:mux1.var1
   02-test1.py:mux1.var2
   03-test1.py:mux1.var3
   04-...
   05-test2.py:mux2.var1
   06-test2.py:mux2.var2
   07-...
   08-test3.py:mux3.var1
   09-test3.py:mux3.var2
   10-test3.py:mux3.var3
   11-test3.py:mux3.var4
   12-...
   13-test4.py:mux4.var1
   14-...

But with nested tests, you can't:

  $ avocado run super-test.py --multiplexer 3variants.yaml
  ("super-test" has 4 nested tests)

     1-super-test.py:var1
         - nested-test1.py
         - nested-test2.py
         - nested-test3.py
         - nested-test4.py
         (all using var1)
     2-super-test.py:var2
         - nested-test1.py
         - nested-test2.py
         - nested-test3.py
         - nested-test4.py
         (all using var2)
     3-super-test.py:var3
         - nested-test1.py
         - nested-test2.py
         - nested-test3.py
         - nested-test4.py
         (all using var3)

> 
> > > > 
> > > > Job results
> > > > -----------
> > > > 
> > > > Combine (multiple) test results into understandable format. There are
> > > > several formats, the most generic one is file format:
> > > > 
> > > > .
> > > > ├── id  -- id of this job
> > > > ├── job.log  -- overall job log
> > > > └── test-results  -- per-test-directories with test results
> > > >     ├── 1-passtest.py:PassTest.test  -- first test's results
> > > >     └── 2-failtest.py:FailTest.test  -- second test's results
> > > > 
> > > > Additionally it contains other files and directories produced by avocado
> > > > plugins like json, xunit, html results, sysinfo gathering and info
> > > > regarding the replay feature.
> > > > 
> > > 
> > > OK, this is pretty much a review.
> > > 
> > > > Test results
> > > > ------------
> > > > 
> > > > In the end, every test produces results, which is what we're interested
> > > > in. The results must clearly define the test status, should provide a
> > > > record of what was executed and in case of failure, they should provide
> > > > all the information in order to find the cause and understand the failure.
> > > > 
> > > > Standard tests does that by providing test log (debug, info, warning,
> > > > error, critical), stdout, stderr, allowing to write to whiteboard and
> > > > attach files in the results directory. Additionally due to structure of
> > > > the test one knows what stage(s) of the test failed and pinpoint exact
> > > > location of the failure (traceback in the log).
> > > > 
> > > > .
> > > > ├── data  -- place for other files produced by a test
> > > > ├── debug.log  -- debug, info, warn, error log
> > > > ├── remote.log  -- additional log regarding remote session
> > > > ├── stderr  -- standard error
> > > > ├── stdout  -- standard output
> > > > ├── sysinfo  -- provided by sysinfo plugin
> > > > │   ├── post
> > > > │   ├── pre
> > > > │   └── profile
> > > > └── whiteboard  -- file for arbitrary test data
> > > > 
> > > > I'd like to extend this structure of either a directory "subtests", or
> > > > convention for directories intended for nested test results `r"\d+-.*"`.
> > > > 
> > > 
> > > Having them on separate sub directory is less intrusive IMHO.  I'd even
> > > argue that `data/nested` is the way to go.
> > 
> > +1.
> > 
> > > 
> > > > The `r"\d+-.*"` reflects the current test-id notation, which nested
> > > > tests should also respect, replacing the serialized-id by
> > > > in-test-serialized-id. That way we easily identify which of the nested
> > > > tests was executed first (which does not necessarily mean it finished as
> > > > first).
> > 
> > So the nested-tests will have "In-Test-Test-IDs", which are different
> > than "Test-IDs".
> > 
> Yes
> 
> > > > 
> > > > In the end nested tests should be assigned a directory inside the main
> > > > test's results (or main test's results/subtests) and it should produce
> > > > the data/debug.log/stdout/stderr/whiteboard in there as well as
> > > > propagate the debug.log with a prefix to the main test's debug.log (as
> > > > well as job.log).
> > > > 
> > > > └── 1-parallel_wget.py:WgetExample.test  -- main test
> > > >     ├── data
> > > >     ├── debug.log  -- contains main log + nested logs with prefixes
> > > >     ├── remote.log
> > > >     ├── stderr
> > > >     ├── stdout
> > > >     ├── sysinfo
> > > >     │   ├── post
> > > >     │   ├── pre
> > > >     │   └── profile
> > > >     ├── whiteboard
> > > >     ├── 1-_usr_bin_wget\ example.org  -- first nested test
> > > >     │   ├── data
> > > >     │   ├── debug.log  -- contains only this nested test log
> > > >     │   ├── remote.log
> > > >     │   ├── stderr
> > > >     │   ├── stdout
> > > >     │   └── whiteboard
> > > >     ├── 2-_usr_bin_wget\ example.org  -- second nested test
> > > > ...
> > > >     └── 3-_usr_bin_wget\ example.org  -- third nested test
> > > > ...
> > 
> > And with the above, a test-ID is not unique anymore in logs and
> > in the results directory. For example, when looking for
> > "1-foobar.py", I may find:
> > 
> >   - foobar.py, the first test run inside the job
> >   AND
> >   - multiple foobar.py, run as a nested test inside an arbitrary
> >     parent test.
> > 
> > That's why I said you would need "In-Test-Test-IDs" (or
> > "Nested-Test-IDs").
> > 
> Yes, when using `find ...` you could get multiple directories,
> but not inside `test-results`, but in `test-results/*/nested/*`
> (or nested nested ones). For me that is convenient, because you
> are used to "parse" the test-ids and you know where to expect
> them. So when you want to look for them, you don't want to
> think whether they should have this form or the other.

That's a huge loss. The most important thing about test IDs is
that they're unique per job. One should be able to unambiguously
grep for a test-ID in logs and in the results.

That's why I said nested-tests would need "In-Test-Test-IDs".
They'll require a different format and a different specification.

(this is what happens when we break abstractions by or violate
layers: we have to redefine or extend concepts previously
defined, a bad sign)

> 
> > > > 
> > > > Note that nested tests can finish with any result and it's up to the
> > > > main test to evaluate that. This means that theoretically you could find
> > > > nested tests which states `FAIL` or `ERROR` in the end. That might be
> > > > confusing, so I think the `NestedRunner` should append last line to the
> > > > test's log saying `Expected FAILURE` to avoid confusion while looking at
> > > > results.
> > > > 
> > > 
> > > This special injection, and special handling for that matter, actually makes
> > > me more confused.
> > 
> > Agree. This is something to add to the parent log (which is
> > waiting for the nested-test result).
> > 
> No problem with that.
> 
> > > 
> > > > Note2: It might be impossible to pass messages in real-time across
> > > > multiple machines, so I think at the end the main job.log should be
> > > > copied to `raw_job.log` and the `job.log` should be reordered according
> > > > to date-time of the messages. (alternatively we could only add a contrib
> > > > script to do that).
> > 
> > You probably mean debug.log (parent test), not job.log.
> > 
> > I'm assuming the nested tests would run in "jobless" mode (is
> > that the case? If yes, you need to specify what it means).
> > 
> Both actually. So the contrib script would be the best solution
> as one can point out to the file he's interested in.

Which "both" you're referencing? Are you talking about
job.log/debug.log, or answering my question about "jobless" mode?

> 
> > > > 
> > > 
> > > Definitely no to another special handling.  Definitely yes to a post-job
> > > contrib script that can reorder the log lines.
> > 
> > +1
> > 
> > > 
> > > > 
> > > > Conclusion
> > > > ==========
> > > > 
> > > > I believe nested tests would help people covering very complex scenarios
> > > > by splitting them into pieces similarly to Lego. It allows easier
> > > > per-component development, consistent results which are easy to analyze
> > > > as one can see both, the overall picture and the specific pieces and it
> > > > allows fixing bugs in all tests by fixing the single piece (nested test).
> > > > 
> > > 
> > > It's pretty clear that running other tests from tests is *useful*, that's
> > > why it's such a hot topic and we've been devoting so much energy to
> > > discussing possible solutions.  NestedTests is one to do it, but I'm not
> > > confident we have enough confidence to make it *the* way to do it. The
> > > feeling that I have at this point, is that maybe we should prototype it as
> > > utilities to:
> > > 
> > >  * give Avocado a kickstart on this niche/feature set
> > >  * avoid as much as possible user-written boiler plate code
> > >  * avoid introducing *core* test APIs that would be set in stone
> > > 
> > > The gotchas that we have identified so far, are IMHO, enough to restrain us
> > > from forcing this kind of feature into the core test API, which we're in
> > > fact, trying to clean up.
> > > 
> > > With user exposition and feedback, this, a modified version or a completely
> > > different solution, can evolve into *the* core (and supported) way to do it.
> > > 
> > 
> > I tend to disagree. I think it should be the other way around:
> > maybe, once we have a Job API, we can consider the possibilities
> > of supporting nested-tests, reusing some of the other concepts.
> Yes, the work on Job API is shared with nested test, so we can
> postpone this till then and see, whether building blocks inside
> tests would be useful. I think they would be.
> 
> > 
> > Nested tests (as in: "simply running tests inside tests") is
> > relatively OK to digest. Not that I like it, but it's relatively
> > simple.
> > 
> > But what Lukas is proposing involves at least three more features
> > or APIs, all of which relate to a Job and should be implemented
> > there before being considered in the context of a test:
> > 
> >  - API and mechanism for running tests on different machines or
> >    environments (at least at first, a Job API)
> IMO this is the runner task, not job task (job only demands to
> run it's tests on a specific machine/environment)

That's because you're trying to portrait the runner as something
that can be used to run tests inside other tests, which is not
the case. The runner runs a job, so they're strongly related.

In other words, you're proposing the runner to be independent of
the job. But runner and job are at the same layer in the current
avocado architecture. Allowing the full runner to be used inside
a test is a layer violation (a huge one).

That's why I said running simple nested tests defined as "tests
that run inside tests" would be relatively OK to digest. They
would have to be run in "jobless" mode, without any configuration
or customization. They would run in the exact same environment
and same location where the parent test is running. They would
not have Test-IDs. I wouldn't *support* this idea, but I wouldn't
be so strongly against it.

> 
> >  - API and mechanism for running tests in parallel (ditto)
> That's the same. Job specifies requirements, runner fulfills
> them. If test has certain demands, why shouldn't nested runner
> fulfill them. Job should have no relation to, nor interest in
> this.

Same thing here.

> 
> >  - API and mechanism to allow tests to synchronize and wait for
> >    barriers (which might be useful once we can run tests in
> >    parallel).
> Yep, basically the synchronization is shared and useful even
> for simple tests, only for multi-host environment we might want
> to add some utils to simplify this.
> 
> > 
> > To me the idea of "nested tests than can be run in multiple
> > machines, under different configurations and with synchronization
> > between them" is fundamentally flawed. It's a huge layer
> > violation that brings all kinds of architectural problems.
> To me it's useful and clean solution using nested runner,
> inherited from job runner. Easy to learn which gives expected
> results while learning just one API (Job API and nested test
> API should be designed similarly). But I agree we should focus
> on Job API first and then reconsider the usage inside tests as
> inside tests the focus is not no triggering the tests, but on
> utilizing the task under testing the way we need to.
> 
> > 
> > Thanks.
> >    - Ademar
> > 
> 
> After reading this, I have another suggestion. We could
> actually completely shred this RFC (well hopefully some ideas
> from this RFC would survive inside the Job API) if we consider
> allowing expected failures (skips, warns, nonskips, errors,
> ...) in Job API. Then the only missing piece would be to allow
> to run several workflows (job definitions) in a single job, but
> that's I think too much to ask (avocado run boot-test job-cloud
> job-parallel-tests; where boot-test is a test, job-cloud is
> job-API scenario and job-parallel-tests is job API scenario to
> run some tests in parallel; the result should be list of all
> triggered test results, so boot-test cloud-setup cloud-test
> parallel1 parallel2 parallel3 parallel4).

It's been an interesting exercise. I'm strongly against nested
tests as proposed because I think it's a layer violation that
breaks the architecture (although that sounds theoretical,
breaking the architecture causes tons of practical issues in the
future and I've wrote the examples and scenarios that break it in
my previous e-mails in response to the other versions of this
RFC).

We will certainly use several of the ideas discussed here in
other RFCs, such as the Job API and multi-flow tests.

Thanks.
  - Ademar

-- 
Ademar Reis
Red Hat

^[:wq!