[Avocado-devel] RFC: Nested tests (previously multi-stream test) [v5]

Thu May 26 07:15:11 UTC 2016

Dne 25.5.2016 v 21:18 Cleber Rosa napsal(a):
>
>
> On 05/24/2016 11:53 AM, Lukáš Doktor wrote:
>> Hello guys,
>>
>> this version returns to roots and tries to define clearly the
>> single solution I find teasing for multi-host and other complex
>> tests.
>>
>> Changes:
>>
>> v2: Rewritten from scratch v2: Added examples for the demonstration
>> to avoid confusion v2: Removed the mht format (which was there to
>> demonstrate manual execution) v2: Added 2 solutions for
>> multi-tests v2: Described ways to support synchronization v3:
>> Renamed to multi-stream as it befits the purpose v3: Improved
>> introduction v3: Workers are renamed to streams v3: Added example
>> which uses library, instead of new test v3: Multi-test renamed to
>> nested tests v3: Added section regarding Job API RFC v3: Better
>> description of the Synchronization section v3: Improved conclusion
>> v3: Removed the "Internal API" section (it was a transition
>> between no support and "nested test API", not a "real" solution)
>> v3: Using per-test granularity in nested tests (requires plugins
>> refactor from Job API, but allows greater flexibility) v4: Removed
>> "Standard python libraries" section (rejected) v4: Removed "API
>> backed by cmdline" (rejected) v4: Simplified "Synchronization"
>> section (only describes the purpose) v4: Refined all sections v4:
>> Improved the complex example and added comments v4: Formulated the
>> problem of multiple tasks in one stream v4: Rejected the idea of
>> bounding it inside MultiTest class inherited from avocado.Test,
>> using a library-only approach v5: Avoid mapping ideas to
>> multi-stream definition and clearly define the idea I bear in my
>> head for test building blocks called nested tests.
>>
>>
>> Motivation ==========
>>
>> Allow building complex tests out of existing tests producing a
>> single result depending on the complex test's requirements.
>> Important thing is, that the complex test might run those tests on
>> the same, but also on a different machine allowing simple
>> development of multi-host tests. Note that the existing tests
>> should stay (mostly) unchanged and executable as simple scenarios,
>> or invoked by those complex tests.
>>
>> Examples of what could be implemented using this feature:
>>
>> 1. Adding background (stress) tasks to existing test producing
>> real-world scenarios. * cpu stress test + cpu hotplug test * memory
>> stress test + migration * network+cpu+memory test on host, memory
>> test on guest while running migration * running several migration
>> tests (of the same and different type)
>>
>> 2. Multi-host tests implemented by splitting them into components
>> and leveraging them from the main test. * multi-host migration *
>> stressing a service from different machines
>>
>>
>> Nested tests ============
>>
>> Test ----
>>
>> A test is a receipt explaining prerequisites, steps to check how
>> the unit under testing behaves and cleanup after successful or
>> unsuccessful execution.
>>
>
> You probably meant "recipe" instead of "receipt".  OK, so this is an
> abstract definition...
yep, sorry for confusion.

>
>> Test itself contains lots of neat features to simplify logging,
>> results analysis and error handling evolved to simplify testing.
>>
>
> ... while this describes concrete conveniences and utilities that
> users of the Avocado Test class can expect.
>
>> Test runner -----------
>>
>> Is responsible for driving the test(s) execution, which includes
>> the standard test workflow (setUp/test/tearDown), handle plugin
>> hooks (results/pre/post) as well as safe interruption.
>>
>
> OK.
>
>> Nested test -----------
>>
>> Is a test invoked by other test. It can either be executed in
>> foreground
>
> I got from this proposal that a nested test always has a parent.
> Basic question is: does this parent have to be a regular (that is,
> non-nested) test?
I think it's mentioned later, nested test should be unmodified normal test executed from a test, which means there is no limit. On the other hand the main test has no knowledge whatsoever about the nested-nested tests as they are masked by the nested test.

Basically the knowledge transfer is:

   main test
   -> trigger a nested test
        nested test
        -> trigger nested test
            nested nested test
            <- report result (PASS/FAIL/...)
        process nested nested results
        <- report nested result (PASS/FAIL/...)
    process nested results
    <- report result (PASS/FAIL/...)

therefor in the json/xunit results you only see main test's result (PASS/FAIL/...), but you can poke around and for details either.

The main test's logs could look like this:

START 1-passtest.py:PassTest.test
Not logging /var/log/messages (lack of permissions)
1-nested.py:Nested.test: START 1-nested.py:Nested.test
1-nested.py:Nested.test: 1-nestednested.py:NestedNested.test: START 1-nestednested.py:NestedNested.test
1-nested.py:Nested.test: 1-nestednested.py:NestedNested.test: Some message from nestednested
1-nested.py:Nested.test: A message from nested
Main test message
1-nested.py:Nested.test: 1-nestednested.py:NestedNested.test: FAIL 1-nestednested.py:NestedNested.test
1-nested.py:Nested.test: Nestednested test failed as expected
1-nested.py:Nested.test: PASS 1-nested.py:Nested.test
The nested test passed, let's finish with PASS
PASS 1-passtest.py:PassTest.test

The nested test's logs:

START 1-nested.py:Nested.test
1-nestednested.py:NestedNested.test: START 1-nestednested.py:NestedNested.test
1-nestednested.py:NestedNested.test: Some message from nestednested
A message from nested
1-nestednested.py:NestedNested.test: FAIL 1-nestednested.py:NestedNested.test
Nestednested test failed as expected
PASS 1-nested.py:Nested.test

The nested nested test's log:

START 1-nestednested.py:NestedNested.test
Some message from nestednested
FAIL 1-nestednested.py:NestedNested.test

The results dir (of the main test):

job.log
\- test-results
     \- 1-passtest.py:PassTest.test
        \- nested-tests
           \- 1-nested.py:Nested.test
              \- nested-tests
                 \- 1-nestednested.py:NestedNested.test

And the json results (of the main test):

{
     "debuglog": "/home/medic/avocado/job-results/job-2016-05-26T08.26-1c81612/job.log",
     "errors": 0,
     "failures": 0,
     "job_id": "1c816129aa3b10fc03270e4e32657b9e2893d5d7",
     "pass": 1,
     "skip": 0,
     "tests": [
         {
             "end": 1464243993.997021,
             "fail_reason": "None",
             "logdir": "/home/medic/avocado/job-results/job-2016-05-26T08.26-1c81612/test-results/1-passtest.py:PassTest.test",
             "logfile": "/home/medic/avocado/job-results/job-2016-05-26T08.26-1c81612/test-results/1-passtest.py:PassTest.test/debug.log",
             "start": 1464243993.996127,
             "status": "PASS",
             "test": "1-passtest.py:PassTest.test",
             "time": 0.0008940696716308594,
             "url": "1-passtest.py:PassTest.test",
             "whiteboard": ""
         }
     ],
     "time": 0.0008940696716308594,
     "total": 1
}

>
> Then, depending on the answer, the following question would also
> apply: do you believe a nesting level limit should be enforced?
>
I see your point, right now I'd leave it on OOM killer, but we might think about it.

>> (while the main test is waiting) or in background along with the
>> main (and other background tests) test. It should follow the
>> default test workflow (setUp/test/tearDown), it should keep all the
>> neat test feature like logging and error handling and the results
>> should also go into the main test's output, with the nested test's
>> id  as prefix. All the produced files of the nested test should be
>> located in a new directory inside the main test results dir in
>> order to be able to browse either overall results (main test +
>> nested tests) or just the nested tests ones.
>>
>
> Based on the example given later, you're attributing to the
> NestedRunner the responsibility to put the nested test results "in
> the right" location.  It sounds appropriate.  The tricky questions
> are really how they show up in the overall job/test result structure,
> because that reflects how much the NestedRunner looks like a "Job".
>
Not really like job, more like a runner. The NestedRunner should create new process and setup the logger as defined in `avocado.Test` into the given nested test's result directory as well as pass it through the pipe/socket the main logging streams of the main test's logger, which adds the prefix and logs it (we can't use the way normal runner setups the logs as that way we can't add the prefix).

So basically NestedRunner defines a bit modified `avocado.core.runner.TestRunner._run_test` and modifies the value of the `base_logdir` argument of the nested test template.

It should not report the intermediary (nested test's) results.

>> Resolver --------
>>
>> Resolver is an avocado component resolving a test reference into a
>> list of test templates compound of the test name, params and other
>> `avocado.Test.__init__` arguments.
>>
>> Very simple example -------------------
>>
>> This example demonstrates how to use existing test (SimpleTest
>> "/usr/bin/wget example.org") in order to create a complex scenario
>> (download the main page from example.org from multiple computers
>> almost concurrently), without any modifications of the
>> `SimpleTest`.
>>
>> import avocado
>>
>> class WgetExample(avocado.Test): def test(self): # Initialize
>> nested test runner self.runner = avocado.NestedRunner(self) # This
>> is what one calls on "avocado run" test_reference = "/usr/bin/wget
>> example.org" # This is the resolved list of templates tests =
>> avocado.resolver.resolve(test_reference) # We could support list of
>> results, but for simplicity # allow only single test. assert
>> len(tests) == 1, ("Resolver produced multiple test " "names:
>> %s\n%s" % (test_reference, tests) test = tests[0] for machine in
>> self.params.get("machines"): # Query a background job on the
>> machine (local or # remote) and return test id in order to query
>> for # the particular results or task interruption, ...
>> self.runner.run_bg(machine, test) # Wait for all background tasks
>> to finish, raise exception # if any of them fails.
>> self.runner.wait(ignore_errors=False)
>>
>
> Just for accounting purposes at this point, and not for applying
> judgment, let's take note that this approach requires the following
> sets of APIs to become "Test APIs":
>
> * avocado.NestedRunner * avocado.resolver
>
> Now, doing a bit of judgment. If I were an Avocado newcomer, looking
> at the Test API docs, I'd be intrigued at how these belong to the
> same very select group that includes only:
>
> * avocado.Test * avocado.fail_on * avocado.main * avocado.VERSION
>
> I'm not proposing a different approach or a different architecture.
> If the proposed architecture included something like a NestedTest
> class, then probably the feeling is that it would indeed naturally
> belong to the same group.  I hope I managed to express my feeling,
> which may just be overreaction. If others share the same feeling,
> then it may be a sign of a red flag.
>
> Now, considering my feeling is not an overreaction, this is how such
> an example could be written so that it does not put NestedRunner and
> resolver in the Test API namespace:
>
> from avocado import Test from avocado.utils import nested
>
> class WgetExample(Test): def test(self): reference = "/usr/bin/wget
> example.org" tests = [] for machine in self.params.get("machines"):
> tests.append(nested.run_test_reference(self, reference, machine))
> nested.wait(tests, ignore_errors=False)
>
> This would solve the crossing (or pollution) of the Test API
> namespace, but it has a catch: the test reference resolution is
> either included in `run_test_reference` (which is a similar problem)
> or delegated to the remote machine.  Having the reference delegated
> sounds nice, until you need to identify backing files for the tests
> and copy them over to the remote machine.  So, take this as food for
> thought, and not as a fail proof solution.
I couldn't agree more with moving nested to utils. Regarding the `run_test_reference` that's actually what I originally started with but you were strictly against. I thought about it (a long time) and it shouldn't be that hard. I see two possible solutions (both could coexist for better results):

1. The resolver should not return `tuple(class, arguments)`, but it should report an object, which should support means to investigate it. For example:

     >>> test = resolve("/bin/true")
     >>> print test.template
     (avocado.tests.SimpleTest, {"methodName": "test", "params": {}, ...)
     >>> print test.test_name
     "/bin/true"
     >>> instance = test.load()

2. Each test class could implement method to define the test's dependencies. For example avocado.Test could say the current file. The avocado.SimpleTest should report the program. Other users might define tests depending on libraries and mark them as dependencies. Last but not least, avocado-vt could define either test providers, or just say this can only be run without dependencies. The only problem of this method is, that it either has to be a class method accepting the test template's arguments, or we'd have to instantiate the class locally before running it on the remote machine.

>
>> When nothing fails, this usage has no benefit over the simple
>> logging into a machine and firing up the command. The difference
>> is, when something does not work as expected. With nested test, one
>> get a runner exception if the machine is unreachable. And on test
>> error he gets not only overall log, but also the per-nested-test
>> results simplifying the error analysis. For 1, 2 or 3 machines,
>> this makes no difference, but imagine you want to run this from
>> hundreds of machines. Try finding the exception there.
>>
>
> I agree that it's nice to have the nested tests' logs.  What you're
> proposing is *core* (as in Test API) convenience, over something
> like:
>
> from avocado import Test from avocado.utils import nested
>
> class WgetExample(Test): def test(self): reference = "/usr/bin/wget
> example.org" tests = [] for machine in self.params.get("machines"):
> tests.append(nested.run_test_reference(self, reference, machine))
> nested.wait(tests, ignore_errors=False) nested.save_results(tests,
> os.path.join(self.resultsdir, "nested"))
>
Well I see the location of the results as essential part of this RFC. Without systematic storage location it makes little sense.

>> Yes, you can implement the above without nested tests, but it
>> requires a lot of boilerplate code to establish the connection (or
>> raise an exception explaining why it was not possible and I'm not
>> talking about "unable to establish connection", but granularity
>> like "Invalid password", "Host is down", ...). Then you'd have to
>> setup the output logging for that particular task, add the prefix,
>> run the task (handling all possible exceptions) and interpret the
>> results. All of this to get the same benefits very simple avocado
>> test provides you.
>>
>
> Having boiler plate code repeatedly written by users is indeed not a
> good thing.  And a well thought out API for users is the way to
> prevent boiler plate code from spreading around in tests.
>
> The exception handling, that is, raising exceptions to flag failures
> in the nested tests execution is also a given IMHO.
>
My point was about the setUp, tearDown and other convenient helpers. People are used to these and it simplifies reading the code/results.

>> Advanced example ----------------
>>
>> Imagine a very complex scenario, for example a cloud with several
>> services. One could write a big-fat test tailored just for this
>> scenario and keep adding sub-scenarios producing unreadable source
>> code.
>>
>> With nested tests one could split this task into tests:
>>
>> * Setup a fake network * Setup cloud service * Setup in-cloud
>> service A/B/C/D/... * Test in-cloud service A/B/C/D/... * Stress
>> network * Migrate nodes
>>
>> New variants could be easily added, for example DDoS attack to
>> some nodes, node hotplug/unplug, ... by invoking those existing
>> tests and combining them into a complex test.
>>
>> Additionally note that some of the tests, eg. the setup cloud
>> service and setup in-cloud service are quite generic tests, what
>> could be reused many times in different tests. Yes, one could write
>> a library to do that, but in that library he'd have to handle all
>> exceptions and provide nice logging, while not clutter the main
>> output with unnecessary information.
>>
>> Job results -----------
>>
>> Combine (multiple) test results into understandable format. There
>> are several formats, the most generic one is file format:
>>
>> . ├── id  -- id of this job ├── job.log  -- overall job log └──
>> test-results  -- per-test-directories with test results ├──
>> 1-passtest.py:PassTest.test  -- first test's results └──
>> 2-failtest.py:FailTest.test  -- second test's results
>>
>> Additionally it contains other files and directories produced by
>> avocado plugins like json, xunit, html results, sysinfo gathering
>> and info regarding the replay feature.
>>
>
> OK, this is pretty much a review.
>
>> Test results ------------
>>
>> In the end, every test produces results, which is what we're
>> interested in. The results must clearly define the test status,
>> should provide a record of what was executed and in case of
>> failure, they should provide all the information in order to find
>> the cause and understand the failure.
>>
>> Standard tests does that by providing test log (debug, info,
>> warning, error, critical), stdout, stderr, allowing to write to
>> whiteboard and attach files in the results directory. Additionally
>> due to structure of the test one knows what stage(s) of the test
>> failed and pinpoint exact location of the failure (traceback in the
>> log).
>>
>> . ├── data  -- place for other files produced by a test ├──
>> debug.log  -- debug, info, warn, error log ├── remote.log  --
>> additional log regarding remote session ├── stderr  -- standard
>> error ├── stdout  -- standard output ├── sysinfo  -- provided by
>> sysinfo plugin │   ├── post │   ├── pre │   └── profile └──
>> whiteboard  -- file for arbitrary test data
>>
>> I'd like to extend this structure of either a directory "subtests",
>> or convention for directories intended for nested test results
>> `r"\d+-.*"`.
>>
>
> Having them on separate sub directory is less intrusive IMHO.  I'd
> even argue that `data/nested` is the way to go.
I like the idea of `nested`. It's short and goes along with the `avocado.utils.nested`. (If it was `avocado.utils`, I'd prefer the results directly in the main dir)

>
>> The `r"\d+-.*"` reflects the current test-id notation, which
>> nested tests should also respect, replacing the serialized-id by
>> in-test-serialized-id. That way we easily identify which of the
>> nested tests was executed first (which does not necessarily mean it
>> finished as first).
>>
>> In the end nested tests should be assigned a directory inside the
>> main test's results (or main test's results/subtests) and it should
>> produce the data/debug.log/stdout/stderr/whiteboard in there as
>> well as propagate the debug.log with a prefix to the main test's
>> debug.log (as well as job.log).
>>
>> └── 1-parallel_wget.py:WgetExample.test  -- main test ├── data ├──
>> debug.log  -- contains main log + nested logs with prefixes ├──
>> remote.log ├── stderr ├── stdout ├── sysinfo │   ├── post │   ├──
>> pre │   └── profile ├── whiteboard ├── 1-_usr_bin_wget\ example.org
>> -- first nested test │   ├── data │   ├── debug.log  -- contains
>> only this nested test log │   ├── remote.log │   ├── stderr │   ├──
>> stdout │   └── whiteboard ├── 2-_usr_bin_wget\ example.org  --
>> second nested test ... └── 3-_usr_bin_wget\ example.org  -- third
>> nested test ...
>>
>> Note that nested tests can finish with any result and it's up to
>> the main test to evaluate that. This means that theoretically you
>> could find nested tests which states `FAIL` or `ERROR` in the end.
>> That might be confusing, so I think the `NestedRunner` should
>> append last line to the test's log saying `Expected FAILURE` to
>> avoid confusion while looking at results.
>>
>
> This special injection, and special handling for that matter,
> actually makes me more confused.
>
Hmm, I'd find it quite helpful, when looking at the particular results. Anyway I can live without it and I demonstrated log results without this at the beginning of this mail. Let me demonstrate how this would look like in case we include this feature:

The nested nested test's log:

     START 1-nestednested.py:NestedNested.test
     Some message from nestednested
     FAIL 1-nestednested.py:NestedNested.test

     Marked as PASS by the main test

I'd prefer that, but it's not a strong opinion.

>> Note2: It might be impossible to pass messages in real-time across
>> multiple machines, so I think at the end the main job.log should
>> be copied to `raw_job.log` and the `job.log` should be reordered
>> according to date-time of the messages. (alternatively we could
>> only add a contrib script to do that).
>>
>
> Definitely no to another special handling.  Definitely yes to a
> post-job contrib script that can reorder the log lines.
>
I thought this is going to be controversial. Imagine browsing those results in Jenkins, I'd welcome the possibility to see the results ordered. On the other hand I could live with contrib-script only approach too (for now...)

>>
>> Conclusion ==========
>>
>> I believe nested tests would help people covering very complex
>> scenarios by splitting them into pieces similarly to Lego. It
>> allows easier per-component development, consistent results which
>> are easy to analyze as one can see both, the overall picture and
>> the specific pieces and it allows fixing bugs in all tests by
>> fixing the single piece (nested test).
>>
>
> It's pretty clear that running other tests from tests is *useful*,
> that's why it's such a hot topic and we've been devoting so much
> energy to discussing possible solutions.  NestedTests is one to do
> it, but I'm not confident we have enough confidence to make it *the*
> way to do it. The feeling that I have at this point, is that maybe we
> should prototype it as utilities to:
>
> * give Avocado a kickstart on this niche/feature set * avoid as much
> as possible user-written boiler plate code * avoid introducing *core*
> test APIs that would be set in stone
>
> The gotchas that we have identified so far, are IMHO, enough to
> restrain us from forcing this kind of feature into the core test API,
> which we're in fact, trying to clean up.
>
> With user exposition and feedback, this, a modified version or a
> completely different solution, can evolve into *the* core (and
> supported) way to do it.
>
Thanks for the feedback, I see this more like a utility so perhaps it's a better place for it.

Regards,
Lukáš

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/avocado-devel/attachments/20160526/b11aa14f/attachment.sig>