[Avocado-devel] RFC: multi-stream test (previously multi-test) [v3]

Thu Apr 21 06:45:54 UTC 2016

Dne 21.4.2016 v 01:58 Ademar Reis napsal(a):
> On Wed, Apr 20, 2016 at 07:38:10PM +0200, Lukáš Doktor wrote:
>> Dne 16.4.2016 v 01:58 Ademar Reis napsal(a):
>>> On Fri, Apr 15, 2016 at 08:05:09AM +0200, Lukáš Doktor wrote:
>>>> Hello again,
>>>
>>> Hi Lukas.
>>>
>> Hello to you, Ademar,
>>
>>> Thanks for v3. Some inline feedback below:
>>>
>>>>
>>>> There were couple of changes and the new Job API RFC, which might sound
>>>> similar to this RFC, but it covers different parts. Let's update the
>>>> multi-test RFC and fix the terminology, which might had been a bit
>>>> misleading.
>>>>
>>>> Changes:
>>>>
>>>>     v2: Rewritten from scratch
>>>>     v2: Added examples for the demonstration to avoid confusion
>>>>     v2: Removed the mht format (which was there to demonstrate manual
>>>>         execution)
>>>>     v2: Added 2 solutions for multi-tests
>>>>     v2: Described ways to support synchronization
>>>>     v3: Renamed to multi-stream as it befits the purpose
>>>>     v3: Improved introduction
>>>>     v3: Workers are renamed to streams
>>>>     v3: Added example which uses library, instead of new test
>>>>     v3: Multi-test renamed to nested tests
>>>>     v3: Added section regarding Job API RFC
>>>>     v3: Better description of the Synchronization section
>>>>     v3: Improved conclusion
>>>>     v3: Removed the "Internal API" section (it was a transition between
>>>>         no support and "nested test API", not a "real" solution)
>>>>     v3: Using per-test granularity in nested tests (requires plugins
>>>>         refactor from Job API, but allows greater flexibility)
>>>>
>>>>
>>>> The problem
>>>> ===========
>>>>
>>>> Allow tests to have some if its block of code run in separate stream(s).
>>>> We'll discuss the range of "block of code" further in the text.
>>>>
>>>> One example could be a user, who wants to run netperf on 2 machines, which
>>>> requires following manual steps:
>>>>
>>>>
>>>>     machine1: netserver -D
>>>>     machine1: # Wait till netserver is initialized
>>>>     machine2: netperf -H $machine1 -l 60
>>>>     machine2: # Wait till it finishes and report the results
>>>>     machine1: # stop the netserver and report possible failures
>>>>
>>>> the test would have to contain the code for both, machine1 and machine2 and
>>>> it executes them in two separate streams, which might or not be executed on
>>>> the same machine.
>>>>
>>>> You can see that each stream is valid even without the other, so additional
>>>> requirement would be to allow easy share of those block of codes among other
>>>> tests. Splitting the problem in two could also sometimes help in analyzing
>>>> the failures.
>>>
>>> I would like to understand this requirement better, because to me
>>> it's not clear why this is important. I think this might be a
>>> consequence of a particular implementation, not necessarily a
>>> requirement.
>>>
>> Yes, I wanted to mention that there might be additional benefit, not
>> directly related only to this RFC. I should probably mention it only
>> where it applies and not here.
>>
>>>>
>>>> Some other examples might be:
>>>>
>>>
>>> I suggest you add real world examples here (for a v4). My
>>> suggestions:
>>>
>>>> 1. A simple stress routine being executed in parallel (the same or different
>>>> hosts)
>>>
>>>  - run a script in multiple hosts, all of them interacting with a
>>>    central service (like a DDoS test). Worth noting that this
>>>    kind of testing could also be done with the Job API.
>>>
>> ack
>>
>>>> 2. Several code blocks being combined into a complex scenario(s)
>>>
>>>  - netperf
>>>  - QEMU live migration
>>>  - other examples?
>> ack
>>>
>>>> 3. Running the same test along with stress test in background
>>>>
>>>
>>>   - write your own stress test and run it (inside a guest, for
>>>     example) while testing live-migration, or collecting some
>>>     performance metrics
>>>   - run bonnie or trinity in background inside the guest while
>>>     testing migration in the host
>>>   - run bonnie or trinity in background while collecting real
>>>     time metrics
>> ack, thank you for the explicit examples
>>
>>>
>>>> For demonstrating purposes this RFC uses a very simple example fitting in
>>>> the category (1). It downloads the main page from "example.org" location
>>>> using "wget" (almost) concurrently from several machines.
>>>>
>>>>
>>>> Standard python libraries
>>>> -------------------------
>>>>
>>>> One can run pieces of python code directly using python's multiprocessing
>>>> library, without any need for the avocado-framework support. But there is
>>>> quite a lot of cons:
>>>>
>>>> + no need for framework API
>>>> - lots of boilerplate code in each test
>>>> - each solution would be unique and therefor hard to analyze the logs
>>>> - no decent way of sharing the code with other tests
>>>>
>>>> Yes, it's possible to share the code by writing libraries, but that does not
>>>> scale as other solutions...
>>>>
>>>> Example (simplified):
>>>>
>>>>     from avocado.core.remoter import Remote
>>>>     from threading import Thread
>>>>     ...
>>>>     class Wget(Thread):
>>>>
>>>>         def __init__(self, machine, url):
>>>>             self.remoter = Remote(machine)
>>>>             self.url = url
>>>>             self.status = None
>>>>
>>>>         def run(self):
>>>>             ret = self.remoter.run("wget %s" % self.target,
>>>>                                    ignore_status=True)
>>>>             self.status = ret.exit_status
>>>>     ...
>>>>
>>>>     threads = []
>>>>     for machine in machines:
>>>>         threads.append(Wget(machine, "example.org"))
>>>>     for thread in threads:
>>>>         thread.start()
>>>>     for thread in threads:
>>>>         thread.join()
>>>>         self.failif(thread.status is 0, ...)
>>>>     ...
>>>>
>>>>
>>>> This should serve the purpose, but to be able to understand failures, one
>>>> would have to add a lot of additional debug information and if one wanted to
>>>> re-use the Wget in other tests, he'd have to make it a library shared with
>>>> all the tests.
>>>
>>> I think we all agree this is not the way to go, so maybe you
>>> could drop this from a potential v4 (or just be very succinct
>>> about it).
>>>
>> OK, as only Cleber replied I wanted to keep most variants here, but it's
>> probably time to get rid of them and focus on the one we agree on.
>>
>>>>
>>>> Nested tests API
>>>> ----------------
>>>>
>>>> Another approach would be to say the "block of code" is the full avocado
>>>> test. The main benefits here are, that each avocado test provides additional
>>>> debug information in a well established format people are used to from
>>>> normal tests, allows one to split the complex problem into separate parts
>>>> (including separate development) and easy sharing of an existing tests (eg.
>>>> stress test, server setup, ...) and putting them together like a Lego into
>>>> complex scenarios.
>>>
>>> I'm not sure we should say it's a "full avocado test". At least
>>> for me, this sounds like a real test as run by a job, so I think
>>> it's the wrong vocabulary.
>>>
>>> The Test ID RFC
>>> (https://www.redhat.com/archives/avocado-devel/2016-March/msg00024.html)
>>> clarifies the vocabulary to use. To recapitulate and summarize:
>>>
>>>  - Test Name: A name that identifies one test (as in code to be
>>>    executed) in the context of a job. A name doesn't imply
>>>    parameters or runtime configuration.
>>>
>>>  - Test Variant: a set of runtime parameters, currently provided
>>>    by the multiplexer as AvocadoParams.
>>>
>>>  - Test Reference: whatever one provides to the test resolver
>>>    (resulting in one or more Test Names). In current Avocado, we
>>>    usually call it "test URL".
>>>
>>>  - Test: An Avocado Test instantiated and run in a Job, with its
>>>    own unique Test ID and runtime configuration.  This is a
>>>    combination of a Test Name and a Test Variant that gets run in
>>>    a job.
>> I'm sorry, I still haven't got used to this terminology (as it's not yet
>> part of the code). Given this it should be "Test Reference" and yes, we
>> can consider some simplifications, but not complete rewrite. Maybe
>> something like `standalone` execution, which executes the same, but a
>> bit simpler. I would not support any solution which would require a new
>> runner.
> 
> I'm not sure I understand what you mean by "require a new
> runner". What is used to run tests as a stream (remotely, in a
> VM, in a docker container, or locally) is an implementation
> detail.
> 
I mean an independent test runner, which drives the workflow. Currently
most of this code is inside the Test itself, but with the cleanup Cleber
is working on, the runner should take over most of the runner
responsibilities instead.

I'm fine with running the tests similarly (not exactly) to standalone mode.

> That's the case even right now: the fact we use the same runner
> when running tests remotely is an implementation detail
> abstracted away from users. All they know is that using
> --remote-... and --vm-*, tests are run in a different location.
> How it happens doesn't concern them and is not relevant from a
> high level perspective.
> 
> For example, the options below are all valid for this RFC because
> they're implementation details, abstract for the user:
> 
>  $ avocado run <stream> # we use avocado run internally, hiding
>                         # everything job-related from the user
>                         # then we process the job results to fit
>                         # our format for streams.
>  $ avocado-stream-runner <stream> # we write a runner for streams
>  $ avocado run --stream <stream> # we introduce --stream
>  $ avocado run-stream <stream> # we introduce run-stream
>  ...
> 
> Users don't need to know anything about the above. All they need
> to know is how to write a multi-stream test (the API).
> Everything else is abstract.
> 
>>
>>>
>>> I think we need something that would be defined as "a class that
>>> inherits from Test". Maybe "Test Implementation", or simply "Test
>>> Class". Then you could say that a block of code should be a "Test
>>> Implementation" instead of using the more ambiguous and confusing
>>> "full avocado test".
>>>
>>> Also for this reason, I'm not a big fan of the "Nested Tests"
>>> nomenclature. We've established long ago that we don't want
>>> sub-tests or nested-tests inside a Job. We should make it extra
>>> clear that a Job has no knowledge whatsoever of nested tests (and
>>> hence I wouldn't call them "nested tests").
>> OK, let's not call that Nested Tests, feel free to suggest better names.
> 
> So you think tests should be invoking other tests via reference.
> 
> But a Test Reference may be resolved into more than one Test
> Name. This matches your idea of streams that can run more than
> one test, something I don't like because I believe this kind
> of abstraction should be at the job level (Job API), not at the
> test level.
> 
> Anyway, in v4 you should properly introduce this idea by defining
> what Streams are and also by defining that you want tests running
> other tests via Test References.
> 
> Personally, I would prefer to define it more or less like this
> instead:
> 
>    Tests can run multiple streams, which can be defined as
>    different processes or threads that run code in the form of
>    Avocado Test Implementations (classes that inherit from
>    avocado.Test, or as returned by the Avocado Resolver).
>    
>    Some runtime variables can be configured per-stream (such as
>    where it's run -- a VM, a container, or remotely).
> 
>    Notice Streams are not handled as tests: they are not visible
>    outside of the test that is running them.  For example,
>    they don't have individual variants, don't have Test IDs,
>    don't trigger pre/post hooks and their results are not visible
>    at the Job level.
> 
I'll try to merge those with the Cleber's ideas, thanks.

> There we are: 3 simple paragraphs that explain the concept in an
> abstract way. Now it would be a matter of exploring it a bit
> further, providing examples, refinements and discussing the API.
> 
>>
>>>
>>>>
>>>> On the negative side, avocado test is not the smallest piece of code and it
>>>> adds quite a bit of overhead. But for simpler code, one can execute the code
>>>> directly (threads, remoter) without a framework support.
>>>
>>> What do you have in mind when you say "not the smallest piece of
>>> code" and "overhead"?
>>>
>> This is linked to v2 and Cleber's questionnaire. We can execute
>> functions, classes, ... . So this limiting this only to
>> avocado-like-tests decreases the granularity, but I see that as a
>> benefit and nothing prevents users from using standard python means to
>> do this.
> 
> With the definitions I proposed above, one should be free to
> implement it in multiple ways. Two examples:
> 
> (pseudo-code)
> 
> Example 1:
> 
> multi.py:
>     class MultiExample(avocado.Test):
>         def setUp(self):
>             whatever...
> 
>         def test(self):
>             ...
>             run_stream(Machine1())
>             run_stream(Machine2())
>             ...
> 
>     @stream ## don't be listed or run by default
>     class Machine1(avocado.Test):
>         def test(self):
>             ...
We have the system of docstrings for such things, so we can expand it,
or use a different test class. As I said as far as we also support
providing tests by reference, I'm fine and I see some users might like
to bond the test and the workers code.

> 
>     @stream ## don't be listed or run by default
>     class Machine2(avocado.Test):
>         def test(self):
>             ...
> 
> $ avocado run multi.py --> runs only MultiExample:test
> $ avocado list multi.py --> lists only MultiExample:test
> $ avocado run multi.py:Machine2:test --> runs Machine2:test as a
>                                          test at the job level.
> 
> Example 2:
> 
>     class MultiExample(avocado.Test):
>         def setUp(self):
>             whatever...
> 
>         def test(self):
>             ...
>             run_stream(resolve("passtest.py"), strict=True)
>             run_stream(resolve("passtest.py"), strict=True)
> 
> The 'strict=True', when passed to resolve() means it should
> return one and only one Test Name (like I said above, I don't
> like the idea of a stream running multiple tests).
> 
> BTW, this would require a clarification that the Avocado Resolver
> is responsible for resolving and instantiating Test Classes,
> making them ready to be run. But that's an implementation detail.
> 
This is actually part of the runner workflow and a reason why I want to
be able to reuse it.

>>
>>>>
>>>> Example (simplified):
>>>>
>>>>     import avocado
>>>>
>>>>     class WgetExample(avocado.Test):
>>>>         def setUp(self):
>>>>             self.streams = avocado.Streams(self)
>>>>             for machine in machines:
>>>>                 self.streams.add_stream(machine)
>>>>         def test(self)
>>>>             for stream in self.streams:
>>>>                 stream.run_bg("/usr/bin/wget example.org")
>>>>             self.streams.wait(ignore_errors=False)
>>>>
>>>> where the `avocado.Stream` represents a worker (local or remote) which
>>>> allows running avocado tests in it (foreground or background). This should
>>>> provide enough flexibility to combine existing tests in complex tests.
>>>
>>> I'm not sure I understand the purpose of the example above... You
>>> explained you intent of using "avocado tests", but that's not
>>> what you're doing here.
>>>
>>> Or do you mean you're running "/usr/bin/wget example.org" as a
>>> "simple test" (as in "not an instrumented test")?
>> OK, this is my biggest failure. I forgot to mention, that in this first
>> version the `stream.run_bg` accepts `Test References`, therefor the
>> `"/usr/bin/wget example.org"` would be resolved by avocado resolver into
>> SimpleTest("/usr/bin/wget example.org"") and it will be executed.
>>
>> I'm sorry for all the confusion. Additionally I'm open to ideas about
>> whether to accept the "Test Reference" or initialized classes. The
>> JobAPI RFC took the goal to implement resolvers and this code could
>> benefit from it, so the critical code above would need additional steps:
>>
>>     test = resolver.resolve("/usr/bin/wget example.org")
>>     stream.run_bg(test)
>>
>> I think that first version could live without this and use the loaders
>> from this avocado execution and later when we have the `resolver` API we
>> accept support both.
> 
> Let's see what you propose in v4, then we can refine it.
> 
>>
>>>
>>>>
>>>> Instead of using plugin library for streams, we can develop it as another
>>>
>>> What do you mean by "using plugin library" here?
>>>
>>>> test variant (not a new test type, only avocado.Test with some additional
>>>> initialization), called `avocado.MultiTest` or `avocado.NestedTest`:
>>>>
>>>>     import avocado
>>>>
>>>>     class WgetExample(avocado.NestedTest):
>>>>         # Machines are defined via params adn initialized
>>>>         # in NestedTest.setUp
>>>>         def test(self):
>>>>             for stream in self.streams:
>>>>                 stream.run("/usr/bin/wget example.org")
>>>>             self.wait(ignore_errors=False)
>>>
>>> I'm lost here (sorry). The only difference I see between the two
>>> examples are:
>>>
>>>  - WgetExample now inherits from NestedTest instead of Test
>>>  - Instead of self.streams.wait(), you use self.wait().
>>>
>>> Can you provide a more concrete example here in this section?
>>> Maybe wget is too simplistic because it requires no
>>> instrumentation at all. Your example from the conclusion is
>>> better, you could have used it here.
>>>
>> My fault, I should have included the `avocado.NestedTest` definition:
>>
>>     class NestedTest(avocado.Test):
>>
>>         def setUp(self):
>>             self.streams = avocado.Streams(self)
>>             for machine in self.params.get("machines",
>>                                            "/plugins/nested_tests/"
>>                                             [])
>>                 self.streams.add_stream(machine)
>>
>>         def wait(self):
>>             return self.streams.wait()
>>         ...
>>
>> So it's not a new test type, it's simple avocado.Test with some
>> pre-defined variables and methods, which could be extended of the common
>> patterns.
> 
> OK, at least I understand it now. I don't think this is really
> necessary, but I'll wait for v4.
> 
The difference is that if we develop __only__ the test class, it can be
part of it and not a library.

The difference is in references and the way of using it (library needs
to contain reference to the assigned test). Anyway majority is for the
library and I'm only slightly inclined towards the new test type, so the
v4 will contain only the library version.

>>
>>>>
>>>>
>>>> API backed by internal API
>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>
>>>> _supported by cleber in v2 and I agree now_
>>>>
>>>> This would implement the nested test API using the internal API (from
>>>> avocado.core).
>>>>
>>>> + runs native python
>>>> + easy interaction and development
>>>> + easily extensible by either using internal API (and risk changes) or by
>>>> inheriting and extending the features.
>>>> - lots of internal API will be involved, thus with almost every change of
>>>> internal API we'd have to adjust this code to keep the NestedTest working
>>>> - fabric/paramiko is not thread/parallel process safe and fails badly so
>>>> first we'd have to rewrite our remote execution code (use autotest's worker,
>>>> or aexpect+ssh)
>>>>
>>>>
>>>> API backed by cmdline
>>>> ~~~~~~~~~~~~~~~~~~~~~
>>>>
>>>> _liked by me in v2, hated by others, rejected in v3_
>>>>
>>>> This would implement the nested test API by translating it into "avocado
>>>> run" commands.
>>>>
>>>> + easy to debug as users are used to the "avocado run" syntax and issues
>>>> + allows manual mode where users trigger the "avocado run" manually
>>>> + cmdline args are part of public API so they should stay stable
>>>> + no issues with fabric/paramiko as each process is separate
>>>> + even easier extensible as one just needs to implement the feature for
>>>> "avocado run" and then can use it as extra_params in the worker, or send PR
>>>> to support it in the stable environment.
>>>> - would require additional features to be available on the cmdline like
>>>> streamline way of triggering tests
>>>> - only features available on the cmdline can be supported (currently not
>>>> limiting)
>>>> - rely on stdout parsing (but avocado supports machine readable output)
>>>>
>>>
>>> I think these two items are confusing for this RFC. We miss the
>>> context of why they were liked or "hated". Maybe you can simply
>>> drop them (that's how RFCs work: we refine newer versions based
>>> on conclusions and feedback from previous discussions).
>>>
>> Again as previously, only Cleber commented until now, so I didn't wanted
>> to remove any sections before getting the feedback. Now it's time to
>> remove them.
>>
>>>>
>>>> Synchronization
>>>> ===============
>>>>
>>>> Some tests do not need any synchronization, users just need to run them. But
>>>> some multi-stream tests needs to be precisely synchronized or they need to
>>>> exchange data.
>>>>
>>>> For synchronization purposes usually "barriers" are used, where barrier
>>>> guards the entry into a section identified by "name" and "number of
>>>> clients". All parties asking an entry into the section will be delayed until
>>>> the "number of clients" reach the section (or timeout). Then they are
>>>> resumed and can entry the section. Any failure while waiting for a barrier
>>>> propagates to other waiting parties.
>>>>
>>>> This can be all automated inside the `avocado.Streams`, which could start
>>>> listening on a free port and pass this information to the executed code
>>>> blocks. In the code blocks one simply imports `Sync` and initialize it with
>>>> the address+port and can use it for synchronization (or later for data
>>>> exchange).
>>>>
>>>>     from avocado.plugins.sync import Sync
>>>>     # Connect the sync server on address stored in params
>>>>     # which could be injected by the multi-stream test
>>>>     # or set manually.
>>>>     sync = Sync(self, params.get("sync_server", "/plugins/sync_server"))
>>>>     # wait until 2 tests ask to enter "setup" barrier (60s timeout)
>>>>     sync.barrier("setup", 2, 60)
>>>>
>>>> As before it can be part of the "NestedTest" test, initialized based on
>>>> params without the need for boilerplate code. The result would be the same,
>>>> avocado listens on some port and the tests can connect to this port and asks
>>>> for a barrier/data exchange, with the support for re-connection.
>>>>
>>>> For debugging purposes it might be useful to allow starting the sync server
>>>> as avocado plugin eg. by `--sync-server ...` (or having another command just
>>>> to start listening, eg `avocado syncserver`). With that one could spawn the
>>>> multiple processes manually, without the need to run the main multi-stream
>>>> test and communicate over this manually started server, or even just debug
>>>> the behavior of one existing piece of the bigger test and fake the other
>>>> components by sending the messages manually instead (eg. to see how it
>>>> handles errors, timing issues, unexpected situations).
>>>>
>>>> Again, there are several ways to implement this:
>>>>
>>>> Standard multiprocess API
>>>> -------------------------
>>>>
>>>> The standard python's multiprocessing library contains over the TCP
>>>> synchronization. The only problem is that "barriers" were introduced in
>>>> python3 so we'd have to backport it. Additionally it does not fit 100% to
>>>> our needs, so we'd have to adjust it a bit (eg. to allow manual interaction)
>>>>
>>>>
>>>> Autotest's syncdata
>>>> -------------------
>>>>
>>>> Python 2.4 friendly, supports barriers and data synchronization. On the
>>>> contrary it's quite hackish and full of shortcuts.
>>>>
>>>>
>>>> Custom code
>>>> -----------
>>>>
>>>> We can inspire by the above and create simple human-readable (easy to debug
>>>> or interact with manually) protocol to support barriers and data exchange
>>>> via pickling. IMO that would be easier to maintain than backporting and
>>>> adjusting of the multiprocessing or fixing the autotest syncdata. A
>>>> proof-of-concept can be found here:
>>>>
>>>>     https://github.com/avocado-framework/avocado/pull/1019
>>>>
>>>> It modifies the "passtest" to be only executed when it's executed by 2 tests
>>>> at the same time. The proof-of-concept does not support the multi-stream
>>>> tests, so one has to run "avocado run passtest" twice using the same
>>>> "--sync-server" (once --sync-server and once --sync).
>>>>
>>>
>>> Architecturally speaking, it doesn't make much difference.
>>>
>>> I consider this an implementation detail: the user should be free
>>> to use whatever mechanism they want to synchronize tests. We'll
>>> probably provide something via the avocado.utils library for
>>> convenience.
>>>
>> OK
>>
>>>>
>>>> Job API RFC
>>>> ===========
>>>>
>>>> Recently introduced Job API RFC covers very similar topic as "nested test",
>>>> but it's not the same. The Job API is enabling users to modify the job
>>>> execution, eventually even write a runner which would suit them to run
>>>> groups of tests. On the contrary this RFC covers a way to combine
>>>> code-blocks/tests to reuse them into a single test. In a hackish way, they
>>>> can supplement each others, but the purpose is different.
>>>>
>>>> One of the most obvious differences is, that a failed "nested" test can be
>>>> intentional (eg. reusing the NetPerf test to check if unreachable machines
>>>> can talk to each other), while in Job API it's always a failure.
>>>>
>>>> I hope you see the pattern. They are similar, but on a different layer.
>>>> Internally, though, they can share some pieces like execution the individual
>>>> tests concurrently with different params/plugins (locally/remotely). All the
>>>> needed plugin modifications would also be useful for both of these RFCs.
>>>>
>>>> Some examples:
>>>>
>>>> User1 wants to run "compile_kernel" test on a machine followed by
>>>> "install_compiled_kernel passtest failtest warntest" on "machine1 machine2".
>>>> They depend on the status of the previous test, but they don't create a
>>>> scenario. So the user should use Job API (or execute 3 jobs manually).
>>>>
>>>> User2 wants to create migration test, which starts migration from machine1
>>>> and receives the migration on machine2. It requires cooperation and together
>>>> it creates one complex usecase so the user should use multi-stream test.
>>>>
>>>>
>>>> Conclusion
>>>> ==========
>>>>
>>>> Given the reasons I like the idea of "nested tests" using "API backed by
>>>> internal API" as it is simple to start with, allows test reuse which gives
>>>> us well known test result format and internal API allow greater flexibility
>>>> for the future.
>>>>
>>>> The netperf example from introduction would look like this:
>>>>
>>>> Machine1:
>>>>
>>>>     class NetServer(avocado.NestedTest):
>>>>         def setUp(self):
>>>>             process.run("netserver")
>>>>             self.barrier("setup", self.params.get("no_clients"))
>>>>         def test(self):
>>>>             pass
>>>>         def tearDown(self):
>>>>             self.barrier("finished", self.params.get("no_clients"))
>>>>             process.run("killall netserver")
>>>>
>>>> Machine2:
>>>>
>>>>     class NetPerf(avocado.NestedTest):
>>>>         def setUp(self):
>>>>             self.barrier("setup", params.get("no_clients"))
>>>>         def test(self):
>>>>             process.run("netperf -H %s -l 60"
>>>>                         % params.get("server_ip"))
>>>>             barrier("finished", params.get("no_clients"))
>>>>
>>>> One would be able to run this manually (or from build systems) using:
>>>>
>>>>     avocado syncserver &
>>>>     avocado run NetServer --mux-inject /plugins/sync_server:sync-server
>>>> $SYNCSERVER &
>>>>     avocado run NetPerf --mux-inject /plugins/sync_server:sync-server
>>>> $SYNCSERVER &
>>>>
>>>> (where the --mux-inject passes the address of the "syncserver" into test
>>>> params)
>>>
>>> I think using --mux-inject should be strongly discouraged if one
>>> is not using the multiplexer. I know that's currently the only
>>> way to provide parameters to a test, but this should IMO be
>>> considered a bug. Using it in a RFC may actually *encourage*
>>> users to use it.
>>>
>> As I said, it's currently the only sane way of passing params. I can add
>> the note, but that does not change this fact.
>>
>>>>
>>>> When the code is stable one would write this multi-stream test (or multiple
>>>> variants of them) to do the above automatically:
>>>>
>>>>     class MultiNetperf(avocado.NestedTest):
>>>>         def setUp(self):
>>>>             self.failif(len(self.streams) < 2)
>>>>         def test(self):
>>>>             self.streams[0].run_bg("NetServer",
>>>>                                    {"no_clients": len(self.streams)})
>>>>             for stream in self.streams[1:]:
>>>>                 stream.add_test("NetPerf",
>>>>                                 {"no_clients": len(self.workers),
>>>>                                  "server_ip": machines[0]})
>>>>             self.wait(ignore_failures=False)
>>>
>>> I don't understand why NestedTest is used all the time. It think
>>> it's not necessary (we could use composition instead of
>>> inheritance).
>> Yes, as mentioned earlier, NestedTest is just a convenience (and another
>> way of limiting the public API).
>>
>>>
>>> Let me give the same example using a different API implementation
>>> and you tell me if you see something *architecturally* wrong with
>>> it, or if these are just *implementation details* that still
>>> match your original idea:
>>>
>>> netperf.py:
>>>
>>> ```
>>>   import avocado
>>>   from avocado import multi
>>>   from avocado.utils import process
>>>
>>>   class NetPerf(Test):
>>>       def test(self):
>>>           s_params = ... # server parameters
>>>           c_params = ... # client parameters
>>>
>>>           server = NetServer()
>>>           client = NetClient()
>>>
>>>           m = multi.Streams()
>>>           ...
>>>           m.run_bg(server, s_params, ...)
>>>           m.run_bg(client, c_params, ...)
>>>           m.wait(ignore_errors=False)
>>>   
>>>   class NetServer(multi.TestWorker)
>>>       def test(self):
>>>           process.run("netserver")
>>>           self.barrier("server", self.params.get("no_clients"))
>>>       def tearDown(self):
>>>           self.barrier("finished", self.params.get("no_clients"))
>>>           process.run("killall netserver")
>>>
>>>   class NetClient(multi.TestWorker):
>>>       def setUp(self):
>>>           self.barrier("server", params.get("no_clients"))
>>>       def test(self):
>>>           process.run("netperf -H %s -l 60"
>>>                       % params.get("server_ip"))
>>>           barrier("finished", params.get("no_clients"))
>>> `
>>>
>>>  $ avocado list netperf.py --> returns *1* Test
>>>  (NetPerf:test)
>>>  $ avocado run multi.py --> runs this *1* Test
>>>
>>> But given multi.TestWorker is implemented as a "class that
>>> inherit from Test", for debug purposes users could run them
>>> individually, without any warranty or expectation that they'll
>>> work consistently given it'll miss the instrumentation and
>>> parameter handling that the main test does (the code from
>>> netperf.py:NetPerf:test). Example:
>> As I told you during our chat, I'm not inclined towards special test
>> type, which is not executed by default, but is executable when specified
>> by full path. But as far as we also support running normal tests, I'm
>> fine with that. It'll just be yet another exception in assessing the
>> Test type.
> 
> Maybe we can use a decorator... Or we use a different method
> name: instead of .test(), it could be .worker(), which is not
> going to be run or listed by default - then we allow the test
> runner to run these methods if we call them explicitly.
> 
Avocado contains docstring tags and different test types for that. So we
have the means for that. The only "problem" is yet another condition in
the discovery. I'm fine with that if we support discovery from different
files too.

>>
>> I'm fine with writing multi-tests in multiple files, it'd be more
>> natural to me as I tend to re-use the code whenever is possible.
>> Including the TestWorker in one test would make the other test look
>> weird (consider asking in Migrate test for `netperf.py:NetClient.test`
>> test).
> 
> I think you're turning a corner case into a primary use-case to
> justify a particular implementation detail. Let me elaborate:
> 
> You could run netperf.py:NetClient.test as a stream inside a
> migration test, but I don't understand why you would do that.
> That's not a common use-case, that's an exceptional corner case.
> 
> To me, writing multiple tests (files) is something I would do if
> running them individually was my primary use-case (and for that I
> would likely use the Job API). In this case running them as
> streams would be an exception and I would use the resolver for
> that (like your case of running NetClient inside a migration
> test).
> 
> For the most important examples we've seen so far (netperf and
> migration), I would certainly prefer to have the implementation
> inside a single file, because I usually don't have to run the
> parts individually as single tests. In this case running them
> individually as Tests at the Job level would still be possible,
> but it would be an exception (avocado run multi.py:Class:name).
> 
I'm not sure it's a corner case. One of the avocado-vt tests is to run
autotest test inside the vm and it's used extensively.

The other usecase is running stress test(s) together with your test,
which is also extensively used in avocado-vt, but usually implemented
in-the-code (various implementations starting with "autotest ... ;
killall autotest" and ending with a new session which is simply closed
along with the underlying processes).

They are not multi-host, but they are really common.

Last example is not from autotest, but jenkins. It also allows splitting
the task into jobs and there are several plugins to build pipelines and
it is really handy. It's used to share the
preparations/compiling/gathering parts and sometimes to build complex
scenarios from different jobs. The main point is the jobs are not used
just once, but they are reused from different pipelines.

>>
>>>
>>>  $ avocado run netperf.py:NetClient
>>>    -> runs NetClient as a standalone test (in this example it
>>>    won't work unless we provide the right parameters, for
>>>    example, via the multiplexer)
>>>  $ avocado run netperf.py:NetServer
>>>    -> runs NetServer as a standalone test (same thing)
>>>
>>> The main justification I see for the existence of test.TestWorker
>>> is to prevent the test runner from discovering these tests by
>>> default ('avocado run' and 'avocado list'). Maybe we could do
>>> things differently (again, composition instead of inheritance)
>>> and get rid of test.TestWorker. Just an idea, I'm not sure.
>>>
>>>>
>>>> Executing of the complex example would become:
>>>>
>>>>     avocado run MultiNetperf
>>>>
>>>> You can see that the test allows running several NetPerf tests
>>>> simultaneously, either locally, or distributed across multiple machines (or
>>>> combinations) just by changing parameters. Additionally by adding features
>>>> to the nested tests, one can use different NetPerf commands, or add other
>>>> tests to be executed together.
>>>>
>>>> The results could look like this:
>>>>
>>>>
>>>>     $ tree $RESULTDIR
>>>>       └── test-results
>>>>           └── MultiNetperf
>>>>               ├── job.log
>>>>                   ...
>>>>               ├── 1
>>>>               │   └── job.log
>>>>                       ...
>>>>               └── 2
>>>>                   └── job.log
>>>>                       ...
>>>>
>>>> Where the MultiNetperf/job.log contains combined logs of the "master" test
>>>> and all the "nested" tests and the sync server.
>> oups, second copy&paste error, it was suppose to be `debug.log` and
>> there is actually yet another problem with the number of tests in each
>> stream. So let's open the nest and define it now, when the discussion
>> settles a bit I'll send the v4.
>>
>> The results should look either like this:
>>
>>     job-2016-04-15T.../
>>     ├── id
>>     ├── job.log
>>     └── test-results
>>         └── 1-MultiNetperf
>>             ├── debug.log
>>             ├── stream1
>>             │   ├── 1
>>             │   │   ├── debug.log
>>             │   │   └── whiteboard
>>             │   └── 2
>>             │       ├── debug.log
>>             │       └── whiteboard
>>             ├── stream2
>>             │   └── 1
>>             │       ├── debug.log
>>             │       └── whiteboard
>>             └── whiteboard
>>
>>
>> Where:
>>
>> * in human/xunit/json results only 1 test is listed.
>> * 1-MultiNetperf/debug.log contains all logs from both workers (and the
>> test itself)
>> * stream1, stream2 - are directories containing output of the streams
>> * stream1/[12] - means that the stream1 executed 2 tests (or test
>> methods of the same class). Each should produce results in a separate
>> directory
>> * stream2/1 - executed one test, these are the results
>>
>>
>> This is a bit controversial and was not much described earlier. But
>> currently the `stream1.run_bg(test)` can result in one, or more tests.
>> Additionally I think it should be possible to call this multiple times,
>> so in the end the stream1 should report not PASS/FAIL, but list of the
>> results.
> 
> Now that I understand you intend to have streams running multiple
> tests, I see why you proposed this. But as I said above, I don't
> think this kind of flexibility should be available at the Test
> level. It's an unnecessary indirection that makes a test behave
> like a job. I hope you reconsider this.
> 
I don't see it as a flexibility. Stream is:

    CONTINUOUS SERIES:  a long and almost continuous series of events,
    people, objects, etc//  stream of//  --a stream of traffic//  --a
    stream of abuse//  steady/constant/endless etc stream //  --A
    steady stream of visitors came to the house.//

So in my perceiving stream is an open channel, which users can assign a
work (Test-like instances) and it'll execute then in series, producing
list of results (directories 1, 2, 3, ...) and reporting list of results
[{'status': 'PASS', ...}, {}, {}, ...].

They can either do it in background and ask the stream if it finished
all the work, or they can do it in foreground, which adds the task at
the end and returns when all tasks are finished. So far I think stream
is what represents that, but I can also rename it to queues, if you
think it's better.

>>
>> Another solution would be to say that stream only supports a single
>> test, or even avoid the whole concept and only allow triggering
>> foreground/background tests. Then in results we'd create separate
>> results for each executed test. The same example would produce:
>>
>>     job-2016-04-16T.../
>>     ├── id
>>     ├── job.log
>>     └── test-results
>>         └── 1-MultiNetperf
>>             ├── debug.log
>>             ├── whiteboard
>>             ├── 1
>>             │   ├── debug.log
>>             │   └── whiteboard
>>             ├── 2
>>             │   ├── debug.log
>>             │   └── whiteboard
>>             └── 3
>>                 ├── debug.log
>>                 └── whiteboard
>>
>> Where instead of `stream1` and `stream2` we have 3 streams, where 1 and
>> 3 will be executed one after another, and 2 concurrently with 1&3.
> 
> That's what I proposed in my previous e-mail as a directory
> structure. 1,2,3 would be a serialized version of the Test Name
> being run, maybe with a prefix to make them unique and identify
> them as streams.
> 
> Looking forward to your v4.
> 
> Thanks.
>    - Ademar
> 
I'll remove the unnecessary code and focus on you (and Clebers)
feedback. Thank you,

Lukáš

>>
>> The difference is that in the original approach, one initializes the
>> connection once per stream and then adds work to be done
>> (background/foreground). In this other approach the user would always
>> have to specify on which machine he wants to execute this test and new
>> connection would have to be established.
>>
>> I hope it's understandable, I'll try to describe it in v4 better.
>>
>>>>
>>>> Directories [12] contain results of the created (possibly even named)
>>>> streams. I think they should be in form of standard avocado Job to keep the
>>>> well known structure.
>>>
>>> Only one job was executed, so there shouldn't be multiple job.log
>>> files. The structure should be consistent with what we already
>>> have:
>>>
>>>     $ tree job-2016-04-15T.../
>>>     job-2016-04-15T.../
>>>     ├── job.log
>>>     ├── id
>>>     ├── replay/
>>>     ├── sysinfo/
>>>     └── test-results/
>>>         ├── 01-NetPerf/ (the serialized Test ID)
>>>         │   ├── data/
>>>         │   ├── debug.log
>>>         │   ├── whiteboard
>>>         │   ├── ...
>>>         │   ├── NetServer/ (a name, not a Test ID)
>>>         │   │   ├── data/
>>>         │   │   ├── ...
>>>         │   │   └── debug.log
>>>         │   └── NetClient (a name, not a Test ID)
>>>         │       ├── data/
>>>         │       ├── ...
>>>         │       └── debug.log
>>>         │       
>>>         ├── 02... (other Tests from the same job)
>>>         ├── 03... (other Tests from the same job)
>>>         ...
>>>
>>>
>>> Finally, I suggest you also cover the other cases you introduced
>>> in the beginning of this RFC.
>>>
>>> For example, if multi.run() is implemented in a flexible way, we
>>> could actually run Avocado Tests (think of Test Name) in multiple
>>> streams:
>>>
>>>     ...
>>>     from avocado import resolver
>>>     ...
>>>
>>>     t = resolver("passtest.py", strict=True)
>>>     multi.run_bg(t, env1, ...)
>>>     multi.run_bg(t, env2, ...)
>>>     multi.run_bg(t, env3, ...)
>>>     multi.wait(ignore_errors=False)
>> Yep, this is the resolver work, which is currently not available
>> externally, but generally I agree with it. I think we can support both.
>>
>>>
>>> Thanks.
>>>    - Ademar
>>>
>> Thank you for the extensive feedback (personal and email).
>>
>> Lukáš
>>
> 
> 
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/avocado-devel/attachments/20160421/f0ce4d3f/attachment.sig>