[Avocado-devel] RFC: multi-stream test (previously multi-test) [v3]

Fri Apr 15 06:05:09 UTC 2016

Hello again,

There were couple of changes and the new Job API RFC, which might sound 
similar to this RFC, but it covers different parts. Let's update the 
multi-test RFC and fix the terminology, which might had been a bit 
misleading.

Changes:

     v2: Rewritten from scratch
     v2: Added examples for the demonstration to avoid confusion
     v2: Removed the mht format (which was there to demonstrate manual
         execution)
     v2: Added 2 solutions for multi-tests
     v2: Described ways to support synchronization
     v3: Renamed to multi-stream as it befits the purpose
     v3: Improved introduction
     v3: Workers are renamed to streams
     v3: Added example which uses library, instead of new test
     v3: Multi-test renamed to nested tests
     v3: Added section regarding Job API RFC
     v3: Better description of the Synchronization section
     v3: Improved conclusion
     v3: Removed the "Internal API" section (it was a transition between
         no support and "nested test API", not a "real" solution)
     v3: Using per-test granularity in nested tests (requires plugins
         refactor from Job API, but allows greater flexibility)

The problem
===========

Allow tests to have some if its block of code run in separate stream(s). 
We'll discuss the range of "block of code" further in the text.

One example could be a user, who wants to run netperf on 2 machines, 
which requires following manual steps:

     machine1: netserver -D
     machine1: # Wait till netserver is initialized
     machine2: netperf -H $machine1 -l 60
     machine2: # Wait till it finishes and report the results
     machine1: # stop the netserver and report possible failures

the test would have to contain the code for both, machine1 and machine2 
and it executes them in two separate streams, which might or not be 
executed on the same machine.

You can see that each stream is valid even without the other, so 
additional requirement would be to allow easy share of those block of 
codes among other tests. Splitting the problem in two could also 
sometimes help in analyzing the failures.

Some other examples might be:

1. A simple stress routine being executed in parallel (the same or 
different hosts)
2. Several code blocks being combined into a complex scenario(s)
3. Running the same test along with stress test in background

For demonstrating purposes this RFC uses a very simple example fitting 
in the category (1). It downloads the main page from "example.org" 
location using "wget" (almost) concurrently from several machines.

Standard python libraries
-------------------------

One can run pieces of python code directly using python's 
multiprocessing library, without any need for the avocado-framework 
support. But there is quite a lot of cons:

+ no need for framework API
- lots of boilerplate code in each test
- each solution would be unique and therefor hard to analyze the logs
- no decent way of sharing the code with other tests

Yes, it's possible to share the code by writing libraries, but that does 
not scale as other solutions...

Example (simplified):

     from avocado.core.remoter import Remote
     from threading import Thread
     ...
     class Wget(Thread):

         def __init__(self, machine, url):
             self.remoter = Remote(machine)
             self.url = url
             self.status = None

         def run(self):
             ret = self.remoter.run("wget %s" % self.target,
                                    ignore_status=True)
             self.status = ret.exit_status
     ...

     threads = []
     for machine in machines:
         threads.append(Wget(machine, "example.org"))
     for thread in threads:
         thread.start()
     for thread in threads:
         thread.join()
         self.failif(thread.status is 0, ...)
     ...

This should serve the purpose, but to be able to understand failures, 
one would have to add a lot of additional debug information and if one 
wanted to re-use the Wget in other tests, he'd have to make it a library 
shared with all the tests.

Nested tests API
----------------

Another approach would be to say the "block of code" is the full avocado 
test. The main benefits here are, that each avocado test provides 
additional debug information in a well established format people are 
used to from normal tests, allows one to split the complex problem into 
separate parts (including separate development) and easy sharing of an 
existing tests (eg. stress test, server setup, ...) and putting them 
together like a Lego into complex scenarios.

On the negative side, avocado test is not the smallest piece of code and 
it adds quite a bit of overhead. But for simpler code, one can execute 
the code directly (threads, remoter) without a framework support.

Example (simplified):

     import avocado

     class WgetExample(avocado.Test):
         def setUp(self):
             self.streams = avocado.Streams(self)
             for machine in machines:
                 self.streams.add_stream(machine)
         def test(self)
             for stream in self.streams:
                 stream.run_bg("/usr/bin/wget example.org")
             self.streams.wait(ignore_errors=False)

where the `avocado.Stream` represents a worker (local or remote) which 
allows running avocado tests in it (foreground or background). This 
should provide enough flexibility to combine existing tests in complex 
tests.

Instead of using plugin library for streams, we can develop it as 
another test variant (not a new test type, only avocado.Test with some 
additional initialization), called `avocado.MultiTest` or 
`avocado.NestedTest`:

     import avocado

     class WgetExample(avocado.NestedTest):
         # Machines are defined via params adn initialized
         # in NestedTest.setUp
         def test(self):
             for stream in self.streams:
                 stream.run("/usr/bin/wget example.org")
             self.wait(ignore_errors=False)

API backed by internal API
~~~~~~~~~~~~~~~~~~~~~~~~~~

_supported by cleber in v2 and I agree now_

This would implement the nested test API using the internal API (from 
avocado.core).

+ runs native python
+ easy interaction and development
+ easily extensible by either using internal API (and risk changes) or 
by inheriting and extending the features.
- lots of internal API will be involved, thus with almost every change 
of internal API we'd have to adjust this code to keep the NestedTest working
- fabric/paramiko is not thread/parallel process safe and fails badly so 
first we'd have to rewrite our remote execution code (use autotest's 
worker, or aexpect+ssh)

API backed by cmdline
~~~~~~~~~~~~~~~~~~~~~

_liked by me in v2, hated by others, rejected in v3_

This would implement the nested test API by translating it into "avocado 
run" commands.

+ easy to debug as users are used to the "avocado run" syntax and issues
+ allows manual mode where users trigger the "avocado run" manually
+ cmdline args are part of public API so they should stay stable
+ no issues with fabric/paramiko as each process is separate
+ even easier extensible as one just needs to implement the feature for 
"avocado run" and then can use it as extra_params in the worker, or send 
PR to support it in the stable environment.
- would require additional features to be available on the cmdline like 
streamline way of triggering tests
- only features available on the cmdline can be supported (currently not 
limiting)
- rely on stdout parsing (but avocado supports machine readable output)

Synchronization
===============

Some tests do not need any synchronization, users just need to run them. 
But some multi-stream tests needs to be precisely synchronized or they 
need to exchange data.

For synchronization purposes usually "barriers" are used, where barrier 
guards the entry into a section identified by "name" and "number of 
clients". All parties asking an entry into the section will be delayed 
until the "number of clients" reach the section (or timeout). Then they 
are resumed and can entry the section. Any failure while waiting for a 
barrier propagates to other waiting parties.

This can be all automated inside the `avocado.Streams`, which could 
start listening on a free port and pass this information to the executed 
code blocks. In the code blocks one simply imports `Sync` and initialize 
it with the address+port and can use it for synchronization (or later 
for data exchange).

     from avocado.plugins.sync import Sync
     # Connect the sync server on address stored in params
     # which could be injected by the multi-stream test
     # or set manually.
     sync = Sync(self, params.get("sync_server", "/plugins/sync_server"))
     # wait until 2 tests ask to enter "setup" barrier (60s timeout)
     sync.barrier("setup", 2, 60)

As before it can be part of the "NestedTest" test, initialized based on 
params without the need for boilerplate code. The result would be the 
same, avocado listens on some port and the tests can connect to this 
port and asks for a barrier/data exchange, with the support for 
re-connection.

For debugging purposes it might be useful to allow starting the sync 
server as avocado plugin eg. by `--sync-server ...` (or having another 
command just to start listening, eg `avocado syncserver`). With that one 
could spawn the multiple processes manually, without the need to run the 
main multi-stream test and communicate over this manually started 
server, or even just debug the behavior of one existing piece of the 
bigger test and fake the other components by sending the messages 
manually instead (eg. to see how it handles errors, timing issues, 
unexpected situations).

Again, there are several ways to implement this:

Standard multiprocess API
-------------------------

The standard python's multiprocessing library contains over the TCP 
synchronization. The only problem is that "barriers" were introduced in 
python3 so we'd have to backport it. Additionally it does not fit 100% 
to our needs, so we'd have to adjust it a bit (eg. to allow manual 
interaction)

Autotest's syncdata
-------------------

Python 2.4 friendly, supports barriers and data synchronization. On the 
contrary it's quite hackish and full of shortcuts.

Custom code
-----------

We can inspire by the above and create simple human-readable (easy to 
debug or interact with manually) protocol to support barriers and data 
exchange via pickling. IMO that would be easier to maintain than 
backporting and adjusting of the multiprocessing or fixing the autotest 
syncdata. A proof-of-concept can be found here:

     https://github.com/avocado-framework/avocado/pull/1019

It modifies the "passtest" to be only executed when it's executed by 2 
tests at the same time. The proof-of-concept does not support the 
multi-stream tests, so one has to run "avocado run passtest" twice using 
the same "--sync-server" (once --sync-server and once --sync).

Job API RFC
===========

Recently introduced Job API RFC covers very similar topic as "nested 
test", but it's not the same. The Job API is enabling users to modify 
the job execution, eventually even write a runner which would suit them 
to run groups of tests. On the contrary this RFC covers a way to combine 
code-blocks/tests to reuse them into a single test. In a hackish way, 
they can supplement each others, but the purpose is different.

One of the most obvious differences is, that a failed "nested" test can 
be intentional (eg. reusing the NetPerf test to check if unreachable 
machines can talk to each other), while in Job API it's always a failure.

I hope you see the pattern. They are similar, but on a different layer. 
Internally, though, they can share some pieces like execution the 
individual tests concurrently with different params/plugins 
(locally/remotely). All the needed plugin modifications would also be 
useful for both of these RFCs.

Some examples:

User1 wants to run "compile_kernel" test on a machine followed by 
"install_compiled_kernel passtest failtest warntest" on "machine1 
machine2". They depend on the status of the previous test, but they 
don't create a scenario. So the user should use Job API (or execute 3 
jobs manually).

User2 wants to create migration test, which starts migration from 
machine1 and receives the migration on machine2. It requires cooperation 
and together it creates one complex usecase so the user should use 
multi-stream test.

Conclusion
==========

Given the reasons I like the idea of "nested tests" using "API backed by 
internal API" as it is simple to start with, allows test reuse which 
gives us well known test result format and internal API allow greater 
flexibility for the future.

The netperf example from introduction would look like this:

Machine1:

     class NetServer(avocado.NestedTest):
         def setUp(self):
             process.run("netserver")
             self.barrier("setup", self.params.get("no_clients"))
         def test(self):
             pass
         def tearDown(self):
             self.barrier("finished", self.params.get("no_clients"))
             process.run("killall netserver")

Machine2:

     class NetPerf(avocado.NestedTest):
         def setUp(self):
             self.barrier("setup", params.get("no_clients"))
         def test(self):
             process.run("netperf -H %s -l 60"
                         % params.get("server_ip"))
             barrier("finished", params.get("no_clients"))

One would be able to run this manually (or from build systems) using:

     avocado syncserver &
     avocado run NetServer --mux-inject /plugins/sync_server:sync-server 
$SYNCSERVER &
     avocado run NetPerf --mux-inject /plugins/sync_server:sync-server 
$SYNCSERVER &

(where the --mux-inject passes the address of the "syncserver" into test 
params)

When the code is stable one would write this multi-stream test (or 
multiple variants of them) to do the above automatically:

     class MultiNetperf(avocado.NestedTest):
         def setUp(self):
             self.failif(len(self.streams) < 2)
         def test(self):
             self.streams[0].run_bg("NetServer",
                                    {"no_clients": len(self.streams)})
             for stream in self.streams[1:]:
                 stream.add_test("NetPerf",
                                 {"no_clients": len(self.workers),
                                  "server_ip": machines[0]})
             self.wait(ignore_failures=False)

Executing of the complex example would become:

     avocado run MultiNetperf

You can see that the test allows running several NetPerf tests 
simultaneously, either locally, or distributed across multiple machines 
(or combinations) just by changing parameters. Additionally by adding 
features to the nested tests, one can use different NetPerf commands, or 
add other tests to be executed together.

The results could look like this:

     $ tree $RESULTDIR
       └── test-results
           └── MultiNetperf
               ├── job.log
                   ...
               ├── 1
               │   └── job.log
                       ...
               └── 2
                   └── job.log
                       ...

Where the MultiNetperf/job.log contains combined logs of the "master" 
test and all the "nested" tests and the sync server.

Directories [12] contain results of the created (possibly even named) 
streams. I think they should be in form of standard avocado Job to keep 
the well known structure.