[Avocado-devel] RFC: Multi-host tests

Mon Mar 28 19:49:34 UTC 2016

----- Original Message -----
> From: "Cleber Rosa" <crosa at redhat.com>
> To: "Lukáš Doktor" <ldoktor at redhat.com>
> Cc: "Amador Pahim" <apahim at redhat.com>, "avocado-devel" <avocado-devel at redhat.com>, "Ademar Reis" <areis at redhat.com>
> Sent: Monday, March 28, 2016 4:44:15 PM
> Subject: Re: [Avocado-devel] RFC: Multi-host tests
> 
> 
> 
> ----- Original Message -----
> > From: "Lukáš Doktor" <ldoktor at redhat.com>
> > To: "Ademar Reis" <areis at redhat.com>, "Cleber Rosa" <crosa at redhat.com>,
> > "Amador Pahim" <apahim at redhat.com>, "Lucas
> > Meneghel Rodrigues" <lookkas at gmail.com>, "avocado-devel"
> > <avocado-devel at redhat.com>
> > Sent: Saturday, March 26, 2016 4:01:15 PM
> > Subject: RFC: Multi-host tests
> > 
> > Hello guys,
> > 
> > Let's open a discussion regarding the multi-host tests for avocado.
> > 
> > The problem
> > ===========
> > 
> > A user wants to run netperf on 2 machines. To do it manually he does:
> > 
> >      machine1: netserver -D
> >      machine1: # Wait till netserver is initialized
> >      machine2: netperf -H $machine1 -l 60
> >      machine2: # Wait till it finishes and report store the results
> >      machine1: # stop the netserver and report possible failures
> > 
> > Now how to support this in avocado, ideally as custom tests, ideally
> > even with broken connections/reboots?
> > 
> > 
> > Super tests
> > ===========
> > 
> > We don't need to do anything and leave everything on the user. He is
> > free to write code like:
> > 
> >      ...
> >      machine1 = aexpect.ShellSession("ssh $machine1")
> >      machine2 = aexpect.ShellSession("ssh $machine2")
> >      machine1.sendline("netserver -D")
> >      # wait till the netserver starts
> >      machine1.read_until_any_line_matches(["Starting netserver"], 60)
> >      output = machine2.cmd_output("netperf -H $machine1 -l $duration")
> >      # interrupt the netserver
> >      machine1.sendline("\03")
> >      # verify netserver finished
> >      machine1.cmd("true")
> >      ...
> > 
> > the problem is it requires active connection and the user needs to
> > manually handle the results.
> 
> And of course the biggest problem here is that it doesn't solve the
> Avocado problem: providing a framework and tools for tests that span
> multiple (Avocado) execution threads, possibly on multiple hosts.
> 
> > 
> > 
> > Triggered simple tests
> > ======================
> > 
> > Alternatively we can say each machine/worker is nothing but yet another
> > test, which occasionally needs a synchronization or data-exchange. The
> > same example would look like this:
> > 
> > machine1.py:
> > 
> >     process.run("netserver")
> >     barrier("server-started", 2)
> >     barrier("test-finished", 2)
> >     process.run("killall netserver")
> > 
> > machine2.py:
> > 
> >      barrier("server-started", 2)
> >      self.log.debug(process.run("netperf -H %s -l 60"
> >                                 % params.get("server_ip"))
> >      barrier("test-finished", 2)
> > 
> > where "barrier(name, no_clients)" is a framework function which makes
> > the process wait till the specified number of processes are waiting for
> > the same barrier.
> 
> The barrier mechanism looks like an appropriate and useful utility for the
> example given.  Even though your use case example explicitly requires it,
> it's worth pointing out and keeping in mind that there may be valid use cases
> which won't require any kind of synchronization.  This may even be true to
> the executions of tests that spawn multiple *local* "Avocado runs".
> 
> > 
> > The barrier needs to know which server to use for communication so we
> > can either create a new service, or simply use one of the executions as
> > "server" and make both processes use it for data exchange. So to run the
> > above tests the user would have to execute 2 avocado commands:
> > 
> >      avocado run machine1.py --sync-server machine1:6547
> >      avocado run machine2.py --remote-hostname machine2 --mux-inject
> > server_ip:machine1 --sync machine1:6547
> > 
> > where:
> >      --sync-server tells avocado to listen on ip address machine1 port 6547
> >      --remote-hostname tells the avocado to run remotely on machine2
> >      --mux-inject adds the "server_ip" into params
> >      --sync tells the second avocado to connect to machine1:6547 for
> > synchronization
> 
> To be honest, apart from the barrier utility, this provides little value
> from the PoV of a *test framework*, and possibly unintentionally, competes
> and overlaps with "remote" tools such as fabric.
> 
> Also, given that the multiplexer is an optional Avocado feature, such
> a feature should not depend on it.
> 
> > 
> > Running those two tests has only one benefit compare to the previous
> > solution and that is it gathers the results independently and makes
> > allows one to re-use simple tests. For example you can create a 3rd
> > test, which uses different params for netperf, run it on "machine2" and
> > keep the same script for "machine1". Or running 2 netperf senders at the
> > same time. This would require libraries and more custom code when using
> > "Super test" approach.
> > 
> > There are additional benefits for this solution. When we introduce the
> > locking API, tests running on a remote machine will be actually directly
> > executed in avocado, therefor the locking API will work for them,
> > avoiding problems with multiple tests using the same shared resource.
> > 
> > Another future benefit would be system reboot/lost connection when we
> > introduce this support for individual tests. The way it'd work is that
> > user triggers the jobs, the master remembers the test ids and would poll
> > for results until they finish/timeout.
> > 
> > All of this we get for free thanks to re-using the existing
> > infrastructure (or the future infrastructure), so I believe this is the
> > right way to go and in this RFC I'm describing details of this approach.
> >
> 
> All of the benefits listed are directly based on the fact that tests on
> remote systems would be run under the Avocado test runner and would have
> it's runtime libraries available.  This is a valid point, but again it
> doesn't bring a significant change in the user experience wrt running
> tests that span multiple "Avocado runs" (possibly on remote machines).
> 
> > 
> > Triggering the jobs
> > -------------------
> > 
> > Previous example required the user to run the avocado 2 times (per each
> > machine) and sharing the same sync server. Additionally it resulted into
> > 2 separated results. Let's try to eliminate this problem.
> > 
> > 
> > Basic tests
> > ~~~~~~~~~~~
> > 
> > For basic setups, we can come up with very simple format to describe
> > which tests should be triggered and avocado should take care of
> > executing it. The way I have in my mind is to simply accept list of
> > "avocado run" commands:
> > 
> > simple_multi_host.mht:
> > 
> >      machine1.py
> >      machine2.py --remote-hostname machine2 --mux-inject server_ip:machine1
> > 
> > Running this test:
> > 
> >      avocado run simple_multi_host.mht --sync-server 0.0.0.0
> > 
> > avocado would pick a free port and start the sync server on it. Then it
> > would prepend "avocado run" and append "--sync $sync-server
> > --job-results-dir $this-job-results" to each line in
> > "simple_multi_host.mht" and run them in parallel. Afterward it'd wait
> > till both processes finish and report pass/fail depending on the status.
> > 
> > This way users get overall results as well as individual ones and simple
> > way to define static setups.
> > 
> 
> First, the given usage example would require Avocado to introduce:
> 
>  * A brand new file format
>  * A new test type (say MULTI_HOST_TEST, in addition to the SIMPLE,
>    INSTRUMENTED, etc).
> 
> Introducing a brand new file format may look like a very simple thing
> to do, but it's not.  I can predict that we'd learn very quickly that
> our original file format definition is very limited.  Then we'd either
> have to live with that, or introduce new file format versions, or just
> break the initial definition or compatibility.  These are all problems
> related to file formats, not really to your proposed file format.
> 
> Then, analogous to the "remote tools (fabric)" example I gave before,
> this looks to be outside of the problem scope of Avocado, in the sense
> that "template" tools can do it better.
> 
> Introducing a new test type, and a test resolver/loader, would be a
> mandatory step to achieve this design, but it looks like a necessary
> action only to make the use of "MHT" file format possible.
> 
> Please note that having a design that allow users to fire multiple
> Avocado command line instances executions in their own scripts is a bad
> thing, but as a test framework, I believe we can deliver a better, more
> focused experience.

I meant "is *not* a bad thing".

> 
> > 
> > Contrib scripts
> > ~~~~~~~~~~~~~~~
> > 
> > The beauty of executing simple lines is, that users might create contrib
> > scripts to generate the "mht" files to get even better flexibility.
> 
> Since I don't think a new file format and test type is a good thing, this
> also becomes a bad idea IMHO.
> 
> > 
> > 
> > Advanced tests
> > ~~~~~~~~~~~~~~
> > 
> > The above might still not be flexible enough. But the system underneath
> > is very simple and flexible. So how about creating instrumented tests,
> > which generate the setup? The same simple example as before:
> > 
> > multi_host.py
> > 
> >      runners = ["machine1.py"]
> >      runners.append("machine2.py --remote-hostname machine2 --mux-inject
> > server_ip:machine1")
> >      self.execute(runners)
> > 
> 
> A major plus here is that there's no attempt to define new file formats,
> test types and other items that are necessary only to fulfill a use case
> requirement.  Since Avocado's primary language of choice is Python, we
> should stick to it, given that it's expressive enough and well maintained
> enough.  This is of course a lesson we learned with Autotest itself, let's
> not forget it.
> 
> Then, a couple of things I dislike here:
> 
>  1) First runner is special/magical (sync server would be run here)
>  2) Interface with runner execution is done by command line parameters
> 
> > where the "self.execute(tests)" would take the list and does the same as
> > for basic tests. Optionally it could return the json results per each
> > tests so the test itself can react and modify the results.
> > 
> > The above was just a direct translation of the previous example, but to
> > demonstrate the real power of this let's try a PingPong multi host test:
> > 
> >      class PingPong(MultiHostTest):
> >          def test(self):
> >              hosts = self.params.get("hosts", default="").split(";")
> >              assert len(hosts) >= 2
> >              runners = ["ping_pong --remote-hostname %s" % _
> >                              for _ in hosts]
> >              # Start creating multiplex tree interactively
> >              mux = MuxVariants("variants")
> >              # add /run/variants/ping with {} values
> >              mux.add("ping", {"url": hosts[1], "direction": "ping",
> >                               "barrier": "ping1"})
> >              # add /run/variants/pong with {} values
> >              mux.add("pong", {"url": hosts[-1], "direction": "pong",
> >                               "barrier": "ping%s" % len(hosts) + 1})
> >              # Append "--mux-inject mux-tree..." to the first command
> >              runners[0] += "--mux-inject %s" % mux.dump()
> >              for i in xrange(1, len(hosts)):
> >                  mux = MuxVariants("variants")
> >                  next_host = hosts[i+1 % len(hosts)]
> >                  prev_host = hosts[i-1]
> >                  mux.add("pong", {"url": prev_host, "direction": "pong",
> >                                   "barrier": "ping%s" % i})
> >                  mux.add("ping", {"url": next_host, "direction": "ping",
> >                                   "barrier": "ping%s" % i+1})
> >                  runners[i] += "--mux-inject %s" % mux.dump()
> >              # Now do the same magic as in basic multihost test on
> >              # the dynamically created scenario
> >              self.execute(runners)
> > 
> > The `self.execute` generates the "simple test"-like list of "avocado
> > run" commands to be executed. But the test writer can define some
> > additional behavior. In this example it generates
> > machine1->machine2->...->machine1 chain of ping-pong tests.
> 
> You mean that this would basically generate a "shell script like" list
> of avocado runs?  This looks to be a very strong design decision, and
> I fail to see how it would lend itself to be flexible enough and deliver
> the "test writer can define some additional behavior" requirement.
> 
> > 
> > When running "avocado run pingpong --mux-inject hosts:machine1;machine2"
> > this generates 2 jobs, both running just a single "ping_pong" test with
> > 2 multiplex variants:
> > 
> > machine1:
> > 
> >      variants: !mux
> >          ping:
> >              url: machine2
> >              direction: pong
> >              barrier: ping1
> >          pong:
> >              url: machine2
> >              direction: pong
> >              barrier: ping2
> > machine2:
> > 
> >      variants: !mux
> >          pong:
> >              url: machine1
> >              direction: pong
> >              barrier: ping1
> >          ping:
> >              url: machine1
> >              direction: ping
> >              barrier: ping2
> > 
> > The first multiplex tree for three machines looks like this:
> > 
> >      variants: !mux
> >          ping:
> >              url: machine2
> >              direction: pong
> >              barrier: ping1
> >          pong:
> >              url: machine3
> >              direction: pong
> >              barrier: ping
> > 
> > Btw I simplified the format for the sake of this RFC. I think instead of
> > generating the strings we should support API to specify test,
> > multiplexer, options... and then turn them into the parallel executed
> > jobs (usually remotely). But these are just details to be solved if we
> > decide to work on it.
> 
> This statement completely changes what you have proposed up to this point.
> 
> IMHO it's far from being just details, because that would define the lowest
> and commonest level of this feature set that we would advertise and support.
> The design should really be from this level up, and not from the opposite
> direction.
> 
> If external users want to define file formats (say your own MHT proposal) on
> top of our "framework for running tests that span multiple execution threads"
> at once, they should be able to do so.
> 
> If you ask me, having sound Avocado APIs that users could use to fire
> multiple
> portions of their *tests* at once and have their *results* coalesced into a
> single
> *test* result is about what Avocado should focus on.
> 
> > 
> > 
> > Results and the UI
> > ==================
> > 
> > The idea is, that the user is free to run the jobs separately, or to
> > define the setup in a "wrapper" job. The benefit of using the "wrapper"
> > job are the results in one place and the `--sync` handling.
> > 
> > The difference is that running them individually looks like this:
> > 
> >      1 | avocado run ping_pong --mux-inject url:192.168.1.58:6001
> > --sync-server
> >      1 | JOB ID     : 6057f4ea2c99c43670fd7d362eaab6801fa06a77
> >      1 | JOB LOG    :
> > /home/medic/avocado/job-results/job-2016-01-22T05.33-6057f4e/job.log
> >      1 | SYNC       : 0.0.0.0:6001
> >      1 | TESTS      : 1
> >      1 |  (1/1) ping_pong: \
> >      2 | avocado run ping_pong --mux-inject :url::6001 direction:pong
> > --sync 192.168.1.1:6001 --remote-host 192.168.1.1
> >      2 | JOB ID     : 6057f4ea2c99c43670fd7d362eaab6801fa06a77
> >      2 | JOB LOG    :
> > /home/medic/avocado/job-results/job-2016-01-22T05.33-6057f4e/job.log
> >      2 | TESTS      : 1
> >      2 |  (1/1) ping_pong: PASS
> >      1 |  (1/1) ping_pong: PASS
> > 
> > and you have 2 results directories and 2 statuses. By running them
> > wrapped inside simple.mht test you get:
> > 
> >      avocado run simple.mht --sync-server 192.168.122.1
> >      JOB ID     : 6057f4ea2c99c43670fd7d362eaab6801fa06a77
> >      JOB LOG    :
> > /home/medic/avocado/job-results/job-2016-01-22T05.33-6057f4e/job.log
> >      TESTS      : 1
> >       (1/1) simple.mht: PASS
> >      RESULTS    : PASS 1 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0
> >      TIME       : 0.00 s
> > 
> > And single results:
> > 
> >      $ tree $RESULTDIR
> > 
> >      └── test-results
> >          └── simple.mht
> >              ├── job.log
> >                  ...
> >              ├── 1
> >              │   └── job.log
> >                      ...
> >              └── 2
> >                  └── job.log
> >                      ...
> > 
> >      tail -f job.log:
> >      running avocado run ping pong ping pong
> >      running avocado run pong ping pong ping --remote-hostname
> > 192.168.122.53
> >      waiting for processes to finish...
> >      PASS avocado run ping pong ping pong
> >      FAIL avocado run pong ping pong ping --remote-hostname 192.168.122.53
> >      this job FAILED
> > 
> 
> I won't spend much time here, since the UI is bound to follow other design
> ideas/decisions.
> 
> > 
> > Demonstration
> > =============
> > 
> > While considering the design I developed a WIP example. You can find it
> > here:
> > 
> >      https://github.com/avocado-framework/avocado/pull/1019
> > 
> > It demonstrates the `Triggered simple tests` chapter without the
> > wrapping tests. Hopefully it helps you understand what I had in mind. It
> > contains modified "examples/tests/passtest.py" which requires 2
> > concurrent executions (for example if you want to test your server and
> > run multiple concurrent "wget" connections). Feel free to play with it,
> > change the number of connections, set different barriers, combine
> > multiple different tests...
> > 
> > 
> > Autotest
> > ========
> > 
> > Avocado was developed by people familiar with Autotest, so let's just
> > mention here, that this method is not all that different from Autotest
> > one. The way Autotest supports parallel execution is it let's users to
> > create the "control" files inside the multi-host-control-file and then
> > run those in parallel. For synchronization it contains master->slave
> > barrier mechanism extended of SyncData to send pickled data to all
> > registered runners.
> > 
> > I considered if we should re-use the code, but:
> > 
> > 1. we do not support control files, so I just inspired by passing the
> > params to the remote instances
> 
> One of the wonderful things about Autotest control files is that
> it's not a custom file format.  This can not be underestimated.  While
> other frameworks have had huge XML based file formats to drive their
> jobs, Autotest control files are infinitely more capable and their
> readability is a lot more scalable.
> 
> The separation of client and server test types (and control files) is
> actually what prevents control files from nearing perfection IMHO.
> 
> The server API allows you to run client control files on given hosts.
> These client control files usually need tweaking for each host.  Then
> you're suddenly doing code generation (control files Python code). That
> is not nice.
> 
> I believe that, if Avocado provides such an API that allows regular Python
> code to operate similarly to server control files, while giving more control
> and granularity to what is run on the individual job executions (say
> on remote machines), and help to coalesce the individual portions into a
> single test result, it would be a very attractive tool.
> 
> > 2. the barriers and syncdata are quite hackish, master->slave
> > communication. I think the described (and demonstrated) approach does
> > the same in a less hackish way and is easy to extend
> > 
> > Using this RFC we'd be able to run autotest-multi-host tests, but it'd
> > require rewriting the control files to "mht" (or contrib) files. It'd be
> > probably even possible to write a contrib script to run the control file
> > and generate the "mht" file which would run the autotest test. Anyway
> > the good think for us is, that this does not affect "avocado-vt",
> > because all of the "avocado-vt" multi-host tests are using a single
> > "control" file, which only prepares the params for simple avocado-vt
> > executions. The only necessary thing is a custom "tests.cfg" as by
> > default it disallows multi-host tests (or we can modify the "tests.cfg"
> > and include the filter inside the "avocado-vt" loader, but these are
> > just the details to be sorted when we start running avocado-vt
> > multi-host tests.
> > 
> > Conclusion
> > ==========
> > 
> > Multi-host testing was solved many times in the history. Some hardcode
> > tests with communication, but most framework I had seen support
> > triggering "normal/ordinary" tests and add some kind of barrier (either
> > inside the code or between the tests) mechanism to synchronize the
> > execution. I'm for the flexibility and easy test sharing and that is how
> > I described it here.
> > 
> > Kind regards,
> > Lukáš
> > 
> 
> _______________________________________________
> Avocado-devel mailing list
> Avocado-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/avocado-devel
>