[Avocado-devel] N(ext) Runner - The road to maturity

Mon May 18 12:08:53 UTC 2020

Hi all,

On 5/14/20 2:43 AM, Cleber Rosa wrote:
> Intro
> =====
> 
> The N(ext) Runner is an experiment started within the Avocado project,
> with the overall goal of solving some fundamental problems with the
> current architecture, and opening up the possibility of smoother
> implementation of a number of advanced features and use case.
> 
> For a complete list of the limitations and use cases, please refer
> to::
> 
>   https://avocado-framework.readthedocs.io/en/79.0/future/core/nrunner.html#motivation

I do like the first four points in the motivation from the link:

> - Remote execution
> - Different test execution isolation models provided by the test runner (process, container, virtual machine)
> - Distributed execution of tests across a pool of any combination of processes, containers, virtual machines, etc.
> - Parallel execution of tests

I think the role of isolation for improved test validation has become indispensable in
contemporary testing in the last two decades. So much so that now we also have various
levels of refinement: from mock modules on a physical machine to containers providing
configurable partial isolation to complete virtual machines. I am convinced every person
in QA of a product appreciates the importance of being able to discard and completely
redo some automated environment setup and use the same as a way to guarantee the validity
of the tests. If Avocado could propose such increasingly refined levels of isolation it
would also be a great way forward for other test runners who are currently a bit stuck
with Avocado VT and thus a lot of legacy code from Autotest.

At least to me such addition could be split into two parts:

1) The remove execution can be reused to connect to a virtual machine, a container, or
a physical machine, thus abstracting from additional details. The next generation runner
already seems to go in a good direction by treating remotely executed code as data. In
fact, the complete nrunner article above reminds me a lot of the autotest server goals
from the past. We recently merged a utility to aexpect that was aiming at very similar
outcomes when it comes to remove code execution:

https://github.com/avocado-framework/aexpect/pull/59

The idea behind this door utility was similar - execute code at a remote location using
only knowledge of sessions and not demanding specific type of isolation. I am not sure
if we could reuse code and implementations from each other or at least discuss more about
this topic in order to unify our thoughts as we seem to try to resolve the same problems.

2) Actually preparing, providing, and removing such test environments which is the part
current Avocado VT is unique at (at least when it comes to the vm type of isolation). Not
sure what your thoughts are but my thoughts here (possibly bold thoughts) are that we
could try and provide such extra management of isolation in the form of additional plugin.
This is what we are trying to do in our custom plugin

https://github.com/intra2net/avocado-i2n

which currently reuses lots of code from Avocado VT and automates complex dependency
resolutions in a traversable graph by managing the vms the VT plugin provides. So far we
can only use vms but are planning to expand to containers and virtual networks of vms of
various topologies but we definitely want to add depending directly on Avocado to our
wish list even though we realize this could require quite a significant effort. However,
it also boils down to the future of Avocado VT and how much of it we could migrate to
avocado and/or optional avocado plugins.

> Job API
> =======
> 
> Avocado, as a "test automation framework", provides a native API that
> users can use to write tests from scratch.  These are, quite simply,
> the "Test APIs"::
> 
>   https://avocado-framework.readthedocs.io/en/79.0/api/test/avocado.html#test-apis
> 
> But, we have noticed that a large part of the test automation process
> in the real world also happens outside of individual tests.  To
> facilitate the integration of the environment and the execution of a
> number of tests, it was natural to look at the layer that encompasses
> more responsibilities.
> 
> In Avocado terminology, a "Job" is an unique execution of a number of
> tests.  An Avocado Job is traditionally created/executed by means of
> the ``avocado run`` command.  The goal of the "Job API" work is to augment
> the expressiveness of jobs, by removing the obligatory command
> line interface, and giving users the possibility of using more flexible
> job descriptions.
> 
> With the "Job API", users should be able to write advanced jobs in a
> number of complementary ways, listed here from least complex to most
> complex:
> 
> * Providing a Python Dictionary configuration to the Job() class, which
>   can be easily loaded, say, from a JSON file
> 
> * Writing Python code that creates the Python dictionary configuration
>   at run time, and gives that to the Job() class
> 
> * Subclassing the core Job() implementation, customizing specific phases
>   of the job workflow, such as the creation of the "test suite", that is
>   the definition of the tests to be executed by the job.
> 
> Because Avocado is not interested in creating a new programming
> language, most of that should be done with standard programming
> languages, such as Avocado's primary language: pure Python.

I think it is essential to keep the Job API and usage as flexible as possible as it is more
or less the second most important class of objects after Test.

> Currently ongoing development
> -----------------------------
> 
> There has been three major types of developments around the Job API:
> 
> 1. Setting some examples "job files", and making use of custom jobs on
>    Avocado's own code.  Users can find example job files in "examples/jobs"
>    directory in the source code::
> 
>    https://github.com/avocado-framework/avocado/tree/master/examples/jobs
> 
>    And jobs that are used during the Avocado release process::
> 
>    https://github.com/avocado-framework/avocado/tree/master/jobs
> 
> 2. Introducing and porting Avocado's code to new combined settings
>    module.  This should allow the same experience for users of the
>    command line tool giving option as command line arguments, users
>    of the Job API providing the configuration as a dictionary, or
>    users of both command line and Job API settings options with
>    configuration files.
> 
> 3. Making sure that features that work while running jobs on the
>    command line (via "avocado run"), also work the same way in custom
>    jobs.  Recently a number of issues with output not being created
>    when running custom jobs were fixed, but there are certainly many
>    more to be found.

Unifying all these settings is great and I think Beraldo has done quite a lot of progress in
this direction. Definitely makes the life of a plugin developer a lot easier too.

> Relationship with N(ext) Runner
> -------------------------------
> 
> The N(ext) Runner will only be considered "feature complete" when it
> is completely integrated into an Avocado Job.  This means that the job
> files discussed on the previous section item #1 should behave the same
> with the current runner, or the "nrunner" implementation.  The items
> #2 and #3, should, as much as possible be finished *before* that, given
> that not doing so will cast a number of questions on who the culprit
> for a bug is.  Any bug will lead to the question: is it a Job API bug
> or a N(ext) Runner bug?
> 
> The mechanism that a Job uses to select a runner implementation already
> is in place.  Once can see the runner implementations available by
> running::
> 
>   $ avocado plugins
>   ...
>   Plugins that run test suites on a job (runners):
>   docker  *DEPRECATED* Runs on a Docker (or compatible) container
>   nrunner *EXPERIMENTAL* nrunner based implementation of job compliant runner
>   remote  *DEPRECATED* Runs on a remote machine using SSH
>   runner  The conventional test runner
>   vm      *DEPRECATED* Runs on a VM using SSH
> 
> And one can set a custom runner by setting the "test_runner" key
> in an Avocado, as it can be seen in "examples/jobs/nrunner.py"::
> 
>   #!/usr/bin/env python3
> 
>   import sys
>   from avocado.core.job import Job
> 
>   config = {
>       'test_runner': 'nrunner',
>       'run.references': [
>           'selftests/unit/test_resolver.py',
>           'selftests/functional/test_argument_parsing.py',
>           '/bin/true',
>           '/bin/false',
>       ],
>       }
> 
>   with Job(config) as j:
>       sys.exit(j.run())
> 
> The ability to set the runner as part of the command line, has
> also been proposed (and be merged by the time one read this)::
> 
>   https://github.com/avocado-framework/avocado/pull/3816
> 
> But setting the configuration in a configuration file and running all
> tests should give a quick indication of the number of issues with the
> N(ext) Runner.

We have our custom runner (called "traverser" in tradition of calling new implementations
with new names ;) which inherits from the current `TestRunner` and is set in code however
I see this paragraph as a subpoint of the unified settings discussed above.

> Requirements Resolver
> =====================
> 
> The requirements resolver is an abstraction and evolution of the
> current "asset fetcher" utility.
> 
> Currently ongoing development
> -----------------------------
> 
> There's a blue print document describing the motivation,
> specification, etc::
> 
>   https://avocado-framework.readthedocs.io/en/79.0/blueprints/BP002.html

To me this somewhat relates to point 2) I mentioned above so the most important question is
what are its current capabilities? I see two `PodmanSpawner` and `ProcessSpawner` implementations
but not much documentation. So what remains unclear to me is how much of the isolated environment
preparation can currently be set up by the spawners? I guess it is mostly downloading an
image/asset and then running it?

Indeed the next generation runner and some of this resolver/spawner functionality seems to
very tightly integrated. As for our plugin we use the old runner at the moment and we can
only spawn one process what we are doing at the moment to achieve parallelism is use extra
bash scripts to manage linux containers (LXC) and start different avocado command line runs
in different partially isolated containers. I wonder if transitioning to the next generation
runner could simplify our workflow and thus we could also participate more in providing
feedback about it.

> Avocado-VT
> ==========
> 
> Avocado-VT, in its previous Virt-Test and Autotest names incarnations
> and names, were the projects that gave birth to Avocado, so we'll
> always be fond of them.
> 
> Having said that, we have to be careful with the promise of
> compatibility for the N(ext) Runner and current Avocado-VT, specially
> about most of the Avocado-VT workflows and expectations.
> 
> For instance, Avocado-VT tests will, as a general rule, access guest
> image files in an installation wide location, say
> "~/avocado/data/avocado-vt/images/f30-x86_64.qcow2".  There's even
> a special plugin, "vt-joblock" whose goal is to prevent multiple jobs
> for running simultaneously and prevent corruption of such images.
> 
> Typical Avocado-VT usage with the libvirt backend, will usually
> require or suggest a fixed set of tests.  For instance, the first test
> in a job may be one that "defines" a libvirt guest domain, the
> following tests may be tests that operate on the guest domain defined
> by the first test, and the final test is usually one that cleans up
> the libvirt domain guest definition.
> 
> Now, thinking of N(ext) Runner, one of its main selling points is the
> ability to run tests in parallel.  While this is a benefit to most
> users, it's actually a problem to current Avocado-VT tests and
> workflows.  Corruption and general test errors and failures would be
> almost guaranteed to occur.

Using the spawners and providing more flexibility might be a good approach towards the partial
isolation provided by containers. I am not sure how possible it will be for us to take extra
steps (possibly with another spawner class) that could extend this to virtual machines. Most
importantly, I still have a lot of questions regarding the current functionality and in particular
how programmable and thus customizable it could get with the current concepts.

> I can think of two action items to manage Avocado-VT, which are
> complimentary:
> 
> 1. Have an LTS release, say version 82.0, with the current runner
>    implementation still supported.  This LTS version can be maintained
>    for twice the time of previous LTS releases (~3 years), and the
>    possibility to extend it even further.
> 
> 2. Attempt to migrate Avocado-VT infrascture code and tests to
>    behave well within the assumptions of the N(ext) Runner
>    architecture.

I am mostly weighing towards the second option (not excluding the possibility of the first one
of course). We very often seem to have circumstances of redeveloping code on the avocado side
that somebody already needed and merged on the VT side. Migrating more useful utilities will
both reduce the weight and very difficult maintenance on the VT side and put them to better use
for avocado newcomers. As the approach seems to be more and more towards phasing out the usage
of Avocado VT and towards a more capable next generation runner, we could only benefit if we
save as much as possible for the wider audience and reduce some future development work.

> Other?
> ======
> 
> If you're an Avocado user or developer, and you think there are
> other topics that may be impacted by the N(ext) Runner architecture
> and the deprecation and removal of the current runner architecture,
> please join the discussion and raise your point.

>From all of this the most exciting point for me remains how capable we could make the next
generation runner (coupled with the dependency resolver) in order to deprecate a lot of the
very flexible isolation capabilities VT can provide. More focus should be put to improved
isolation management (autotest-server-like behavior with some distributed computing included)
and probably less and less developmental effort will be needed for the actual running itself.

> Next
> ====
> 
> Following this, I'll be posting an architectural blue print proposal
> for the missing pieces of the N(ext) Runner.  Feedback on the proposal
> is much appreciated, and it's worth reminding everyone that there's no
> unworthy suggestion or question.
> 

Thanks Cleber for opening future development of such scale to be discussed among all the people
using this software, I will keep an eye for the blueprints to come.

Best,
Plamen

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/avocado-devel/attachments/20200518/a0b1fcff/attachment.sig>