[Avocado-devel] RFC: multi-stream test (previously multi-test) [v4]

Tue May 3 17:11:53 UTC 2016

On Tue, May 03, 2016 at 06:29:25PM +0200, Lukáš Doktor wrote:
> Dne 3.5.2016 v 02:32 Cleber Rosa napsal(a):
> > 
> > 
> > On 04/29/2016 05:35 AM, Lukáš Doktor wrote:
> >> Dne 29.4.2016 v 00:48 Ademar Reis napsal(a):
> >>> On Thu, Apr 28, 2016 at 05:10:07PM +0200, Lukáš Doktor wrote:

<snip>

> >>>> Conclusion
> >>>> ==========
> >>>>
> >>>> This RFC proposes to add a simple API to allow triggering
> >>>> avocado.Test-like instances on local or remote machine. The main point
> >>>> is it should allow very simple code-reuse and modular test development.
> >>>> I believe it'll be easier, than having users to handle the
> >>>> multiprocessing library, which might allow similar features, but with a
> >>>> lot of boilerplate code and even more code to handle possible
> >>>> exceptions.
> >>>>
> >>>> This concept also plays nicely with the Job API RFC, it could utilize
> >>>> most of tasks needed for it and together they should allow amazing
> >>>> flexibility with known and similar structure (therefor easy to learn).
> >>>>
> >>>
> >>> I see you are trying to make the definitions more clear and a bit
> >>> less strict, but at the end of the day, what you're proposing is
> >>> that a test should be able to run other tests, plain and simple.
> >>> Maybe even worse, a Test would be able to run "jobs", disguised
> >>> as streams that run multiple tests.
> >>>
> >>> This is basically what you've been proposing since the beginning
> >>> and in case it's not crystal clear yet, I'm strongly against it
> >>> because I think it's a fundamental breakage of the abstractions
> >>> present in Avocado.
> >>>
> >>> I insist on something more abstract, like this:
> >>>
> >>>    Tests can run multiple streams, which can be defined as
> >>>    different processes or threads that run parts of the test
> >>>    being executed. These parts are implemented in the form of
> >>>    classes that inherit from avocado.Test.
> >>>
> >>>    (My initial feeling is that these parts should not even have
> >>>    setUp() and tearDown() methods; or if they have, they should
> >>>    be ignored by default when the implementation is run in a
> >>>    stream. In my view, these parts should be defined as "one
> >>>    method in a class that inherits from avocado.Test", with the
> >>>    class being instantiated in the actual stream runtime
> >>>    environment.  But this probably deserves some discussion, I
> >>>    miss some real-world use-cases here)
> >>>
> >>>    The only runtime variable that can be configured per-stream is
> >>>    the execution location (or where it's run): a VM, a container,
> >>>    remotely, etc. For everything else, Streams are run under the
> >>>    same environment as the test is.
> >>>
> >>>    Notice Streams are not handled as tests: they are not visible
> >>>    outside of the test that is running them. They don't have
> >>>    individual variants, don't have Test IDs, don't trigger
> >>>    pre/post hooks, can't change the list of plugins
> >>>    enabled/disabled (or configure them) and their results are not
> >>>    visible at the Job level.  The actual Test is responsible for
> >>>    interpreting and interacting with the code that is run in a
> >>>    stream.
> >>>
> >> So basically you're proposing to extract the method, copy it over to the
> >> other host and trigger it. In the end copy back the results, right?
> >>
> >> That would work in case of no failures. But if anything goes wrong, you
> >> have absolute no idea what happened, unless you prepare the code
> >> intended for execution to it. I really prefer being able to trigger real
> >> tests in remote environment from my tests, because:
> >>
> >> 1. I need to write the test just once and either use it as one test, or
> >> combine it with other existing tests to create a complex scenario
> >> 2. I know exactly what happened and where, because test execution
> >> follows certain workflow. I'm used to the workflow from normal execution
> >> so if anything goes wrong, I get quite extensive set of information
> >> regarding the failure, without any need to adjust the test code.
> >> 3. While writing the "inner" test, I don't need to handle the results. I
> >> use the streams available to me, I get `self` containing all the
> >> information like results dir, whiteboard, .... It's very convenient and
> >> using just methods with some arguments (or just stdout?) would be a huge
> >> step back.
> >>
> >> I mean for normal execution, it's usable (it loses the possibility of
> >> re-use the existing tests, for example as stresses) but when the "inner"
> >> test fails, I know nothing unless I pay a great attention and add a lot
> >> of debug information while writing the test.
> >>
> >>> Now let me repeat something from a previous e-mail, originally
> >>> written as feedback to v3:
> >>>
> >>> I'm convinced that your proposal breaks the abstraction and will
> >>> result in numerous problems in the future.
> >>>
> >>> To me whatever we run inside a stream is not and should not be
> >>> defined as a test.  It's simply a block of code that gets run
> >>> under the control of the actual test. The fact we can find these
> >>> "blocks of code" using the resolver is secondary. A nice and
> >>> useful feature, but secondary. The fact we can reuse the avocado
> >>> test runner remotely is purely an implementation detail. A nice
> >>> detail that will help with debugging and make our lives easier
> >>> when implementing the feature, but again, purely an
> >>> implementation detail.
> >>>
> >>> The test writer should have strict control of what gets run in a
> >>> stream, with a constrained API where the concepts are very clear.
> >>> We should not, under any circumstances, induce users to think of
> >>> streams as something that runs tests. To me this is utterly
> >>> important.
> >>>
> >>> For example, if we allow streams to run tests, or Test
> >>> References, then running `avocado run *cpuid*` and
> >>> `stream.run("*cpuid*")` will look similar at first, but with
> >>> several subtle differences in behavior, confusing users.
> >>>
> >>> Users will inevitably ask questions about these differences and
> >>> we'll end up having to revisit some concepts and refine the
> >>> documentation, a result of breaking the abstraction.
> >>>
> >>> A few examples of these differences which might not be
> >>> immediately clear:
> >>>
> >>>    * No pre/post hooks for jobs or tests get run inside a stream.
> >>>    * No per-test sysinfo collection inside a stream.
> >>>    * No per-job sysinfo collection inside a stream.
> >>>    * Per-stream, there's basically nothing that can be configured
> >>>      about the environment other than *where* it runs.
> >>>      Everything is inherited from the actual test. Streams should
> >>>      have access to the exact same APIs that *tests* have.
> >>>    * If users see streams as something that runs tests, it's
> >>>      inevitable that they will start asking for knobs
> >>>      to fine-tune the runtime environment:
> >>>      * Should there be a timeout per stream?
> >>>      * Hmm, at least support enabling/disabling gdb or wrappers
> >>>        in a stream? No? Why not!?
> >>>      * Hmm, maybe allow multiplex="file" in stream.run()?
> >>>      * Why can't I disable or enable plugins per-stream? Or at
> >>>        least configure them?
> >>>
> >> Basically just running a RAW test, without any features the default
> >> avocado runner provides. I'm fine with that.
> >>
> >> I slightly disagree there are no way of modifying the environment as the
> >> resolver resolves into template, which contains all the params given to
> >> the test. So one could modify basically everything regarding the test.
> >> The only thing one can't configure, nor use are the job features (like
> >> the pre-post hooks, plugins, ...)
> >>
> >>> And here are some other questions, which seem logical at first:
> >>>
> >>>    * Hey, you know what would be awesome? Let me upload the
> >>>      test results from a stream as if it was a job! Maybe a
> >>>      tool to convert stream test results to job results? Or a
> >>>      plugin that handles them!
> >>>    * Even more awesome: a feature to replay a stream!
> >>>    * And since I can run multiple tests in a stream, why can't I
> >>>      run a job there? It's a logical next step!
> >>>
> >>> The simple fact the questions above are being asked is a sign the
> >>> abstraction is broken: we shouldn't have to revisit previous
> >>> concepts to clarify the behavior when something is being added in
> >>> a different layer.
> >>>
> >>> Am I making sense?
> >>>
> >> IMO you're describing a different situation. We should have the Job API,
> >> which should suit users, who need the features you described, so they
> >> don't need to "workaround" it using this API.
> >>
> >> Other users might prefer the multiprocessing, fabric or autotest's
> >> remote_commander, to execute just a plain simple methods/scripts on
> >> other machines.
> >>
> >> But if you need to run something complex, you need a runner, which gives
> >> you the neat features to avoid the boilerplate code used to produce
> >> outputs in case of failure, or other features like streams, datadirs,
> >> ...).
> >>
> >> Therefor I believe allowing to trigger tests in background from test
> >> would be very useful and the best way of solving this I can imagine. As
> >> a test writer I would not want to learn yet another way of expressing
> >> myself when splitting the task in several streams. I want the same
> >> development, I expect the same results and yes, I don't expect the full
> >> job. Having just a raw test without any extra job features is sufficient
> >> and well understandable.
> >>
> >> Btw the only controversial think I can imagine is, that some (me
> >> including) would have nothing against offloading multi-stream tests into
> >> a stream (so basically nesting). And yes, I expect it to work and create
> >> yet another directory inside the stream's results. (eg. to run
> >> multi-host netperf as a stresser while running multi-host migration. I
> >> could either reference each party - netserver, netclient, migrate_from,
> >> migrate_to - or I can just say - multi_netperf, multi_migrate and expect
> >> the netserver+netclient streams to be created inside multi_netperf
> >> results and the same for migrate. Conceptually I have no problem with
> >> that and as a test writer I'd use the second, because putting together
> >> building blocks is IMO the way to go.
> >>
> > 
> > I can only say that, at this time, it's very clear to me what's
> > nested-test support and what's multi-stream test support.  Let's call
> > them by different names, because they're indeed different, and decide on
> > one.
> > 
> I'm not sure how to call them. I listed what is important to me in order
> to be able to cope with that and I think it's the (hated) nested-tests.
> I don't need the plugins, tweaks, nor anything else, I just want a means
> to trigger/run something I know how it behaves, collect the results and
> be done with it. I don't want to write try/except blocks and think what
> info I'll need in case it fails like that. I had been there and I hated it.
> 

I think the point here is that what you're proposing is very
different from what me and Cleber have in mind when we think of a
solution to the use-cases presented.

You're proposing nested-tests as the solution, while we're
proposing multi-flow (or multi-stream) tests, with an abstraction
layer. Unfortunately we're not converging to a common solution.

Proposing nested-tests is valid (as in: nobody should prevent you
from *proposing* it) and we can certainly discuss the merits of
it as a feature in Avocado. But what's happening here is that
you're trying to introduce it masked as something different
because of the resistance against it. The result is an
inconsistent proposal that doesn't please anybody.

My suggestion:

Rewrite your proposal as "support for nested-tests". Call it by
its real name, present it as something you consider consistent
and show us the advantages and disadvantages of the idea. With
all the discussions so far, you have a clear idea of what we like
and what we don't like. It'll live or die based on its own
merits.

In parallel, me and/or Cleber will write another proposal, for
what we see as the proper solution for the use-cases presented
("multi-flow" tests). It'll also live or die based on its own
merits.

Does it make sense?

Thanks.
   - Ademar

-- 
Ademar Reis
Red Hat

^[:wq!