[Avocado-devel] Parameter System Overhaul

Tue Aug 8 13:21:09 UTC 2017

On Tue, Aug 08, 2017 at 01:01:26PM +0200, Lukáš Doktor wrote:
> Hello guys,
> 
> I'm sorry for such a late response, I totally forgot about this email (thanks to Ademar, your response reminded it to me).
> 
> Dne 7.8.2017 v 23:49 Ademar Reis napsal(a):
> > On Tue, Aug 01, 2017 at 03:37:34PM -0400, Cleber Rosa wrote:
> >> Even though Avocado has had a parameter passing system for
> >> instrumented tests almost from day one, it has been intertwined with
> >> the varianter (then multiplexer) and this is fundamentally wrong.  The
> >> most obvious example of this broken design is the `mux-inject` command
> >> line option::
> >>
> >>   --mux-inject [MUX_INJECT [MUX_INJECT ...]]
> >>                         Inject [path:]key:node values into the final
> >> multiplex
> >>                         tree.
> >>
> >> This is broken design not because such a varianter implementations can
> >> be tweaked over the command line, that's fine.  It's broken because it
> >> is the recommended way of passing parameters on the command line.
> >>
> >> The varianter (or any other subsystem) should be able to act as a
> >> parameter provider, but can not dictate that parameters must first be
> >> nodes/key/values of its own internal structure.
> > 
> > Correct. It's broken because it violates several layers. There would
> > be nothing wrong with something like "--param [prefix:]<key:value>",
> > for example (more below).
> > 
> Well I wouldn't call it broken. The implementation is fine we only lack other providers which would allow to inject just params so people are abusing `mux-inject` for that.
> 
> >>
> >> The proposed design
> >> ===================
> >>
> >> A diagram has been used on a few different occasions, to describe how
> >> the parameters and variants generation mechanism should be connected
> >> to a test and to the overall Avocado architecture.  Here it is, in its
> >> original form::
> >>
> >>    +------------------------------+
> >>    | Test                         |
> >>    +------------------------------+
> >>              |
> >>              |
> >>    +---------v---------+    +--------------------------------+
> >>    | Parameters System |--->| Variants Generation Plugin API |
> >>    +-------------------+    +--------------------------------+
> >>              ^                                ^
> >>              |                                |
> >>    +--------------------------------------+   |
> >>    | +--------------+ +-----------------+ |   |
> >>    | | avocado-virt | | other providers | |   |
> >>    | +--------------+ +-----------------+ |   |
> >>    +--------------------------------------+   |
> >>                                               |
> >>                  +----------------------------+-----+
> >>                  |                                  |
> >>                  |                                  |
> >>                  |                                  |
> >>        +--------------------+           +-------------------------+
> >>        | Multiplexer Plugin |           | Other variant plugin(s) |
> >>        +--------------------+           +-------------------------+
> >>              |
> >>              |
> >>        +-----v---------------------------+
> >>        | +------------+ +--------------+ |
> >>        | | --mux-yaml | | --mux-inject | |
> >>        | +------------+ +--------------+ |
> >>        +---------------------------------+
> >>
> >>
> >> Given that the "Parameter System" is the entry point into the parameters
> >> providers, it should provide two different interfaces:
> >>
> >>  1) An interface for its users, that is, developers writing
> >>     `avocado.Test` based tests
> >>
> >>  2) An interface for developers of additional providers, such as the
> >>     "avocado-virt" and "other providers" box on the diagram.
> >>
> >> The current state of the the first interface is the ``self.params``
> >> attribute.  Hopefully, it will be possible to keep its current interface,
> >> so that tests won't need any kind of compatibility adjustments.
> > 
> > Right. The way I envision the parameters system includes a
> > resolution mechanism, the "path" currently used in params.get().
> > This adds extra specificity to the user who requests a parameter.
> > 
> > But these parameters can be provided by any entity. In the diagram
> > above, they're part of the "Parameter System" box. Examples of
> > "other providers" could be support for a configuration file or a
> > "--param=[prefix:]<key:value>" command line option.
> > 
> > The Test API to request parameters should be shared. To a test it
> > doesn't matter where the parameter is coming from. They're always
> > accessed through an API like:
> > 
> >     params.get(<key>, [prefix], [fallback value])
> > 
> > Where:
> >   * key (mandatory) is the configuration variable the user wants.
> > 
> >   * prefix (optional) is used to find the right key and can be
> >     partially resolved, from right to left. As an example, a
> >     `params.get("bar", "name")` would be fulfilled by the parameter
> >     system if there are entries like "bar:name", "foobar:name" or
> >     "/foo/bar:name". We're using '/' in the multiplexer to
> >     separate branch levels, thus instead of "prefix", we're calling
> >     it "path", but the mechanism is the same and IMO should be
> >     generic to avoid confusion (a path at the parameter level is
> >     different than a path at the multiplexer level).
> > 
> >     params.get() can be even more flexible, supporting regexps and
> >     blobs...  The objective of [prefix] is to serve as a filter and
> >     to guarantee the caller is getting the variable he wants. It's
> >     similar to a namespace.
> > 
> >   * fallback (optional) is the value returned if the parameter
> >     system can't resolve prefix+key.
> > 
> > Users who don't want any specificity and/or have a small test base
> > with a low chance of clashes could simply ignore the prefix both
> > when creating parameters and when making calls to params.get().
> > 
> >>
> >> The second item will probably mean the definition of a new class to
> >> the ``avocado.core.plugin_interfaces`` module, together with a new
> >> dispatcher(-like) implementation in ``avocado.core.dispatcher``.
> >>
> >> Parameters availability: local .vs. external
> >> ============================================
> >>
> >> Right now, all parameters are given to the test at instantiation time.
> >> Let's say that in this scenario, all parameters are *local* to the
> >> test.  Local parameters have the benefit that the test is self
> >> contained and doesn't need any external communication.
> >>
> >> In theory, this is also a limitation, because all parameters must be
> >> available before the test is started.  Even if other parameter system
> >> implementations are possible, with a local approach, there would be a
> >> number of limitations.  For long running tests, that may depend on
> >> parameters generated during the test, this would be a blocker.  Also,
> >> if a huge number of parameters would be available (think of a huge
> >> cloud or database of parameters) they would all have to be copied to
> >> the test at instantiation time.  Finally, it also means that the
> >> source of parameters would need to be able to iterate over all the
> >> available parameters, so that they can be copied, which can be a
> >> further limitation for cascading implementations.
> >>
> >> An external approach to parameters, would be one that the test holds a
> >> handle to a broker of parameter providers.  The parameter resolution
> >> would be done at run time.  This avoids the copying of parameters, but
> >> requires runtime communication with the parameter providers.  This can
> >> make the test execution much more fragile and dependent on the external
> >> communication.  Even by minimizing the number of communication
> >> endpoints by communicating with the test runner only, it can still add
> >> significant overhead, latency and point of failures to the test
> >> execution.
> >>
> >> I believe that, at this time, the limitations imposed by local
> >> parameter availability do *not* outweigh the problems that an external
> >> approach can bring.  In the future, if advanced use cases require an
> >> external approach to parameters availability, this can be reevaluated.
> > 
> > If I understand your point correctly, this is an implementation
> > detail. It depends on what the "contract" is between the test runner
> > (parameter provider) and the test being run (the user of
> > params.get()).
> 
> The only difference for the scope of this RFC I see is that the
> "lazy" params can't be merged with "static" params before test
> execution (otherwise it'd effectively made them static as well).
> This is quite important difference to consider further in the
> design...
> 
> > 
> > For example, should users assume parameters are dynamic and can
> > change during the lifetime of a test, an therefore two identical
> > calls to params.get() might return different values?  Should it be
> > possible to change params (something like params.put()) at runtime?
> Definitely no params changing.
> 
> > 
> > (IMO the answer should be no to both questions).

The above should be properly specified.

> > 
> > If you have something different in mind, then it would be
> > interesting to see some real use-cases.
> > 
> >>
> >> Namespaces (AKA how/if should we merge parameters)
> >> ==================================================
> >>
> >> Currently, the parameter fetching interface already contains at its
> >> core the concept of paths[1].  In theory, this is sufficient to prevent
> >> clashes of keys with the same names, but intended to configure different
> >> aspects of a test.
> >>
> >> Now, with the proposed implementation of multiple providers to the
> >> parameter system, the question of how they will be combined comes up.
> >>
> >> One way is for each implementation to claim, based on some unique
> >> attribute such as its own name, a part of a tree path.  For instance,
> >> for two implementations:
> >>
> >>  1) variants
> >>  2) plain_ini
> >>
> >> Users could access parameters explicitly defined on those by referring
> >> to paths such as:
> >>
> >>    self.params.get('speed', path='/plain_ini', default=60)
> >>
> >> or
> >>
> >>    self.params.get('speed', path='/variants/path/inside/varianter',
> >> default=60)
> >>
> >> This clearly solves the clashes, but binds the implementation to the
> >> tests, which should be avoided at all costs.
> > 
> > So you're providing this as an example of why it's a bad idea...
> > OK. :)
> > 
> Yep, don't see this as a way to go. One should be able to execute the same test with different providers (eg. json file with params generated by jenkins)
> 
> >>
> >> One other approach would be to merge everything into the tree root
> >> node.  By doing this, one parameter provider could effectively
> >> override the values in another parameter provider, given that both
> >> used the same paths for a number of parameters.
> This wouldn't be possible with "lazy" providers as it'd require iterating through all params.
> 
> >>
> >> Yet another approach would be to *not* use paths, and resort to
> >> completely separate namespaces.  A parameter namespace would be an
> >> additional level of isolation, which can quickly become exceedingly
> >> complex.
> If I understood it correctly, this is what I'd go with. The way I envision this is:
> 
> 1. varianter yields a variant, which also contains params in form of list of namespaces associated with params `variant = [{"/run/variant": {"foo": "bar"}},]`
> 2. runner attaches params from cmdline `cmdline = [{"/1": {"foo": "bar"}}, {"/2": {"foo": "bar"}}]`
> 3. runner attaches params from avocado-virt plugin `virt = [{"/virt": {"image_name": "foo"}]
> 4. runner executes test and passes params definition there
>     {"params": [("varianter", varianter), ("cmdline", cmdline), ("virt", virt)],
>      "order": ["varianter", "__ALL_", "cmdline"]}
> 
> Now the NewAvocadoParams object should create AvocadoParams per each provider (params, cmdline, virt) being able to query params from any of them. The process of getting params would get a bit more complicated for Avocado, but for user nothing changes. Let's say user issues params.get("foo", "/*"):
> 
> 1. NewAvocadoParasms looks at the "order"
> 2. varianter is asked for a value, reports "bar", as this is a valid response, no further params is queried and "bar" is returned
> 
> Now for params.get("foo", "/1/*")
> 
> 1. NewAvocadoParasms looks at the "order"
> 2. varianter is asked and reports no match
> 3. __ALL__ => means ask providers without assigned priority, which applies to "virt" in this example, which reports no match
> 4. cmdline is asked and returns "bar"
> 
> And params.get("missing")
> 
> 1. NewAvocadoParasms looks at the "order"
> 2. varianter => no match
> 3. virt => no match
> 4. cmdline => no match
> 5. default is reported (None)
> 
> 
> The "order" is important, by default I'd suggest varianter=>plugins=>params=>default_params (basically what Ademar suggested and we all agreed many times), but it should be possible to set it if necessary (eg. when user have multiple plugins and knows the order).
> 
> Now in case of clash I believe it should report the clash even though further providers might be able to report. Let's change the order to ["cmdline", "__ALL__"] and query for params.get("foo", "/*")
> 
> 1. cmdline is asked, it raises ValueError("Leaves [/1, /2] contain key "foo"\n  /1:foo:bar\n  /2:foo:bar") which is forwarded to the test.
> 
> This schema would work for "lazy" providers as it'd only be asked for specific key and only when needed.
> 

So looks like we agree here, but I'm not familiar with the
implementation details to follow everything. Cleber, can you confirm
and clarify what you had in mind?

> > 
> > I think using paths is confusing because it mixes concepts which are
> > exclusive to the multiplexer (a particular implementation of the
> > varianter) with an API that is shared by all other parameter
> > providers.
> > 
> > For example, when you say "merge everything into the tree root
> > node", are you talking about namespace paths, or paths as used by
> > the multiplexer when the "!mux" keyword is present?
> There is no difference between namespace path and path in
> multiplexer. The only difference is that multiplexer can provide a
> different slice, but the namespace is always the same, only not
> existing in some slices:

The multiplexer is just one of many possible "varianters". In the
multiplexer the path (tree) has a well defined semantics, which is
non-trivial: paths can be namespaces or used for "multiplexing" when
the "!mux" keyword is used.

I've been thinking of variants as having a prefix, but not a path (a
dictionary, not a tree).  But maybe a tree is better, assuming it
works as a good abstraction for other "varianters". We need an
authoritative specification somewhere.

> >>
> >> As can be seen from the section name, I'm not proposing one solution
> >> at this point, but hoping that a discussion on the topic would help
> >> achieve the best possible design.
> > 
> > I think this should be abstract to the Test (in other words, not
> > exposed through any API). The order, priority and merge of
> > parameters is a problem to be solved at run-time by the test runner.
> > 
> > All a test needs to "know" is that there's a parameter with the name
> > it wants.
> > 
> > In the case of clashes, specifying a prioritization should be easy.
> > We could use a similar approach to how we prioritize Avocado's own
> > configuration.
> > 
> > Example: from less important to top priorities, when resolving a
> > call to params.get():
> > 
> >    * "default" value provided to params.get() inside the test code.
> >    * Params from /etc/avocado/global-variants.ini
> >    * Params from ~/avocado/global-variants.ini
> >    * Params from "--params=<file>"
> >    * Params from "--param=[prefix:]<key:value>"
> >    * Params from the variant generated from --mux-yaml=<file> (and
> >      using --mux-inject would have the same effect of changing
> >      <file> before using it)
> > 
> > The key of this proposal is simplicity and scalability: it doesn't
> > matter if the user is running the test with the varianter, a simple
> > config file (--params=<file>) or passing some parameters by hand
> > (--param=key:value). The Test API and behavior are the same and the
> > users get a consistent experience.
>
> Yep. Basically there is a test writer and test user. Test writer
> should write a test and ask for params. Test user should provide
> params which might also include specific order, if needed (via
> cmdline or config).

ACK.

Thanks.
   - Ademar

> 
> Lukáš
> 
> > 
> > Thanks.
> >    - Ademar
> > 
> >> [1] -
> >> http://avocado-framework.readthedocs.io/en/52.0/api/core/avocado.core.html#avocado.core.varianter.AvocadoParams.get
> > 
> > 
> > 
> 
> 

-- 
Ademar Reis
Red Hat

^[:wq!