[Avocado-devel] Parameter System Overhaul

Wed Aug 16 16:58:47 UTC 2017

On Wed, Aug 16, 2017 at 12:01:08PM -0400, Cleber Rosa wrote:
> 
> 
> On 08/07/2017 05:49 PM, Ademar Reis wrote:
> > On Tue, Aug 01, 2017 at 03:37:34PM -0400, Cleber Rosa wrote:
> >> Even though Avocado has had a parameter passing system for
> >> instrumented tests almost from day one, it has been intertwined with
> >> the varianter (then multiplexer) and this is fundamentally wrong.  The
> >> most obvious example of this broken design is the `mux-inject` command
> >> line option::
> >>
> >>   --mux-inject [MUX_INJECT [MUX_INJECT ...]]
> >>                         Inject [path:]key:node values into the final
> >> multiplex
> >>                         tree.
> >>
> >> This is broken design not because such a varianter implementations can
> >> be tweaked over the command line, that's fine.  It's broken because it
> >> is the recommended way of passing parameters on the command line.
> >>
> >> The varianter (or any other subsystem) should be able to act as a
> >> parameter provider, but can not dictate that parameters must first be
> >> nodes/key/values of its own internal structure.
> > 
> > Correct. It's broken because it violates several layers. There would
> > be nothing wrong with something like "--param [prefix:]<key:value>",
> > for example (more below).
> > 
> >>
> >> The proposed design
> >> ===================
> >>
> >> A diagram has been used on a few different occasions, to describe how
> >> the parameters and variants generation mechanism should be connected
> >> to a test and to the overall Avocado architecture.  Here it is, in its
> >> original form::
> >>
> >>    +------------------------------+
> >>    | Test                         |
> >>    +------------------------------+
> >>              |
> >>              |
> >>    +---------v---------+    +--------------------------------+
> >>    | Parameters System |--->| Variants Generation Plugin API |
> >>    +-------------------+    +--------------------------------+
> >>              ^                                ^
> >>              |                                |
> >>    +--------------------------------------+   |
> >>    | +--------------+ +-----------------+ |   |
> >>    | | avocado-virt | | other providers | |   |
> >>    | +--------------+ +-----------------+ |   |
> >>    +--------------------------------------+   |
> >>                                               |
> >>                  +----------------------------+-----+
> >>                  |                                  |
> >>                  |                                  |
> >>                  |                                  |
> >>        +--------------------+           +-------------------------+
> >>        | Multiplexer Plugin |           | Other variant plugin(s) |
> >>        +--------------------+           +-------------------------+
> >>              |
> >>              |
> >>        +-----v---------------------------+
> >>        | +------------+ +--------------+ |
> >>        | | --mux-yaml | | --mux-inject | |
> >>        | +------------+ +--------------+ |
> >>        +---------------------------------+
> >>
> >>
> >> Given that the "Parameter System" is the entry point into the parameters
> >> providers, it should provide two different interfaces:
> >>
> >>  1) An interface for its users, that is, developers writing
> >>     `avocado.Test` based tests
> >>
> >>  2) An interface for developers of additional providers, such as the
> >>     "avocado-virt" and "other providers" box on the diagram.
> >>
> >> The current state of the the first interface is the ``self.params``
> >> attribute.  Hopefully, it will be possible to keep its current interface,
> >> so that tests won't need any kind of compatibility adjustments.
> > 
> > Right. The way I envision the parameters system includes a
> > resolution mechanism, the "path" currently used in params.get().
> > This adds extra specificity to the user who requests a parameter.
> > 
> > But these parameters can be provided by any entity. In the diagram
> > above, they're part of the "Parameter System" box. Examples of
> > "other providers" could be support for a configuration file or a
> > "--param=[prefix:]<key:value>" command line option.
> > 
> 
> OK, we're in sync.
> 
> > The Test API to request parameters should be shared. To a test it
> > doesn't matter where the parameter is coming from. They're always
> > accessed through an API like:
> > 
> >     params.get(<key>, [prefix], [fallback value])
> > 
> 
> As a reference to the current status, this is what this looks like:
> 
>       get(key, path=None, default=None)
> 
> http://avocado-framework.readthedocs.io/en/53.0/api/core/avocado.core.html?highlight=avocadoparams#avocado.core.varianter.AvocadoParams.get
> 
> So it matches the general signature suggested.  Now unto the details.
> 
> > Where:
> >   * key (mandatory) is the configuration variable the user wants.
> > 
> 
> OK.
> 
> >   * prefix (optional) is used to find the right key and can be
> >     partially resolved, from right to left. As an example, a
> >     `params.get("bar", "name")` would be fulfilled by the parameter
> >     system if there are entries like "bar:name", "foobar:name" or
> >     "/foo/bar:name". We're using '/' in the multiplexer to
> >     separate branch levels, thus instead of "prefix", we're calling
> >     it "path", but the mechanism is the same and IMO should be
> >     generic to avoid confusion (a path at the parameter level is
> >     different than a path at the multiplexer level).
> > 
> 
> I'm skeptical about the partial prefix matches.  For instance, a test
> calling:
> 
>     self.params.get('name', prefix='obar', default='default')
> 
> Returning the "foobar:name" value is not something I'd expect.  I agree
> that the tree representation is due to the multiplexer heritage, but I
> see a clearer definition of the prefix as a good thing (explicit is
> better than implicit IMO).
> 
> A prefix of '.*obar', using a regex(-like) syntax would fit the
> requirements of being flexible enough (allowing right to left matches
> and other) and still be precise and explicit.

I agree.

I'm uncomfortable with calling it a path (and using '/' as separator
which accepts both relative and absolute paths, the former working
as a partial match) because this is too attached to the multiplexer.

But maybe it's the other way around and "path" should be a concept
that belongs to the varianter and all configuration providers should
be conformant to it.

> 
> >     params.get() can be even more flexible, supporting regexps and
> >     blobs...  The objective of [prefix] is to serve as a filter and
> >     to guarantee the caller is getting the variable he wants. It's
> >     similar to a namespace.
> > 
> 
> Right, that's why I'd avoid the partial (implicit) prefix match.  By
> making the prefix selector more flexible, one of the prefixes (or
> namespaces) available would be selected.  The prefix (or namespace)
> would be always in its complete form.  The flexible (regex) and explicit
> version would be something like:
> 
>    >>> self.params.get('name', prefix='.*obar', default='default')
>    DEBUG| PARAMS selected prefix "foobar" out of prefix selector ".*obar"
>    DEBUG| PARAMS (key=name, prefix=foobar, default='default') => 'value'
> 
> >   * fallback (optional) is the value returned if the parameter
> >     system can't resolve prefix+key.
> > 
> 
> OK.  We've also noticed that many users would rather have defaults on
> parameter files rather than hardcoded within the test.  The *default*
> set of parameter system providers in Avocado should include one that
> allows for that use case.

I'm not sure I understand this use-case. Do you think it's covered
by the fallback mechanism I propose later in this message?

> 
> > Users who don't want any specificity and/or have a small test base
> > with a low chance of clashes could simply ignore the prefix both
> > when creating parameters and when making calls to params.get().
> > 
> 
> Yes.
> 
> >>
> >> The second item will probably mean the definition of a new class to
> >> the ``avocado.core.plugin_interfaces`` module, together with a new
> >> dispatcher(-like) implementation in ``avocado.core.dispatcher``.
> >>
> >> Parameters availability: local .vs. external
> >> ============================================
> >>
> >> Right now, all parameters are given to the test at instantiation time.
> >> Let's say that in this scenario, all parameters are *local* to the
> >> test.  Local parameters have the benefit that the test is self
> >> contained and doesn't need any external communication.
> >>
> >> In theory, this is also a limitation, because all parameters must be
> >> available before the test is started.  Even if other parameter system
> >> implementations are possible, with a local approach, there would be a
> >> number of limitations.  For long running tests, that may depend on
> >> parameters generated during the test, this would be a blocker.  Also,
> >> if a huge number of parameters would be available (think of a huge
> >> cloud or database of parameters) they would all have to be copied to
> >> the test at instantiation time.  Finally, it also means that the
> >> source of parameters would need to be able to iterate over all the
> >> available parameters, so that they can be copied, which can be a
> >> further limitation for cascading implementations.
> >>
> >> An external approach to parameters, would be one that the test holds a
> >> handle to a broker of parameter providers.  The parameter resolution
> >> would be done at run time.  This avoids the copying of parameters, but
> >> requires runtime communication with the parameter providers.  This can
> >> make the test execution much more fragile and dependent on the external
> >> communication.  Even by minimizing the number of communication
> >> endpoints by communicating with the test runner only, it can still add
> >> significant overhead, latency and point of failures to the test
> >> execution.
> >>
> >> I believe that, at this time, the limitations imposed by local
> >> parameter availability do *not* outweigh the problems that an external
> >> approach can bring.  In the future, if advanced use cases require an
> >> external approach to parameters availability, this can be reevaluated.
> > 
> > If I understand your point correctly, this is an implementation
> > detail. It depends on what the "contract" is between the test runner
> > (parameter provider) and the test being run (the user of
> > params.get()).
> > 
> 
> It is an implementation detail, but one that will ultimately reflect on
> what is available to test writers.
> 
> > For example, should users assume parameters are dynamic and can
> > change during the lifetime of a test, an therefore two identical
> > calls to params.get() might return different values?  Should it be
> > possible to change params (something like params.put()) at runtime?
> > 
> > (IMO the answer should be no to both questions).
> > 
> > If you have something different in mind, then it would be
> > interesting to see some real use-cases.
> > 
> 
> Defining what users could expect is what I was trying to get to, but you
> were way more direct here.  I mentioned that iterating through the
> available parameters wouldn't be possible with the "external" approach,
> because test writers would be affected by those choices.
> 
> When I said that the "limitations imposed by local parameter
> availability do *not* outweigh the problems that an external approach
> can bring", I basically stood by the "local parameter availability"
> approach, which mean to users that it'd be possible to:
> 
>  1) Iterate through parameters
>  2) Change parameters
>  3) Remove parameters
>  4) Add parameters
> 
> Now, it's not because it's possible that it should be supported or
> encouraged.  For the record, #1 is already possible with the current
> implementation:
> 
> http://avocado-framework.readthedocs.io/en/53.0/api/core/avocado.core.html?highlight=avocadoparams#avocado.core.varianter.AvocadoParams.iteritems
> 
> And so is, to some extent #4 (and depending on how you look at it, #2):
> 
> http://avocado-framework.readthedocs.io/en/53.0/api/core/avocado.core.html?highlight=avocadoparams#avocado.core.varianter.Varianter.add_default_param
> 

Right. But these are avocado.core APIs (implementation details) that
should not be exposed or supported.

I've been assuming params.get() is the only params-related API
exposed to test writers. Is that correct?

> >>
> >> Namespaces (AKA how/if should we merge parameters)
> >> ==================================================
> >>
> >> Currently, the parameter fetching interface already contains at its
> >> core the concept of paths[1].  In theory, this is sufficient to prevent
> >> clashes of keys with the same names, but intended to configure different
> >> aspects of a test.
> >>
> >> Now, with the proposed implementation of multiple providers to the
> >> parameter system, the question of how they will be combined comes up.
> >>
> >> One way is for each implementation to claim, based on some unique
> >> attribute such as its own name, a part of a tree path.  For instance,
> >> for two implementations:
> >>
> >>  1) variants
> >>  2) plain_ini
> >>
> >> Users could access parameters explicitly defined on those by referring
> >> to paths such as:
> >>
> >>    self.params.get('speed', path='/plain_ini', default=60)
> >>
> >> or
> >>
> >>    self.params.get('speed', path='/variants/path/inside/varianter',
> >> default=60)
> >>
> >> This clearly solves the clashes, but binds the implementation to the
> >> tests, which should be avoided at all costs.
> > 
> > So you're providing this as an example of why it's a bad idea...
> > OK. :)
> > 
> >>
> >> One other approach would be to merge everything into the tree root
> >> node.  By doing this, one parameter provider could effectively
> >> override the values in another parameter provider, given that both
> >> used the same paths for a number of parameters.
> >>
> >> Yet another approach would be to *not* use paths, and resort to
> >> completely separate namespaces.  A parameter namespace would be an
> >> additional level of isolation, which can quickly become exceedingly
> >> complex.
> > 
> > I think using paths is confusing because it mixes concepts which are
> > exclusive to the multiplexer (a particular implementation of the
> > varianter) with an API that is shared by all other parameter
> > providers.
> > 
> 
> We can talk about "prefixes" instead, but the same problem would still
> need to be addressed.  Let's change the example completely and use the
> following parameter provider implementations:
> 
> 1) env_vars, reads `os.environ`, and provides:
> 
>  [{'prefix': '', 'key': 'PATH', 'value': '/bin:/usr/bin'}, ...]
> 
> 2) extended_ini, reads a config file:
> 
>  ## STARTS HERE
>  PATH=/usr/libexec:/bin:/usr/bin
> 
>  [STORAGE]
>  TEMP=/tmp
>  PERSISTENT=/var/lib
>  ## ENDS HERE
> 
> And provides:
> 
>  [{'prefix': '', 'key': 'PATH', 'value': '/usr/libexec:/bin:/usr/bin'},
>   {'prefix': 'STORAGE', 'key': 'TEMP', 'value': '/tmp'},
>   {'prefix': 'STORAGE', 'key': 'PERSISTENT', 'value': '/var/lib'}]
> 
> All parameters from all providers have to copied to the test (that's a
> basic premise of the "local" approach discussed earlier).  During that
> copy, we have two options:
> 
> 1) Keep the origin of the parameters in the resulting copy:
> 
>  params = {}
>  params['env_vars'] = [{'prefix': '', 'key': 'PATH', 'value':
> '/bin:/usr/bin'}, ...]
>  params['extended_ini'] = [{'prefix': '', 'key': 'PATH', 'value':
> '/usr/libexec:/bin:/usr/bin'}, ...]
> 
> 2) Discard the origin, and "flatten out" the parameters:
> 
>  params = {}
>  # process env_vars
>  params[{'prefix': '', 'key': 'PATH'}] = '/bin:/usr/bin'
>  # process extended_ini
>  params[{'prefix': '', 'key': 'PATH'}] = '/usr/libexec:/bin:/usr/bin'
> 
> Using the example above, this means that there will be clashes on
> "params.get(name='PATH', prefix='')".
> 
> > For example, when you say "merge everything into the tree root
> > node", are you talking about namespace paths, or paths as used by
> > the multiplexer when the "!mux" keyword is present?
> > 
> 
> I meant namespace(-like) paths.  The wording was chosen to match the
> current implementation (I now realize it was a bad word choice).  Using
> the example above, the "tree root node" can be thought of as the
> "params" dict itself.
> 
> >>
> >> As can be seen from the section name, I'm not proposing one solution
> >> at this point, but hoping that a discussion on the topic would help
> >> achieve the best possible design.
> > 
> > I think this should be abstract to the Test (in other words, not
> > exposed through any API). The order, priority and merge of
> > parameters is a problem to be solved at run-time by the test runner.
> > 
> 
> What you're proposing is #2, and yes, I'm fine with that.

Correct.

> 
> > All a test needs to "know" is that there's a parameter with the name
> > it wants.
> > 
> > In the case of clashes, specifying a prioritization should be easy.
> > We could use a similar approach to how we prioritize Avocado's own
> > configuration.
> > 
> 
> I think it can even be defined by the plugin ordering functionality
> already in place
> (http://avocado-framework.readthedocs.io/en/53.0/Plugins.html#default-plugin-execution-order).
> 
> > Example: from less important to top priorities, when resolving a
> > call to params.get():
> > 
> >    * "default" value provided to params.get() inside the test code.
> >    * Params from /etc/avocado/global-variants.ini
> >    * Params from ~/avocado/global-variants.ini
> >    * Params from "--params=<file>"
> >    * Params from "--param=[prefix:]<key:value>"
> >    * Params from the variant generated from --mux-yaml=<file> (and
> >      using --mux-inject would have the same effect of changing
> >      <file> before using it)
> > 
> 
> Just to be sure: do you expect this resolution to happen at *test* run time?
> 
> All the discussion so far seems to indicate that we want to merge all
> parameter system provider data and pass the merged result to the test.
> To the test, there would resolution of regarding where the parameter is
> coming from.  In fact, that information would be lost altogether.  The
> only "resolution" available to the test would be based on prefix and key
> name.

Correct, that's what I mean. It's just that to me this is an
implementation detail of how params.get() works -- and I've been
assuming params.get() is the *only* params-related API exposed to
test writers.

Thanks!
   - Ademar

> 
> > The key of this proposal is simplicity and scalability: it doesn't
> > matter if the user is running the test with the varianter, a simple
> > config file (--params=<file>) or passing some parameters by hand
> > (--param=key:value). The Test API and behavior are the same and the
> > users get a consistent experience.
> > 
> 
> Agreed.
> 
> - Cleber.
> 
> > Thanks.
> >    - Ademar
> > 
> >> [1] -
> >> http://avocado-framework.readthedocs.io/en/52.0/api/core/avocado.core.html#avocado.core.varianter.AvocadoParams.get
> > 
> > 
> > 
> 
> -- 
> Cleber Rosa
> [ Sr Software Engineer - Virtualization Team - Red Hat ]
> [ Avocado Test Framework - avocado-framework.github.io ]
> [  7ABB 96EB 8B46 B94D 5E0F  E9BB 657E 8D33 A5F2 09F3  ]
> 

-- 
Ademar Reis
Red Hat

^[:wq!