[Avocado-devel] Parameter System Overhaul

Wed Aug 16 20:34:32 UTC 2017

On Wed, Aug 16, 2017 at 02:28:47PM -0400, Cleber Rosa wrote:
> 
> 
> On 08/16/2017 12:58 PM, Ademar Reis wrote:
> > On Wed, Aug 16, 2017 at 12:01:08PM -0400, Cleber Rosa wrote:
> >>
> >>
> >> On 08/07/2017 05:49 PM, Ademar Reis wrote:
> >>> On Tue, Aug 01, 2017 at 03:37:34PM -0400, Cleber Rosa wrote:
> >>>> Even though Avocado has had a parameter passing system for
> >>>> instrumented tests almost from day one, it has been intertwined with
> >>>> the varianter (then multiplexer) and this is fundamentally wrong.  The
> >>>> most obvious example of this broken design is the `mux-inject` command
> >>>> line option::
> >>>>
> >>>>   --mux-inject [MUX_INJECT [MUX_INJECT ...]]
> >>>>                         Inject [path:]key:node values into the final
> >>>> multiplex
> >>>>                         tree.
> >>>>
> >>>> This is broken design not because such a varianter implementations can
> >>>> be tweaked over the command line, that's fine.  It's broken because it
> >>>> is the recommended way of passing parameters on the command line.
> >>>>
> >>>> The varianter (or any other subsystem) should be able to act as a
> >>>> parameter provider, but can not dictate that parameters must first be
> >>>> nodes/key/values of its own internal structure.
> >>>
> >>> Correct. It's broken because it violates several layers. There would
> >>> be nothing wrong with something like "--param [prefix:]<key:value>",
> >>> for example (more below).
> >>>
> >>>>
> >>>> The proposed design
> >>>> ===================
> >>>>
> >>>> A diagram has been used on a few different occasions, to describe how
> >>>> the parameters and variants generation mechanism should be connected
> >>>> to a test and to the overall Avocado architecture.  Here it is, in its
> >>>> original form::
> >>>>
> >>>>    +------------------------------+
> >>>>    | Test                         |
> >>>>    +------------------------------+
> >>>>              |
> >>>>              |
> >>>>    +---------v---------+    +--------------------------------+
> >>>>    | Parameters System |--->| Variants Generation Plugin API |
> >>>>    +-------------------+    +--------------------------------+
> >>>>              ^                                ^
> >>>>              |                                |
> >>>>    +--------------------------------------+   |
> >>>>    | +--------------+ +-----------------+ |   |
> >>>>    | | avocado-virt | | other providers | |   |
> >>>>    | +--------------+ +-----------------+ |   |
> >>>>    +--------------------------------------+   |
> >>>>                                               |
> >>>>                  +----------------------------+-----+
> >>>>                  |                                  |
> >>>>                  |                                  |
> >>>>                  |                                  |
> >>>>        +--------------------+           +-------------------------+
> >>>>        | Multiplexer Plugin |           | Other variant plugin(s) |
> >>>>        +--------------------+           +-------------------------+
> >>>>              |
> >>>>              |
> >>>>        +-----v---------------------------+
> >>>>        | +------------+ +--------------+ |
> >>>>        | | --mux-yaml | | --mux-inject | |
> >>>>        | +------------+ +--------------+ |
> >>>>        +---------------------------------+
> >>>>
> >>>>
> >>>> Given that the "Parameter System" is the entry point into the parameters
> >>>> providers, it should provide two different interfaces:
> >>>>
> >>>>  1) An interface for its users, that is, developers writing
> >>>>     `avocado.Test` based tests
> >>>>
> >>>>  2) An interface for developers of additional providers, such as the
> >>>>     "avocado-virt" and "other providers" box on the diagram.
> >>>>
> >>>> The current state of the the first interface is the ``self.params``
> >>>> attribute.  Hopefully, it will be possible to keep its current interface,
> >>>> so that tests won't need any kind of compatibility adjustments.
> >>>
> >>> Right. The way I envision the parameters system includes a
> >>> resolution mechanism, the "path" currently used in params.get().
> >>> This adds extra specificity to the user who requests a parameter.
> >>>
> >>> But these parameters can be provided by any entity. In the diagram
> >>> above, they're part of the "Parameter System" box. Examples of
> >>> "other providers" could be support for a configuration file or a
> >>> "--param=[prefix:]<key:value>" command line option.
> >>>
> >>
> >> OK, we're in sync.
> >>
> >>> The Test API to request parameters should be shared. To a test it
> >>> doesn't matter where the parameter is coming from. They're always
> >>> accessed through an API like:
> >>>
> >>>     params.get(<key>, [prefix], [fallback value])
> >>>
> >>
> >> As a reference to the current status, this is what this looks like:
> >>
> >>       get(key, path=None, default=None)
> >>
> >> http://avocado-framework.readthedocs.io/en/53.0/api/core/avocado.core.html?highlight=avocadoparams#avocado.core.varianter.AvocadoParams.get
> >>
> >> So it matches the general signature suggested.  Now unto the details.
> >>
> >>> Where:
> >>>   * key (mandatory) is the configuration variable the user wants.
> >>>
> >>
> >> OK.
> >>
> >>>   * prefix (optional) is used to find the right key and can be
> >>>     partially resolved, from right to left. As an example, a
> >>>     `params.get("bar", "name")` would be fulfilled by the parameter
> >>>     system if there are entries like "bar:name", "foobar:name" or
> >>>     "/foo/bar:name". We're using '/' in the multiplexer to
> >>>     separate branch levels, thus instead of "prefix", we're calling
> >>>     it "path", but the mechanism is the same and IMO should be
> >>>     generic to avoid confusion (a path at the parameter level is
> >>>     different than a path at the multiplexer level).
> >>>
> >>
> >> I'm skeptical about the partial prefix matches.  For instance, a test
> >> calling:
> >>
> >>     self.params.get('name', prefix='obar', default='default')
> >>
> >> Returning the "foobar:name" value is not something I'd expect.  I agree
> >> that the tree representation is due to the multiplexer heritage, but I
> >> see a clearer definition of the prefix as a good thing (explicit is
> >> better than implicit IMO).
> >>
> >> A prefix of '.*obar', using a regex(-like) syntax would fit the
> >> requirements of being flexible enough (allowing right to left matches
> >> and other) and still be precise and explicit.
> > 
> > I agree.
> > 
> > I'm uncomfortable with calling it a path (and using '/' as separator
> > which accepts both relative and absolute paths, the former working
> > as a partial match) because this is too attached to the multiplexer.
> > 
> > But maybe it's the other way around and "path" should be a concept
> > that belongs to the varianter and all configuration providers should
> > be conformant to it.
> > 
> >>
> >>>     params.get() can be even more flexible, supporting regexps and
> >>>     blobs...  The objective of [prefix] is to serve as a filter and
> >>>     to guarantee the caller is getting the variable he wants. It's
> >>>     similar to a namespace.
> >>>
> >>
> >> Right, that's why I'd avoid the partial (implicit) prefix match.  By
> >> making the prefix selector more flexible, one of the prefixes (or
> >> namespaces) available would be selected.  The prefix (or namespace)
> >> would be always in its complete form.  The flexible (regex) and explicit
> >> version would be something like:
> >>
> >>    >>> self.params.get('name', prefix='.*obar', default='default')
> >>    DEBUG| PARAMS selected prefix "foobar" out of prefix selector ".*obar"
> >>    DEBUG| PARAMS (key=name, prefix=foobar, default='default') => 'value'
> >>
> >>>   * fallback (optional) is the value returned if the parameter
> >>>     system can't resolve prefix+key.
> >>>
> >>
> >> OK.  We've also noticed that many users would rather have defaults on
> >> parameter files rather than hardcoded within the test.  The *default*
> >> set of parameter system providers in Avocado should include one that
> >> allows for that use case.
> > 
> > I'm not sure I understand this use-case. Do you think it's covered
> > by the fallback mechanism I propose later in this message?
> > 
> 
> It's not covered by the fallback mechanism.  What I mean is that Avocado
> should probably have, as one of the test parameter system providers
> which are enabled by default, one that looks for parameters in the test
> data dir.  Something like:
> 
>  1. test => examples/tests/sleeptest.py
>  2. test's datadir (current concept) => examples/tests/sleeptest.py.data
>  3. test's file-based param => examples/tests/sleeptest.py.data/params
> 
> The code handling line 3, would be one of the test parameter system
> providers.  Just one that is enabled by default.

OK, this makes sense to me and is aligned with what I propose, we're
in sync.

> 
> >>
> >>> Users who don't want any specificity and/or have a small test base
> >>> with a low chance of clashes could simply ignore the prefix both
> >>> when creating parameters and when making calls to params.get().
> >>>
> >>
> >> Yes.
> >>
> >>>>
> >>>> The second item will probably mean the definition of a new class to
> >>>> the ``avocado.core.plugin_interfaces`` module, together with a new
> >>>> dispatcher(-like) implementation in ``avocado.core.dispatcher``.
> >>>>
> >>>> Parameters availability: local .vs. external
> >>>> ============================================
> >>>>
> >>>> Right now, all parameters are given to the test at instantiation time.
> >>>> Let's say that in this scenario, all parameters are *local* to the
> >>>> test.  Local parameters have the benefit that the test is self
> >>>> contained and doesn't need any external communication.
> >>>>
> >>>> In theory, this is also a limitation, because all parameters must be
> >>>> available before the test is started.  Even if other parameter system
> >>>> implementations are possible, with a local approach, there would be a
> >>>> number of limitations.  For long running tests, that may depend on
> >>>> parameters generated during the test, this would be a blocker.  Also,
> >>>> if a huge number of parameters would be available (think of a huge
> >>>> cloud or database of parameters) they would all have to be copied to
> >>>> the test at instantiation time.  Finally, it also means that the
> >>>> source of parameters would need to be able to iterate over all the
> >>>> available parameters, so that they can be copied, which can be a
> >>>> further limitation for cascading implementations.
> >>>>
> >>>> An external approach to parameters, would be one that the test holds a
> >>>> handle to a broker of parameter providers.  The parameter resolution
> >>>> would be done at run time.  This avoids the copying of parameters, but
> >>>> requires runtime communication with the parameter providers.  This can
> >>>> make the test execution much more fragile and dependent on the external
> >>>> communication.  Even by minimizing the number of communication
> >>>> endpoints by communicating with the test runner only, it can still add
> >>>> significant overhead, latency and point of failures to the test
> >>>> execution.
> >>>>
> >>>> I believe that, at this time, the limitations imposed by local
> >>>> parameter availability do *not* outweigh the problems that an external
> >>>> approach can bring.  In the future, if advanced use cases require an
> >>>> external approach to parameters availability, this can be reevaluated.
> >>>
> >>> If I understand your point correctly, this is an implementation
> >>> detail. It depends on what the "contract" is between the test runner
> >>> (parameter provider) and the test being run (the user of
> >>> params.get()).
> >>>
> >>
> >> It is an implementation detail, but one that will ultimately reflect on
> >> what is available to test writers.
> >>
> >>> For example, should users assume parameters are dynamic and can
> >>> change during the lifetime of a test, an therefore two identical
> >>> calls to params.get() might return different values?  Should it be
> >>> possible to change params (something like params.put()) at runtime?
> >>>
> >>> (IMO the answer should be no to both questions).
> >>>
> >>> If you have something different in mind, then it would be
> >>> interesting to see some real use-cases.
> >>>
> >>
> >> Defining what users could expect is what I was trying to get to, but you
> >> were way more direct here.  I mentioned that iterating through the
> >> available parameters wouldn't be possible with the "external" approach,
> >> because test writers would be affected by those choices.
> >>
> >> When I said that the "limitations imposed by local parameter
> >> availability do *not* outweigh the problems that an external approach
> >> can bring", I basically stood by the "local parameter availability"
> >> approach, which mean to users that it'd be possible to:
> >>
> >>  1) Iterate through parameters
> >>  2) Change parameters
> >>  3) Remove parameters
> >>  4) Add parameters
> >>
> >> Now, it's not because it's possible that it should be supported or
> >> encouraged.  For the record, #1 is already possible with the current
> >> implementation:
> >>
> >> http://avocado-framework.readthedocs.io/en/53.0/api/core/avocado.core.html?highlight=avocadoparams#avocado.core.varianter.AvocadoParams.iteritems
> >>
> >> And so is, to some extent #4 (and depending on how you look at it, #2):
> >>
> >> http://avocado-framework.readthedocs.io/en/53.0/api/core/avocado.core.html?highlight=avocadoparams#avocado.core.varianter.Varianter.add_default_param
> >>
> > 
> > Right. But these are avocado.core APIs (implementation details) that
> > should not be exposed or supported.
> > 
> 
> The second one is not exposed to test writers, the first one is.
> 
> > I've been assuming params.get() is the only params-related API
> > exposed to test writers. Is that correct?
> > 
> 
> No, all the methods in AvocadoParams are available:
> 
> http://avocado-framework.readthedocs.io/en/53.0/api/core/avocado.core.html?highlight=avocadoparams#avocado.core.varianter.AvocadoParams
> 
> So it's possible to do things like:
> 
>  def test():
>     for o, k, v in self.params.iteritems():
>        self.log.debug("%s %s %s", o, k, v)
> 
> Now, some clarifications: I got confused and pasted the wrong link to
> add_default_param().  That is a Varianter method, so it's *not*
> available to test writers.  I got confused because I was staring at the
> AvocadoParams __init__ parameter "default_params".  And and instance of
> AvocadoParams is what you get as the "self.params" on a test.
> 
> Hopefully that last paragraph is not too confusing.

You once again linked to avocado.core documentation, but the object
is available to test writers. I imagine you understand we have a
leak here: an internal object is being exposed as part of the API to
test writers.

So let me change my question:

I've been assuming params.get() is the only params-related API we
*want* (or plan) to expose to test writers. Is that correct?

Thanks.
   - Ademar

> 
> >>>>
> >>>> Namespaces (AKA how/if should we merge parameters)
> >>>> ==================================================
> >>>>
> >>>> Currently, the parameter fetching interface already contains at its
> >>>> core the concept of paths[1].  In theory, this is sufficient to prevent
> >>>> clashes of keys with the same names, but intended to configure different
> >>>> aspects of a test.
> >>>>
> >>>> Now, with the proposed implementation of multiple providers to the
> >>>> parameter system, the question of how they will be combined comes up.
> >>>>
> >>>> One way is for each implementation to claim, based on some unique
> >>>> attribute such as its own name, a part of a tree path.  For instance,
> >>>> for two implementations:
> >>>>
> >>>>  1) variants
> >>>>  2) plain_ini
> >>>>
> >>>> Users could access parameters explicitly defined on those by referring
> >>>> to paths such as:
> >>>>
> >>>>    self.params.get('speed', path='/plain_ini', default=60)
> >>>>
> >>>> or
> >>>>
> >>>>    self.params.get('speed', path='/variants/path/inside/varianter',
> >>>> default=60)
> >>>>
> >>>> This clearly solves the clashes, but binds the implementation to the
> >>>> tests, which should be avoided at all costs.
> >>>
> >>> So you're providing this as an example of why it's a bad idea...
> >>> OK. :)
> >>>
> >>>>
> >>>> One other approach would be to merge everything into the tree root
> >>>> node.  By doing this, one parameter provider could effectively
> >>>> override the values in another parameter provider, given that both
> >>>> used the same paths for a number of parameters.
> >>>>
> >>>> Yet another approach would be to *not* use paths, and resort to
> >>>> completely separate namespaces.  A parameter namespace would be an
> >>>> additional level of isolation, which can quickly become exceedingly
> >>>> complex.
> >>>
> >>> I think using paths is confusing because it mixes concepts which are
> >>> exclusive to the multiplexer (a particular implementation of the
> >>> varianter) with an API that is shared by all other parameter
> >>> providers.
> >>>
> >>
> >> We can talk about "prefixes" instead, but the same problem would still
> >> need to be addressed.  Let's change the example completely and use the
> >> following parameter provider implementations:
> >>
> >> 1) env_vars, reads `os.environ`, and provides:
> >>
> >>  [{'prefix': '', 'key': 'PATH', 'value': '/bin:/usr/bin'}, ...]
> >>
> >> 2) extended_ini, reads a config file:
> >>
> >>  ## STARTS HERE
> >>  PATH=/usr/libexec:/bin:/usr/bin
> >>
> >>  [STORAGE]
> >>  TEMP=/tmp
> >>  PERSISTENT=/var/lib
> >>  ## ENDS HERE
> >>
> >> And provides:
> >>
> >>  [{'prefix': '', 'key': 'PATH', 'value': '/usr/libexec:/bin:/usr/bin'},
> >>   {'prefix': 'STORAGE', 'key': 'TEMP', 'value': '/tmp'},
> >>   {'prefix': 'STORAGE', 'key': 'PERSISTENT', 'value': '/var/lib'}]
> >>
> >> All parameters from all providers have to copied to the test (that's a
> >> basic premise of the "local" approach discussed earlier).  During that
> >> copy, we have two options:
> >>
> >> 1) Keep the origin of the parameters in the resulting copy:
> >>
> >>  params = {}
> >>  params['env_vars'] = [{'prefix': '', 'key': 'PATH', 'value':
> >> '/bin:/usr/bin'}, ...]
> >>  params['extended_ini'] = [{'prefix': '', 'key': 'PATH', 'value':
> >> '/usr/libexec:/bin:/usr/bin'}, ...]
> >>
> >> 2) Discard the origin, and "flatten out" the parameters:
> >>
> >>  params = {}
> >>  # process env_vars
> >>  params[{'prefix': '', 'key': 'PATH'}] = '/bin:/usr/bin'
> >>  # process extended_ini
> >>  params[{'prefix': '', 'key': 'PATH'}] = '/usr/libexec:/bin:/usr/bin'
> >>
> >> Using the example above, this means that there will be clashes on
> >> "params.get(name='PATH', prefix='')".
> >>
> >>> For example, when you say "merge everything into the tree root
> >>> node", are you talking about namespace paths, or paths as used by
> >>> the multiplexer when the "!mux" keyword is present?
> >>>
> >>
> >> I meant namespace(-like) paths.  The wording was chosen to match the
> >> current implementation (I now realize it was a bad word choice).  Using
> >> the example above, the "tree root node" can be thought of as the
> >> "params" dict itself.
> >>
> >>>>
> >>>> As can be seen from the section name, I'm not proposing one solution
> >>>> at this point, but hoping that a discussion on the topic would help
> >>>> achieve the best possible design.
> >>>
> >>> I think this should be abstract to the Test (in other words, not
> >>> exposed through any API). The order, priority and merge of
> >>> parameters is a problem to be solved at run-time by the test runner.
> >>>
> >>
> >> What you're proposing is #2, and yes, I'm fine with that.
> > 
> > Correct.
> > 
> >>
> >>> All a test needs to "know" is that there's a parameter with the name
> >>> it wants.
> >>>
> >>> In the case of clashes, specifying a prioritization should be easy.
> >>> We could use a similar approach to how we prioritize Avocado's own
> >>> configuration.
> >>>
> >>
> >> I think it can even be defined by the plugin ordering functionality
> >> already in place
> >> (http://avocado-framework.readthedocs.io/en/53.0/Plugins.html#default-plugin-execution-order).
> >>
> >>> Example: from less important to top priorities, when resolving a
> >>> call to params.get():
> >>>
> >>>    * "default" value provided to params.get() inside the test code.
> >>>    * Params from /etc/avocado/global-variants.ini
> >>>    * Params from ~/avocado/global-variants.ini
> >>>    * Params from "--params=<file>"
> >>>    * Params from "--param=[prefix:]<key:value>"
> >>>    * Params from the variant generated from --mux-yaml=<file> (and
> >>>      using --mux-inject would have the same effect of changing
> >>>      <file> before using it)
> >>>
> >>
> >> Just to be sure: do you expect this resolution to happen at *test* run time?
> >>
> >> All the discussion so far seems to indicate that we want to merge all
> >> parameter system provider data and pass the merged result to the test.
> >> To the test, there would resolution of regarding where the parameter is
> >> coming from.  In fact, that information would be lost altogether.  The
> >> only "resolution" available to the test would be based on prefix and key
> >> name.
> > 
> > Correct, that's what I mean. It's just that to me this is an
> > implementation detail of how params.get() works -- and I've been
> > assuming params.get() is the *only* params-related API exposed to
> > test writers.
> > 
> 
> It's not, but we'll work on that.
> 
> - Cleber.
> 
> > Thanks!
> >    - Ademar
> > 
> >>
> >>> The key of this proposal is simplicity and scalability: it doesn't
> >>> matter if the user is running the test with the varianter, a simple
> >>> config file (--params=<file>) or passing some parameters by hand
> >>> (--param=key:value). The Test API and behavior are the same and the
> >>> users get a consistent experience.
> >>>
> >>
> >> Agreed.
> >>
> >> - Cleber.
> >>
> >>> Thanks.
> >>>    - Ademar
> >>>
> >>>> [1] -
> >>>> http://avocado-framework.readthedocs.io/en/52.0/api/core/avocado.core.html#avocado.core.varianter.AvocadoParams.get
> >>>
> >>>
> >>>
> >>
> >> -- 
> >> Cleber Rosa
> >> [ Sr Software Engineer - Virtualization Team - Red Hat ]
> >> [ Avocado Test Framework - avocado-framework.github.io ]
> >> [  7ABB 96EB 8B46 B94D 5E0F  E9BB 657E 8D33 A5F2 09F3  ]
> >>
> > 
> > 
> > 
> > 
> 
> -- 
> Cleber Rosa
> [ Sr Software Engineer - Virtualization Team - Red Hat ]
> [ Avocado Test Framework - avocado-framework.github.io ]
> [  7ABB 96EB 8B46 B94D 5E0F  E9BB 657E 8D33 A5F2 09F3  ]
> 

-- 
Ademar Reis
Red Hat

^[:wq!