[Avocado-devel] Parameter System Overhaul

Wed Aug 16 18:28:47 UTC 2017

On 08/16/2017 12:58 PM, Ademar Reis wrote:
> On Wed, Aug 16, 2017 at 12:01:08PM -0400, Cleber Rosa wrote:
>>
>>
>> On 08/07/2017 05:49 PM, Ademar Reis wrote:
>>> On Tue, Aug 01, 2017 at 03:37:34PM -0400, Cleber Rosa wrote:
>>>> Even though Avocado has had a parameter passing system for
>>>> instrumented tests almost from day one, it has been intertwined with
>>>> the varianter (then multiplexer) and this is fundamentally wrong.  The
>>>> most obvious example of this broken design is the `mux-inject` command
>>>> line option::
>>>>
>>>>   --mux-inject [MUX_INJECT [MUX_INJECT ...]]
>>>>                         Inject [path:]key:node values into the final
>>>> multiplex
>>>>                         tree.
>>>>
>>>> This is broken design not because such a varianter implementations can
>>>> be tweaked over the command line, that's fine.  It's broken because it
>>>> is the recommended way of passing parameters on the command line.
>>>>
>>>> The varianter (or any other subsystem) should be able to act as a
>>>> parameter provider, but can not dictate that parameters must first be
>>>> nodes/key/values of its own internal structure.
>>>
>>> Correct. It's broken because it violates several layers. There would
>>> be nothing wrong with something like "--param [prefix:]<key:value>",
>>> for example (more below).
>>>
>>>>
>>>> The proposed design
>>>> ===================
>>>>
>>>> A diagram has been used on a few different occasions, to describe how
>>>> the parameters and variants generation mechanism should be connected
>>>> to a test and to the overall Avocado architecture.  Here it is, in its
>>>> original form::
>>>>
>>>>    +------------------------------+
>>>>    | Test                         |
>>>>    +------------------------------+
>>>>              |
>>>>              |
>>>>    +---------v---------+    +--------------------------------+
>>>>    | Parameters System |--->| Variants Generation Plugin API |
>>>>    +-------------------+    +--------------------------------+
>>>>              ^                                ^
>>>>              |                                |
>>>>    +--------------------------------------+   |
>>>>    | +--------------+ +-----------------+ |   |
>>>>    | | avocado-virt | | other providers | |   |
>>>>    | +--------------+ +-----------------+ |   |
>>>>    +--------------------------------------+   |
>>>>                                               |
>>>>                  +----------------------------+-----+
>>>>                  |                                  |
>>>>                  |                                  |
>>>>                  |                                  |
>>>>        +--------------------+           +-------------------------+
>>>>        | Multiplexer Plugin |           | Other variant plugin(s) |
>>>>        +--------------------+           +-------------------------+
>>>>              |
>>>>              |
>>>>        +-----v---------------------------+
>>>>        | +------------+ +--------------+ |
>>>>        | | --mux-yaml | | --mux-inject | |
>>>>        | +------------+ +--------------+ |
>>>>        +---------------------------------+
>>>>
>>>>
>>>> Given that the "Parameter System" is the entry point into the parameters
>>>> providers, it should provide two different interfaces:
>>>>
>>>>  1) An interface for its users, that is, developers writing
>>>>     `avocado.Test` based tests
>>>>
>>>>  2) An interface for developers of additional providers, such as the
>>>>     "avocado-virt" and "other providers" box on the diagram.
>>>>
>>>> The current state of the the first interface is the ``self.params``
>>>> attribute.  Hopefully, it will be possible to keep its current interface,
>>>> so that tests won't need any kind of compatibility adjustments.
>>>
>>> Right. The way I envision the parameters system includes a
>>> resolution mechanism, the "path" currently used in params.get().
>>> This adds extra specificity to the user who requests a parameter.
>>>
>>> But these parameters can be provided by any entity. In the diagram
>>> above, they're part of the "Parameter System" box. Examples of
>>> "other providers" could be support for a configuration file or a
>>> "--param=[prefix:]<key:value>" command line option.
>>>
>>
>> OK, we're in sync.
>>
>>> The Test API to request parameters should be shared. To a test it
>>> doesn't matter where the parameter is coming from. They're always
>>> accessed through an API like:
>>>
>>>     params.get(<key>, [prefix], [fallback value])
>>>
>>
>> As a reference to the current status, this is what this looks like:
>>
>>       get(key, path=None, default=None)
>>
>> http://avocado-framework.readthedocs.io/en/53.0/api/core/avocado.core.html?highlight=avocadoparams#avocado.core.varianter.AvocadoParams.get
>>
>> So it matches the general signature suggested.  Now unto the details.
>>
>>> Where:
>>>   * key (mandatory) is the configuration variable the user wants.
>>>
>>
>> OK.
>>
>>>   * prefix (optional) is used to find the right key and can be
>>>     partially resolved, from right to left. As an example, a
>>>     `params.get("bar", "name")` would be fulfilled by the parameter
>>>     system if there are entries like "bar:name", "foobar:name" or
>>>     "/foo/bar:name". We're using '/' in the multiplexer to
>>>     separate branch levels, thus instead of "prefix", we're calling
>>>     it "path", but the mechanism is the same and IMO should be
>>>     generic to avoid confusion (a path at the parameter level is
>>>     different than a path at the multiplexer level).
>>>
>>
>> I'm skeptical about the partial prefix matches.  For instance, a test
>> calling:
>>
>>     self.params.get('name', prefix='obar', default='default')
>>
>> Returning the "foobar:name" value is not something I'd expect.  I agree
>> that the tree representation is due to the multiplexer heritage, but I
>> see a clearer definition of the prefix as a good thing (explicit is
>> better than implicit IMO).
>>
>> A prefix of '.*obar', using a regex(-like) syntax would fit the
>> requirements of being flexible enough (allowing right to left matches
>> and other) and still be precise and explicit.
> 
> I agree.
> 
> I'm uncomfortable with calling it a path (and using '/' as separator
> which accepts both relative and absolute paths, the former working
> as a partial match) because this is too attached to the multiplexer.
> 
> But maybe it's the other way around and "path" should be a concept
> that belongs to the varianter and all configuration providers should
> be conformant to it.
> 
>>
>>>     params.get() can be even more flexible, supporting regexps and
>>>     blobs...  The objective of [prefix] is to serve as a filter and
>>>     to guarantee the caller is getting the variable he wants. It's
>>>     similar to a namespace.
>>>
>>
>> Right, that's why I'd avoid the partial (implicit) prefix match.  By
>> making the prefix selector more flexible, one of the prefixes (or
>> namespaces) available would be selected.  The prefix (or namespace)
>> would be always in its complete form.  The flexible (regex) and explicit
>> version would be something like:
>>
>>    >>> self.params.get('name', prefix='.*obar', default='default')
>>    DEBUG| PARAMS selected prefix "foobar" out of prefix selector ".*obar"
>>    DEBUG| PARAMS (key=name, prefix=foobar, default='default') => 'value'
>>
>>>   * fallback (optional) is the value returned if the parameter
>>>     system can't resolve prefix+key.
>>>
>>
>> OK.  We've also noticed that many users would rather have defaults on
>> parameter files rather than hardcoded within the test.  The *default*
>> set of parameter system providers in Avocado should include one that
>> allows for that use case.
> 
> I'm not sure I understand this use-case. Do you think it's covered
> by the fallback mechanism I propose later in this message?
> 

It's not covered by the fallback mechanism.  What I mean is that Avocado
should probably have, as one of the test parameter system providers
which are enabled by default, one that looks for parameters in the test
data dir.  Something like:

 1. test => examples/tests/sleeptest.py
 2. test's datadir (current concept) => examples/tests/sleeptest.py.data
 3. test's file-based param => examples/tests/sleeptest.py.data/params

The code handling line 3, would be one of the test parameter system
providers.  Just one that is enabled by default.

>>
>>> Users who don't want any specificity and/or have a small test base
>>> with a low chance of clashes could simply ignore the prefix both
>>> when creating parameters and when making calls to params.get().
>>>
>>
>> Yes.
>>
>>>>
>>>> The second item will probably mean the definition of a new class to
>>>> the ``avocado.core.plugin_interfaces`` module, together with a new
>>>> dispatcher(-like) implementation in ``avocado.core.dispatcher``.
>>>>
>>>> Parameters availability: local .vs. external
>>>> ============================================
>>>>
>>>> Right now, all parameters are given to the test at instantiation time.
>>>> Let's say that in this scenario, all parameters are *local* to the
>>>> test.  Local parameters have the benefit that the test is self
>>>> contained and doesn't need any external communication.
>>>>
>>>> In theory, this is also a limitation, because all parameters must be
>>>> available before the test is started.  Even if other parameter system
>>>> implementations are possible, with a local approach, there would be a
>>>> number of limitations.  For long running tests, that may depend on
>>>> parameters generated during the test, this would be a blocker.  Also,
>>>> if a huge number of parameters would be available (think of a huge
>>>> cloud or database of parameters) they would all have to be copied to
>>>> the test at instantiation time.  Finally, it also means that the
>>>> source of parameters would need to be able to iterate over all the
>>>> available parameters, so that they can be copied, which can be a
>>>> further limitation for cascading implementations.
>>>>
>>>> An external approach to parameters, would be one that the test holds a
>>>> handle to a broker of parameter providers.  The parameter resolution
>>>> would be done at run time.  This avoids the copying of parameters, but
>>>> requires runtime communication with the parameter providers.  This can
>>>> make the test execution much more fragile and dependent on the external
>>>> communication.  Even by minimizing the number of communication
>>>> endpoints by communicating with the test runner only, it can still add
>>>> significant overhead, latency and point of failures to the test
>>>> execution.
>>>>
>>>> I believe that, at this time, the limitations imposed by local
>>>> parameter availability do *not* outweigh the problems that an external
>>>> approach can bring.  In the future, if advanced use cases require an
>>>> external approach to parameters availability, this can be reevaluated.
>>>
>>> If I understand your point correctly, this is an implementation
>>> detail. It depends on what the "contract" is between the test runner
>>> (parameter provider) and the test being run (the user of
>>> params.get()).
>>>
>>
>> It is an implementation detail, but one that will ultimately reflect on
>> what is available to test writers.
>>
>>> For example, should users assume parameters are dynamic and can
>>> change during the lifetime of a test, an therefore two identical
>>> calls to params.get() might return different values?  Should it be
>>> possible to change params (something like params.put()) at runtime?
>>>
>>> (IMO the answer should be no to both questions).
>>>
>>> If you have something different in mind, then it would be
>>> interesting to see some real use-cases.
>>>
>>
>> Defining what users could expect is what I was trying to get to, but you
>> were way more direct here.  I mentioned that iterating through the
>> available parameters wouldn't be possible with the "external" approach,
>> because test writers would be affected by those choices.
>>
>> When I said that the "limitations imposed by local parameter
>> availability do *not* outweigh the problems that an external approach
>> can bring", I basically stood by the "local parameter availability"
>> approach, which mean to users that it'd be possible to:
>>
>>  1) Iterate through parameters
>>  2) Change parameters
>>  3) Remove parameters
>>  4) Add parameters
>>
>> Now, it's not because it's possible that it should be supported or
>> encouraged.  For the record, #1 is already possible with the current
>> implementation:
>>
>> http://avocado-framework.readthedocs.io/en/53.0/api/core/avocado.core.html?highlight=avocadoparams#avocado.core.varianter.AvocadoParams.iteritems
>>
>> And so is, to some extent #4 (and depending on how you look at it, #2):
>>
>> http://avocado-framework.readthedocs.io/en/53.0/api/core/avocado.core.html?highlight=avocadoparams#avocado.core.varianter.Varianter.add_default_param
>>
> 
> Right. But these are avocado.core APIs (implementation details) that
> should not be exposed or supported.
> 

The second one is not exposed to test writers, the first one is.

> I've been assuming params.get() is the only params-related API
> exposed to test writers. Is that correct?
> 

No, all the methods in AvocadoParams are available:

http://avocado-framework.readthedocs.io/en/53.0/api/core/avocado.core.html?highlight=avocadoparams#avocado.core.varianter.AvocadoParams

So it's possible to do things like:

 def test():
    for o, k, v in self.params.iteritems():
       self.log.debug("%s %s %s", o, k, v)

Now, some clarifications: I got confused and pasted the wrong link to
add_default_param().  That is a Varianter method, so it's *not*
available to test writers.  I got confused because I was staring at the
AvocadoParams __init__ parameter "default_params".  And and instance of
AvocadoParams is what you get as the "self.params" on a test.

Hopefully that last paragraph is not too confusing.

>>>>
>>>> Namespaces (AKA how/if should we merge parameters)
>>>> ==================================================
>>>>
>>>> Currently, the parameter fetching interface already contains at its
>>>> core the concept of paths[1].  In theory, this is sufficient to prevent
>>>> clashes of keys with the same names, but intended to configure different
>>>> aspects of a test.
>>>>
>>>> Now, with the proposed implementation of multiple providers to the
>>>> parameter system, the question of how they will be combined comes up.
>>>>
>>>> One way is for each implementation to claim, based on some unique
>>>> attribute such as its own name, a part of a tree path.  For instance,
>>>> for two implementations:
>>>>
>>>>  1) variants
>>>>  2) plain_ini
>>>>
>>>> Users could access parameters explicitly defined on those by referring
>>>> to paths such as:
>>>>
>>>>    self.params.get('speed', path='/plain_ini', default=60)
>>>>
>>>> or
>>>>
>>>>    self.params.get('speed', path='/variants/path/inside/varianter',
>>>> default=60)
>>>>
>>>> This clearly solves the clashes, but binds the implementation to the
>>>> tests, which should be avoided at all costs.
>>>
>>> So you're providing this as an example of why it's a bad idea...
>>> OK. :)
>>>
>>>>
>>>> One other approach would be to merge everything into the tree root
>>>> node.  By doing this, one parameter provider could effectively
>>>> override the values in another parameter provider, given that both
>>>> used the same paths for a number of parameters.
>>>>
>>>> Yet another approach would be to *not* use paths, and resort to
>>>> completely separate namespaces.  A parameter namespace would be an
>>>> additional level of isolation, which can quickly become exceedingly
>>>> complex.
>>>
>>> I think using paths is confusing because it mixes concepts which are
>>> exclusive to the multiplexer (a particular implementation of the
>>> varianter) with an API that is shared by all other parameter
>>> providers.
>>>
>>
>> We can talk about "prefixes" instead, but the same problem would still
>> need to be addressed.  Let's change the example completely and use the
>> following parameter provider implementations:
>>
>> 1) env_vars, reads `os.environ`, and provides:
>>
>>  [{'prefix': '', 'key': 'PATH', 'value': '/bin:/usr/bin'}, ...]
>>
>> 2) extended_ini, reads a config file:
>>
>>  ## STARTS HERE
>>  PATH=/usr/libexec:/bin:/usr/bin
>>
>>  [STORAGE]
>>  TEMP=/tmp
>>  PERSISTENT=/var/lib
>>  ## ENDS HERE
>>
>> And provides:
>>
>>  [{'prefix': '', 'key': 'PATH', 'value': '/usr/libexec:/bin:/usr/bin'},
>>   {'prefix': 'STORAGE', 'key': 'TEMP', 'value': '/tmp'},
>>   {'prefix': 'STORAGE', 'key': 'PERSISTENT', 'value': '/var/lib'}]
>>
>> All parameters from all providers have to copied to the test (that's a
>> basic premise of the "local" approach discussed earlier).  During that
>> copy, we have two options:
>>
>> 1) Keep the origin of the parameters in the resulting copy:
>>
>>  params = {}
>>  params['env_vars'] = [{'prefix': '', 'key': 'PATH', 'value':
>> '/bin:/usr/bin'}, ...]
>>  params['extended_ini'] = [{'prefix': '', 'key': 'PATH', 'value':
>> '/usr/libexec:/bin:/usr/bin'}, ...]
>>
>> 2) Discard the origin, and "flatten out" the parameters:
>>
>>  params = {}
>>  # process env_vars
>>  params[{'prefix': '', 'key': 'PATH'}] = '/bin:/usr/bin'
>>  # process extended_ini
>>  params[{'prefix': '', 'key': 'PATH'}] = '/usr/libexec:/bin:/usr/bin'
>>
>> Using the example above, this means that there will be clashes on
>> "params.get(name='PATH', prefix='')".
>>
>>> For example, when you say "merge everything into the tree root
>>> node", are you talking about namespace paths, or paths as used by
>>> the multiplexer when the "!mux" keyword is present?
>>>
>>
>> I meant namespace(-like) paths.  The wording was chosen to match the
>> current implementation (I now realize it was a bad word choice).  Using
>> the example above, the "tree root node" can be thought of as the
>> "params" dict itself.
>>
>>>>
>>>> As can be seen from the section name, I'm not proposing one solution
>>>> at this point, but hoping that a discussion on the topic would help
>>>> achieve the best possible design.
>>>
>>> I think this should be abstract to the Test (in other words, not
>>> exposed through any API). The order, priority and merge of
>>> parameters is a problem to be solved at run-time by the test runner.
>>>
>>
>> What you're proposing is #2, and yes, I'm fine with that.
> 
> Correct.
> 
>>
>>> All a test needs to "know" is that there's a parameter with the name
>>> it wants.
>>>
>>> In the case of clashes, specifying a prioritization should be easy.
>>> We could use a similar approach to how we prioritize Avocado's own
>>> configuration.
>>>
>>
>> I think it can even be defined by the plugin ordering functionality
>> already in place
>> (http://avocado-framework.readthedocs.io/en/53.0/Plugins.html#default-plugin-execution-order).
>>
>>> Example: from less important to top priorities, when resolving a
>>> call to params.get():
>>>
>>>    * "default" value provided to params.get() inside the test code.
>>>    * Params from /etc/avocado/global-variants.ini
>>>    * Params from ~/avocado/global-variants.ini
>>>    * Params from "--params=<file>"
>>>    * Params from "--param=[prefix:]<key:value>"
>>>    * Params from the variant generated from --mux-yaml=<file> (and
>>>      using --mux-inject would have the same effect of changing
>>>      <file> before using it)
>>>
>>
>> Just to be sure: do you expect this resolution to happen at *test* run time?
>>
>> All the discussion so far seems to indicate that we want to merge all
>> parameter system provider data and pass the merged result to the test.
>> To the test, there would resolution of regarding where the parameter is
>> coming from.  In fact, that information would be lost altogether.  The
>> only "resolution" available to the test would be based on prefix and key
>> name.
> 
> Correct, that's what I mean. It's just that to me this is an
> implementation detail of how params.get() works -- and I've been
> assuming params.get() is the *only* params-related API exposed to
> test writers.
> 

It's not, but we'll work on that.

- Cleber.

> Thanks!
>    - Ademar
> 
>>
>>> The key of this proposal is simplicity and scalability: it doesn't
>>> matter if the user is running the test with the varianter, a simple
>>> config file (--params=<file>) or passing some parameters by hand
>>> (--param=key:value). The Test API and behavior are the same and the
>>> users get a consistent experience.
>>>
>>
>> Agreed.
>>
>> - Cleber.
>>
>>> Thanks.
>>>    - Ademar
>>>
>>>> [1] -
>>>> http://avocado-framework.readthedocs.io/en/52.0/api/core/avocado.core.html#avocado.core.varianter.AvocadoParams.get
>>>
>>>
>>>
>>
>> -- 
>> Cleber Rosa
>> [ Sr Software Engineer - Virtualization Team - Red Hat ]
>> [ Avocado Test Framework - avocado-framework.github.io ]
>> [  7ABB 96EB 8B46 B94D 5E0F  E9BB 657E 8D33 A5F2 09F3  ]
>>
> 
> 
> 
> 

-- 
Cleber Rosa
[ Sr Software Engineer - Virtualization Team - Red Hat ]
[ Avocado Test Framework - avocado-framework.github.io ]
[  7ABB 96EB 8B46 B94D 5E0F  E9BB 657E 8D33 A5F2 09F3  ]

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/avocado-devel/attachments/20170816/3e20521a/attachment.sig>