[Avocado-devel] Multiplexer and params retrieval

Fri Mar 6 21:04:39 UTC 2015

On Thu, Mar 05, 2015 at 02:27:08PM -0300, Ademar Reis wrote:
> On Wed, Mar 04, 2015 at 11:20:03AM +0100, Lukáš Doktor wrote:
> > Dne 3.3.2015 v 14:24 Ademar Reis napsal(a):
> > >On Tue, Mar 03, 2015 at 09:23:06AM +0100, Lukáš Doktor wrote:
> > >>Dne 2.3.2015 v 22:30 Ademar Reis napsal(a):
> > >>>On Wed, Feb 25, 2015 at 03:54:34PM +0100, Lukáš Doktor wrote:
> > >>>>Hello World,
> > >>>>
> > >>>>I'm working on the https://trello.com/c/vKxylMFN/115-multiplexer-mechanism-for-tests-to-retrieve-variables
> > >>>>card and having some design decision concerns. I'd like to ask you for your
> > >>>>opinions, examples, concerns or demands :D
> > >>>>
> 
> <snip>
> 
> I had a very productive discussion about the mux with Lukas
> yesterday on the phone and we came up with several ideas and
> clarifications. Specially regarding my "scope and resolution
> order" concept, which I identified as still being unclear and a
> source of confusion.
> 
> So as promised I'm summarizing everything we have here, to try to
> clarify once again the state of our proposal. Lukas, I believe
> we're in sync with most of it, please correct me if I say
> something wrong or if you disagree.
> 
> > >>>>
> > >>>>Clashes
> > >>>>~~~~~~~
> > >>>>
> > >>>>Simple example:
> > >>>>
> > >>>>/self/by_variable/oneliner {variable: "Single line example"}
> > >>>>/self/by_other/something {variable: 123}
> > >>>
> > >>>Just to make sure I'm following: the above is a variant, as
> > >>>available to the test. 'variable:' is the name of the leaf, which
> > >>>is duplicated in both branches.
> > >>>
> > >>Exactly.
> > >>
> > >>>>
> > >>>>params.get('variable') => Exception("Multiple occurances found")
> > >>>>params.get('variable', '/self/by_variable') => "Single line example"
> > >>>>
> > >>>>Let's try this:
> > >>>>
> > >>>>/self {timeout: 10}     # Test's defaults
> > >>>>/self/by_duration/long {timeout: 60, sleep: 30}
> > >>>>/self/by_tool/builtin {timeout: 10}@/self {}
> > >>>
> > >>>What's '@/self {}'?
> > >>>
> > >>I should have mentioned that, I used the same transcript as in debug
> > >>multiplex to emphasize that the timeout:10 is inherited from `/self`.
> > >>
> > >>>>
> > >>>>params.get(timeout) => raises exception as it's defined multiple times, but
> > >>>>is it what we want?
> > >
> > >OK, now I understand your question. A variable inherited should
> > >not be considered a duplicate and should not be considered a
> > >conflict (the other one, yes, it should).
> > >
> > Well it might not be that simple. For example:
> > 
> > hw:
> >     fmt:
> >         qcow2: !join
> >             type: qcow2
> >             0:
> >             v3:
> >                 extra_params = "compat=1.1"
> >     nic:
> >         rtl8139:
> >             type: rtl8139
> > 
> > this produces:
> > 1. */qcow2/0 + rtl8139
> > 2. */qcow2/v3 + rtl8139
> > 
> > in your definition params.get('type', '/hw') would not raise exception and
> > return the rtl8139 as type is inherited from qcow2.
> > 
> > So we can only ignore inheritance from different multiplex group (basically
> > everything what is !joined should share the key values).
> 
> Of course.
> 
> What I mean is that when a variable is "inherited", it's in fact
> the same variable and unless it's changed (for example, extended)
> there's no conflict with another instance of itself.
> 
> <snip>
> 
> > 
> > PS: What do you think about removing the `self.default_params`, I'd really
> > prefer having the defaults in place, maybe even restrict the usage of
> > params[] (or change it to return None in case of missing key) It's easier to
> > read the code when you read params.get('key', 'virito') than scrolling all
> > the way up to see the defaults in some table (or even different file
> > elsewhere).
> 
> I don't have a strong preference. I have not though a lot about
> it as you did, so we should be fine with whatever you think is
> best for this initial version.
> 
> <snip>
> 
> > >>>
> > >>>Can you precisely define what -m is for and how it's used?
> > >>>
> > >>-m means `avocado run -m myfile.yaml`. When you use it without any `!using`
> > >>or custom prefixes, it'd go directly into /self.
> > >
> > >I hope this is not the way you'll document it in --help, because
> > >it's way too complex for users to understand. :-)
> > >
> > Sure, before we get to the documentation phase the workflow would be
> > completely different. (eg. !join vs inverted !mux :D)
> 
> So what about this definition:
> 
>  -m [branch:]<file>
>   Merges <file> into the [branch] of the multiplexer tree. If
>   [branch] is not specified, <file> is merged into the root of
>   the tree.  <file> can add new branches, chop them, add, extend
>   or remove variables using the mux yaml syntax.
> 
>   In other words, the -m mechanism allows one to "patch(1)" a
>   multiplexer tree using yaml, in a dynamic and flexible way.
> 
>   (notice I don't say anything about /self here)
> 
> <snip>
> 
> > >>
> > >>>Still regarding the usage of local variables (the ones that don't
> > >>>start with /, or have no path=), I discussed resolution orders
> > >>>and scopes with you the other day, maybe you remember, or maybe
> > >>>my proposal was not clear, so I'll explain it here:
> > >>>
> > >>Yep, remember.
> > >>
> > >>>Just like we have scope when we're programming, we would have
> > >>>scope when looking for multiplex variables. Anything that doesn't
> > >>>start with / would be local. Then we would define the resolution
> > >>>order of the local scope and return the first occurrence inside
> > >>>that scope (and error out with duplication only if there are two
> > >>>occurrences that match the same relative path inside that the
> > >>>same level).
> > >>>
> > >>>For example, the scope for local variables could be:
> > >>>
> > >>>    1. /self
> > >>>    2. default value provided by the test
> > >>>    3. ERROR (not found)
> > >>I agree with this one
> > >>
> > >>>
> > >>>OR
> > >>>
> > >>>    1. /self
> > >>>    2. /tests
> > >>>    3. /config
> > >>>    4. default value provided by the test
> > >>>    5. ERROR (NOT FOUND)
> > >>This is troublesome. /config contains multiple (I mean humongous number of)
> > >>leafs.
> > >>/tests contain all tests, not just the executed one.
> > >
> > >My point is that this is configurable, this was just an example
> > >of a possible resolution order.
> > >
> > >What I want is to split the problem in two:
> > >
> > >    - The retriever API for tests is simple to understand and
> > >      write. You can explain its usage, the scope and resolution
> > >      order in a couple of paragraphs. No complex exceptions, no
> > >      need to understand how the mux works, no magic.
> > >
> > >    - The scope and resolution order can be fine-tuned by the test
> > >      runner (or whoever is running the tests). That's where the
> > >      magic happens. The complexity should be at this level.
> 
> OK, here's a summary of the proposal we discussed yesterday from
> my point of view. Not everything here was discussed and therefore
> not everything is endorsed by Lukas, at least not yet. ;-)
> 
>   1. When writing a test, writers don't necessarily care about
>   where the variable is coming from. It's an abstract problem,
>   which doesn't matter to the test writer. All they know is that
>   they request a key (variable) from a particular path and it
>   should work.
> 
>   2. When running the test, avocado provides a mechanism to
>   deliver the variables the test needs to run.
> 
>     2a. Right now, we are working on the multiplexer, which one
>     of the tools that provides these variables in a peculiar way,
>     allowing tests to be run multiple times with a myriad of
>     variable combinations.
> 
>     So for the multiplexer, when running the test the tester has
>     control of how big the tree is, how it gets manipulated to
>     the point of creating a specific set of variants, what's the
>     scope of the search, etc.
> 
> (1) is the API discussion. It should be simple and convenient.
> There should be no magic or need to understand details of how the
> multiplexer works to use it. As in the requirements, it should
> not leak details of our particular multiplexer implementation.
> 
> What about this API:
> 
>    X = get(path="", var="", default="")
> 
>    Where:
> 
>      path: limits the scope of where the search happens. Very
>      similar to the way file-systems work. Using a file-system
>      analogy, path would be a path to a directory, or part of it.
>      Accepts wildcards and can be either a full path (absolute,
>      when started by /) or partial (relative, started by
>      something else).
> 
>      var: the value we want to retrieve. Using a file-system
>      analogy, var would be a file.
> 
>      default: the default to be returned in case 'var' with a
>      given 'tag' is not found.
> 
>      X (returned value): the content of 'var'. In a file-system
>      analogy, this would be the content of a file pointed by
>      'var'.
> 
>    Given we match a file-system implementation well, I would
>    follow the same conventions and limitations of POSIX
>    file-system implementations at first (we may want to extend
>    the wildcard mechanism later).
> 
>    Relative paths are always resolved from the right to the left,
>    and the starting path (CWD in the fs analogy) for relative
>    paths is defined at runtime by the test runner.
> 
>    Potential examples:
> 
>    type = get("disk", "type", "ide")
> 
>    print(get("hw/*/cpu/*/model", "", "Intel Westmere"))
> 
>    print(get("/hw/*/cpu/*/model", "", "Intel Westmere"))

I made a mistake in the two lines above. I was not intending to
make 'key' optional. The lines should read:

    print(get("hw/*/cpu/*", "model", "Intel Westmere"))
    print(get("/hw/*/cpu/*", "model", "Intel Westmere"))

And since I'm replying, here are some other examples as well:

    get(var="my.very.specific.key.i.dont.care.its.global",
        "default-value"))

    get(var="my.very.specific.key.i.dont.care.its.global")

> 
> As one can see, the API leaks nothing about the multiplexer. It's
> very generic and can be used to retrieve variables from a
> configuration file or simple structure.

And therefore it could be used to read Avocado configuration as
well (how much of this would be exposed in the test API is
debatable, though):

    get("config.runner.output", "colored", True)
    get("config.datadir.paths", "base_dir")
    get("config.sysinfo.collect", "enabled", True)
    get("config.sysinfo.collect", "profiler", False)
    get("config.sysinfo.collect", "profiler_commands", "")

Thanks.
   - Ademar

> 
> (2a) is where the magic happens. That's when we want to limit the
> size of the mux tree to reduce the scope and therefore the number
> of resulting variants, define where to look for relative
> variables and so on. These are the mechanisms we have, either
> implemented or proposed:
> 
>   - We can specify our own mux file(s), or tell avocado to use
>     only a specific set of files at runtime.
> 
>   - We can use filter-only and filter-out to chop the tree.
> 
>   - We can specify a 'downstream' mux file (the -m you're
>     proposing) to "patch" the tree before it's multiplexed.
> 
>   - When resolving relative paths, we can define what's to the
>     left side. That's what I would call resolution order: since
>     we start resolving from the right to the left, we try to
>     resolve paths in a specific order (in other words, we would
>     try different "CWD"s, following a specific resolution order.
>     Similar to the way PYTHON_PATH works for imports)
> 
>   --- Most importantly: we can mix all of the above.
> 
> Now, what's left? What else do we want to do that can't be
> accomplished by the above?
> 

-- 
Ademar Reis
Red Hat

^[:wq!