[Avocado-devel] Multiplexer and params retrieval

Thu Mar 5 17:27:08 UTC 2015

On Wed, Mar 04, 2015 at 11:20:03AM +0100, Lukáš Doktor wrote:
> Dne 3.3.2015 v 14:24 Ademar Reis napsal(a):
> >On Tue, Mar 03, 2015 at 09:23:06AM +0100, Lukáš Doktor wrote:
> >>Dne 2.3.2015 v 22:30 Ademar Reis napsal(a):
> >>>On Wed, Feb 25, 2015 at 03:54:34PM +0100, Lukáš Doktor wrote:
> >>>>Hello World,
> >>>>
> >>>>I'm working on the https://trello.com/c/vKxylMFN/115-multiplexer-mechanism-for-tests-to-retrieve-variables
> >>>>card and having some design decision concerns. I'd like to ask you for your
> >>>>opinions, examples, concerns or demands :D
> >>>>

<snip>

I had a very productive discussion about the mux with Lukas
yesterday on the phone and we came up with several ideas and
clarifications. Specially regarding my "scope and resolution
order" concept, which I identified as still being unclear and a
source of confusion.

So as promised I'm summarizing everything we have here, to try to
clarify once again the state of our proposal. Lukas, I believe
we're in sync with most of it, please correct me if I say
something wrong or if you disagree.

> >>>>
> >>>>Clashes
> >>>>~~~~~~~
> >>>>
> >>>>Simple example:
> >>>>
> >>>>/self/by_variable/oneliner {variable: "Single line example"}
> >>>>/self/by_other/something {variable: 123}
> >>>
> >>>Just to make sure I'm following: the above is a variant, as
> >>>available to the test. 'variable:' is the name of the leaf, which
> >>>is duplicated in both branches.
> >>>
> >>Exactly.
> >>
> >>>>
> >>>>params.get('variable') => Exception("Multiple occurances found")
> >>>>params.get('variable', '/self/by_variable') => "Single line example"
> >>>>
> >>>>Let's try this:
> >>>>
> >>>>/self {timeout: 10}     # Test's defaults
> >>>>/self/by_duration/long {timeout: 60, sleep: 30}
> >>>>/self/by_tool/builtin {timeout: 10}@/self {}
> >>>
> >>>What's '@/self {}'?
> >>>
> >>I should have mentioned that, I used the same transcript as in debug
> >>multiplex to emphasize that the timeout:10 is inherited from `/self`.
> >>
> >>>>
> >>>>params.get(timeout) => raises exception as it's defined multiple times, but
> >>>>is it what we want?
> >
> >OK, now I understand your question. A variable inherited should
> >not be considered a duplicate and should not be considered a
> >conflict (the other one, yes, it should).
> >
> Well it might not be that simple. For example:
> 
> hw:
>     fmt:
>         qcow2: !join
>             type: qcow2
>             0:
>             v3:
>                 extra_params = "compat=1.1"
>     nic:
>         rtl8139:
>             type: rtl8139
> 
> this produces:
> 1. */qcow2/0 + rtl8139
> 2. */qcow2/v3 + rtl8139
> 
> in your definition params.get('type', '/hw') would not raise exception and
> return the rtl8139 as type is inherited from qcow2.
> 
> So we can only ignore inheritance from different multiplex group (basically
> everything what is !joined should share the key values).

Of course.

What I mean is that when a variable is "inherited", it's in fact
the same variable and unless it's changed (for example, extended)
there's no conflict with another instance of itself.

<snip>

> 
> PS: What do you think about removing the `self.default_params`, I'd really
> prefer having the defaults in place, maybe even restrict the usage of
> params[] (or change it to return None in case of missing key) It's easier to
> read the code when you read params.get('key', 'virito') than scrolling all
> the way up to see the defaults in some table (or even different file
> elsewhere).

I don't have a strong preference. I have not though a lot about
it as you did, so we should be fine with whatever you think is
best for this initial version.

<snip>

> >>>
> >>>Can you precisely define what -m is for and how it's used?
> >>>
> >>-m means `avocado run -m myfile.yaml`. When you use it without any `!using`
> >>or custom prefixes, it'd go directly into /self.
> >
> >I hope this is not the way you'll document it in --help, because
> >it's way too complex for users to understand. :-)
> >
> Sure, before we get to the documentation phase the workflow would be
> completely different. (eg. !join vs inverted !mux :D)

So what about this definition:

 -m [branch:]<file>
  Merges <file> into the [branch] of the multiplexer tree. If
  [branch] is not specified, <file> is merged into the root of
  the tree.  <file> can add new branches, chop them, add, extend
  or remove variables using the mux yaml syntax.

  In other words, the -m mechanism allows one to "patch(1)" a
  multiplexer tree using yaml, in a dynamic and flexible way.

  (notice I don't say anything about /self here)

<snip>

> >>
> >>>Still regarding the usage of local variables (the ones that don't
> >>>start with /, or have no path=), I discussed resolution orders
> >>>and scopes with you the other day, maybe you remember, or maybe
> >>>my proposal was not clear, so I'll explain it here:
> >>>
> >>Yep, remember.
> >>
> >>>Just like we have scope when we're programming, we would have
> >>>scope when looking for multiplex variables. Anything that doesn't
> >>>start with / would be local. Then we would define the resolution
> >>>order of the local scope and return the first occurrence inside
> >>>that scope (and error out with duplication only if there are two
> >>>occurrences that match the same relative path inside that the
> >>>same level).
> >>>
> >>>For example, the scope for local variables could be:
> >>>
> >>>    1. /self
> >>>    2. default value provided by the test
> >>>    3. ERROR (not found)
> >>I agree with this one
> >>
> >>>
> >>>OR
> >>>
> >>>    1. /self
> >>>    2. /tests
> >>>    3. /config
> >>>    4. default value provided by the test
> >>>    5. ERROR (NOT FOUND)
> >>This is troublesome. /config contains multiple (I mean humongous number of)
> >>leafs.
> >>/tests contain all tests, not just the executed one.
> >
> >My point is that this is configurable, this was just an example
> >of a possible resolution order.
> >
> >What I want is to split the problem in two:
> >
> >    - The retriever API for tests is simple to understand and
> >      write. You can explain its usage, the scope and resolution
> >      order in a couple of paragraphs. No complex exceptions, no
> >      need to understand how the mux works, no magic.
> >
> >    - The scope and resolution order can be fine-tuned by the test
> >      runner (or whoever is running the tests). That's where the
> >      magic happens. The complexity should be at this level.

OK, here's a summary of the proposal we discussed yesterday from
my point of view. Not everything here was discussed and therefore
not everything is endorsed by Lukas, at least not yet. ;-)

  1. When writing a test, writers don't necessarily care about
  where the variable is coming from. It's an abstract problem,
  which doesn't matter to the test writer. All they know is that
  they request a key (variable) from a particular path and it
  should work.

  2. When running the test, avocado provides a mechanism to
  deliver the variables the test needs to run.

    2a. Right now, we are working on the multiplexer, which one
    of the tools that provides these variables in a peculiar way,
    allowing tests to be run multiple times with a myriad of
    variable combinations.

    So for the multiplexer, when running the test the tester has
    control of how big the tree is, how it gets manipulated to
    the point of creating a specific set of variants, what's the
    scope of the search, etc.

(1) is the API discussion. It should be simple and convenient.
There should be no magic or need to understand details of how the
multiplexer works to use it. As in the requirements, it should
not leak details of our particular multiplexer implementation.

What about this API:

   X = get(path="", var="", default="")

   Where:

     path: limits the scope of where the search happens. Very
     similar to the way file-systems work. Using a file-system
     analogy, path would be a path to a directory, or part of it.
     Accepts wildcards and can be either a full path (absolute,
     when started by /) or partial (relative, started by
     something else).

     var: the value we want to retrieve. Using a file-system
     analogy, var would be a file.

     default: the default to be returned in case 'var' with a
     given 'tag' is not found.

     X (returned value): the content of 'var'. In a file-system
     analogy, this would be the content of a file pointed by
     'var'.

   Given we match a file-system implementation well, I would
   follow the same conventions and limitations of POSIX
   file-system implementations at first (we may want to extend
   the wildcard mechanism later).

   Relative paths are always resolved from the right to the left,
   and the starting path (CWD in the fs analogy) for relative
   paths is defined at runtime by the test runner.

   Potential examples:

   type = get("disk", "type", "ide")

   print(get("hw/*/cpu/*/model", "", "Intel Westmere"))

   print(get("/hw/*/cpu/*/model", "", "Intel Westmere"))

As one can see, the API leaks nothing about the multiplexer. It's
very generic and can be used to retrieve variables from a
configuration file or simple structure.

(2a) is where the magic happens. That's when we want to limit the
size of the mux tree to reduce the scope and therefore the number
of resulting variants, define where to look for relative
variables and so on. These are the mechanisms we have, either
implemented or proposed:

  - We can specify our own mux file(s), or tell avocado to use
    only a specific set of files at runtime.

  - We can use filter-only and filter-out to chop the tree.

  - We can specify a 'downstream' mux file (the -m you're
    proposing) to "patch" the tree before it's multiplexed.

  - When resolving relative paths, we can define what's to the
    left side. That's what I would call resolution order: since
    we start resolving from the right to the left, we try to
    resolve paths in a specific order (in other words, we would
    try different "CWD"s, following a specific resolution order.
    Similar to the way PYTHON_PATH works for imports)

  --- Most importantly: we can mix all of the above.

Now, what's left? What else do we want to do that can't be
accomplished by the above?

Thanks.
   - Ademar

-- 
Ademar Reis
Red Hat

^[:wq!