[Avocado-devel] Multiplexer and params retrieval
Lukáš Doktor
ldoktor at redhat.com
Wed Feb 25 14:54:34 UTC 2015
Hello World,
I'm working on the
https://trello.com/c/vKxylMFN/115-multiplexer-mechanism-for-tests-to-retrieve-variables
card and having some design decision concerns. I'd like to ask you for
your opinions, examples, concerns or demands :D
Multiplexer
===========
Ultimate goal
-------------
Possibility to specify test parameters, allowing simple
upstream-downstream modification, supplying different files on command
line or even injecting nodes/variables from cmdline to use different
params for different runs.
Additionally we must avoid params name clashes.
Ideally this should be simple enough to be understood without any prior
knowledge.
Tree
----
Let's say tree is the way we want to go:
-*- by_something -*- A
| \- B
\- by_else -*- A
\- B
Produces:
1. /by_something/A + /by_else/A
2. /by_something/A + /by_else/B
3. /by_something/B + /by_else/A
4. /by_something/B + /by_else/B
and we can ask for:
params.get(key) => can generate errors when multiple matches
params.get(key, path=/by_something)
params.get(key, path=/by_else)
Upstream-downstream
-------------------
One way I come up with is to use default multiplex files location:
/etc/avocado/config.yaml => avocado config (could be parsed from INI)
$AVOCADO/global.yaml => default variables (use /etc/avocado/
instead?)
$PLUGIN => plugins should be able to specify default
mux too
$TEST_DATA/multiplex.yaml => per-test additional combinations
Not all of the files needs to be proceed. This should depend on the
cmdline option, which either disables multiplexation, uses only tests
options, (maybe) uses global options but multiplex only parts specified
in test, multiplex everything.
We might also consider using `10_` prefix and automatically load all
files matching the name (eg.: when files "10_multiplex.yaml" and
"20_multiplex.yaml" exists, we should include booth of them. That way
extending up-down is just adding an additional file, which shouldn't
clash with our default files. Thus only fast-forward merges are required
in downstream projects). The same applies for "-m" cmdline option, when
`-m $PATH/10_myfile.yaml` is set, we should search for other
`\d\d_myfile files`.
Structure
---------
This one is tricky and it heavily affects the usage.
1. we might not care about the structure (allow everything for advanced
users)
2. we can precisely define the structure (restrict everyone, but
simplify usage)
3. we can automatically create locations (heavily affect tree modifications)
4. combine the methods
Allow everything
~~~~~~~~~~~~~~~~
This one is the most flexible but can surprise users. They are
responsible for not touching elsewhere used variants, but if they
succeed, they can do absolutely everything.
On the other hand it might be harder to write the tests as you need to
know the namespace you want to get the variable from. Also with multiple
multiplex files this can get nasty.
Precise structure
~~~~~~~~~~~~~~~~~
Let me demonstrate this on example (+ => !join, * => !mux):
-+- config -+- datadir --- paths
| +- sysinfo
| \- runner
+- plugins -*- virt -*- hw -*- disk -*- virtio
| | | \- ide
| | \- nic -*- virtio
| | \- rtl8139
| \- os ...
+- tests -*- HASH1
| \- HASH2
\- self
This is a basic global structure. When we multiplex the file, we get
(instead of full path I used *+leaf)
1. */paths + */sysinfo + */runner + */virtio + */virtio + */os* + /self
2. */paths + */sysinfo + */runner + */ide + */virtio + */os* + /self
...
/tests are special. It's parsed out of the cmdline's urls and/or
specified by mapping file. For each run current test just extends the
existing "/self". That way we can specify per-test variants.
All multiplex files goes automatically into "/self" by default. When we
specify !using, we can override other parts too (?except /config).
Automatic locations
~~~~~~~~~~~~~~~~~~~
Similar to previous only each file (which is not using !using) goes into
`/self/$UID` location.
This brings one benefit and drawback. It supports simple files:
# simple.yaml
short:
medium:
long:
Which works fine with multiple independent files:
# simple2.yaml
builtin:
bash:
But makes it impossible to extend this file from another file:
# extend.yaml
longest:
which results in (without the simple2.yaml):
1. */short + */longest
2. */medium + */longest
3. */long + */longest
People could use !using to avoid this problem, but that somehow
workarounds the automatic location and creates the same structure as 1 or 2.
Also it brings problems to params.get(key, "/self/by_something") as in
reality it would become "/self/$UID/by_something". Should we search for
"*/by_something" instead? Or differentiate between "/by_something" vs.
"by_something" and in case of relative path prepend "/self/*"?.
Combined
~~~~~~~~
Possibilities are unlimited.
Command line
~~~~~~~~~~~~
We also need to consider where -m files go (when not using !using). I'd
vote for /self by default, /self/$UID in case of automatic location.
Also we might allow -m [$location:]$file to dynamically extend the
location. (I tend to prefer this to hard-coded locations inside the
file). Take another look at the "Structure.Automatic locations"
simple.yaml and simle2.yaml. Instead of automatic handling we can just
let users say:
-m /self/by_duration:simple.yaml -m /self/by_tool:simple2.yaml -m
/self/by_duration:extend.yaml
Which does booth; extends the simple.yaml by extend.yaml and adds second
set of variants from simple2.yaml.
This itself can help the basic users to create variants ad-hoc while not
tying our hands to write complex yaml files.
Getting the params
------------------
OK when we have the tree structure parsed into variants, we execute the
tests using the params extracted from variants:
Let's say our variant1 is:
/config* {...}
/plugins/virt/hw/disk/virtio {}
/plugins/virt/hw/nic/virtio {}
/plugins/virt/os/linux/fedora/21/64 {file:img.qcow2, kickstart:ks.ks, ...}
/self {file:boot.py, tag:, exec:sleep 1}
# the plugin uses
params.get_leaves(/plugins/virt/hw) => [*/disk/virtio, */nic/virtio]
params.get_leaf(/plugins/virt/hw/disk) => virtio
params.get_leaf(/plugins/virt/hw/nic) => virtio
params.get(file, /plugins/virt/os) => img.qcow2
# in test we use
params.get(exec) => sleep 1
Now imagine that second test defines multiple variants:
...
/self {file:boot.py, tag:}
/self/by_command/sleep {command:sleep 1}
now we use params.get(command, /self/by_command) to avoid clashes with
variables from /self.
Clashes
~~~~~~~
Simple example:
/self/by_variable/oneliner {variable: "Single line example"}
/self/by_other/something {variable: 123}
params.get('variable') => Exception("Multiple occurances found")
params.get('variable', '/self/by_variable') => "Single line example"
Let's try this:
/self {timeout: 10} # Test's defaults
/self/by_duration/long {timeout: 60, sleep: 30}
/self/by_tool/builtin {timeout: 10}@/self {}
params.get(timeout) => raises exception as it's defined multiple times,
but is it what we want?
+ namespaces are really separated
- we can't override defaults inside variants
We can use params.get(timeout, /self/by_duration), but that would fail
if we remove the multiplex file with /by_duration (as there would be no
*/by_duration in params).
We can make /self special and say it contains defaults. So in case path
or key are not found, we look there:
/self/by_duration/long {timeout: 60, sleep: 30}
/self/by_tool/builtin
---
/defaults {timeout: 10}
params.get(timeout) => 60 # if in other run timeout is not found, uses 10
This solution also has some holes. Users must make sure not to use keys
from /defaults multiple times inside variants:
/self/by_duration/long {timeout: 60, sleep: 30}
/self/count/many {timeout: 120, count: 3}
---
/defaults {timeout: 10, count: 1}
params.get(timeout) => Exception("Multiple occurances")
^^ I know this example makes little sense, but it demonstrates possible
issues.
Additionally we should consider another problem. We have similar example
as before:
params.get(count, /self/count) => 3
in next run we remove the multiplex file specifying /self/count* leafs
params.get(count, /self/count) => Exception("Path not found")
should we consider supplying /defaults if path not found? (=> 1)
Logging
-------
While thinking about optimization and usage, do we need to log all the
config variables? I remember how hard is it to find something when
looking at Autotest's output. So why not to log first occurrence of
params.get() instead?
...
DEBUG: PARAM $name ($path) => $value
...
We can log them:
1. in main log when it's inquired (we might need to use grep to see them
all)
2. in main log after the test (in case of panic we might not see them)
3. in separate file (sorted alphabetically)
4. in separate file (sorted chronologically)
5. combination, eg: log all of them into special file and after the test
remove
the file and put only used one there (or in main log).
Benefits:
1. Speed&mem (some leafs might not been processed)
2. See what variables were used and when (first inquire)
3. Shorter logs
Drawbacks:
1. Necessity to look into another file or at the end instead of beginning.
Results
-------
Ok, this was long as multiplexer is something new and very flexible
right now. Let me summarize it by providing example of what I'd choose,
please feel free to disagree, comment and give advises. (also I'm aware
the previous brain-dump has big holes in it, but I don't want to avoid
rewriting this using what I realize while writing it).
Structure (optional)
~~~~~~~~~~~~~~~~~~~~
/config # /etc/avocad/... (using yaml to allow multiplexation)
/plugins # When plugin asks for it (eg. `test.VirtTest` would process
# the virt's yaml file before
# test execution.
/self # Default location when -m used
/test # Parsed from cmdline urls (or by mapping file),
# each variant extended of `Test.get_variants()`
# not available inside the test, only during test parsing...
Demonstration
~~~~~~~~~~~~~
Instead of drawing trees I'll write yaml representing the tree
# /etc/avocado/avocado.yaml
!using : /config
datadir:
log: "/tmp/avocado"
# myfile.yaml
key: value
by_something:
variant_a:
variant_b:
avocado run passtest.py boot.py -m myfile.yaml
# After params parsing looks like this:
config:
datadir:
log: "/tmp/avocado"
self:
key: value
by_something:
variant_a:
variant_b:
tests:
passtest.py:
id: passtest.py
boot.py:
id: boot.py
# Then it starts executing tests
# passtest.py:
# 1. executes Test.get_variants()
self:
id: passtest.py
# 2. merges this tree and the original one (without tests)
config: # unchanged
self:
id: passtest.py
key: value
by_something:
variant_a:
variant_b:
# 3. executes all variants (2)
# boot.py:
# 1. executes Test.get_variants()
plugins:
virt:
hw:
disk:
ide:
scsi:
self:
id: boot.py
# 2. merges this tree and the original one (without tests)
config: # unchanged
plugins: # unchanged
self:
id: boot.py
key: value
by_something:
variant_a:
variant_b:
3) executes all variants (4)
This requires not to reuse keys from /self as all children would share
them and params.get(key) would raise Exception. In my opinion this is
sensible and it simplifies the usage.
Params
~~~~~~
params.get(key, default=None, strict=False, path=None)
* path:
* None => "/self[/.*]?"
* PATH => "/self/[/.*]?PATH[/.*]?"
* /PATH => "/PATH[/.*]?
* '*' => .* # eg: */PATH => .*/PATH[/.*]?
# eg: PA*TH => "/self/[/.*]?PA.*TH[/.*]?"
* strict: exception on missing value?
* default: default value when not found
* Exception when multiple matches of !different! value
In case values are the same, use them (inherited values)
params.get_leaves(path) => return all leaves matching path
params.get_leaf(path) => return single leaf or exception when 0 or 2+
Locations
~~~~~~~~~
1. /etc/avocado/avocado.yaml + conf.d/* files (usually !using: /config)
2. API for plugins to inject tree (during init or Test.get_variants())
3. cmdline $test1 $test2 $test3 => /tests/HASH {url: $test1}, ...
extended by Test.get_variants() during execution.
4. "-m $path" extends /self
5. "-m $path:$file" extends $path (when $path starts with '/', extends
global, otherwise injects into /self)
6. when -m 10_file.yaml specified, all files matching \d\d_file.yaml
files are loaded.
Output
~~~~~~
1. Dump first occurrence of the params.get() in the main log with common
prefix to be able to grep for it:
...
INFO | Starting machine
DEBUG| PARAM: /plugins/virt/os/fedora/21/64 image = image.qcow2
DEBUG| Logging into machine...
DEBUG| PARAM: /plugins/virt/os/fedora/21/64 password = 123456
DEBUG| PARAM: /self exec = sleep 1
DEBUG| Executing binary "sleep 1"
...
Execution
~~~~~~~~~
--mux NOMUX|DEFAULT|ALL
where plugins and tests should look at /config/multiplex/mux value to
distinguish between variants.
NOMUX => only execute single variant of the test with defaults
DEFAULT => basic multiplexation of the file
ALL => multiplex everything
NOTE: It's possible to say --nomux -m /:$FILE to use only custom
multiplex file (/config and /test would be merged).
NOTE2: --mux DEFAULT with virt test would mean use default
Test.get_variants() but don't parse the $virttest.yaml.
Ugh, bzzz, grrr, kukikuki, agh... please feel free to comment or just
provide examples of where it fails and why. Examples are important.
Best regards,
Lukáš
More information about the Avocado-devel
mailing list