[Avocado-devel] Multiplexer: mechanism for tests to retrieve variables

Tue Jan 20 16:57:10 UTC 2015

Hi guys,

I'm struggling a bit with the inverse version of the multiplexer, 
because I designed it with the `mechanism for tests to retrieve 
variables` in mind. I thought about what the inverse version represents 
and actually couldn't sleep last night because of it. (I know how it 
works, but can't think of what it represents and it's always necessarily 
to see the big picture before choosing the path. So I have !nomux 
version already in my tree - the difference is about 10 lines, but I 
can't merge it before really understanding how it affects the concept)

There is an example tree (incompatible with the old version; remove 
!nomux tags to execute with RFC or remove !multiplex tags to work with 
inverse version):

!multiplex
hw:  !multiplex
     nic:
         rtl8139:
             type = rtl8139
         e1000:
             type = e1000
         virtio_net:
             type = virtio_net
         xennet:
             type = xennet
         spapr-vlan:
             type = spapr-vlan
         nic_custom:
             type = nic_custom
     smp:
         up:
             count = 1
         smp2:
             count = 2
     drive_format:
         ide:
             type = ide
         scsi:
             type = scsi
         sd:
             type = sd
         virtio_blk:
             type = virtio_blk
         virtio_scsi:
             type = virtio_scsi
         spapr_vscsi:
             type = spapr_vscsi
         lsi_scsi:
             type = lsi_scsi
         ahci:
             type = ahci
         usb2:
             type = usb2
         xenblk:
             type = xenblk
     image_format: !nomux
         qcow2:
             type = qcow2
             2v3:
                 params = compat=1.1
             2:
         vmdk:
             type = vmdk
         raw:
             type = raw
         raw_dd:
             type = raw_dd
         qed:
             type = qed
     pci_assignable:
         no_pci_assignable:
             type = false
         pf_assignable:
             type = pf
         vf_assignable:
             type = vf
     pagesize:
         smallpages:
         hugepages:
             type = hugepage
     9p:
         no_9p_export:
         9p_export:
             type = p9
     gluster:
         filesystem:
         gluster:
             type = gluster
     lvm:
         no_lvm_support:
         lvm_partition:
             type = lvm
         emulated_lvm:
             type = emulated

os: !multiplex
     platform:
         32:
         64:
     type: !nomux
         Windows:
             2000:
             xp:
             2003:
             7:
             8:
             10:
         Linux: !nomux
             Fedora:
                 10:
                 11:
                 12:
                 13:
                 14:
                 15:
                 16:
                 17:
                 18:
                 19:
                 20:
                 21:
             RHEL: !nomux
                 3: !nomux
                     0:
                     1:
                     2:
                     3:
                     4:
                     5:
                     6:
                     7:
                 4: !nomux
                     0:
                     1:
                     2:
                     3:
                     4:
                 5: !nomux
                     0:
                     1:
                     2:
                     3:
                 6: !nomux
                     0:
                     1:
                     2:
                     3:
                 7: !nomux
                     0:
                     1:
                     beta:

machines:
     i440fx:
     q35:
     pseries:
     arm:

There are the problems I had in mind separated into sections:

[Namespace issue]

Current situation:
1) Variants are created as combination of non-sibling leaf nodes
2) In the end we pass only dictionary where some values might be 
rewritten from values from later nodes
3) We ask for a certain key without any namespaces
=== params.get('type') returns completely useless 'lvm'...

How I understood the multiplexing with !multiplex:
1) We gather leaves per each !multiplex domain (each child of !multiplex 
node is separate multiplex domain)
2) We pass an object, which contain multiplex domains with current 
variant's values (/hw/cpu, /hw/disk, ...)
3) We ask for a certain key. Without namespace it  returns either first 
or last match (needs to be decided).
4) We can ask for the value inside a given namespace, eg: 
params.get('/hw/nic', 'type'). Then in first variant it returns the 
value of /hw/nic/rtl8139, in second /hw/nic/e1000, ... (because we know 
which leaf belongs to which multiplex domain).
5) Collisions might occur when using non-end-multiplex domain to ask for 
a value, eg: params.get('/hw', 'type'). We don't know whether user wants 
'type' from '/hw/nic' or '/hw/disk'. As people create the structure, 
they should know which nodes are marked as !multiplex and they should 
always use them. Then the situation is clear.
=== params.get('type') returns useless 'lvm' as previous, but we can use 
params.get('/hw/nic', 'type') to get the real value

With !nomux, we don't mark multiplex domains, so people might get 
confused easily.
1) the same as !multiplex
2) similar to !multiplex, only most of the nodes are !multiplex so it's 
harder to pinpoint the end-multiplex domains. (for humans, computer does 
it easily)
3) the same as !multiplex
4) similar, only this time all nodes are multiplexed. So we need to 
guess which one is end-point and much easier we can get multiple 
matching leaves.
=== params.get('type') works the same way, we can also use 
params.get('/hw/nic', 'type') but as we are lazy and don't specify 
multiplex domains, we might accidentally query for bad nodes, eg. 
params.get('/hw/image_format/qcow', 'type'). For first 2 rounds this 
succeeds ['/hw/image_format/qcow/2', '/hw/image_format/qcow/2v3'], but 
in third variant it fails to find the leaf (because the current leaf is 
'/hw/image_format/vmdk'.

The problem of matching nodes is described in detail below

[Matching nodes - endswith]

the leaf nodes are usually something like '/hw/nic/rtl8139' or 
'/hw/nic/e1000'. where the last part varies over variants. Actually it's 
not only the last part, eg: /hw/image_format/qcow/2v3 is sibling to 
/hw/image_format/raw. So matching '/hw/image_format/qcow/2v3' makes no 
sense. We always need to match the last multiplex group (in this case 
'/hw/image_format').

On the other hand when we query only for '/hw', we get 
['/hw/nic/rtl8139', '/hw/cpu/smp2', '/hw/drive_format/ide', ...] and we 
need to decide which key to return (for example try parmas.get('/hw', 
'type')).

[Matching nodes - startswith]

There is also an opposite problem with the beginning. Usually we 
encourage people to use simple yaml files to multiplex tests (eg. the 
sleeptest multiplex:

     short:
         sleep_length: 0.5
     medium:
         sleep_length: 1
     long:
         sleep_length: 5
     longest:
         sleep_length: 10

The tree is:

--- short
  |- medium
  |-long
  \-longest

so the result leaves are:

[/short, /medium, /long, /longest]

So when writing test for this simple version, we'd ask for 
params.get('/', 'sleep_length').

But what if someones want's more complicated version and he puts this 
into another branch, eg:

     tests:
         sleeptest:
             by_length:
                 short:
                 medium:
                 long:
                 longest:

When he develops the test, he'd use 
params.get('/tests/sleeptest/by_length', 'sleep_length') to obtain the 
value from the correct namespace. This would might cause trouble when 
executing this test with the simple version (the issue is more serious 
as most of the time it'd work fine, but when the keys are duplicate, 
other value might win).

This might be eliminated a bit by separating framework-related and 
test-related multiplexing.

1) framework-related (plugin-related) should have defined structure so 
we can safely assume `/virt/hw/nic` defines each key only once and is 
used to obtain information about the current `nic`.
2) test-related should be unstructured and should extend the `/test` 
namespace. That way we don't mix values from other namespaces (other 
plugins or complex structures defined by users) and we only query for 
`params.get('/test', key)` or if we know we defined substructres for 
`params.get('/test/our_subvariant', key)`.

But let me know if you know of a better solution (params.get('/', key) 
returns all of the leafs including /hw/nic/rtl8139, 
/os/type/linux/Fedora/8, ... so one can only guess what's returned.

[params.get_variant()]

Another simplification could be to provide `params.get_variant(path)` 
API, which would return the currently matching leaves to the provided 
path. This can simplify the yaml file as shown in '/os' (+below) and 
speed as for simple cases we won't need to query environment, which is 
expensive.

Instead of `params.get('/hw/nic', 'type'), you could use 
`params.get_variant('/hw/nic'). This returns `/hw/nic/rtl8139` (or 
`params.get_variant('/hw/nic', strip=True)` => `rtl8139`). This is a 
sufficient information for us and we don't need to specify `type = ...` 
on every line and focus only on the actual key=value pairs (eg. queues = 
..., if needed).

Note: For `params.get_variant('/os/type', True)` returns 'Linux/RHEL/3/7'

on the other hand `params.get_variant('/os', True) returns 
['platform/32', 'type/Linux/RHEL/3/7'] as multiple leaves matches.

[INI config]

For safety reasons I think it might be good to reserve '/config' branch 
which would be only writable by INI config parser. On the other hand INI 
should be able to extend any part (eg. default qemu path)

[Per-test variants]

I'm still a bit troubled about the tests variants. When we execute a 
single test ourselves, we can easily change the --mux to different 
setting. But correct me if I'm wrong, there is currently no way to 
execute various different tests and multiplex some tests with different 
variants. There are again multiple ways:

1) use different runs per each test
2) define per-test variants in specific path (this option was discussed 
in my multiplexer RFC, put multiplex file into $test.data/$test.yaml 
directory and it'd extend the tests run when --mux-test specified)
3) having tests as part of the multiplex tree (this is very similar to 
how virttest worked), there is a need to map test names to test paths.
...

I liked the 3rd approach a lot, but as Avocado executes anything as 
test, I can't see the way to reliably map names to files (full path 
makes no sense as path usually varies over multiple machines).

This leaves me with the 2). In this case I'd extend the tree on-the-fly 
of the tree from `$test.data/$test.yaml` file into `/test` path so 
people can safely use `params.get('/test', key)` or 
`params.get('/test/my_subvariant', 'key')`. Note that `/test` already 
contains all the `/` values... Anyway the problem is in modifying these 
(one can easily only filter the existing variants but not replacing the 
multiplex files)

Or we can just assume people always run single test and combines the 
results themselves.

Congratulation on reading such a long mail, all ideas are welcome.

Sincerely yours completely exhausted Lukáš.