[Avocado-devel] RFC: Parameters as Environment Variables

Lukáš Doktor ldoktor at redhat.com
Mon Jan 22 12:33:00 UTC 2018


Dne 19.1.2018 v 20:03 Lukáš Doktor napsal(a):
> Dne 19.1.2018 v 14:50 Amador Pahim napsal(a):
>> On Wed, Jan 10, 2018 at 6:39 PM, Lukáš Doktor <ldoktor at redhat.com> wrote:
>>> Dne 9.1.2018 v 13:27 Amador Pahim napsal(a):
>>>> Motivation
>>>> ==========
>>>>
>>>> Avocado creates a tree like object to represent its parameters coming
>>>> from the varianter. But that tree object can be accessed only by
>>>> INSTRUMENTED tests. When it comes to non-INSTRUMENTED tests, test
>>>> writers have to rely on the environment variables created out of that
>>>> tree structure. The translation from tree object to environment
>>>> variables has some caveats currently and that's the main focus of this
>>>> RFC.
>>>>
>>>> Reference: https://trello.com/c/4cNAKW6I/1111-discuss-a-way-to-control-the-export-of-parameters-as-environment-variables
>>>>
>>>> Currently, the tree object is flattened, creating environment
>>>> variables with the names of the leaves, disregarding the branch path
>>>> and also overwriting preexisting environment variables with the same
>>>> name.
>>>>
>>>> Example:
>>>>
>>>>     $ cat parameters.yaml
>>>>     branch1:
>>>>         foo: bar1
>>>>     branch2:
>>>>         foo: bar2
>>>>
>>>> With the above yaml file, avocado will generate the following tree:
>>>>
>>>>     $ avocado variants -m parameters.yaml --tree -c
>>>>     Multiplex tree representation:
>>>>      ┗━━ run
>>>>           ┣━━ branch1
>>>>           ┃     → foo: bar1
>>>>           ┗━━ branch2
>>>>                 → foo: bar2
>>>>
>>>>
>>>> Even though two parameters have the same name, they are in different
>>>> branches and have different values. Both parameters can be accessed in
>>>> INSTRUMENTED tests using:
>>>>
>>>>     self.params.get('foo', path='/run/branch1/*')
>>>>
>>>> and
>>>>
>>>>     self.params.get('foo', path='/run/branch2/*')
>>>>
>>>> When it comes to non-INSTRUMENTED tests, i.e. SIMPLE tests, the tree
>>>> is flattened and the corresponding environment variables are created.
>>>> In the same example above, a bash script which used the "foo"
>>>> parameter above would look like this:
>>>>
>>>>     #!/bin/bash
>>>>     echo ${foo}
>>>>
>>>> But since we don't use the path information, the $foo is the last
>>>> parameter with that name from the tree. The previous "foo" value will
>>>> not be available. And, as mentioned before, a previous environment
>>>> variable with that name will be overwritten, with obvious
>>>> consequences.
>>>>
>>>>
>>>> Proposal
>>>> ========
>>>>
>>>> To avoid parameters collision, one option is to use the path
>>>> information to create the environment variable name. In the example
>>>> above, instead of one "foo" variable, we would create a
>>>> "run_branch1_foo" containing "bar1" and a "run_branch2_foo" containing
>>>> "bar2".
>>>>
>>>
>>> This can become really long (is there a limit of variable name length in bash?)
>>>
>>>> To avoid overwriting the preexisting environment variables, one option
>>>> is create a more unique variable name, including the "AVOCADO_" string
>>>> at the beginning. The same example above would then become
>>>> "AVOCADO_run_branch1_foo" and "AVOCADO_run_branch2_foo", with their
>>>> respective values.
>>>>
>>>
>>> I don't like this because people use this to control programs/frameworks that rely on pre-defined env variables. For example people might want to set `DEBUG=5` or `LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1` on some variants via parameters.
>>
>> This is not an Avocado feature. If people are relying on our
>> parameters system to create pre-defined variables, they are taking
>> advantage (or abusing) a side effect of our current implementation. If
>> we were to support this use case, we should create those env.
>> variables for INSTRUMENTED tests as well. Right now, we are not.
>>
> 
> I'm not certain, but Honzo, aren't/weren't you (ab)using this feature? I think I saw it somewhere and as I mentioned during last meeting I believe we should keep supporting that. Honestly I find this more useful than having the access to full tree or params, because parsing the full params is really hard and if someone wants to use clashing variables, they should probably use INSTRUMENTED tests.
> 
> Anyway as mentioned below I can imagine a way to say this variable is important I want to make it available. During the meeting I used the suffix "_ENV", which sounds IMO better than "_PASSTHROUGH" which was mentioned below.
> 
> Even better solution which I mentioned during last meeting I can imagine whitelist/blacklist approach. For example `whitelist_env_params` which would be list (or coma separated list) of variables (or path+variables?) to be exported in the environment. This could be even enabled for INSTRUMENTED tests and the whitelist approach should be really safe and fast. Blacklist approach could be implemented for the sake of backward compatibility with the current implementation but allowing one to not to override the usual commands like "PATH" (unless they change the blacklist). I think the `whitelist|blacklist_env_params` should be settable by "config" and overradeable by "params" to have sane defaults while the flexibility to disable it for certain execution.
> 
>>>
>>> What I can imagine is to do this by default, but also allow saying some variable should be passed as is, for example:
>>>
>>>     FOO=VALUE
>>>     FOO_PASSTHROUGH=1
>>>
>>> which should produce:
>>>
>>>     FOO=VALUE
>>>     AVOCADO_FOO=VALUE
>>>     AVOCADO_FOO_PASSTHROUGH=1
>>>
>>>> The proposal above would improve the situation, while keeping the
>>>> current user experience, enabling more powerful parameters API for
>>>> non-INSTRUMENTED tests. Besides the proposals above, one extra feature
>>>> that we have to consider is to create an environment variable with a
>>>> machine-parsable content with the serialized varianter object. A good
>>>> format for that would be JSON, a text format that is completely
>>>> language independent but uses conventions that are familiar to
>>>> programmers, which is already used by Avocado to serialize the
>>>> varianter object in the 'jobdata' module.
>>>>
>>>> With that in mind, using the same example, the proposal is to create a
>>>> variable called "AVOCADO_PARAMETERS" containing:
>>>>
>>>>     [{"paths": ["/run/*"],
>>>>      "variant": [["/run/first_branch", [["/run/first_branch",
>>>>                                                    "foo",
>>>>                                                    "bar1"]]],
>>>>                     ["/run/second_branch", [["/run/second_branch",
>>>>                                                         "foo",
>>>>                                                         "bar2"]]]],
>>>>      "variant_id": "first_branch-second_branch-4cd9"}]
>>>>
>>>
>>> This would be nearly impossible to use directly from BASH and so I see this as integration feature, therefor I'd wait for users to ask for it. And when that time comes we'd have better understanding of their needs (probably I'd prefer storing it in file rather than env variable, but can't really tell why...).
>>>
>>> If we decide we needed a way to get full representation, I'd go with variables available in a dir structure. On TMPFS it should be fairly quick to create it and easy to access via `find`. Anyway I still think it should be disabled by default and only available via settings or a special param.
>>
>> Getting full representation is the main point of this RFC. Using tmpfs
>> is indeed a good alternative.
>>
> 
> For me this is nice to have, but I don't see anyone asking for it, nor I see myself ever trying to consume it. If I use really complex variants writing an INSTRUMENTED test to execute it should be simpler.
> 
> So yes, JSON or tmpfile would be acceptable for me, but I don't see it as a priority and I don't think it should be enabled by default. Honestly I don't even think we should devleop it unless someone asks for it.
> 

Actually I thought about this throughout the weekend and I don't think we should spend any time on this. The way most tests (I saw) written in BASH use env variables to store things, therefor their are written in simple flatten world in mind. In python people usually use multiple variables, dicts or even objects to store stuff so the tests are written in a different manner. Yes, there might be 0.0004% of tests that could potentially benefit from this, but then the users should ask or develop it themselves (tailored to their needs, be it json, tmpfile or some other serialization).

As for the easier integration with other test types (eg. ruby tests), they should write the custom test type and provide the variables in their custom format. It's a matter of a few lines (see SimpleTest definition for example) and they'll get fine-grained control.

Anyway if we get actual reports from user, or if someone sends a PR, I won't be against. I just don't think it's worth spending any time on that.

Lukáš

PS: This addition is only about the full-structure. I still believe the flattened variables are useful and actually very important to export.

> Regards,
> Lukáš
> 
>>>
>>>> That structure would be easily parsed/used by tests written in any language.
>>>>
>>>> Note: As an additional resource, we could make public a Python API for
>>>> loading that serialized structure back in a variant object, so Python
>>>> users would have the same experience as in INSTRUMENTED tests when
>>>> writing non-INSTRUMENTED Python tests.
>>>>
>>>> Expected Results
>>>> ================
>>>>
>>>> Mitigate the cases where the parameters can not be accessed by
>>>> non-INSTRUMENTED tests and avoid collision with preexisting
>>>> environment variables.
>>>>
>>>
>>> I see only one big issue and that is accidental overriding of special variable names (like PATH="my_path"), but still our system should allow that on request (people might want to override PATH for certain variants).
>>>
>>> As for the clashes I believe people who expect env variables use simple variant files. Eventually I'd prefer the dir-like structure to provide the variables as that can be easily searched by `find` and `fgrep` but only if that's an optional feature as it'd be costly.
>>>
>>> As for the json, yes, it can be useful to integrate with other languages, but I'd wait for user request.
>>
>> Ok, let's keep this part of the RFC on hold and focus in having the
>> current issues addressed in an implementation that can fully represent
>> the parameters. I'm working on a V2 of this RFC.
>>
>>>
>>> Regards,
>>> Lukáš
>>>
>>>
>>>>
>>>> Looking forward to read your comments.
>>>> --
>>>> apahim
>>>>
>>>
>>>
> 
> 



-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 502 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/avocado-devel/attachments/20180122/8a6e7c60/attachment.sig>


More information about the Avocado-devel mailing list