[Avocado-devel] RFC: Configuration by convention

Lukáš Doktor ldoktor at redhat.com
Mon Dec 2 15:57:28 UTC 2019


Dne 02. 12. 19 v 15:30 Beraldo Leal napsal(a):
> On Thu, Nov 28, 2019 at 04:00:17PM +0100, Lukáš Doktor wrote:
>> Dne 21. 11. 19 v 22:23 Beraldo Leal napsal(a):
>>> Motivation
>>> ##########
>>>
>>> An Avocado Job is primarily executed through the `avocado run` command line.
>>> The behavior of such an Avocado Job is determined by parsing the following
>>> settings (listed in parsed order):
>>>
>>>  1) Default values in source code
>>>  2) Configuration file contents
>>>  3) Command-line options
>>>
>>
>> I'm missing in this RFC some kind of mapping the above. I think if we
>> are to do those intrusive changes, we should spend some time on
>> specifying the relations. (but maybe I just missed it)
> 
> Not sure if I understood. What kind of relations do you suggest? Maybe
> now, I'm missing something. The idea was to simply show the order that
> options are parsed and overridden, just to give some context.
> 

While talking about standardization of args/conf options I though it'd be nice to define some standard mapping to override configuration options from cmdline arguments. What I mean is eg the `--sysinfo`. Obviously it whouldn't fit everything, but would be nice to know that:

--sysinfo $VALUE

is the same as

```
[sysinfo.collect]
enabled = $VALUE
```

and maps to:

{sysinfo: $VALUE}

job param. Obviously there would be exceptions, but we could create guidelines to cover most cases. Additionally even discussing this might help identify the sections/arguments.

>>> There is no convention on the naming pattern used either on configuration files
>>> or on command-line options. Besides the name convention, there is also a lack
>>> of convention for some argument types. For instance::
>>>
>>>  $ avocado run -d
>>>
>>> and::
>>>
>>>  $ avocado run --sysinfo on
>>>
>>> Both are boolean variables, but with different "execution model" (the former
>>> doesn't need arguments and the latter needs `on` or `off` as argument).
>>>
>>
>> Actually we do follow the pattern for booleans. The "--sysinfo" is a tri-state.
> 
> I tried to see the third state but I might be missing something:
> 
> --
>   $ grep "\-\-sysinfo" *  -R
>   plugins/run.py: parser.add_argument('--sysinfo', choices=('on', 'off'),
> --
> 
> Also from `avocado run --help`:
> 
> --
>   --sysinfo {on,off}   Enable or disable system information (hardware
>                        details, profilers, etc.). Current: on
> --
> 
> And here "Current" should be "Default".
> 

The default comes from /etc. You can checkout the `avocado/plugins/run.py` for details:

        sysinfo_default = settings.get_value('sysinfo.collect',
                                             'enabled',
                                             key_type='bool',
                                             default=True)
        sysinfo_default = 'on' if sysinfo_default is True else 'off'
        parser.add_argument('--sysinfo', choices=('on', 'off'),
                            default=sysinfo_default, help="Enable or disable "
                            "system information (hardware details, profilers, "
                            "etc.). Current:  %(default)s")

1. `--sysinfo on` => force-enable
2. `--sysinfo off` => force-disable
3. no `--sysinfo` => use the value from /etc

> Could you please, point me to the third state?
> 
> Keep in mind that I used `--sysinfo` just as an example, there are few
> other cases: `--keep-tmp`, `--ignore-missing-references` and
> `--failfast`.
> 

Yes, `--keep-tmp` is wrong and should be only `bool`, so it's true we should review all options.

> 
>>> Besides the basic ones, here are some recommendations we should try to follow
>>> and pay attention to:
>>>
>>>   1. Option-arguments should not be optional (Guideline 7, from POSIX). So we
>>>      should avoid this::
>>>      
>>>         avocado run --loaders [LOADERS [LOADERS ...]]
>>>
>>
>> Well you might want to specify no loaders (to override the default),
>> although the only usecase I see is self-testing. But how about:
>>
>>     avocado run --loaders LOADERS [LOADERS ...]
>>
>> is that acceptable?
> 
> Yes, and I liked this one, it is better than my suggestions.
> 

Great, thanks and +1 to this one. Also note there is one caveat, it must not be used for main arguments, otherwise you won't be able to specify the `subcommand`. This is why the `--show` accepts only a single argument and uses custom split (`,`). Something like:

    avocado --show foo bar baz run passtest.py

is not possible, because the argparse can't decide which is the subcommand and which is the `--show` argument. For subcommand options it is possible, because one can use `--` to indicate end of the options:

    avocado run --loaders foo bar -- passtest.py

or simply add another option that consumes given amount of arguments (but it is hackish :-) ):

    avocado run --loaders foo bar --sysinfo on passtest.py

>>>   2. **Lists**: When an option argument has multiple values we should use the
>>>        space as the separator.
>>>
>>
>> This basically means:
>>
>>     avocado run --loaders LOADERS [LOADERS ...]
>>
>> right?
> 
> Yes, you are right.
> 
>>> Presentation
>>> ~~~~~~~~~~~~
>>>
>>> Finding options easily, either in the manual or in the help, favor usability
>>> and avoids chaos.
>>>
>>> We can arrange the display of these options in alphabetical order within each
>>> section.
>>>
>>
>> I'd love to (more-less), but sometimes people forget. It's hard to
>> enforce this. Also there are exceptions where we want to make some
>> options more visible, but in majority cases it should be A-Z.
> 
> I think that if we have the options in alphabetical order, this will be
> used as an example and new options will tend to be in order too.
> 
> Also, If we agree with this suggestion and the general idea of this RFC,
> I would suggest storing this RFC on something more concrete/permanent,
> such as a "blueprint" and keeping in one of ours repositories, linking
> to our dev/contributor guide.
> 
> If, during the PR, the new argument option it is not in the order that
> we expect, we can suggest this small change. But I understand that this
> should be a common agreement.
> 
> I'm pointing this here because, for a new developer and/or user, it
> might be "painful" to find the right option without any logical order.
> 

Yes and believe me, the were in order when Avocado was introduced... So either a selftest (with exceptions) has to be developed or we'll face this again.

>>> Standards for Config File Interface
>>> -----------------------------------
>>>
>>> .. note:: Many other config file options could be used here, but since that
>>>           this is another discussion, I'm assuming that we are going to keep
>>>           using `configparser` for a while.
>>>
>>> As one of the main motivations of this RFC is to create a convention to avoid
>>> chaos and make the job execution API use as straightforward as possible, I
>>> believe that the config file should be as close as possible to the dictionary
>>> that will be passed to this API.
>>>
>>> For this reason, this may be the most critical point of this RFC. We should
>>> create a pattern that is intuitive for the developer to convert from one format
>>> to another without much juggling.
>>>
>>> Nested Sections
>>> ~~~~~~~~~~~~~~~
>>>
>>> While the current `configparser` library does not support nested sections,
>>> Avocado can use the dot character as a convention for that. i.e:
>>> `[runner.output]`.
>>>
>>> This convention will be important soon, when converting a dictionary into a
>>> config file and vice-versa.
>>>
>>
>> This is the only mentioning of args->config mapping. Can you please
>> elaborate a bit more?
> 
> Sorry if I was not clear, maybe I got your point now. But it is
> straightforward:
> 
> All default values (in source code) are overridden by config file
> options, and then by the argument options (in this order). Probably we
> will not have 1:1 mapping of all config file options to argument
> options. So far I understood, the command-line argument options are the
> basic ones and the most used.
> 

Viz above (there is no 1:1 mapping, but we might suggest some recommendations)

>>> And since almost everything in Avocado is a plugin, each plugin section should
>>> **not** use the "plugins" prefix and **must** respect the reserved sections
>>> mentioned before. Currently, we have a mix of sections that start with
>>> "plugins" and sections that don't.
>>>
>>
>> So basically
>>
>> [vt]
>>
>> vt-related-option
>>
>> [vt.generic]
>>
>> generic-vt-related-option
>>
>> [runner]
>>
>> runner-related-option
>>
>>
>> yes, the plugins section seems redundant as many parts are actually
>> implemented as plugins.
> 
> Yes, you got the point.
> 
> 
>>> Plugin section name
>>> ~~~~~~~~~~~~~~~~~~~
>>>
>>> I am not quite sure here and would like to know the opinion of those who are
>>> the longest in the project. Perhaps this is a little controversial point. But I
>>> believe we can touch here to improve our convention.
>>>
>>> Most plugins currently have the same name as the python module. Example: human,
>>> diff, tap, nrun, run, journal, replay, sysinfo, etc.
>>>
>>> These are examples of "good" names.
>>>
>>> However, some other plugins do not follow this convention. Ex: runnable_run,
>>> runnable_run_recipe, task_run, task_run_recipe, archive, etc.
>>>
>>> I believe that having a convention here helps when writing more complex tests,
>>> configfiles, as well as easily finding plugins in various parts of the project,
>>> either on a manual page or during the installation procedure.
>>>
>>> I understand that the name of the plugin is different from the module name in
>>> python, but anyway, should we follow PEP8 in this case?
>>>
>>>         From PEP8: Modules should have short, all-lowercase names. Underscores
>>>         can be used in the module name if it improves readability. Python
>>>         packages should also have short, all-lowercase names, although the use
>>>         of underscores is discouraged.
>>>
>>
>> I'm not sure I understand properly this section. Can you please
>> elaborate a bit more? Is the "_" -> "-" the problem you want to
>> avoid?.
> 
> Let's get the `human` example:
> 
> Python module name: human
> Plugin name: human
> 
> Let's get the `task_run_recipe` example:
> 
> Python module name: task_run_recipe
> Plugin name: task-run-recipe
> 

Sure, but task-run-recipe is easier to read for humans and there is a clear mapping. I don't think this is a problem

> Let's get another example:
> 
> Python module name: archive
> Plugin name: zip_archive
> 

Yes, this is a problem and should have been renamed long time ago, patches are welcome :D

> It is not a *big problem* just a convention in order to avoid chaos in
> the near future. It is a bit more clear now?
> 
>>> Regarding boolean values, `getboolean()` can accept `yes/no`, `on/off`,
>>> `true/false` or `1/0`. But we should adopt one style and stick with it. I
>>> would suggest using `true/false`.
>>>
>>
>> I'd encourage people to use one, but we should attempt to accept all.
> 
> Yes, I agree. And the good news is that `getboolean()` already support
> all options.
> 
> 
>>> Presentation
>>> ------------
>>>
>>> As the avocado trend is to have more and more plugins, I believe that to make
>>> it easier for the user to find where each configuration is, we should split the
>>> file into smaller files, leaving one file for each plugin. Avocado already
>>> supports that with the conf.d directory. What do you think?
>>>
>>
>> I'd go with core+core-plugins and plugins, therefor basically the
>> current situation. I don't think we need to extract the core-plugins
>> from the core-configuration (talking about the essential set of
>> plugins like "run").
> 
> Not sure if I understood your "formula". Could you please provide an
> example? And what it is considered "core", in your option?
> 

core-plugins are all plugins that are always installed (therefor the `avocado.plugins` namespace). So basically we'd only create separate files for `optional_plugins` (yaml_loader) or the plugins that live outside of our repository (eg. avocado-vt).

>>> Security Implications
>>> #####################
>>>
>>> Avocado users should have the warranty that their jobs are running on isolated
>>> environment.
>>>
>>> We should consider this and keep in mind that any moves here should continue
>>> with this assumption.
>>>
>>
>> I don't really understand this section, can you expand it (or remove
>> it, if not necessary?)
> 
> Maybe we can remove it. It is just a disclaimer in case someone brings
> a completely new approach.
> 

Not a big deal, it's just that I don't understand it, nor I see any relation to this RFC (which doesn't mean there are none, I might miss something)

>>> How to Teach This
>>> #################
>>>
>>> We should provide a complete configuration reference guide section
>>> in ourThis section doesn't seem important to me.  User's
>>> Documentation.
>>>
>>> In the future, the Job API should also be very well detailed so sphinx could
>>> generate good documentation on our Test Writer's Guide.
>>>
>>> Besides a good documentation, there is no better way to learn than by example.
>>> If our plugins, options and settings follow a good convention it will serve as
>>> template to new plugins.
>>>
>>> If these changes are accepted by the community and implemented, this RFC could
>>> be adapted to become a section on one of our guides, maybe something like the a
>>> Python PEP that should be followed when developing new plugins.
>>>
>>
>> IIUC this section just says we should keep thinks as they are, right?
>> If so than it doesn't need to be here, does it? Maybe one thing to add
>> is that this basically follows the job API, right? And it's outcome
>> should be well defined Job "args", right? Then the outcome of this
>> should be a written standard of mandatory and optional sections of the
>> "args" and all the relations between "args" and "options". Written,
>> stable and flexible enough to suite all extra plugins needs.
> 
> I'm sorry it was not clear. I must have expressed myself wrong. This is
> not about the Job API yet. Just a preparation for that.
> 
> This RFC is about a convention for our configuration options:
> 
>   1. "implicit" (default values);
>   2. "more permanent" (config files);
>   3. "temporary/runtime" (argument options)
> 
> This section is about how to teach this new convention to new users. I
> could not find a detailed configuration reference on our docs, with all
> default options, for instance.
> 

For me the important part is to have the correct defaults in `--help` and list of all setting in `avocado config`. As for the documentation, perhaps we can generate the `avocado config` output on readthedocs.io, if you think it's important to keep them somewhere, but the runtime options are more important to me.

> It was not my intention to discuss the future job API on this RFC. This
> will be discussed on a different RFC. And I'm already taking some notes
> to do this soon.
> 

Sure, which is why I mentioned it follows (partially) the Job API initiative (meaning heading in the same direction).
> 

>> Thank you, Beraldo, for opening this discussion.
>> Lukáš
> 
> 
> Thank you for spending some time on this, I appreciate.
> 

You're welcome, more eyes see more.
Lukáš

> Regards,
> Beraldo
> 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/avocado-devel/attachments/20191202/2b4cff81/attachment.sig>


More information about the Avocado-devel mailing list