[Avocado-devel] RFC: Configuration by convention

Lukáš Doktor ldoktor at redhat.com
Thu Nov 28 15:00:17 UTC 2019


Dne 21. 11. 19 v 22:23 Beraldo Leal napsal(a):
> Hi all,
> 

Hello Beraldo,

I do like (ideally written) conventions as far as they don't block us.

> I am working on a card about "Configuration by convention", and I realized that
> it would be better to consult the list first, regarding few key points.
> 
> So I would like to share with you this RFC and get your feedbacks.
> 
> TL;DR
> #####
> 
> The number of plugins made by many people and the lack of some name, config
> options, and argument type conventions may turn Avocado's usability difficult.
> This also makes it challenging to create a future API for executing more
> complex jobs. I would like to discuss in this RFC some proposals to improve
> this.
> 
> And note that, since this is a relatively big change, this RFC, if agreed,
> could be broken down into smaller issues to facilitate its acceptance into the
> master branch.
> 
> Motivation
> ##########
> 
> An Avocado Job is primarily executed through the `avocado run` command line.
> The behavior of such an Avocado Job is determined by parsing the following
> settings (listed in parsed order):
> 
>  1) Default values in source code
>  2) Configuration file contents
>  3) Command-line options
> 

I'm missing in this RFC some kind of mapping the above. I think if we are to do those intrusive changes, we should spend some time on specifying the relations. (but maybe I just missed it)

> Currently, the Avocado config file is an .ini file that is parsed by Python's
> `configparser` library and this config is broken into sections. Each Avocado
> plugin has its dedicated section.
> 
> Today, the parsing of the command line options is made by `argparse` library
> and produces a dictionary that is given to the `avocado.core.job.Job()` class
> as its `config` parameter.
> 
> There is no convention on the naming pattern used either on configuration files
> or on command-line options. Besides the name convention, there is also a lack
> of convention for some argument types. For instance::
> 
>  $ avocado run -d
> 
> and::
> 
>  $ avocado run --sysinfo on
> 
> Both are boolean variables, but with different "execution model" (the former
> doesn't need arguments and the latter needs `on` or `off` as argument).
> 

Actually we do follow the pattern for booleans. The "--sysinfo" is a tri-state.

> Since the Avocado trend is to have more and more plugins, we need to design a
> name convention on command-line arguments and settings to avoid chaos.
> 
> But, most important: It would be valuable for our users if Avocado provides a
> Python API in such a way that developers could write more complex jobs
> programmatically and advanced users that know the configuration entries used on
> jobs, could do a quick one-off execution on command-line.
> 
> Example::
> 
>  import sys
>  from avocado.core.job import Job
> 
>  config = {'references': ['tests/passtest.py:PassTest.test']}
> 
>  with Job(config) as j:
>    sys.exit(j.run())
> 
> Before we address this API use-case, it is important to create this convention
> so we can have an intuitive use of Avocado config options.
> 
> .. note:: We understand that, plugin developers have the flexibility to
>           configure they options as desired but inside Avocado core and plugin,
>           settings should have a good naming convention.
> 
> 
> Specification
> #############
> 
> 
> Standards for Command Line Interface
> ------------------------------------
> 
> When it comes to the command line interface, a very interesting recommendation
> is the POSIX Standard's recommendation for arguments[1]. Avocado should try to
> follow this standard and its recommendations.
> 
> This pattern does not cover long options (starting with --). For this, we should
> also embrace the GNU extension[2].
> 
> One of the goals of this extension, by introducing long options, was to make
> command-line utilities user-friendly. Also, another aim was to try to create a
> norm among different command-line utilities. Thus, --verbose, --debug,
> --version (with other options) would have the same behavior in many programs.
> Avocado should try to, where applicable, use the GNU long options table[3] as
> reference.
> 
> Many of these recommendations are obvious and already used by Avocado or
> enforced by default, thanks to libraries like `argparse`.
> 
> However, those libraries do not force the developer to follow all
> recommendations.
> 
> Besides the basic ones, here are some recommendations we should try to follow
> and pay attention to:
> 
>   1. Option-arguments should not be optional (Guideline 7, from POSIX). So we
>      should avoid this::
>      
>         avocado run --loaders [LOADERS [LOADERS ...]]
> 

Well you might want to specify no loaders (to override the default), although the only usecase I see is self-testing. But how about:

    avocado run --loaders LOADERS [LOADERS ...]

is that acceptable?

>   or::
>   
>         avocado run --store-logging-stream [STREAM[:LEVEL] [STREAM[:LEVEL] ...]]
> 
>      We can have::
> 
>         avocado run --loaders LOADER,LOADER,LOADER,...

^^ Inventing another separator usually leads to non-systematic escaping

> 
>      or::
> 
>         avocado run --loader LOADER --loader LOADER --loader LOADER

^^ this one is really verbose

I dislike both proposed, the:

    avocado run --loaders LOADERS [LOADERS ...]

is well supported and widely used by other programs. We can argue about "nargs=*" but IMO it sometimes makes sense (when we do want to accept empty sets, like filters...)

> 
>   2. Use hyphens not underscore: Long options consist of ‘--’ followed by a
>      name made of alphanumeric characters and dashes. Option names are
>      typically one to three words long, with hyphens to separate words. Users
>      can abbreviate the option names as long as the abbreviations are unique.
>      Also, underscore, sometimes it gets "eaten" by a terminal border and
>      thus looks like space.
> 

sure "[a-z-]*" works for me for long options. As for short "-" options it's useful to extend it to "[a-zA-Z]" eg. to enable/disable an option.

>   3. When naming subcommands options you don’t have to worry about name
>      conflicts outside the subcommand scope, just keep them short, simple and
>      intuitive.
> 
> Argument Types
> ~~~~~~~~~~~~~~
> 
> Basic types, like strings and integers, are clear how to use. But here is a
> list of what should expect when using other types:
> 
>   1. **Booleans**: Boolean options should be expressed as "flags" args (without
>        the "option-argument"). Flags, when present, should represent a
>        True/Active value.  This will reduce the command line size. We should
>        avoid using this::
> 
>         avocado run --json-job-result {on,off}
> 
>   2. **Lists**: When an option argument has multiple values we should use the
>        space as the separator.
> 

This basically means:

    avocado run --loaders LOADERS [LOADERS ...]

right?

> 
> Presentation
> ~~~~~~~~~~~~
> 
> Finding options easily, either in the manual or in the help, favor usability
> and avoids chaos.
> 
> We can arrange the display of these options in alphabetical order within each
> section.
> 

I'd love to (more-less), but sometimes people forget. It's hard to enforce this. Also there are exceptions where we want to make some options more visible, but in majority cases it should be A-Z.

> 
> Standards for Config File Interface
> -----------------------------------
> 
> .. note:: Many other config file options could be used here, but since that
>           this is another discussion, I'm assuming that we are going to keep
>           using `configparser` for a while.
> 
> As one of the main motivations of this RFC is to create a convention to avoid
> chaos and make the job execution API use as straightforward as possible, I
> believe that the config file should be as close as possible to the dictionary
> that will be passed to this API.
> 
> For this reason, this may be the most critical point of this RFC. We should
> create a pattern that is intuitive for the developer to convert from one format
> to another without much juggling.
> 
> Nested Sections
> ~~~~~~~~~~~~~~~
> 
> While the current `configparser` library does not support nested sections,
> Avocado can use the dot character as a convention for that. i.e:
> `[runner.output]`.
> 
> This convention will be important soon, when converting a dictionary into a
> config file and vice-versa.
> 

This is the only mentioning of args->config mapping. Can you please elaborate a bit more?

> And since almost everything in Avocado is a plugin, each plugin section should
> **not** use the "plugins" prefix and **must** respect the reserved sections
> mentioned before. Currently, we have a mix of sections that start with
> "plugins" and sections that don't.
> 

So basically

[vt]

vt-related-option

[vt.generic]

generic-vt-related-option

[runner]

runner-related-option


yes, the plugins section seems redundant as many parts are actually implemented as plugins.

> Plugin section name
> ~~~~~~~~~~~~~~~~~~~
> 
> I am not quite sure here and would like to know the opinion of those who are
> the longest in the project. Perhaps this is a little controversial point. But I
> believe we can touch here to improve our convention.
> 
> Most plugins currently have the same name as the python module. Example: human,
> diff, tap, nrun, run, journal, replay, sysinfo, etc.
> 
> These are examples of "good" names.
> 
> However, some other plugins do not follow this convention. Ex: runnable_run,
> runnable_run_recipe, task_run, task_run_recipe, archive, etc.
> 
> I believe that having a convention here helps when writing more complex tests,
> configfiles, as well as easily finding plugins in various parts of the project,
> either on a manual page or during the installation procedure.
> 
> I understand that the name of the plugin is different from the module name in
> python, but anyway, should we follow PEP8 in this case?
> 
>         From PEP8: Modules should have short, all-lowercase names. Underscores
>         can be used in the module name if it improves readability. Python
>         packages should also have short, all-lowercase names, although the use
>         of underscores is discouraged.
> 

I'm not sure I understand properly this section. Can you please elaborate a bit more? Is the "_" -> "-" the problem you want to avoid?.

> Reserved Sections
> ~~~~~~~~~~~~~~~~~
> 
> We should reserve a few sections as reserved for the Avocado's core
> functionalities. i.e: main, plugins, logs, job, etc...
> 
> Not sure here, it makes sense?
> 

If we are to remove the "plugins." namespace then yes, we should reserve some names. At least "core" to indicate core options, or all above (plus perhaps some other core parts).

> Config Types
> ~~~~~~~~~~~~
> 
> `configparser` do not guess datatypes of values in configuration files, always
> storing them internally as strings. This means that if you need other
> datatypes, you should convert on your own
> 
> There are few methods on this library to help us: `getboolean()`, `getint()`
> and `getfloat()`. Basic types here, are also straightforward.
> 

If we are to map arguments to options then we already need to define the types somewhere. Then the config type should come from there.

> Regarding boolean values, `getboolean()` can accept `yes/no`, `on/off`,
> `true/false` or `1/0`. But we should adopt one style and stick with it. I
> would suggest using `true/false`.
> 

I'd encourage people to use one, but we should attempt to accept all.

> 
> Presentation
> ------------
> 
> As the avocado trend is to have more and more plugins, I believe that to make
> it easier for the user to find where each configuration is, we should split the
> file into smaller files, leaving one file for each plugin. Avocado already
> supports that with the conf.d directory. What do you think?
> 

I'd go with core+core-plugins and plugins, therefor basically the current situation. I don't think we need to extract the core-plugins from the core-configuration (talking about the essential set of plugins like "run").

> 
> Backwards Comaptibility
> #######################
> 
> In order to keep a good naming convention, this set of changes probably will
> rename some args and/or config file options.
> 
> While some changes proposed here are simple and do not affect Avocado's
> behavior, others are critical and may break Avocado jobs.
> 
> Command line syntax changes
> ---------------------------
> 
> If these changes are acceptable, these command-line conversions will lead to a
> "syntax error".
> 
> We can have a transition period with a "deprecated message" but it may not be
> worth it. I'm not sure yet. What do you think?
> 

I don't expect much of these, but some transition period (if possible) should be kept. The same applies to plugin name changes. As far as it's possible and feasible, keep the transition period.

> Plugin name changes
> -------------------
> 
> Again, if these changes are feasible, changing the modules names and/or the
> 'name' attribute of plugins will require to change the config files inside
> Avocado as well. This will not break unless the user is using an old config
> file. In that case, we can also have a "deprecated message" and accept the old
> config file option for some time. Any other drawbacks that I can't see?
> 
> 
> Security Implications
> #####################
> 
> Avocado users should have the warranty that their jobs are running on isolated
> environment.
> 
> We should consider this and keep in mind that any moves here should continue
> with this assumption.
> 

I don't really understand this section, can you expand it (or remove it, if not necessary?)

> How to Teach This
> #################
> 
> We should provide a complete configuration reference guide section in ourThis section doesn't seem important to me. 
> User's Documentation.
> 
> In the future, the Job API should also be very well detailed so sphinx could
> generate good documentation on our Test Writer's Guide.
> 
> Besides a good documentation, there is no better way to learn than by example.
> If our plugins, options and settings follow a good convention it will serve as
> template to new plugins.
> 
> If these changes are accepted by the community and implemented, this RFC could
> be adapted to become a section on one of our guides, maybe something like the a
> Python PEP that should be followed when developing new plugins.
> 

IIUC this section just says we should keep thinks as they are, right? If so than it doesn't need to be here, does it? Maybe one thing to add is that this basically follows the job API, right? And it's outcome should be well defined Job "args", right? Then the outcome of this should be a written standard of mandatory and optional sections of the "args" and all the relations between "args" and "options". Written, stable and flexible enough to suite all extra plugins needs.

> Open Issues
> ###########
> 
> .. note:: Links to open issues that are related to this.
> 
> References
> ##########
> 
> [1] - https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html
> [2] - https://www.gnu.org/prep/standards/html_node/Command_002dLine-Interfaces.html
> [3] - https://www.gnu.org/prep/standards/html_node/Option-Table.html#Option-Table
> 
> Regards,
> Beraldo
> 

Thank you, Beraldo, for opening this discussion.
Lukáš

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/avocado-devel/attachments/20191128/83078264/attachment.sig>


More information about the Avocado-devel mailing list