[Avocado-devel] RFC: Configuration by convention
Cleber Rosa
crosa at redhat.com
Tue Dec 3 19:15:14 UTC 2019
On Thu, Nov 28, 2019 at 04:00:17PM +0100, Lukáš Doktor wrote:
> Dne 21. 11. 19 v 22:23 Beraldo Leal napsal(a):
> > Hi all,
> >
>
> Hello Beraldo,
>
> I do like (ideally written) conventions as far as they don't block us.
>
> > I am working on a card about "Configuration by convention", and I realized that
> > it would be better to consult the list first, regarding few key points.
> >
> > So I would like to share with you this RFC and get your feedbacks.
> >
> > TL;DR
> > #####
> >
> > The number of plugins made by many people and the lack of some name, config
> > options, and argument type conventions may turn Avocado's usability difficult.
> > This also makes it challenging to create a future API for executing more
> > complex jobs. I would like to discuss in this RFC some proposals to improve
> > this.
> >
> > And note that, since this is a relatively big change, this RFC, if agreed,
> > could be broken down into smaller issues to facilitate its acceptance into the
> > master branch.
> >
> > Motivation
> > ##########
> >
> > An Avocado Job is primarily executed through the `avocado run` command line.
> > The behavior of such an Avocado Job is determined by parsing the following
> > settings (listed in parsed order):
> >
> > 1) Default values in source code
> > 2) Configuration file contents
> > 3) Command-line options
> >
>
> I'm missing in this RFC some kind of mapping the above. I think if we are to do those intrusive changes, we should spend some time on specifying the relations. (but maybe I just missed it)
>
I'm assuming that by mapping you mean the exact convention that would
be implemented from a command-line option to a configuration file
content to default value name. Then, yes, we need clearer definition,
and I guess Beraldo intends to do that in the "blueprint" document.
The consequences on user experience, including deprecation and
migration plans were briefly raised at the "Backwards
Comaptibility" section.
> > Currently, the Avocado config file is an .ini file that is parsed by Python's
> > `configparser` library and this config is broken into sections. Each Avocado
> > plugin has its dedicated section.
> >
> > Today, the parsing of the command line options is made by `argparse` library
> > and produces a dictionary that is given to the `avocado.core.job.Job()` class
> > as its `config` parameter.
> >
> > There is no convention on the naming pattern used either on configuration files
> > or on command-line options. Besides the name convention, there is also a lack
> > of convention for some argument types. For instance::
> >
> > $ avocado run -d
> >
> > and::
> >
> > $ avocado run --sysinfo on
> >
> > Both are boolean variables, but with different "execution model" (the former
> > doesn't need arguments and the latter needs `on` or `off` as argument).
> >
>
> Actually we do follow the pattern for booleans. The "--sysinfo" is a tri-state.
>
> > Since the Avocado trend is to have more and more plugins, we need to design a
> > name convention on command-line arguments and settings to avoid chaos.
> >
> > But, most important: It would be valuable for our users if Avocado provides a
> > Python API in such a way that developers could write more complex jobs
> > programmatically and advanced users that know the configuration entries used on
> > jobs, could do a quick one-off execution on command-line.
> >
> > Example::
> >
> > import sys
> > from avocado.core.job import Job
> >
> > config = {'references': ['tests/passtest.py:PassTest.test']}
> >
> > with Job(config) as j:
> > sys.exit(j.run())
> >
> > Before we address this API use-case, it is important to create this convention
> > so we can have an intuitive use of Avocado config options.
> >
> > .. note:: We understand that, plugin developers have the flexibility to
> > configure they options as desired but inside Avocado core and plugin,
> > settings should have a good naming convention.
> >
> >
> > Specification
> > #############
> >
> >
> > Standards for Command Line Interface
> > ------------------------------------
> >
> > When it comes to the command line interface, a very interesting recommendation
> > is the POSIX Standard's recommendation for arguments[1]. Avocado should try to
> > follow this standard and its recommendations.
> >
> > This pattern does not cover long options (starting with --). For this, we should
> > also embrace the GNU extension[2].
> >
> > One of the goals of this extension, by introducing long options, was to make
> > command-line utilities user-friendly. Also, another aim was to try to create a
> > norm among different command-line utilities. Thus, --verbose, --debug,
> > --version (with other options) would have the same behavior in many programs.
> > Avocado should try to, where applicable, use the GNU long options table[3] as
> > reference.
> >
> > Many of these recommendations are obvious and already used by Avocado or
> > enforced by default, thanks to libraries like `argparse`.
> >
> > However, those libraries do not force the developer to follow all
> > recommendations.
> >
> > Besides the basic ones, here are some recommendations we should try to follow
> > and pay attention to:
> >
> > 1. Option-arguments should not be optional (Guideline 7, from POSIX). So we
> > should avoid this::
> >
> > avocado run --loaders [LOADERS [LOADERS ...]]
> >
>
> Well you might want to specify no loaders (to override the default), although the only usecase I see is self-testing. But how about:
>
> avocado run --loaders LOADERS [LOADERS ...]
>
> is that acceptable?
>
> > or::
> >
> > avocado run --store-logging-stream [STREAM[:LEVEL] [STREAM[:LEVEL] ...]]
> >
> > We can have::
> >
> > avocado run --loaders LOADER,LOADER,LOADER,...
>
> ^^ Inventing another separator usually leads to non-systematic escaping
>
> >
> > or::
> >
> > avocado run --loader LOADER --loader LOADER --loader LOADER
>
> ^^ this one is really verbose
>
> I dislike both proposed, the:
>
> avocado run --loaders LOADERS [LOADERS ...]
>
> is well supported and widely used by other programs. We can argue about "nargs=*" but IMO it sometimes makes sense (when we do want to accept empty sets, like filters...)
>
> >
> > 2. Use hyphens not underscore: Long options consist of ‘--’ followed by a
> > name made of alphanumeric characters and dashes. Option names are
> > typically one to three words long, with hyphens to separate words. Users
> > can abbreviate the option names as long as the abbreviations are unique.
> > Also, underscore, sometimes it gets "eaten" by a terminal border and
> > thus looks like space.
> >
>
> sure "[a-z-]*" works for me for long options. As for short "-" options it's useful to extend it to "[a-zA-Z]" eg. to enable/disable an option.
>
> > 3. When naming subcommands options you don’t have to worry about name
> > conflicts outside the subcommand scope, just keep them short, simple and
> > intuitive.
> >
> > Argument Types
> > ~~~~~~~~~~~~~~
> >
> > Basic types, like strings and integers, are clear how to use. But here is a
> > list of what should expect when using other types:
> >
> > 1. **Booleans**: Boolean options should be expressed as "flags" args (without
> > the "option-argument"). Flags, when present, should represent a
> > True/Active value. This will reduce the command line size. We should
> > avoid using this::
> >
> > avocado run --json-job-result {on,off}
> >
> > 2. **Lists**: When an option argument has multiple values we should use the
> > space as the separator.
> >
>
> This basically means:
>
> avocado run --loaders LOADERS [LOADERS ...]
>
> right?
>
> >
> > Presentation
> > ~~~~~~~~~~~~
> >
> > Finding options easily, either in the manual or in the help, favor usability
> > and avoids chaos.
> >
> > We can arrange the display of these options in alphabetical order within each
> > section.
> >
>
> I'd love to (more-less), but sometimes people forget. It's hard to enforce this. Also there are exceptions where we want to make some options more visible, but in majority cases it should be A-Z.
>
Yes, this is ideal... the tricky question is how and at what
(development) cost.
> >
> > Standards for Config File Interface
> > -----------------------------------
> >
> > .. note:: Many other config file options could be used here, but since that
> > this is another discussion, I'm assuming that we are going to keep
> > using `configparser` for a while.
> >
> > As one of the main motivations of this RFC is to create a convention to avoid
> > chaos and make the job execution API use as straightforward as possible, I
> > believe that the config file should be as close as possible to the dictionary
> > that will be passed to this API.
> >
> > For this reason, this may be the most critical point of this RFC. We should
> > create a pattern that is intuitive for the developer to convert from one format
> > to another without much juggling.
> >
> > Nested Sections
> > ~~~~~~~~~~~~~~~
> >
> > While the current `configparser` library does not support nested sections,
> > Avocado can use the dot character as a convention for that. i.e:
> > `[runner.output]`.
> >
> > This convention will be important soon, when converting a dictionary into a
> > config file and vice-versa.
> >
>
> This is the only mentioning of args->config mapping. Can you please elaborate a bit more?
>
> > And since almost everything in Avocado is a plugin, each plugin section should
> > **not** use the "plugins" prefix and **must** respect the reserved sections
> > mentioned before. Currently, we have a mix of sections that start with
> > "plugins" and sections that don't.
> >
>
> So basically
>
> [vt]
>
> vt-related-option
>
> [vt.generic]
>
> generic-vt-related-option
>
> [runner]
>
> runner-related-option
>
>
> yes, the plugins section seems redundant as many parts are actually implemented as plugins.
>
Yes, agreed. The "plugin" suffix can go.
> > Plugin section name
> > ~~~~~~~~~~~~~~~~~~~
> >
> > I am not quite sure here and would like to know the opinion of those who are
> > the longest in the project. Perhaps this is a little controversial point. But I
> > believe we can touch here to improve our convention.
> >
> > Most plugins currently have the same name as the python module. Example: human,
> > diff, tap, nrun, run, journal, replay, sysinfo, etc.
> >
> > These are examples of "good" names.
> >
> > However, some other plugins do not follow this convention. Ex: runnable_run,
> > runnable_run_recipe, task_run, task_run_recipe, archive, etc.
> >
> > I believe that having a convention here helps when writing more complex tests,
> > configfiles, as well as easily finding plugins in various parts of the project,
> > either on a manual page or during the installation procedure.
> >
> > I understand that the name of the plugin is different from the module name in
> > python, but anyway, should we follow PEP8 in this case?
> >
> > From PEP8: Modules should have short, all-lowercase names. Underscores
> > can be used in the module name if it improves readability. Python
> > packages should also have short, all-lowercase names, although the use
> > of underscores is discouraged.
> >
>
> I'm not sure I understand properly this section. Can you please elaborate a bit more? Is the "_" -> "-" the problem you want to avoid?.
>
> > Reserved Sections
> > ~~~~~~~~~~~~~~~~~
> >
> > We should reserve a few sections as reserved for the Avocado's core
> > functionalities. i.e: main, plugins, logs, job, etc...
> >
> > Not sure here, it makes sense?
> >
>
> If we are to remove the "plugins." namespace then yes, we should reserve some names. At least "core" to indicate core options, or all above (plus perhaps some other core parts).
>
How can we tell if we have reserved *enough* sections? If know that we
need a section such as "logs", and use it, this is a de-facto reservation.
What worries me is a preventive reservation because they will be probably
speculative. In a programming language, reserved words have a use, and
thus variables and other statements can't use it. But image a reserved
word that is never used...
- Cleber.
More information about the Avocado-devel
mailing list