[libvirt-ci PATCH 06/12] lcitool: Introduce methods to load and validate the YAML config

Erik Skultety eskultet at redhat.com
Mon May 11 10:50:00 UTC 2020


On Thu, May 07, 2020 at 06:20:46PM +0200, Andrea Bolognani wrote:
> On Wed, 2020-05-06 at 14:05 +0200, Erik Skultety wrote:
> >  class Config:
> >  
> > +    def __init__(self):
> > +
> > +        # Load the template config containing the defaults first, this must
> > +        # always succeed.
> > +        # NOTE: we should load this from /usr/share once we start packaging
> > +        # lcitool
> > +        base = Util.get_base()
> > +        with open(os.path.join(base, "config.yaml"), "r") as fp:
> > +            self.values = yaml.load(fp, Loader=yaml.Loader)
> 
> We're use yaml.safe_load(fp) everywhere else. Any reason why we
> should need the non-safe version? Why is it necessary to explicitly
> provide it with a Loader?

"YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated"
...but this was a completely wrong choice of API on my end, since rather then
grepping the code base and copy-pasting, I read the documentation and missed
the safe_load API which also doesn't suffer from the deprecation warning.

> 
> > +        try:
> > +            with open(self._get_config_file("config.yaml"), "r") as fp:
> > +                user_config = yaml.load(fp, Loader=yaml.Loader)
> > +        except Exception as e:
> > +            print("Missing or invalid config.yaml file: {}".format(e),
> > +                  file=sys.stderr)
> > +            sys.exit(1)
> 
> As mentioned last time, this is not the correct place or way to
> handle this exception. I know you don't like how we process failures
> right now :) but the way to address that is to post a patch that
> eradicates the current implementation and replaces it with a better
> one in one fell swoop, not to introduce alternative ways to do error
> handling in a few select places.
> 
> I have posted a patch[1] that implements the suggestion I gave last
> time, and that allows you to simply raise an Exception here.
> 
> > +        # Validate the user provided config and use it to override the default
> > +        # settings
> > +        self._validate(user_config)
> > +        self._update(user_config)
> 
> Incidentally, since these two functions can also fail, with your
> approach you'd have reported a nice error message if the
> configuration file is missing altogether but *not* if it exists but
> is invalid.
> 
> > +    @staticmethod
> > +    def _validate_section(config, section, keys):
> > +
> > +        if config.get(section, None) is None:
> > +            raise KeyError("Section '{}' not found".format(section))
> 
> This will raise the same error message both when the section is
> actually missing and when it's present but doesn't contain anything,
> so the error handling should be
> 
>   raise Exception("Missing or empty section '{}'.format(section))
> 
> Note the use of Exception instead of KeyError - again, if you want
> to change the way we do error handling by all means post patches
> towards that goal, but until then please stick with the current
> approach.
> 
> As for the check itself, I think I would like it to look like
> 
>   if section not in config or config[section] is None:

Well, if it was only about key existence, the first part of your boolean
predicate would be preferable, but once you need to check both existence and
value, .get() is the recommended way.

> 
> I'm no Pythonista, but this it seems to follow the mantra "explicit

As it turned out, I even was explicit :P since 'None' is the default value
returned on key non-existence

> is better than implicit" more closely... You've written more Python
> than me, though, so if you think using get() is superior I don't have
> a problem leaving the code as is :)
> 
> > +        for key in keys:
> > +            if config.get(section).get(key, None) is None:
> > +                raise KeyError("Missing mandatory key '{}.{}'".format(section,
> > +                                                                      key))
> 
> The very same considerations apply here.
> 
> One additional thing: since YAML is quite flexible, I think it would
> be sensible (if a little paranoid) to also have something like
> 
>   if not (instanceof(config[section][key], str) or
>           instanceof(config[section][key], int)):
>       raise Exception(
>           "Invalid type for key '{}.{}'".format(section, key)
>       )

Yes, this is a good idea, I'll incorporate that into the next revision,
although I'll do it a bit differently.

> 
> otherwise we'd happily accept something like
> 
>   ---
>   install:
>     root_password:
>       - let
>       - me
>       - in
> 
> as valid.

Yep, that's bad.

> 
> > +    @staticmethod
> > +    def _validate(config):
> 
> The first thing you need to check here is whether config is not None,
> otherwise an empty configuration file will result in
> 
>   Traceback (most recent call last):
>     File "./lcitool", line 1090, in <module>
>       Application().run()
>     File "./lcitool", line 452, in __init__
>       self._config = Config()
>     File "./lcitool", line 156, in __init__
>       self._validate(user_config)
>     File "./lcitool", line 195, in _validate
>       Config._validate_section(config, "install", ["root_password"])
>     File "./lcitool", line 183, in _validate_section
>       if config.get(section, None) is None:
>   AttributeError: 'NoneType' object has no attribute 'get'

Damn, this is bad too, I'll fix it.

> 
> > +        # verify the [install] section and its mandatory options
> > +        Config._validate_section(config, "install", ["root_password"])
> > +
> > +        # we need to check flavor here, because later validations depend on it
> > +        flavor = config.get("install").get("flavor", "test")
> 
> This hardcodes the default value for flavor in the script instead of
> reading it from the default configuration file. It should be
> 
>    .get("flavor", self.values["install"]["flavor"])

I'm not hardcoding anything. What's actually being done is we're only verifying
the user-provided data, so we should not interact with the default values at
this point yet, so instead, either '' or None should be returned from the
.get() method.

-- 
Erik Skultety




More information about the libvir-list mailing list