[Pulp-list] configuration files, validation and standards

Jeff Ortel jortel at redhat.com
Tue May 22 15:57:53 UTC 2012


All,

So, I got derailed on something yesterday (related to this, but not 
tihs) and decided "why waste a perfectly hosed up day" and decided to 
finish up the day on something that has been bugging me about pulp v1. 
Jay, don't kill me .. I mostly worked on this last night ;)

One area in which Pulp is inconsistent is in configuration file handling 
(reading/parsing/accessing/converting).  The server uses a module based 
on ConfigParser and the client (originally) took the route of classes 
derived from INIConfig.  These subclasses represented the configuration 
file and defined a schema used for self validation.  The validation 
provided in common/config.py.

As we move forward in v2, I would like to consolidate on a single 
approach.  That we pass configuration as (dict) objects instead of 
INIConfig or ConfigParser objects.  Further, that we consistently 
leverage the validation that we invested in long ago but only adopted 
(completely) in the client.  This approach has several benefits.

1. Configuration file parsing/formatting is independent of how 
configuration is used by our code.

2. By representing/passing configuration as a dictionary (actually a 
dict of dict), cobbling up configuration for testing is easier.

3. Consistent usage of (existing) configuration validation provides the 
following benefits:
   - define/document configuration in one place.
   - validate and report errors in one place with consistent error
     messages instead of splattering validation everywhere the
     configuration(s) are used.
   - combined usage of validation and companion type conversion functions
     makes for safe (and easy) conversion.  Eg: values for bool
     properties are consistently verified and converted.

4. A dict is well documented and understood.  It can be easily pickled 
and converted to json.

I GC content handlers (in the agent) follow this approach.  When loaded, 
the handler configuration is passed as a dictionary and the loader using 
validation to ensure that /standard/ parts of the configuration 
(descriptor) are valid.  I would suggest the same for server plugin 
configurations.

I suppose Im guilty of adding yet one more way of dealing with 
configuration files but unlike some of the one-off solutions that have 
been popping up (I think), this is an attempt to standardize on a single 
approach.  Just something to consider as we move forward.

--------------------


To help facilitate (and demonstrate) this approach:

In common/gc_config.py, I added a Config class that can be used to 
easily read INI config files into a dict (graph).  It supports 
construction with a variety of inputs that are merged together to 
provide: a composite representation; property/section overrides; easy 
defaults.  I also converted the validation to work on dictionaries. 
Further, I added a method that renders an object graph representation of 
the dict (graph) for easy access using (.) dot notation for those of you 
that liked INIConfig for this reason.

A Few Examples:

This just show validation and simple dict access to the configuration.

<snip>
[server]
host=myhost
port=80
</snip>

 >>> config = Config(path)
 >>> config.validate(schema)
 >>> print config['server']['port']
80

 >>> config = Config(path)
 >>> obj = config.graph()
 >>> print obj.server.port
80

# graph

 >>> print obj.server.port
80

# not defined

 >>> print obj.server.notdefined
None

 >>> obj = config.graph(strict=True)
 >>> print obj.server.notdefined
KeyError('notdefinded')

----

This example shows configuration defaults (dict) that is overlaid with 
file at path (A) and then file at path (B).  The precedence defined by 
ordering.

defaults = { 'port' : 443 }

pathA = /etc/pulp/admin/admin.conf
<snip>
[server]
host=myhost
</snip>

pathB = ~/.pulp/admin.conf
<snip>
[server]
host=redhat.com
</snip>

 >>> config = Config(defaults, pathA, pathB)
 >>> config.validate(schema)
server = config['server']
 >>> print server['host']
redhat.com

 >>> print server['port']
443

----

Section filtering provided but not demonstrated.




More information about the Pulp-list mailing list