[Freeipa-devel] parameter validation

Thu Apr 5 21:55:58 UTC 2012

Jason was exceptional at documenting everything he wrote, especially in 
the framework. However the correct use of validation in the Param class 
is not discussed at all. I tried to figure it out by looking at what we 
currently do in our code base. Given the inconsistencies I presume I'm 
not the only one who wasn't quite sure how to do this right (or rather 
what the intent was). Below are some issues/questions, if you can shed 
light on any them it would be helpful and I'll try to summarize and get 
best practice documented.

Every param object is passed a name and then zero or more validation 
functions as positional arguments. Each validation function has this 
signature:

validate(ugettext, value)

It's supposed to return None if there was no validation error. If there 
was a validation error it's supposed to return an error string.

1) Most of our validation functions do not follow the calling convention 
in parameter.py for a validator. Instead of returning either None or an 
error string they raise a ValidationError.

Here is an example of how the code in parameter.py calls a validation 
function:

         for rule in self.all_rules:
             error = rule(ugettext, value)
             if error is not None:
                 raise ValidationError(
                     name=self.get_param_name(),
                     value=value,
                     index=index,
                     error=error,
                     rule=rule,
                 )

But there are a number of places in the code where we raise a 
ValidationError and substitute a hardcoded name. Directly raising a 
ValidationError by-passes the above and prevents correctly filling in 
all the information. Speaking of all the information see the next point.

The direct use of ValidationError occurs in other places besides 
validation functions. There are places where that makes sense, perhaps 
the error only occurs in the presence of another option, or perhaps it's 
buried deep in some other code, but for validation functions that can 
validate without external context we shouldn't be raising 
ValidationError directly, but rather returning an error string, right?

2) Currently ValidationError only returns this string:

     format = _('invalid %(name)r: %(error)s')

Note only the name and error keywords are used, omitted are value, index 
and rule. Shouldn't the ValidationError test for presence of the other 
keywords and return a different format string including those keywords 
if they are present? Seems to me we're dropping useful information which 
would be helpful to the user on the floor.

3) I'm confused as to why the validator signature has ugettext as a 
parameter. None of our code uses it. I suspect the intent was to use it 
to obtain the l10n translation for the error string. But as far as I can 
tell the i18n strings used in the validators via _().

# current usage, believed to be correct
validate(ugettext, value)
     return _("Must not contain spaces")

But if the validator used the ugettext parameter, it would have to be 
done in one of two fashions

# This does not work because the string is not marked for translation 
via _(), ugettext will never find the string.
validate(ugettext, value)
     return ugettext("Must not contain spaces")

# This is just silly, the _() returns the l10n translation (by calling
# ugettext), which is then passed to ugettext a second time which does
# nothing because it can't find the translation for a string already
# translated, it's a no-op.
validate(ugettext, value)
     return ugettext(_("Must not contain spaces"))

Also, by passing only a pointer to the ugettext function the use of 
plural forms is prohibited because that requires a different function. 
But perhaps that's O.K. because the validator is only called on scalar 
values.

Anyway, we don't use the ugettext parameter and I don't see the purpose. 
Do you?

Detail: In parameter.py ugettext is called on class variables 
initialized with _(). This is correct. Why? Class variables are 
evaluated at compile time, the _() function will return the msgid 
unmodified. The string must be marked for translation with _(). Thus at 
runtime ugettext needs to be invoked again on the string. But this is a 
consequence of it being evaluated at compile time. None of our 
validation functions use i18n strings which are evaluated at compile 
time, thus there is no need to pass ugettext to them. If for some reason 
they did, they should call the proper version o fugettext from text.py, 
thus I still don't see the need or value of always passing ugettext as a 
parameter to the validator, do you?

-- 
John Dennis <jdennis at redhat.com>

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/