[Pulp-dev] 3.0 Validation update

Fri Apr 7 15:08:49 UTC 2017

On 04/06/2017 06:40 PM, Austin Macdonald wrote:
> tldr;
>     1. Serializer errors won't be caught by serializer validation.
>     2. DRF does not use full_clean(), so if data passes serializer
> validation, it gets into the db.
>     3. We can use full_clean for process level validation.
> 
> During a lengthy discussion on IRC, I incorrectly asserted that full_clean
> did not work with DRF.
> 
> My thinking was partly based on the incorrect assumption that the data was
> valid. This seemed reasonable given that I created the data from the REST
> API, so the data was validated by the serializers. After a closer look, I
> fixed an issue with the serializer and was able to use full_clean() as
> expected.
> 
> The serializer formatted the REST data and then validated the data against
> its own expectations. Of course it thought it was right!

Here's what I was getting at yesterday:

https://paste.fedoraproject.org/paste/T2rjzc4RXrvD8DtXc5y3MF5M1UNdIGYhyRLivL9gydE=

full_clean is an interface that Model instances provide to work with ModelForm.

From https://docs.djangoproject.com/en/1.10/ref/models/instances/#validating-objects:

"""

Model.full_clean(exclude=None, validate_unique=True)[source]¶

This method calls Model.clean_fields(), Model.clean(), and Model.validate_unique()
(if validate_unique is True), in that order and raises a ValidationError that has a
message_dict attribute containing errors from all three stages.
"""

The interface is well-defined, and consists of more than just "full_clean". Reusing
it for our purposes, with the claim of "supporting the Django way of doing things"
doesn't really work, since we're so heavily using Django REST Framework, and this is
explicitly *not* the way to do things in DRF. Not only does DRF not use this interface,
they consciously made the choice to discontinue its use:

http://www.django-rest-framework.org/topics/3.0-announcement/#differences-between-modelserializer-validation-and-modelform

Some of their justification is documented here:

http://www.django-rest-framework.org/api-guide/validators/#validation-in-rest-framework

I agree with their major points, particularly as they relate to a separation of
concerns. I think that for input validation, either using a model's full_clean
interface or using its related ModelSerializer should be done, not both. Since
we need Model Serializers for the API, it makes sense to use the serializers
for this, and to never use full_clean, so that we don't provide or implement
two interfaces for the same job.

Any "process validation" (I term I made up to try to distinguish what we're
talking about from input validation, which is data validation done "at the
edges") should be done by whatever process needs "valid" data. As far as I know,
the interface for it does not exist, since it's for validation for a very pulp-
specific bit of process.

An interface should be implemented or created to address the "process validation"
concern, but reusing either the DRF serializer or Django's full_clean interface to
provide this behavior is a recoupling of concerns that should be avoided.

Finally:
Discussing this has been very difficult as a result of our overloading of the term
"validation", so I recommend getting clearly defined use-cases and settling on a
shared vocabulary before spending more time thinking and talking about this.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 866 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20170407/3932137b/attachment.sig>