[Pulp-dev] 3.0 Validation update

Fri Apr 7 16:08:30 UTC 2017

Thanks @smyers and asmacdo. This convinces me that we should just use the
DRF serializer for all validation and expect that full_clean() is not the
authority on what it means to be validated. I was really holding out on
full_clean, but I can see now that we should go wholesale into DRF
serializers.

Regarding vocabulary, @asmacdo suggested using "validation" and "business
logic" as two separate terms. Those make sense to me. Here is a quick
definition, maybe others could refine:

* Validation:  Ensuring that all data accepted is correct as defined by the
serializer. This is solely the responsibility of a DRF serializer and not
other Django facilities like full_clean() or clean().
* Business Logic: Logic that determines how data can be created, stored, or
changed.

== questions ==
* Where should ^ terms be documented?

* Take the case of a sync which has overrides provided? This isn't in the
MVP, but in the future it could be. In that case, does the serializer
associated with the importer validate the database data with the overrides
"added" on top of it?

* For plugin writers writing a serializer for a subclassed Importer, do
they also need to express validations for fields defined on the base
Importer?

* The database still rejects data that doesn't adhere to the data layer
definition right? That occurs even without the DRF serializer correct?

* In cases where data is created in the backend, do we need to validate
that as a general practice? If we do, do we call the DRF serializer
regularly in the backend or just let the database reject "bad" data at the
db level?

Thanks for all the discussion on this topic.

-Brian

On Fri, Apr 7, 2017 at 11:08 AM, Sean Myers <sean.myers at redhat.com> wrote:

> On 04/06/2017 06:40 PM, Austin Macdonald wrote:
> > tldr;
> >     1. Serializer errors won't be caught by serializer validation.
> >     2. DRF does not use full_clean(), so if data passes serializer
> > validation, it gets into the db.
> >     3. We can use full_clean for process level validation.
> >
> > During a lengthy discussion on IRC, I incorrectly asserted that
> full_clean
> > did not work with DRF.
> >
> > My thinking was partly based on the incorrect assumption that the data
> was
> > valid. This seemed reasonable given that I created the data from the REST
> > API, so the data was validated by the serializers. After a closer look, I
> > fixed an issue with the serializer and was able to use full_clean() as
> > expected.
> >
> > The serializer formatted the REST data and then validated the data
> against
> > its own expectations. Of course it thought it was right!
>
> Here's what I was getting at yesterday:
>
> https://paste.fedoraproject.org/paste/T2rjzc4RXrvD8DtXc5y3MF5M1UNdIG
> YhyRLivL9gydE=
>
> full_clean is an interface that Model instances provide to work with
> ModelForm.
>
> From https://docs.djangoproject.com/en/1.10/ref/models/
> instances/#validating-objects:
>
> """
>
> Model.full_clean(exclude=None, validate_unique=True)[source]¶
>
> This method calls Model.clean_fields(), Model.clean(), and
> Model.validate_unique()
> (if validate_unique is True), in that order and raises a ValidationError
> that has a
> message_dict attribute containing errors from all three stages.
> """
>
> The interface is well-defined, and consists of more than just
> "full_clean". Reusing
> it for our purposes, with the claim of "supporting the Django way of doing
> things"
> doesn't really work, since we're so heavily using Django REST Framework,
> and this is
> explicitly *not* the way to do things in DRF. Not only does DRF not use
> this interface,
> they consciously made the choice to discontinue its use:
>
> http://www.django-rest-framework.org/topics/3.0-announcement/#differences-
> between-modelserializer-validation-and-modelform
>
> Some of their justification is documented here:
>
> http://www.django-rest-framework.org/api-guide/validators/#validation-in-
> rest-framework
>
> I agree with their major points, particularly as they relate to a
> separation of
> concerns. I think that for input validation, either using a model's
> full_clean
> interface or using its related ModelSerializer should be done, not both.
> Since
> we need Model Serializers for the API, it makes sense to use the
> serializers
> for this, and to never use full_clean, so that we don't provide or
> implement
> two interfaces for the same job.
>
> Any "process validation" (I term I made up to try to distinguish what we're
> talking about from input validation, which is data validation done "at the
> edges") should be done by whatever process needs "valid" data. As far as I
> know,
> the interface for it does not exist, since it's for validation for a very
> pulp-
> specific bit of process.
>
> An interface should be implemented or created to address the "process
> validation"
> concern, but reusing either the DRF serializer or Django's full_clean
> interface to
> provide this behavior is a recoupling of concerns that should be avoided.
>
> Finally:
> Discussing this has been very difficult as a result of our overloading of
> the term
> "validation", so I recommend getting clearly defined use-cases and
> settling on a
> shared vocabulary before spending more time thinking and talking about
> this.
>
>
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20170407/61210b5d/attachment.htm>