[Pulp-dev] Required fields for models at the DB level

Thu Apr 30 21:22:36 UTC 2020

Recently a group of developers investigated if pulpcore needed to make any
changes regarding the concerns discussed in this thread from a while ago.
Here are the conclusions we reached, please let us know if you have other
ideas or concerns:

* We lack documentation advising plugin writers that they need to use DRF
serializers to validate models. This includes cases where the data didn't
come from users but from a remote source during sync. That docs addition is
being tracked here https://pulp.plan.io/issues/5927
* No pulpcore code changes are needed since Pulpcore does not create any
objects, e.g. Content, it only saves objects already created by plugin code
in the Stages API first stage.

We'll proceed with the docs and that should resolve this issue. Please let
us know if you have concerns, questions, or ideas.

Thanks to the devs who collaborated in this effort.

-Brian

On Tue, Dec 10, 2019 at 3:39 AM Tatiana Tereshchenko <ttereshc at redhat.com>
wrote:

> +1 to have additional serializer for validation content fields and use
> current serializers for upload case
> +1 to validate during sync
>
> On Fri, Dec 6, 2019 at 8:31 PM David Davis <daviddavis at redhat.com> wrote:
>
>> I talked to @ttereshc this afternoon and a couple questions came up.
>>
>> First, in pulp_rpm we have a PackageSerializer[0] that handles creating
>> and displaying packages. The problem is that the package field values are
>> derived from the artifact and not the user so most fields are readonly and
>> not required. I'm imagining we'll have to split this serializer in two (ie
>> a PackageSerializer and a PackageUploadSerializer) in order to use it to
>> validate package data. Does that make sense or is there a better way?
>>
>> Secondly, I don't think we're validating data during sync. What if a user
>> syncs data from a remote that has bad data? I think we'll need to have the
>> stages somehow use serializers as well? If others agree, I can file an
>> issue.
>>
>> [0]
>> https://github.com/pulp/pulp_rpm/blob/326d189bbae9267d59282d62ada1fa36467a2d8f/pulp_rpm/app/serializers.py#L69
>>
>> David
>>
>>
>> On Thu, Dec 5, 2019 at 6:32 AM David Davis <daviddavis at redhat.com> wrote:
>>
>>> This makes sense to me. I wonder if we should document what you've
>>> outlined in the plugin writers guide? If so, then maybe we should repurpose
>>> 5828?
>>>
>>> David
>>>
>>>
>>> On Tue, Dec 3, 2019 at 12:14 PM Brian Bouterse <bmbouter at redhat.com>
>>> wrote:
>>>
>>>> After some IRC discussion during open floor, I think the consensus is
>>>> that we should not have BaseModel call full_clean(). Plugin writers doing a
>>>> lot of object creation without involving DRF serializers should call
>>>> full_clean() either in application code or their specific model's save()
>>>> method. Here's some recap about the motivations for this:
>>>>
>>>> * By having full_clean() called with all model save it gives Pulp "two
>>>> validation layers" when there only needs to be one. DRF's validation layer
>>>> is the serializer, and it isn't prepared to do error handling from a
>>>> "second" layer. This is partly an echo of the concerns from Tom Christie's
>>>> writing.
>>>> * DRF is primarily designed to have data flow through its serializers.
>>>> This issue is more problematic in cases where data is not flowing through
>>>> the serializer. Thus we should try to include the serializer whenever
>>>> possible.
>>>> * If we cannot include the serializer, those are cases where we would
>>>> specifically benefit from calling full_clean manually.
>>>>
>>>> Other thoughts and concerns are welcome. If nothing major comes up then
>>>> we can close https://pulp.plan.io/issues/5828 as WONTFIX.
>>>>
>>>>
>>>>
>>>> On Mon, Dec 2, 2019 at 6:52 PM Simon Baatz <gmbnomis at gmail.com> wrote:
>>>>
>>>>> On Mon, Dec 02, 2019 at 03:53:06PM -0500, Brian Bouterse wrote:
>>>>> >    If anyone has concerns with us enabling Model validation by
>>>>> default on
>>>>> >    all models please let us know soon.
>>>>>
>>>>> I don't know (yet) if I have concerns, but DRF seems to have, see
>>>>>
>>>>> https://www.django-rest-framework.org/community/3.0-announcement/#differences-between-modelserializer-validation-and-modelform
>>>>>
>>>>> According to DRF's design, all validation logic should be at one
>>>>> place, which is the serializer.  This seems to be a controversial
>>>>> issue, see e.g.
>>>>> https://github.com/encode/django-rest-framework/issues/3144.  From
>>>>> that issue:
>>>>>
>>>>> What happens in the case where in your models you are forcing a
>>>>> full_clean (perhaps by including it in the save method)?  Will
>>>>> serializers know how to handle an exception from an explicit
>>>>> full_clean?
>>>>>
>>>>> And Tom Christie's answer:
>>>>>
>>>>> I'd recommend that application level validation should generally
>>>>> happen prior to save, not during it.  Personally I'd avoid
>>>>> full_clean, and instead ensure that state changing operations on
>>>>> model instances are only ever made via method calls that can provide
>>>>> a boundary that ensures that only valid state changes may ever be
>>>>> made by the rest of the application.
>>>>>
>>>>>
>>>>>
>>>>> We need to be aware that we are leaving the path recommended by DRF
>>>>> if we implement this proposal and mix Django validation and DRF
>>>>> validation.  Unfortunately, I don't know what the alternative is.
>>>>> Using DRF serializers to construct all model instances looks clumsy
>>>>> when it comes to relations.
>>>>>
>>>>> _______________________________________________
>>>> Pulp-dev mailing list
>>>> Pulp-dev at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>>
>>> _______________________________________________
>> Pulp-dev mailing list
>> Pulp-dev at redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20200430/3b4d4b79/attachment.htm>