[Pulp-dev] [pulp-dev] modularity upload

Brian Bouterse bmbouter at redhat.com
Thu Oct 3 20:10:25 UTC 2019


On Wed, Oct 2, 2019 at 9:55 AM Pavel Picka <ppicka at redhat.com> wrote:

>
>
> On Wed, Oct 2, 2019 at 2:42 PM Brian Bouterse <bmbouter at redhat.com> wrote:
>
>> The additional endpoint sounds fine, especially since uploading the
>> modules.yaml file will produce two types of content units ModuleMD
>> andModuleMDDefaults (is my understanding).
>>
>> What's important to me is that our users have similar functionality
>> anytime they upload. It's not as important to me that the URL is always a
>> POST on the content type's resource, especially given the production of
>> mixed resources ^. The functionality I think we want to offer is:
>>
>> * users can always reference an Artifact they created using the
>> chunked/parallel uploader. Even in the case of smaller files, to not have
>> it be the same is a consistency problem (to me).
>>
>
> I would like to keep this possibility for sure.
>
>
>> * users can always associate those units directly with a repository with
>> the same call
>>
> Agree.
>
>
>> * users can also provide the Artifact as part of the call itself
>>
>
> Agree here, covering both possibilities is on my list of implementation of
> upload.
>
>
>>
>> To reduce the burden on plugin writers I think core's
>> SingleArtifactUploadSerializer will serve you the best still. Let me know
>> if you see any gaps or issues there.
>>
>
> Here I would disagree  as upload returns a task, and parsing modulemds is
> designed to run in task too (as lot of files for snippets are created so
> can be removed at once when job is done) so task after task doesn't looks
> nice to me.
> Or may I missed something? Not sure if I can inject some more
> functionality into the task from core.
>

I want to share my understanding on this a bit more. I expected there would
be only 1 task, and by using the core code you could avoid writing one
yourself. These are described in the call order they would occur.

* If you use the SingleArtifactContentUploadViewSet and wire it to the URL
you want it provided at it will provide the features described above ^
(file provided in the call, file provided ahead of time, repo association).
This handles receiving the file, creating the Artifact if provided now, and
dispatches the call to general_create
<https://github.com/pulp/pulpcore/blob/d545ecb091c5fb8003fbe2cb87bdeb988bb33bcd/pulpcore/app/tasks/base.py#L5>
with the correct locks. Validation occurs in this viewset as described by these
docs
<https://docs.pulpproject.org/en/pulpcore-plugin/nightly/plugin-writer/concepts/subclassing/viewsets.html#content-upload-viewset>
.

* general_create will call back into your plugin code for "in-task"
validation here
<https://github.com/pulp/pulpcore/blob/d545ecb091c5fb8003fbe2cb87bdeb988bb33bcd/pulpcore/app/tasks/base.py#L17>
which is actually calling deferred_validate
<https://github.com/pulp/pulpcore-plugin/blob/e9f2ca4a32f52807b6da43c74454173dd1dc8126/pulpcore/plugin/serializers/content.py#L70>.
This is a second validation which can occur if you need to do a
long-running operation like open a large tarball and validate data inside
it (like pulp_ansible does).

* general_create will call back into your plugin code again "in-task" to
create the model(s) in the task here
<https://github.com/pulp/pulpcore/blob/d545ecb091c5fb8003fbe2cb87bdeb988bb33bcd/pulpcore/app/tasks/base.py#L18>
which is actually calling create
<https://github.com/pulp/pulpcore-plugin/blob/e9f2ca4a32f52807b6da43c74454173dd1dc8126/pulpcore/plugin/serializers/content.py#L80>
.

So in this way you don't have to "inject" code into core it will always
call back into your plugin code on the Serializer.

There are two possible issues I see since you need to create a few models:

1) The CreateResource here in general_create is only prepared to handle a
single model. I filed that as an issue here
<https://pulp.plan.io/issues/5539>. It will still work, the created objects
won't be reported to the user though.
2) Say the create() code already exists and is its own RQ task. The
solution to this I think would be to call that task as a function
synchronously (RQ allows this)



>
>
>> One gotcha to be aware of is that the viewset saves the Artifact and we
>> no longer know its filename. When the general_base_create() task runs in
>> core it doesn't know the filename. So one design that is especially hard
>> would be a "upload any kind of rpm content endpoint". It's useful to rely
>> on the fact that this endpoint can be reliably parsed as a modules.yaml
>> file, versus a comps.xml file, etc. Otherwise you'll be in a position where
>> you have the data but you'll have to figure out what kind of file it is
>> from the file itself without its filename. Let me know if this doesn't make
>> sense.
>>
>
> The idea of 'unified' upload is very nice but I think it is very hard
> approach nowadays. And in this case I would implement one more argument for
> upload with 'type' as deciding only by filename can be misleading e.g. my
> modules.yaml can me 'my_devs.yml'.
> But as mentioned it will take imo a lot of time.
>
>
>>
>>
>>
>>
>>
>> On Wed, Oct 2, 2019 at 7:41 AM Pavel Picka <ppicka at redhat.com> wrote:
>>
>>> It is good point, personally when I am going deeper to implementation I
>>> am starting to think that additional endpoint will be helpful and to
>>> disable possibility to create modular content by 'modulemd' and
>>> 'modulemd-defaults' endpoints too.
>>>
>>> Thank you for answer.
>>>
>>> On Wed, Oct 2, 2019 at 1:01 PM Ina Panova <ipanova at redhat.com> wrote:
>>>
>>>> When we sync, do we assign artifact to modulemd-defaults? If yes then
>>>> your idea with regards to creation of modulemd-defaults by providing
>>>> arguments will bring in inconsistency.
>>>>
>>>> I would pivot for a custom endpoint for modules upload that would
>>>> accept a file that would be parsed and based what's found there modulemd
>>>> and/or modulemd-defaults contents will be created respectively.
>>>>
>>>>
>>>> --------
>>>> Regards,
>>>>
>>>> Ina Panova
>>>> Senior Software Engineer| Pulp| Red Hat Inc.
>>>>
>>>> "Do not go where the path may lead,
>>>>  go instead where there is no path and leave a trail."
>>>>
>>>>
>>>> On Tue, Oct 1, 2019 at 1:34 PM Pavel Picka <ppicka at redhat.com> wrote:
>>>>
>>>>>
>>>>> [pulp-dev] modularity upload
>>>>>
>>>>> I was happy to see new unified way of upload already merged, but I
>>>>> think I got special use case for modularity.
>>>>>
>>>>> Expected by upload from core is one file for one content unit. My case
>>>>> is the user will upload one file with many content units yet two content
>>>>> types can appear. I would like to discuss here some ideas hot to proceed in
>>>>> this case.
>>>>>
>>>>> - Possibly this could be different endpoint as more than one content
>>>>> unit will by upload
>>>>>  - and possibly two content types
>>>>>  - I discuss this with Brian B. and to use original endpoint
>>>>> (../content/rpm/modulemd/) for upload can have advantage of less endpoints
>>>>> to maintain and still same logic but different behaviour
>>>>>   - user should (not must be true) be aware of modularity types when
>>>>> storing them
>>>>>   - still will be documented
>>>>>  - disadvantage is any other endpoint with upload use only one content
>>>>> unit (inconsistence)
>>>>>   - because uploaded file is connected for both endpoints (modulemd -
>>>>> mddefaults)
>>>>>   - and will need some discussion about name
>>>>>
>>>>> - Still I would like to keep upload with chunks
>>>>>  - even official modules.yaml to parse from fedora 30 modular repo has
>>>>> ~500K
>>>>>
>>>>> Summary:
>>>>>
>>>>> In my opinion I would use same endpoint but override 'create' method
>>>>> to '../rpm/modulemd' will parse whole file and create many modulemd and
>>>>> even modulemd-deafult content types.
>>>>> And highlight it in documentation.
>>>>>
>>>>> For second content type and endpoint 'modulemd-defaults' I would not
>>>>> allow upload a file but allow user to create modulemd-defaults with
>>>>> arguments (as they are three) so user just call
>>>>> http:.../rpm/modulemd-defaults/ module=foo stream="1" profiles="1:default".
>>>>> As if there will be file with more defaults he can use previous endpoint
>>>>> for both.
>>>>>
>>>>> What do you think or what would you like to see there?
>>>>>
>>>>> --
>>>>> Pavel Picka
>>>>> Red Hat
>>>>> _______________________________________________
>>>>> Pulp-dev mailing list
>>>>> Pulp-dev at redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>>>
>>>>
>>>
>>> --
>>> Pavel Picka
>>> Red Hat
>>> _______________________________________________
>>> Pulp-dev mailing list
>>> Pulp-dev at redhat.com
>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>
>>
>
> --
> Pavel Picka
> Red Hat
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20191003/45ae8d47/attachment.htm>


More information about the Pulp-dev mailing list