[Pulp-dev] [pulp-dev] modularity upload

Brian Bouterse bmbouter at redhat.com
Thu Oct 3 20:10:25 UTC 2019

On Wed, Oct 2, 2019 at 9:55 AM Pavel Picka <ppicka at redhat.com> wrote:

> On Wed, Oct 2, 2019 at 2:42 PM Brian Bouterse <bmbouter at redhat.com> wrote:
>> The additional endpoint sounds fine, especially since uploading the
>> modules.yaml file will produce two types of content units ModuleMD
>> andModuleMDDefaults (is my understanding).
>> What's important to me is that our users have similar functionality
>> anytime they upload. It's not as important to me that the URL is always a
>> POST on the content type's resource, especially given the production of
>> mixed resources ^. The functionality I think we want to offer is:
>> * users can always reference an Artifact they created using the
>> chunked/parallel uploader. Even in the case of smaller files, to not have
>> it be the same is a consistency problem (to me).
> I would like to keep this possibility for sure.
>> * users can always associate those units directly with a repository with
>> the same call
> Agree.
>> * users can also provide the Artifact as part of the call itself
> Agree here, covering both possibilities is on my list of implementation of
> upload.
>> To reduce the burden on plugin writers I think core's
>> SingleArtifactUploadSerializer will serve you the best still. Let me know
>> if you see any gaps or issues there.
> Here I would disagree  as upload returns a task, and parsing modulemds is
> designed to run in task too (as lot of files for snippets are created so
> can be removed at once when job is done) so task after task doesn't looks
> nice to me.
> Or may I missed something? Not sure if I can inject some more
> functionality into the task from core.

I want to share my understanding on this a bit more. I expected there would
be only 1 task, and by using the core code you could avoid writing one
yourself. These are described in the call order they would occur.

* If you use the SingleArtifactContentUploadViewSet and wire it to the URL
you want it provided at it will provide the features described above ^
(file provided in the call, file provided ahead of time, repo association).
This handles receiving the file, creating the Artifact if provided now, and
dispatches the call to general_create
with the correct locks. Validation occurs in this viewset as described by these

* general_create will call back into your plugin code for "in-task"
validation here
which is actually calling deferred_validate
This is a second validation which can occur if you need to do a
long-running operation like open a large tarball and validate data inside
it (like pulp_ansible does).

* general_create will call back into your plugin code again "in-task" to
create the model(s) in the task here
which is actually calling create

So in this way you don't have to "inject" code into core it will always
call back into your plugin code on the Serializer.

There are two possible issues I see since you need to create a few models:

1) The CreateResource here in general_create is only prepared to handle a
single model. I filed that as an issue here
<https://pulp.plan.io/issues/5539>. It will still work, the created objects
won't be reported to the user though.
2) Say the create() code already exists and is its own RQ task. The
solution to this I think would be to call that task as a function
synchronously (RQ allows this)

>> One gotcha to be aware of is that the viewset saves the Artifact and we
>> no longer know its filename. When the general_base_create() task runs in
>> core it doesn't know the filename. So one design that is especially hard
>> would be a "upload any kind of rpm content endpoint". It's useful to rely
>> on the fact that this endpoint can be reliably parsed as a modules.yaml
>> file, versus a comps.xml file, etc. Otherwise you'll be in a position where
>> you have the data but you'll have to figure out what kind of file it is
>> from the file itself without its filename. Let me know if this doesn't make
>> sense.
> The idea of 'unified' upload is very nice but I think it is very hard
> approach nowadays. And in this case I would implement one more argument for
> upload with 'type' as deciding only by filename can be misleading e.g. my
> modules.yaml can me 'my_devs.yml'.
> But as mentioned it will take imo a lot of time.
>> On Wed, Oct 2, 2019 at 7:41 AM Pavel Picka <ppicka at redhat.com> wrote:
>>> It is good point, personally when I am going deeper to implementation I
>>> am starting to think that additional endpoint will be helpful and to
>>> disable possibility to create modular content by 'modulemd' and
>>> 'modulemd-defaults' endpoints too.
>>> Thank you for answer.
>>> On Wed, Oct 2, 2019 at 1:01 PM Ina Panova <ipanova at redhat.com> wrote:
>>>> When we sync, do we assign artifact to modulemd-defaults? If yes then
>>>> your idea with regards to creation of modulemd-defaults by providing
>>>> arguments will bring in inconsistency.
>>>> I would pivot for a custom endpoint for modules upload that would
>>>> accept a file that would be parsed and based what's found there modulemd
>>>> and/or modulemd-defaults contents will be created respectively.
>>>> --------
>>>> Regards,
>>>> Ina Panova
>>>> Senior Software Engineer| Pulp| Red Hat Inc.
>>>> "Do not go where the path may lead,
>>>>  go instead where there is no path and leave a trail."
>>>> On Tue, Oct 1, 2019 at 1:34 PM Pavel Picka <ppicka at redhat.com> wrote:
>>>>> [pulp-dev] modularity upload
>>>>> I was happy to see new unified way of upload already merged, but I
>>>>> think I got special use case for modularity.
>>>>> Expected by upload from core is one file for one content unit. My case
>>>>> is the user will upload one file with many content units yet two content
>>>>> types can appear. I would like to discuss here some ideas hot to proceed in
>>>>> this case.
>>>>> - Possibly this could be different endpoint as more than one content
>>>>> unit will by upload
>>>>>  - and possibly two content types
>>>>>  - I discuss this with Brian B. and to use original endpoint
>>>>> (../content/rpm/modulemd/) for upload can have advantage of less endpoints
>>>>> to maintain and still same logic but different behaviour
>>>>>   - user should (not must be true) be aware of modularity types when
>>>>> storing them
>>>>>   - still will be documented
>>>>>  - disadvantage is any other endpoint with upload use only one content
>>>>> unit (inconsistence)
>>>>>   - because uploaded file is connected for both endpoints (modulemd -
>>>>> mddefaults)
>>>>>   - and will need some discussion about name
>>>>> - Still I would like to keep upload with chunks
>>>>>  - even official modules.yaml to parse from fedora 30 modular repo has
>>>>> ~500K
>>>>> Summary:
>>>>> In my opinion I would use same endpoint but override 'create' method
>>>>> to '../rpm/modulemd' will parse whole file and create many modulemd and
>>>>> even modulemd-deafult content types.
>>>>> And highlight it in documentation.
>>>>> For second content type and endpoint 'modulemd-defaults' I would not
>>>>> allow upload a file but allow user to create modulemd-defaults with
>>>>> arguments (as they are three) so user just call
>>>>> http:.../rpm/modulemd-defaults/ module=foo stream="1" profiles="1:default".
>>>>> As if there will be file with more defaults he can use previous endpoint
>>>>> for both.
>>>>> What do you think or what would you like to see there?
>>>>> --
>>>>> Pavel Picka
>>>>> Red Hat
>>>>> _______________________________________________
>>>>> Pulp-dev mailing list
>>>>> Pulp-dev at redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>> --
>>> Pavel Picka
>>> Red Hat
>>> _______________________________________________
>>> Pulp-dev mailing list
>>> Pulp-dev at redhat.com
>>> https://www.redhat.com/mailman/listinfo/pulp-dev
> --
> Pavel Picka
> Red Hat
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20191003/45ae8d47/attachment.htm>

More information about the Pulp-dev mailing list