[Pulp-dev] Changes in the Pulp 3 Upload story

Justin Sherrill jsherril at redhat.com
Fri Feb 22 18:05:35 UTC 2019


On 2/22/19 12:07 PM, Brian Bouterse wrote:
>
>
> On Fri, Feb 22, 2019 at 9:36 AM Justin Sherrill <jsherril at redhat.com 
> <mailto:jsherril at redhat.com>> wrote:
>
>
>     On 2/18/19 2:41 PM, Austin Macdonald wrote:
>>     Originally, our upload story was as follows:
>>     The user will upload a new file to Pulp via POST to /artifacts/
>>     (provided by core)
>>     The user will create a new plugin specific Content via POST to
>>     /path/to/plugin/content/, referencing whatever artifacts that are
>>     contained, and whatever fields are expected for the new content.
>>     The user will add the new content to a repository via POST to
>>     /repositories/1/versions/
>>
>>     However, this is somewhat cumbersome to the user with 3 API calls
>>     to accomplish something that only took one call in Pulp 2.
>
>     How would you do this with one call in pulp2?
>     https://docs.pulpproject.org/dev-guide/integration/rest-api/content/upload.html
>     seems to suggest 3-4 calls.
>
> Some plugins implemented the pulp2 equivalent of a one-shot uploader. 
> Those docs are for pulp2's core which don't include the plugin's docs.
>
>>
>>     There are a couple of different paths plugins have taken to
>>     improve the user experience:
>>     The Python plugin follows the above workflow, but reads the
>>     Artifact file to determine the values for the fields. The RPM
>>     plugin has gone even farther and created a new endpoint for "one
>>     shot" upload that perform all of this in a single call. I think
>>     it is likely that the Python plugin will move more in the "one
>>     shot" direction, and other plugins will probably follow.
>
>     How does the RPM one shot api work?  Will it be compatible with
>     whatever solution https://pulp.plan.io/issues/4196 arrives at?
>
> You would upload the Artifact as binary data along with what content 
> type it is and what relative path it uses and Pulp creates the 
> Artifact, Content unit, ContentArtifact. It should be compatible with 
> issue 4196 because django's binary form data should allow for parallel 
> uploading before calling the view handler. It may take 2 calls though. 
> The issue to me isn't about the number of calls as it is the client 
> data payload complexity.
If i'm having to chunk up data, i already have quite a bit of client 
data payload complexity.  In pulp 2 this was most of the complexity!
>
>     I would hate for all our plugins to move to One shot methods which
>     users can't even rely on.
>
> I don't think we're taking the "generic" uploading away. You can 
> always rely on that. The issue w/ one-shot is that it's not possible 
> (literally) for many content types, e.g. Artifact-less content. It's 
> also hard for multi-artifact Content so that would probably still be 
> something plugin writers would provide as a custom thing for their 
> content type. Regardless it's just not possible to have consistency in 
> this area.

Why is it not possible to create a one-shot upload for artifact-less 
content?  (maybe we're defining what a one-shot upload actually is 
differently, i'm reading it as something that combines multiple steps 
into one)

Why is consistency not possible? I guess i don't see a huge variation of 
upload scenarios beyond:

1.  upload Zero to many files as artifacts

2.  Provide some metadata about the zero or more artifacts or let the 
plugin parse it out themselves (or maybe even a combination of the two)

3.  Import that unit into a repository.

I can see it being difficult as a user to go through all of those steps 
(even if 2 & 3 were combined into one), and the desire is to simplify 
the process, but uploading arbitrary files is not simple.   Why do i 
need to give up the plugin's ability to parse the unit's details because 
i'm using the consistent api?

Keep in mind all my questions are coming from a very ignorant 
perspective with respect of pulp3 internals, and more from a user 
perspective.

>     My problem with single api calls to upload files is that we cannot
>     reliably use them due to limitation in request sizes.  We have to
>     be prepared to use multiple calls to upload files regardless. 
>     Maybe if a user is using some plugin that never has super large
>     files (ansible?) you could be confident you would never hit a
>     request size limitation.   But file, docker, and yum all would
>     require multiple calls to get the physical data to the server.
>
> I believe arbitrarily large files can be uploaded either through 
> multi-part form data or through the django-chunked interface. We'll 
> see what happens with 4196, but I expect arbitrary payload size to be 
> a requirement for Pulp users.
>
>     I care more about having a consistent method for uploading files
>     than having fewer api calls.   If we need a some content specific
>     api, that's fine, but please make it a consistent part of the
>     process.
>
> It sounds like the 4-call interface is the only choice then if 
> consistency is a must. There isn't a way to offer consistency for 
> one-shot uploaders. Is it ok that Katello will have to fill out all of 
> the field data when you post the content type? What could be better?

I'll reserve my comments here based on the discussion above.

Thanks!

Justin


>     I feel like we may be chasing the wrong goal here (fewer calls vs
>     a more consistent experience).
>
>>
>>     That said, I think we should discuss this as a community to
>>     encourage plugins to behave similarly, and because there may also
>>     be a possibility for sharing some of code. It is my hope that a
>>     "one shot upload" could do 2 things: 1) Upload and create
>>     Content. 2) Optionally add that content to repositories.
>>
>>     _______________________________________________
>>     Pulp-dev mailing list
>>     Pulp-dev at redhat.com  <mailto:Pulp-dev at redhat.com>
>>     https://www.redhat.com/mailman/listinfo/pulp-dev
>     _______________________________________________
>     Pulp-dev mailing list
>     Pulp-dev at redhat.com <mailto:Pulp-dev at redhat.com>
>     https://www.redhat.com/mailman/listinfo/pulp-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20190222/da14ebc2/attachment.htm>


More information about the Pulp-dev mailing list