[Pulp-dev] Changes in the Pulp 3 Upload story
Justin Sherrill
jsherril at redhat.com
Fri Feb 22 18:05:35 UTC 2019
On 2/22/19 12:07 PM, Brian Bouterse wrote:
>
>
> On Fri, Feb 22, 2019 at 9:36 AM Justin Sherrill <jsherril at redhat.com
> <mailto:jsherril at redhat.com>> wrote:
>
>
> On 2/18/19 2:41 PM, Austin Macdonald wrote:
>> Originally, our upload story was as follows:
>> The user will upload a new file to Pulp via POST to /artifacts/
>> (provided by core)
>> The user will create a new plugin specific Content via POST to
>> /path/to/plugin/content/, referencing whatever artifacts that are
>> contained, and whatever fields are expected for the new content.
>> The user will add the new content to a repository via POST to
>> /repositories/1/versions/
>>
>> However, this is somewhat cumbersome to the user with 3 API calls
>> to accomplish something that only took one call in Pulp 2.
>
> How would you do this with one call in pulp2?
> https://docs.pulpproject.org/dev-guide/integration/rest-api/content/upload.html
> seems to suggest 3-4 calls.
>
> Some plugins implemented the pulp2 equivalent of a one-shot uploader.
> Those docs are for pulp2's core which don't include the plugin's docs.
>
>>
>> There are a couple of different paths plugins have taken to
>> improve the user experience:
>> The Python plugin follows the above workflow, but reads the
>> Artifact file to determine the values for the fields. The RPM
>> plugin has gone even farther and created a new endpoint for "one
>> shot" upload that perform all of this in a single call. I think
>> it is likely that the Python plugin will move more in the "one
>> shot" direction, and other plugins will probably follow.
>
> How does the RPM one shot api work? Will it be compatible with
> whatever solution https://pulp.plan.io/issues/4196 arrives at?
>
> You would upload the Artifact as binary data along with what content
> type it is and what relative path it uses and Pulp creates the
> Artifact, Content unit, ContentArtifact. It should be compatible with
> issue 4196 because django's binary form data should allow for parallel
> uploading before calling the view handler. It may take 2 calls though.
> The issue to me isn't about the number of calls as it is the client
> data payload complexity.
If i'm having to chunk up data, i already have quite a bit of client
data payload complexity. In pulp 2 this was most of the complexity!
>
> I would hate for all our plugins to move to One shot methods which
> users can't even rely on.
>
> I don't think we're taking the "generic" uploading away. You can
> always rely on that. The issue w/ one-shot is that it's not possible
> (literally) for many content types, e.g. Artifact-less content. It's
> also hard for multi-artifact Content so that would probably still be
> something plugin writers would provide as a custom thing for their
> content type. Regardless it's just not possible to have consistency in
> this area.
Why is it not possible to create a one-shot upload for artifact-less
content? (maybe we're defining what a one-shot upload actually is
differently, i'm reading it as something that combines multiple steps
into one)
Why is consistency not possible? I guess i don't see a huge variation of
upload scenarios beyond:
1. upload Zero to many files as artifacts
2. Provide some metadata about the zero or more artifacts or let the
plugin parse it out themselves (or maybe even a combination of the two)
3. Import that unit into a repository.
I can see it being difficult as a user to go through all of those steps
(even if 2 & 3 were combined into one), and the desire is to simplify
the process, but uploading arbitrary files is not simple. Why do i
need to give up the plugin's ability to parse the unit's details because
i'm using the consistent api?
Keep in mind all my questions are coming from a very ignorant
perspective with respect of pulp3 internals, and more from a user
perspective.
> My problem with single api calls to upload files is that we cannot
> reliably use them due to limitation in request sizes. We have to
> be prepared to use multiple calls to upload files regardless.
> Maybe if a user is using some plugin that never has super large
> files (ansible?) you could be confident you would never hit a
> request size limitation. But file, docker, and yum all would
> require multiple calls to get the physical data to the server.
>
> I believe arbitrarily large files can be uploaded either through
> multi-part form data or through the django-chunked interface. We'll
> see what happens with 4196, but I expect arbitrary payload size to be
> a requirement for Pulp users.
>
> I care more about having a consistent method for uploading files
> than having fewer api calls. If we need a some content specific
> api, that's fine, but please make it a consistent part of the
> process.
>
> It sounds like the 4-call interface is the only choice then if
> consistency is a must. There isn't a way to offer consistency for
> one-shot uploaders. Is it ok that Katello will have to fill out all of
> the field data when you post the content type? What could be better?
I'll reserve my comments here based on the discussion above.
Thanks!
Justin
> I feel like we may be chasing the wrong goal here (fewer calls vs
> a more consistent experience).
>
>>
>> That said, I think we should discuss this as a community to
>> encourage plugins to behave similarly, and because there may also
>> be a possibility for sharing some of code. It is my hope that a
>> "one shot upload" could do 2 things: 1) Upload and create
>> Content. 2) Optionally add that content to repositories.
>>
>> _______________________________________________
>> Pulp-dev mailing list
>> Pulp-dev at redhat.com <mailto:Pulp-dev at redhat.com>
>> https://www.redhat.com/mailman/listinfo/pulp-dev
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com <mailto:Pulp-dev at redhat.com>
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20190222/da14ebc2/attachment.htm>
More information about the Pulp-dev
mailing list