[Pulp-dev] proposing changes to pulp 3 upload API

Bryan Kearney bkearney at redhat.com
Tue Jun 27 14:38:48 UTC 2017


If the uploaded artifact is an ISO at 2.5G how does this impact the choice?

-- bk

On 06/27/2017 09:36 AM, Brian Bouterse wrote:
> I thought that we pulled out the chunking uploads from the MVP. IIRC, 
> @jortel and I thought since that use case was for high performing 
> (parallel) uploads and it should be on the 3.1+ page.
> 
> +1 to just sending data without having a file handle. If the entire file 
> is delivered in one request then having a file ID to upload to in a 
> second request is just cumbersome.
> +1 to having the handler receiving that file just make it an Artifact() 
> right away. This will work better with how Django handles file uploads.
> 
> I also think we can skip making one Artifact from another. That is not 
> going to be a commonly used use case I think. So removing that use case 
> and chunking that would be:
> 
>   * As an authenticated user, I can upload a file which becomes an
>     Artifact. At the end up the of upload, the server returns the JSON
>     representation of the created Artifact.
>   * As an authenticated user, I can create a content unit by providing
>     the content type, its Artifacts using IDs for each Artifact, and the
>     metadata supplied in the POST body. This call is atomic, content
>     unit is created in the database and on the filesystem or not at all.
> 
> The biggest reason I think to do this adjustment is to aligns with the 
> users desire to have uploads take fewer calls. This removes at least two 
> calls from the workflow. It also avoids having to save the data multiple 
> times which I don't think we can do practically.
> 
> Thoughts or ideas?
> 
> -Brian
> 
> On Tue, Jun 27, 2017 at 8:55 AM, Dennis Kliban <dkliban at redhat.com 
> <mailto:dkliban at redhat.com>> wrote:
> 
>     My motivations for writing this email include: recent discussion
>     about pulp 2 upload API in #pulp and django's documentation on file
>     uploads.
> 
>     Files uploaded to Django are initially stored in memory (if under
>     2.5 mb) or Python's tempfile module is used to write it to /tmp/
>     directory. The file created in /tmp is deleted when and if the last
>     file handle is closed.
> 
>     If we implement the upload API as described in the MVP doc[0], then
>     according to Django docs[1] we will be performing a write to disk 2
>     or 3 times for each upload. In cases where a file is bigger than
>     2.5mb in size, it will be first written to /tmp. The same file will
>     then be written to /var/lib/pulp/uploads (or similar location) when
>     the FileUpload model is saved. A third write will occur when an
>     artifact is created using the FileUpload. This third write will
>     likely be a move though.
> 
>     I propose that we eliminate writing the uploaded file to
>     /var/lib/pulp/upload and go directly to creating an artifact. The
>     use cases can then be rewritten as the following:
> 
>       * As an authenticated user, I can upload a file with an optional
>         chunk size, and an optional offset. At the end up the of upload
>         the server returns the JSON representation of the artifact.
> 
> 
>       * As an authenticated user, I can create a new artifact by
>         specifying an existing artifact id.
> 
> 
>       * As an authenticated user, I can create a content unit by
>         providing the content type, its Artifacts using IDs for each
>         Artifact, and the metadata supplied in the POST body. This call
>         is atomic, content unit is created in the database and on the
>         filesystem or not at all.
> 
> 
> 
> 
>     [0]
>     https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Minimum_Viable_Product#Upload-amp-Copy
>     <https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Minimum_Viable_Product#Upload-amp-Copy>
>     [1]
>     https://docs.djangoproject.com/en/1.9/topics/http/file-uploads/#handling-uploaded-files-with-a-model
>     <https://docs.djangoproject.com/en/1.9/topics/http/file-uploads/#handling-uploaded-files-with-a-model>
> 
>     _______________________________________________
>     Pulp-dev mailing list
>     Pulp-dev at redhat.com <mailto:Pulp-dev at redhat.com>
>     https://www.redhat.com/mailman/listinfo/pulp-dev
>     <https://www.redhat.com/mailman/listinfo/pulp-dev>
> 
> 
> 
> 
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
> 




More information about the Pulp-dev mailing list