[Pulp-dev] proposing changes to pulp 3 upload API

Dennis Kliban dkliban at redhat.com
Tue Jun 27 12:55:59 UTC 2017

My motivations for writing this email include: recent discussion about pulp
2 upload API in #pulp and django's documentation on file uploads.

Files uploaded to Django are initially stored in memory (if under 2.5 mb)
or Python's tempfile module is used to write it to /tmp/ directory. The
file created in /tmp is deleted when and if the last file handle is closed.

If we implement the upload API as described in the MVP doc[0], then
according to Django docs[1] we will be performing a write to disk 2 or 3
times for each upload. In cases where a file is bigger than 2.5mb in size,
it will be first written to /tmp. The same file will then be written to
/var/lib/pulp/uploads (or similar location) when the FileUpload model is
saved. A third write will occur when an artifact is created using the
FileUpload. This third write will likely be a move though.

I propose that we eliminate writing the uploaded file to
/var/lib/pulp/upload and go directly to creating an artifact. The use cases
can then be rewritten as the following:

   - As an authenticated user, I can upload a file with an optional chunk
   size, and an optional offset. At the end up the of upload the server
   returns the JSON representation of the artifact.

   - As an authenticated user, I can create a new artifact by specifying an
   existing artifact id.

   - As an authenticated user, I can create a content unit by providing the
   content type, its Artifacts using IDs for each Artifact, and the metadata
   supplied in the POST body. This call is atomic, content unit is created in
   the database and on the filesystem or not at all.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20170627/83bcb1d6/attachment.htm>

More information about the Pulp-dev mailing list