[Pulp-dev] proposing changes to pulp 3 upload API

Brian Bouterse bbouters at redhat.com
Tue Jun 27 18:56:29 UTC 2017


Picking up from @jortel's observations...

+1 to allowing Artifacts to have an optional FK.

If we have an Artifacts endpoint then we can allow for the deleting of a
single artifact if it has no FK. I think we want to disallow the removal of
an Artifact that has a foreign key. Also filtering should allow a single
operation to clean up all unassociated artifacts by searching for FK=None
or similar.

Yes, we will need to allow the single call delivering a file to also
specify the relative path, size, checksums etc. Since the POST body
contains binary data we either need to accept this data as GET style params
or use a multi-part MIME upload [0]. Note that this creation of an Artifact
does not change the repository contents and therefore can be handled
synchronously outside of the tasking system.

+1 to the saving of an Artifact to perform validation

[0]: https://www.w3.org/Protocols/rfc1341/7_2_Multipart.html

-Brian

On Tue, Jun 27, 2017 at 2:44 PM, Dennis Kliban <dkliban at redhat.com> wrote:

> On Tue, Jun 27, 2017 at 1:24 PM, Michael Hrivnak <mhrivnak at redhat.com>
> wrote:
>
>>
>> On Tue, Jun 27, 2017 at 11:27 AM, Jeff Ortel <jortel at redhat.com> wrote:
>>
>>>
>>> - The artifact FK to a content unit would need to become optional.
>>>
>>> - Need to add use cases for cleaning up artifacts not associated with a
>>> content unit.
>>>
>>> - The upload API would need additional information needed to create an
>>> artifact.  Like relative path, size,
>>> checksums etc.
>>>
>>> - Since (I assume) you are proposing uploading/writing directly to
>>> artifact storage (not staging in a working
>>> dir), the flow would need to involve (optional) validation.  If
>>> validation fails, the artifact must not be
>>> inserted into the DB.
>>
>>
>> Perhaps a decent middle ground would be to stick with the plan of keeping
>> uploaded (or partially uploaded) files as a separate model until they are
>> ready to be turned into a Content instance plus artifacts, and save their
>> file data directly to somewhere within /var/lib/pulp/. It would be some
>> path distinct from where Artifacts are stored. That's what I had imagined
>> we would do anyway. Then as Dennis pointed out, turning that into an
>> Artifact would only require a move operation on the same filesystem, which
>> is super-cheap.
>>
>>
> Would that address all the concerns? We'd write the data just once, and
>> then move it once on the same filesystem. I haven't looked at django's
>> support for this recently, but it seems like it should be doable.
>>
>> I was just looking at the dropbox API and noticed that they provide two
> separate API endpoints for regular file uploads[0] (< 150mb) and large file
> uploads[1]. It is the latter that supports chunking and requires using an
> upload id. For the most common case they support uploading a file with one
> API call. Our original proposal requires 2 for the same use case. Pulp API
> users would appreciate having to only make one API call to upload a file.
>
> [0] https://www.dropbox.com/developers-v1/core/docs#files_put
> [1] https://www.dropbox.com/developers-v1/core/docs#chunked-upload
>
>
>
>> --
>>
>> Michael Hrivnak
>>
>> Principal Software Engineer, RHCE
>>
>> Red Hat
>>
>> _______________________________________________
>> Pulp-dev mailing list
>> Pulp-dev at redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>
>>
>
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20170627/e184c5db/attachment.htm>


More information about the Pulp-dev mailing list