[Pulp-dev] [pulp 3] proposed change to publishing REST api

Tue Oct 24 13:43:39 UTC 2017

On 10/23/2017 06:14 PM, Dennis Kliban wrote:
> On Mon, Oct 23, 2017 at 3:20 PM, Michael Hrivnak <mhrivnak at redhat.com <mailto:mhrivnak at redhat.com>> wrote:
> 
> 
> 
>     On Mon, Oct 23, 2017 at 12:30 PM, Dennis Kliban <dkliban at redhat.com <mailto:dkliban at redhat.com>> wrote:
> 
>         On Mon, Oct 23, 2017 at 10:56 AM, Jeff Ortel <jortel at redhat.com <mailto:jortel at redhat.com>> wrote:
> 
>             This is interesting.
> 
>             Some thoughts:
> 
>             If adopted, I propose the publication task create the publication and pass to the publisher which
>             would
>             require a change in the plugin API - Publisher.publish(publication).  If the publisher fails, I
>             think the
>             publication should be deleted.
> 
> 
>         The ViewSet would create the publication, dispatch a publish task with the publication id as an
>         argument, update the publication with the task id, return a serialized Publication to the API user.
>         The user is responsible for deleting any publication that is not created successfully.
> 
> 
>     For me, your wording illustrates the problem well. Why should a user have to delete a resource that was
>     never created?
> 
>     This sounds like we'd be introducing a partially-created state for publications. There would be some kind
>     of placeholder representation that could be referenced as a location where a real publication *might or
>     might not* eventually appear. And this representation would live side-by-side in a "publications/"
>     endpoint with representations of actual publications? How would a user know which are which? It seems like
>     this just shifts the async problem onto the publication model.
> 
>     I go back to this: When creation of a resource is requested, the response should either be 201 if the
>     resource was created, or 202 if creation is deferred. We should not attempt partial creation.
> 
>  
> 
>     It's easy to lose sight of this, so maybe it's worth also observing that a resource is not just a DB
>     record or some JSON. The existence of a resource representation requires that the resource itself exists
>     in every way that is necessary for it to make sense. We should be careful not to misrepresent the
>     existence of a publication.
> 
>  
> The description of issue 3033[0] does not clearly establish what a serialized version of a Publication looks
> like. In our current design, I imagine that it will contain three fields: _href, created, and publisher.
> @jortel, do you have the same vision?

Yes.

> 
> If we start associating tasks with Publications, then the serialized publication would have 4 fields: _href,
> created, publisher, task. The API would then allow filtering based on the status of the associated task. e.g.
> publications/?task__status=successful to retrieve all publications that are successfully created.
> 
> We could also add validation on the Distribution that will check whether the publication being associated with
> the Distribution has a task associated with it, and if so that it successfully completed.

I don't think we should store broken publications in the DB.

> 
> A POST to /publications/ could return a 202 and a serialized version of the publication. This lets the user
> know that the task of creating a publication was accepted. Any GET requests to /publications/<publication_id>
> would return 202 until the publication task has completed. Once the publication task is complete a GET request
> to /publications/<publication_id> would return 200 if the task finished successfully or 410 (gone) if it did
> not complete successfully. 

My main objection to storing the task_id on the publication is that task_id is only meaningful to the user for
a very short period.  Just long enough to make subsequent API calls but nothing further unless the user writes
it down with a note giving it meaning.  But imagine a user listing publications later, trying to select one to
associate with a distribution.  Or to be delted.  The task ID would be meaningless.  The natural key
Publication.name was an attempt to give the user something meaningful for all use cases.  After further
consideration, I'm not convinced that adding "name" is the best solution either.

I wonder if versioned repositories isn't the real answer.  If the repository was versioned then publications
would be naturally versioned as well.  The serialized publication could include the repository "version"
number.  This would be meaningful to the user for all use cases.

> 
> 
> [0] https://pulp.plan.io/issues/3033 <https://pulp.plan.io/issues/3033>
> 
>      
>     -- 
> 
>     Michael Hrivnak
> 
>     Principal Software Engineer, RHCE 
> 
>     Red Hat
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 847 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20171024/fc578b04/attachment.sig>