[Pulp-dev] [pulp 3] proposed change to publishing REST api

Tue Oct 24 18:11:58 UTC 2017

Thanks everyone for all the discussion! I'll try to recap the problem and
some of the solutions I've heard. I'll also share some of my perspective on
them too.

What problem are we solving?
When a user calls "publish" (the action API endpoint) they get a 202 w/ a
link to the task. That task will produce a publication. How can the user
find the publication that was produced by the task? How can the user be
sure the publication is fully complete?

What are our options?
1) Start linking to created objects from task status. I believe its been
clearly stated about why we can't do this. If it's not clear, or if there
are other things we should consider, let's talk about it. Acknowledging or
establishing agreement on this is crucial because a change like this would
bring back a lot of the user pain from pulp2. I believe the HAL suggestion
falls into this area.

2) Have the user find the publication via query that sorts on time and
filters only for a specific publisher. This could be fragile because with a
multi-user system and no hard references between publications and tasks,
answering the question "which is the publication for me" is hard because
another user could have submitted a publish too. While not totally perfect,
this could work.

3) Have the user create a publication directly like any other REST
resource, and help the user understand the state of that resource over
time. I believe the proposal at the start of this thread is recommending
this solution. I'm also +1 on this solution.

As an aside, I don't think considering versioned repos as a possible
solution is helping us with this problem. The scope of the current problem
is relatively small and the scope of planning for versioned repos is large.

On Tue, Oct 24, 2017 at 9:43 AM, Jeff Ortel <jortel at redhat.com> wrote:

>
>
> On 10/23/2017 06:14 PM, Dennis Kliban wrote:
> > On Mon, Oct 23, 2017 at 3:20 PM, Michael Hrivnak <mhrivnak at redhat.com
> <mailto:mhrivnak at redhat.com>> wrote:
> >
> >
> >
> >     On Mon, Oct 23, 2017 at 12:30 PM, Dennis Kliban <dkliban at redhat.com
> <mailto:dkliban at redhat.com>> wrote:
> >
> >         On Mon, Oct 23, 2017 at 10:56 AM, Jeff Ortel <jortel at redhat.com
> <mailto:jortel at redhat.com>> wrote:
> >
> >             This is interesting.
> >
> >             Some thoughts:
> >
> >             If adopted, I propose the publication task create the
> publication and pass to the publisher which
> >             would
> >             require a change in the plugin API -
> Publisher.publish(publication).  If the publisher fails, I
> >             think the
> >             publication should be deleted.
> >
> >
> >         The ViewSet would create the publication, dispatch a publish
> task with the publication id as an
> >         argument, update the publication with the task id, return a
> serialized Publication to the API user.
> >         The user is responsible for deleting any publication that is not
> created successfully.
> >
> >
> >     For me, your wording illustrates the problem well. Why should a user
> have to delete a resource that was
> >     never created?
> >
> >     This sounds like we'd be introducing a partially-created state for
> publications. There would be some kind
> >     of placeholder representation that could be referenced as a location
> where a real publication *might or
> >     might not* eventually appear. And this representation would live
> side-by-side in a "publications/"
> >     endpoint with representations of actual publications? How would a
> user know which are which? It seems like
> >     this just shifts the async problem onto the publication model.
> >
> >     I go back to this: When creation of a resource is requested, the
> response should either be 201 if the
> >     resource was created, or 202 if creation is deferred. We should not
> attempt partial creation.
> >
> >
> >
> >     It's easy to lose sight of this, so maybe it's worth also observing
> that a resource is not just a DB
> >     record or some JSON. The existence of a resource representation
> requires that the resource itself exists
> >     in every way that is necessary for it to make sense. We should be
> careful not to misrepresent the
> >     existence of a publication.
> >
> >
> > The description of issue 3033[0] does not clearly establish what a
> serialized version of a Publication looks
> > like. In our current design, I imagine that it will contain three
> fields: _href, created, and publisher.
> > @jortel, do you have the same vision?
>
> Yes.
>
> >
> > If we start associating tasks with Publications, then the serialized
> publication would have 4 fields: _href,
> > created, publisher, task. The API would then allow filtering based on
> the status of the associated task. e.g.
> > publications/?task__status=successful to retrieve all publications that
> are successfully created.
> >
> > We could also add validation on the Distribution that will check whether
> the publication being associated with
> > the Distribution has a task associated with it, and if so that it
> successfully completed.
>
> I don't think we should store broken publications in the DB.
>
> >
> > A POST to /publications/ could return a 202 and a serialized version of
> the publication. This lets the user
> > know that the task of creating a publication was accepted. Any GET
> requests to /publications/<publication_id>
> > would return 202 until the publication task has completed. Once the
> publication task is complete a GET request
> > to /publications/<publication_id> would return 200 if the task finished
> successfully or 410 (gone) if it did
> > not complete successfully.
>
> My main objection to storing the task_id on the publication is that
> task_id is only meaningful to the user for
> a very short period.  Just long enough to make subsequent API calls but
> nothing further unless the user writes
> it down with a note giving it meaning.  But imagine a user listing
> publications later, trying to select one to
> associate with a distribution.  Or to be delted.  The task ID would be
> meaningless.  The natural key
> Publication.name was an attempt to give the user something meaningful for
> all use cases.  After further
> consideration, I'm not convinced that adding "name" is the best solution
> either.
>
> I wonder if versioned repositories isn't the real answer.  If the
> repository was versioned then publications
> would be naturally versioned as well.  The serialized publication could
> include the repository "version"
> number.  This would be meaningful to the user for all use cases.
>
> >
> >
> > [0] https://pulp.plan.io/issues/3033 <https://pulp.plan.io/issues/3033>
> >
> >
> >     --
> >
> >     Michael Hrivnak
> >
> >     Principal Software Engineer, RHCE
> >
> >     Red Hat
> >
> >
>
>
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20171024/9c638261/attachment.htm>