[Pulp-dev] [pulp 3] proposed change to publishing REST api
David Davis
daviddavis at redhat.com
Tue Oct 24 20:07:32 UTC 2017
I was just reviewing the Task search API in Pulp 3 we designed a few months
ago. Two of the requirements [0] were "As a user of the task search API I
want to search for all tasks that operated on repo zoo” and "As a user of
the task search API I want to search all publish tasks performed by a
particular publisher." My question: will users be allowed to search for
tasks using a publication href or a publication using a task href as they
would with a repo/publisher/etc? If so, that could be the link between task
and publication[1].
[0] https://pulp.plan.io/issues/2482
[1] https://pulp.plan.io/issues/2890
David
On Tue, Oct 24, 2017 at 2:11 PM, Brian Bouterse <bbouters at redhat.com> wrote:
> Thanks everyone for all the discussion! I'll try to recap the problem and
> some of the solutions I've heard. I'll also share some of my perspective on
> them too.
>
> What problem are we solving?
> When a user calls "publish" (the action API endpoint) they get a 202 w/ a
> link to the task. That task will produce a publication. How can the user
> find the publication that was produced by the task? How can the user be
> sure the publication is fully complete?
>
>
> What are our options?
> 1) Start linking to created objects from task status. I believe its been
> clearly stated about why we can't do this. If it's not clear, or if there
> are other things we should consider, let's talk about it. Acknowledging or
> establishing agreement on this is crucial because a change like this would
> bring back a lot of the user pain from pulp2. I believe the HAL suggestion
> falls into this area.
>
> 2) Have the user find the publication via query that sorts on time and
> filters only for a specific publisher. This could be fragile because with a
> multi-user system and no hard references between publications and tasks,
> answering the question "which is the publication for me" is hard because
> another user could have submitted a publish too. While not totally perfect,
> this could work.
>
> 3) Have the user create a publication directly like any other REST
> resource, and help the user understand the state of that resource over
> time. I believe the proposal at the start of this thread is recommending
> this solution. I'm also +1 on this solution.
>
>
> As an aside, I don't think considering versioned repos as a possible
> solution is helping us with this problem. The scope of the current problem
> is relatively small and the scope of planning for versioned repos is large.
>
>
> On Tue, Oct 24, 2017 at 9:43 AM, Jeff Ortel <jortel at redhat.com> wrote:
>
>>
>>
>> On 10/23/2017 06:14 PM, Dennis Kliban wrote:
>> > On Mon, Oct 23, 2017 at 3:20 PM, Michael Hrivnak <mhrivnak at redhat.com
>> <mailto:mhrivnak at redhat.com>> wrote:
>> >
>> >
>> >
>> > On Mon, Oct 23, 2017 at 12:30 PM, Dennis Kliban <dkliban at redhat.com
>> <mailto:dkliban at redhat.com>> wrote:
>> >
>> > On Mon, Oct 23, 2017 at 10:56 AM, Jeff Ortel <jortel at redhat.com
>> <mailto:jortel at redhat.com>> wrote:
>> >
>> > This is interesting.
>> >
>> > Some thoughts:
>> >
>> > If adopted, I propose the publication task create the
>> publication and pass to the publisher which
>> > would
>> > require a change in the plugin API -
>> Publisher.publish(publication). If the publisher fails, I
>> > think the
>> > publication should be deleted.
>> >
>> >
>> > The ViewSet would create the publication, dispatch a publish
>> task with the publication id as an
>> > argument, update the publication with the task id, return a
>> serialized Publication to the API user.
>> > The user is responsible for deleting any publication that is
>> not created successfully.
>> >
>> >
>> > For me, your wording illustrates the problem well. Why should a
>> user have to delete a resource that was
>> > never created?
>> >
>> > This sounds like we'd be introducing a partially-created state for
>> publications. There would be some kind
>> > of placeholder representation that could be referenced as a
>> location where a real publication *might or
>> > might not* eventually appear. And this representation would live
>> side-by-side in a "publications/"
>> > endpoint with representations of actual publications? How would a
>> user know which are which? It seems like
>> > this just shifts the async problem onto the publication model.
>> >
>> > I go back to this: When creation of a resource is requested, the
>> response should either be 201 if the
>> > resource was created, or 202 if creation is deferred. We should not
>> attempt partial creation.
>> >
>> >
>> >
>> > It's easy to lose sight of this, so maybe it's worth also observing
>> that a resource is not just a DB
>> > record or some JSON. The existence of a resource representation
>> requires that the resource itself exists
>> > in every way that is necessary for it to make sense. We should be
>> careful not to misrepresent the
>> > existence of a publication.
>> >
>> >
>> > The description of issue 3033[0] does not clearly establish what a
>> serialized version of a Publication looks
>> > like. In our current design, I imagine that it will contain three
>> fields: _href, created, and publisher.
>> > @jortel, do you have the same vision?
>>
>> Yes.
>>
>> >
>> > If we start associating tasks with Publications, then the serialized
>> publication would have 4 fields: _href,
>> > created, publisher, task. The API would then allow filtering based on
>> the status of the associated task. e.g.
>> > publications/?task__status=successful to retrieve all publications
>> that are successfully created.
>> >
>> > We could also add validation on the Distribution that will check
>> whether the publication being associated with
>> > the Distribution has a task associated with it, and if so that it
>> successfully completed.
>>
>> I don't think we should store broken publications in the DB.
>>
>> >
>> > A POST to /publications/ could return a 202 and a serialized version of
>> the publication. This lets the user
>> > know that the task of creating a publication was accepted. Any GET
>> requests to /publications/<publication_id>
>> > would return 202 until the publication task has completed. Once the
>> publication task is complete a GET request
>> > to /publications/<publication_id> would return 200 if the task finished
>> successfully or 410 (gone) if it did
>> > not complete successfully.
>>
>> My main objection to storing the task_id on the publication is that
>> task_id is only meaningful to the user for
>> a very short period. Just long enough to make subsequent API calls but
>> nothing further unless the user writes
>> it down with a note giving it meaning. But imagine a user listing
>> publications later, trying to select one to
>> associate with a distribution. Or to be delted. The task ID would be
>> meaningless. The natural key
>> Publication.name was an attempt to give the user something meaningful for
>> all use cases. After further
>> consideration, I'm not convinced that adding "name" is the best solution
>> either.
>>
>> I wonder if versioned repositories isn't the real answer. If the
>> repository was versioned then publications
>> would be naturally versioned as well. The serialized publication could
>> include the repository "version"
>> number. This would be meaningful to the user for all use cases.
>>
>> >
>> >
>> > [0] https://pulp.plan.io/issues/3033 <https://pulp.plan.io/issues/3033>
>> >
>> >
>> > --
>> >
>> > Michael Hrivnak
>> >
>> > Principal Software Engineer, RHCE
>> >
>> > Red Hat
>> >
>> >
>>
>>
>> _______________________________________________
>> Pulp-dev mailing list
>> Pulp-dev at redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>
>>
>
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20171024/456280aa/attachment.htm>
More information about the Pulp-dev
mailing list