[Pulp-dev] [pulp 3] proposed change to publishing REST api

Jeff Ortel jortel at redhat.com
Wed Nov 1 14:58:49 UTC 2017



On 11/01/2017 09:16 AM, Brian Bouterse wrote:
> Thanks for the response. Let's not move forward until we have more agreement in this area. I've written some
> responses inline.
> 
> On Wed, Nov 1, 2017 at 9:05 AM, Jeff Ortel <jortel at redhat.com <mailto:jortel at redhat.com>> wrote:
> 
>     I'm not yet convinced about the proposed URL change for publishing.  Can you help me understand why a POST to
>     the publications collection is more appropriate than the a POST to a publisher?
> 
> 
> I believe the thinking is: REST suggests that POSTing to a resource is expected to create a new resource of
> that type. So assume a users knows REST and they know they want to get a Publication created in Pulp, they
> know exactly how to do that just by knowing REST. In the case of a POST to a publisher url with a special
> 'publish' keyword on the end of it (the controller endpoint), they only way a user could know to do that is by
> reading our docs. Both approaches would work, but I believe the former is more aligned with REST which means
> users can do more without having to read Pulp docs.

I believe that if the user is experienced with REST, they will be expecting a 201 and an href to the created
resource to be returned instead of a 202 and a task href.  Further, they will expect the resource to be
created as defined in the POST body.  I think the side-effect of running the publisher to actually create the
resource will be unexpected.  The user would still need to read the docs to understand what's happening.  Not
saying this is all together wrong, just challenging the assertion that this would be more intuitive to the user.

>  
> 
> 
>     A POST to the publications/ collection means the POST body should define the publication to be created.
>     Right?  What about options that need to be passed to the publisher?
> 
>  
> Yes, if we look at the fields of the publisher (link below), there are only two fields: 'created' and
> 'publisher'. Since 'created' is set on the server automatically, the user would specify only the href to the
> publisher in the POST body. For the MVP, we don't accept one-time options, and all other options are
> configured on the publisher which is a different url call from both the publish controller and the publication
> resource. So for the MVP this approach would work well. The future case also is better with this approach (I
> think). When we do introduce one-time options, where will we store them? 

Options are not stored.  That's the point of them being one-time options vs. publisher attributes. My concern
is that making this choice means that we will never support one-time publishing options.  What do others think
of that?

We will probably be store them on the
> publication too, and that makes sense, because we can't store N, one-time publish options on 1 publisher
> instance, but we can store N, one-time publish options on N publications.
> 
> https://github.com/pulp/pulp/blob/15857fb0831c0998219a32e8d6ba52abdba20888/platform/pulpcore/app/models/publication.py#L6
> 
> 
> 
>     On 10/31/2017 03:13 PM, Brian Bouterse wrote:
>     > @dkliban, I'm +1 on that.
>     >
>     > @all, Please jump in if this is not the best direction for us to go.
>     >
>     > On Tue, Oct 31, 2017 at 3:55 PM, Dennis Kliban <dkliban at redhat.com <mailto:dkliban at redhat.com>
>     <mailto:dkliban at redhat.com <mailto:dkliban at redhat.com>>> wrote:
>     >
>     >     On Tue, Oct 31, 2017 at 3:52 PM, Brian Bouterse <bbouters at redhat.com <mailto:bbouters at redhat.com> <mailto:bbouters at redhat.com
>     <mailto:bbouters at redhat.com>>> wrote:
>     >
>     >         Would that return the 202 w/ a link to the task because the publication hasn't been created yet? Then
>     >         using the created_resources they can see what was created, and in the event of failure the task fails
>     >         and there are no created_resources.
>     >
>     >         @dkliban is ^ the idea?
>     >
>     >
>     >     Yes, the response would the same as it for the /publish URL right now. This is just a change in the URL
>     >     that is used to make the request.
>     >
>     >
>     >
>     >         On Tue, Oct 31, 2017 at 3:48 PM, Dennis Kliban <dkliban at redhat.com <mailto:dkliban at redhat.com>
>     <mailto:dkliban at redhat.com <mailto:dkliban at redhat.com>>> wrote:
>     >
>     >
>     >
>     >             On Tue, Oct 31, 2017 at 3:40 PM, Brian Bouterse <bbouters at redhat.com
>     <mailto:bbouters at redhat.com> <mailto:bbouters at redhat.com <mailto:bbouters at redhat.com>>>
>     >             wrote:
>     >
>     >                 +1 to updating #3033 to have a created_resources attribute which would be a list of
>     >                 GenericForeignKeys. It also needs docs, but I'm not entirely sure where.
>     >
>     >                 If we're going to introduce the above attribute, I think having the controller endpoint as-is
>     >                 would be the most usable. @dkliban do you see value in changing the URL structure if the
>     >                 created_resources attribute is introduced?
>     >
>     >
>     >             This API call creates a publication resource. A POST to publishers/<id>/publications/ seems most
>     >             appropriate for creating new publication resources.
>     >
>     >                 I can help review/groom these if that is helpful.
>     >
>     >                 -Brian
>     >
>     >
>     >                 On Tue, Oct 31, 2017 at 1:39 PM, David Davis <daviddavis at redhat.com <mailto:daviddavis at redhat.com>
>     >                 <mailto:daviddavis at redhat.com <mailto:daviddavis at redhat.com>>> wrote:
>     >
>     >                     Personally I am not opposed to the url endpoint you suggest.
>     >
>     >                     It also seems like there is some consensus around adding a ‘created resources’
>     >                     relationship to Task or at least prototyping that out to see what it would look like.
>     >
>     >                     If no one disagrees, should I update issue #3033 with those two items?
>     >
>     >
>     >                     David
>     >
>     >                     On Wed, Oct 25, 2017 at 1:23 PM, Dennis Kliban <dkliban at redhat.com <mailto:dkliban at redhat.com>
>     >                     <mailto:dkliban at redhat.com <mailto:dkliban at redhat.com>>> wrote:
>     >
>     >                         On Wed, Oct 25, 2017 at 11:24 AM, David Davis <daviddavis at redhat.com <mailto:daviddavis at redhat.com>
>     >                         <mailto:daviddavis at redhat.com <mailto:daviddavis at redhat.com>>> wrote:
>     >
>     >                             I don’t know that the ambiguity around whether a task has a publication or not is
>     >                             a big deal. If I call the publication endpoint, I’d expect a publication task
>     >                             which either has 1 publication or 0 (if the publication failed) attached to it.
>     >
>     >                             In terms of ambiguity, I see a worse problem around adding a task_id field to
>     >                             publications. As a user, I don’t know if a publication failed or not when I get
>     >                             back a publication object. Instead, I have to look up the task to see if it is a
>     >                             real (or successful) publication. Moreover, since we allow users to remove/clean
>     >                             up tasks, that task may not even exist anymore.
>     >
>     >
>     >                         I agree that the ephemeral nature of tasks makes the originally proposed solution
>     >                         non-deterministic. I am open to associating 'resources created' with a task instead.
>     >
>     >                         However, I still think there is value in changing the rest API endpoint for starting a
>     >                         publish task to POST
>     >                         /api/v3/repositories/<repo-id>/publishers/<type>/<name>/publications/. However, I will
>     >                         start a separate thread for that discussion.
>     >
>     >                          - Dennis
>     >
>     >
>     >
>     >                             David
>     >
>     >                             On Wed, Oct 25, 2017 at 11:03 AM, Brian Bouterse <bbouters at redhat.com <mailto:bbouters at redhat.com>
>     >                             <mailto:bbouters at redhat.com <mailto:bbouters at redhat.com>>> wrote:
>     >
>     >
>     >
>     >                                 On Tue, Oct 24, 2017 at 10:00 PM, Michael Hrivnak <mhrivnak at redhat.com <mailto:mhrivnak at redhat.com>
>     >                                 <mailto:mhrivnak at redhat.com <mailto:mhrivnak at redhat.com>>> wrote:
>     >
>     >
>     >
>     >                                     On Tue, Oct 24, 2017 at 2:11 PM, Brian Bouterse <bbouters at redhat.com <mailto:bbouters at redhat.com>
>     >                                     <mailto:bbouters at redhat.com <mailto:bbouters at redhat.com>>> wrote:
>     >
>     >                                         Thanks everyone for all the discussion! I'll try to recap the
>     problem
>     >                                         and some of the solutions I've heard. I'll also share some of my
>     >                                         perspective on them too.
>     >
>     >                                         What problem are we solving?
>     >                                         When a user calls "publish" (the action API endpoint) they get a 202
>     >                                         w/ a link to the task. That task will produce a publication. How can
>     >                                         the user find the publication that was produced by the task? How can
>     >                                         the user be sure the publication is fully complete?
>     >
>     >
>     >                                         What are our options?
>     >                                         1) Start linking to created objects from task status. I believe its
>     >                                         been clearly stated about why we can't do this. If it's not
>     clear, or
>     >                                         if there are other things we should consider, let's talk about it.
>     >                                         Acknowledging or establishing agreement on this is crucial because a
>     >                                         change like this would bring back a lot of the user pain from
>     pulp2. I
>     >                                         believe the HAL suggestion falls into this area.
>     >
>     >
>     >                                     I may have missed something, but I do not think this is clear. I
>     know that
>     >                                     Pulp 2's API included a lot of unstructured data, but that is not at all
>     >                                     what I'm suggesting here.
>     >
>     >                                     It is standard and recommended practice for REST API responses to
>     include
>     >                                     links to resources along with information about what type of
>     resource each
>     >                                     link references. We could include a reference to the created
>     resource and
>     >                                     an identifier for what type of resource it is, and that would be well
>     >                                     within the bounds of good REST API design. HAL is just one of
>     several ways
>     >                                     to accomplish that, and I'm not pitching any particular solution
>     there. In
>     >                                     any case, I'm not sure what the problem would be with this approach.
>     >
>     >
>     >                                 I agree it is a standard practice for a resource to include links to other
>     >                                 resources, but the proposal is to include "generic" links is different and
>     >                                 creates a different user experience. I believe referencing the task from the
>     >                                 publication will be easier for users and clients. When a user looks up a
>     >                                 publication, they will always know they'll get between 0 and 1 links to a
>     >                                 task. You can use that to check the state of the publication. If we link to
>     >                                 "generic" resources (like a publication) from a task, then if I ask a
>     user "do
>     >                                 you expect task ede3af3e-d5cf-4e18-8c57-69ac4d4e4de6 to contain a link to a
>     >                                 publication or not?" you can't know until you query it. I think that
>     ambiguity
>     >                                 was a pain point in Pulp2. I don't totally reject this solution, but this is
>     >                                 an undesirable property (I think).
>     >
>     >
>     >
>     >
>     >                                         2) Have the user find the publication via query that sorts on
>     time and
>     >                                         filters only for a specific publisher. This could be fragile because
>     >                                         with a multi-user system and no hard references between publications
>     >                                         and tasks, answering the question "which is the publication for
>     me" is
>     >                                         hard because another user could have submitted a publish too. While
>     >                                         not totally perfect, this could work.
>     >
>     >
>     >                                     In theory if a user queried for a publication from a specific publisher
>     >                                     that was created between the start and end times of the task, that
>     should
>     >                                     unambiguously identify the correct publication. But depending on
>     >                                     timestamps is not a particularly robust nor confidence-inspiring way to
>     >                                     reference a resource.
>     >
>     >                                 Agreed and Agreed
>     >
>     >
>     >
>     >
>     >                                         3) Have the user create a publication directly like any other REST
>     >                                         resource, and help the user understand the state of that
>     resource over
>     >                                         time. I believe the proposal at the start of this thread is
>     >                                         recommending this solution. I'm also +1 on this solution.
>     >
>     >
>     >                                     I think the problem with this is that a user cannot create a
>     publication.
>     >                                     A user can only ask a plugin to create a publication. Until the plugin
>     >                                     creates the publication, there is no publication.
>     >
>     >
>     >                                 Note a publication is an object, but really we mean a publication and it's
>     >                                 related PublishedArtifact, PublishedMetadat, etc objects. It would be
>     >                                 straightforward for a user to create a publication using the viewset and
>     have
>     >                                 the task associated with it call the publisher to build out the associated
>     >                                 PublishedArtifact, PublishedContent, PublishedMetadata, etc. We should
>     explore
>     >                                 if this is good or not, but it is possible.
>     >
>     >                                 As an aside, this is related to a problem everyone should be aware of: the
>     >                                 existence of a publication does not guarantee that publication is finished
>     >                                 publishing. Even with option 1, where the task creates the publisher and
>     links
>     >                                 to it in the task status, while the publisher is running it must save the
>     >                                 Publication so that the PublishedArtifact, etc can link to it. So for any
>     >                                 given publication, in order to know if it's "fully finished and consistent"
>     >                                 you must be able to check the status of the associated task that
>     produced it.
>     >
>     >
>     >
>     >                                         As an aside, I don't think considering versioned repos as a possible
>     >                                         solution is helping us with this problem. The scope of the current
>     >                                         problem is relatively small and the scope of planning for versioned
>     >                                         repos is large.
>     >
>     >
>     >                                     Versioned repos is a potential solution. In that scenario, a user would
>     >                                     request publication of a specific repo version (perhaps defaulting
>     to the
>     >                                     latest), the publication would be linked to that version, and that is an
>     >                                     easy mechanism for the user to find the publication they want.
>     Ultimately
>     >                                     the user is interested in working with a specific content set
>     anyway. They
>     >                                     get a repo to a state where it has the content they want, and then they
>     >                                     publish that content set. No matter what we do with publications, users
>     >                                     will think of them in terms of related content sets. A repo version is
>     >                                     that immutable content set they can work with confidently.
>     >
>     >
>     >                                 It's neat to me that that versions are snapshots of content and publications
>     >                                 are snapshots of content. Publications already create much of the value
>     >                                 propostion of versioned repos with publications. They allow you to work with
>     >                                 specific content sets like you describe. Also they allow for rollback.
>     So that
>     >                                 is all great for our users. For this thread, I want to bring the
>     conversation
>     >                                 back to where it started, solving a small problem about linking two
>     resources
>     >                                 that already exist.
>     >
>     >
>     >                                     It helps the rollback scenario a lot as well. Versioning repos allows a
>     >                                     user to see what the differences are between two content sets, and thus
>     >                                     two different publications, which informs them about when and how
>     far back
>     >                                     they should roll back a distribution.
>     >
>     >
>     >                                     - user discovers a horrible flaw in a piece of content
>     >                                     - user queries for which version of the repo introduced that piece
>     of content
>     >                                     - user updates the distribution to serve the publication that came
>     before
>     >                                     the one which introduced the piece of content, optionally re-publishing
>     >                                     that version in case its publication was deleted or had never been
>     made in
>     >                                     the first place.
>     >
>     >                                     --
>     >
>     >                                     Michael Hrivnak
>     >
>     >                                     Principal Software Engineer, RHCE
>     >
>     >                                     Red Hat
>     >
>     >
>     >
>     >                                 _______________________________________________
>     >                                 Pulp-dev mailing list
>     >                                 Pulp-dev at redhat.com <mailto:Pulp-dev at redhat.com>
>     <mailto:Pulp-dev at redhat.com <mailto:Pulp-dev at redhat.com>>
>     >                                 https://www.redhat.com/mailman/listinfo/pulp-dev
>     <https://www.redhat.com/mailman/listinfo/pulp-dev>
>     >                                 <https://www.redhat.com/mailman/listinfo/pulp-dev <https://www.redhat.com/mailman/listinfo/pulp-dev>>
>     >
>     >
>     >
>     >                             _______________________________________________
>     >                             Pulp-dev mailing list
>     >                             Pulp-dev at redhat.com <mailto:Pulp-dev at redhat.com> <mailto:Pulp-dev at redhat.com
>     <mailto:Pulp-dev at redhat.com>>
>     >                             https://www.redhat.com/mailman/listinfo/pulp-dev
>     <https://www.redhat.com/mailman/listinfo/pulp-dev>
>     >                             <https://www.redhat.com/mailman/listinfo/pulp-dev
>     <https://www.redhat.com/mailman/listinfo/pulp-dev>>
>     >
>     >
>     >
>     >
>     >
>     >
>     >
>     >
>     >
>     >
>     > _______________________________________________
>     > Pulp-dev mailing list
>     > Pulp-dev at redhat.com <mailto:Pulp-dev at redhat.com>
>     > https://www.redhat.com/mailman/listinfo/pulp-dev <https://www.redhat.com/mailman/listinfo/pulp-dev>
>     >
> 
> 
>     _______________________________________________
>     Pulp-dev mailing list
>     Pulp-dev at redhat.com <mailto:Pulp-dev at redhat.com>
>     https://www.redhat.com/mailman/listinfo/pulp-dev <https://www.redhat.com/mailman/listinfo/pulp-dev>
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 847 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20171101/e782b18a/attachment.sig>


More information about the Pulp-dev mailing list