[Pulp-dev] proposed changes to Pulp 3 auto generated docs

Thu Jul 26 16:13:50 UTC 2018

bmbouter, dkliban, daviddavis, jsherril and I just chatted about this in
person. Here are the minutes from that meeting:

We have tools (API, docs, bindings+3rd party applications, CLI) that needs
to refer to references somehow, and we need to figure out whether these
references should be the href or a combination of the href and the id.
Dkliban has opened a PR updating the schemas to only use the hrefs, but
this schema isn't valid. [0]
Bmbouter has proposed some solutions to using both hrefs and ids, and
option 2 is by far the most popular [1]

If we go the hybrid href and id route there are some concerns:
    - How do we serialize createdresource that can be either publications
or repositoryversion?
        - additionalproperties field in openapi should take care of it
    - repository has to be referenced with 2 parameters (repo id and
version number) instead of just an id like all other resources, this isn't
consistent
        - maybe we should unnest repoversion, this also allows us to search
for content across repository versoins
        - probably should leave as is, since it works, and can be described
in openapiv2
    - we need to restructure our API, and this could be time consuming
        - changes needed to hyperlinked related field, created resource
schema, content added api endpoint, need to validate openapi schema is
compliant for all requests and responses

If we merge dkliban's PR, and use only hrefs there's also some concerns:
    - noncompliant schema
        - we aren't in a position to ship and support the tool chain for a
noncompliant schema, unlike google
        - following spec gives some peace of mind for future support

We decided on the following action items:
    - bizhang will take some time to write out a hybrid href and id POC
next week
    - dkliban will reach out to openapi, to see what the status is of the
href work, and maybe get an estimation of when that will be accepted into
the specification

[0] https://github.com/pulp/pulp/pull/3561#issuecomment-407496845
[1] https://github.com/pulp/pulp/pull/3561#issuecomment-407888652

On Wed, Jul 25, 2018 at 4:43 PM, Brian Bouterse <bbouters at redhat.com> wrote:

> I'm exploring the changes required to use IDs and hrefs on the PR here:
> https://github.com/pulp/pulp/pull/3561#issuecomment-407888652
>
> On Wed, Jul 25, 2018 at 4:24 PM, David Davis <daviddavis at redhat.com>
> wrote:
>
>> I know we don’t support things like accepting hrefs as references to
>> resources but if I remember correctly we do return hrefs alongside ids in
>> responses in Pulp 2. Is that not correct?
>>
>> David
>>
>>
>> On Wed, Jul 25, 2018 at 4:17 PM Dennis Kliban <dkliban at redhat.com> wrote:
>>
>>> I don't think we support both hrefs and ids in Pulp 2. The Pulp 2 REST
>>> API does not accept HREFs as references to resources. In Pulp 2's REST API
>>> we do not even have resources that have relationships to other resources.
>>> The relationships between resources are established by nesting them under
>>> one another. e.g.: /pulp/api/v2/repositories/<repo_id>/ and
>>> /pulp/api/v2/repositories/<repo_id>/importers/<importer_id>/. In Pulp
>>> 2, if a user wanted to reference content units in a request, the API
>>> requires writing a filter that uses Mongodb syntax.
>>>
>>> Pulp 3's REST API has a resources called Task that has a
>>> 'created_resource' attribute. This resource is a reference to either a
>>> repository version or a publication at this time. Pulp 3's REST API also
>>> supports users specifying references to content units that should be added
>>> or removed from a repository. These needs do not exist in Pulp 2's REST
>>> API.
>>>
>>>
>>> On Wed, Jul 25, 2018 at 3:55 PM, David Davis <daviddavis at redhat.com>
>>> wrote:
>>>
>>>> Correct me if I’m wrong but Pulp 2 supported @bizhang’s model of
>>>> providing both hrefs and ids. Was that a source of problems or complaints
>>>> by Pulp 2 users?
>>>>
>>>> David
>>>>
>>>>
>>>> On Wed, Jul 25, 2018 at 3:08 PM Dennis Kliban <dkliban at redhat.com>
>>>> wrote:
>>>>
>>>>> For everyone following along, the conversation has moved to Github -
>>>>> on the PR[0] with the proposed changes.
>>>>>
>>>>> [0] https://github.com/pulp/pulp/pull/3561
>>>>>
>>>>> On Tue, Jul 24, 2018 at 11:15 AM, Bihan Zhang <bizhang at redhat.com>
>>>>> wrote:
>>>>>
>>>>>> @dkliban I've tried out your PR and left a question:
>>>>>> https://github.com/pulp/pulp/pull/3561#issuecomment-407425172
>>>>>>
>>>>>> Won't it be problematic with the openapi definitions causing us to
>>>>>>> have two schemas? Accepting the data in two forms is one thing, but using
>>>>>>> openapi to describe it both ways is something I don't understand well. Are
>>>>>>> we going to ship and test two?
>>>>>>>
>>>>>>
>>>>>>  I don't think we'll be defining the data in two different ways in
>>>>>> openapi. We need to pass a {repository identifier} to /sync/, openapi
>>>>>> expects a string, what we do with that string is up to us. (In the
>>>>>> following example the format is "uri" but this isn't actually used for
>>>>>> validation at all, since it's not defined by the swagger specification [0],
>>>>>> we can also clear out the format field, since format is only there to
>>>>>> support documentation needs)
>>>>>>
>>>>>>    - RepositorySyncURL:
>>>>>>    {
>>>>>>       - required:
>>>>>>       [
>>>>>>          - "repository"
>>>>>>          ],
>>>>>>       - type: "object",
>>>>>>       - properties:
>>>>>>       {
>>>>>>          - repository:
>>>>>>          {
>>>>>>             - title: "Repository",
>>>>>>             - description: "A URI of the repository to be
>>>>>>             synchronized.",
>>>>>>             - type: "string",
>>>>>>             - format: "uri"
>>>>>>             }
>>>>>>          }
>>>>>>       },
>>>>>>
>>>>>>
>>>>>> I can see why some users would want to refer to things in the api
>>>>>>> using ID not an href. I think about the case that when calling publish and
>>>>>>> referring to a RepositoryVersion with id=827561, num=3, and for
>>>>>>> repository=1234. With an ID alternately accepted, you could call publish
>>>>>>> and submit repo_version=827561 instead of repo_version='repositories/1234/version/3/'.
>>>>>>> I can see that benefit, but it comes with downsides. Saving/storing a url I
>>>>>>> know feels a little strange, but I do see several upsides...
>>>>>>>
>>>>>>
>>>>>>> Doing it only with hrefs, ensures those benefits (nice recap btw)
>>>>>>> will always be true. Having to submit the references using something like
>>>>>>> 'repositories/1234/version/3/' will cause any client to store them that
>>>>>>> way. I think that's a good thing because someone troubleshooting their
>>>>>>> scripts or in katello's db will instead have 'repositories/1234/version/3/',
>>>>>>> which they can directly use with HTTP. I think this is valuable. Otherwise
>>>>>>> they would have repo version 827561, which now they have to do extra work
>>>>>>> to start interacting with that object via HTTP. Storing urls removes the
>>>>>>> "templating" step from the troubleshooter's responsibilities so we're
>>>>>>> making their job easier. Spacewise, I don't think the clients benefit
>>>>>>> hugely from storing 827561 instead of 'repositories/1234/version/3/',
>>>>>>> but humans do.
>>>>>>>
>>>>>> Why don't we provide the ability to use both href and id as
>>>>>> identifiers, and katello can choose the route that is right for them based
>>>>>> on the points you bring up?
>>>>>>
>>>>>>>
>>>>>>> I don't know much about the CLI, but if we want to enable a specific
>>>>>>> user experience, I think we can find a way to make that work. Overall I
>>>>>>> think users should be able to specify things in the most intuitive way
>>>>>>> possible, and I don't see how API data formats directly influence that. For
>>>>>>> example I think referring to a repository by it's name is the most natural;
>>>>>>> it's more natural than 1234 or repositories/1234.
>>>>>>>
>>>>>> +1 the CLI can resolve name to identifiers (either id or href), so
>>>>>> I'm not too concerned with that.
>>>>>>
>>>>>> [0] https://github.com/OAI/OpenAPI-Specification/blob/master
>>>>>> /versions/2.0.md#data-types
>>>>>>
>>>>>> On Mon, Jul 23, 2018 at 9:51 PM, Dennis Kliban <dkliban at redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I've made a work in progress PR[0] that demonstrates the changes I
>>>>>>> was suggesting.
>>>>>>>
>>>>>>> [0] https://github.com/pulp/pulp/pull/3561
>>>>>>>
>>>>>>> On Mon, Jul 23, 2018 at 3:50 PM, Brian Bouterse <bbouters at redhat.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Having two ways to refer to objects in the API makes me nervous. I
>>>>>>>> have some questions/concerns/ideas. I'm also interested to see what
>>>>>>>> dkliban's bindings produce in terms of a resolution of the swagger issues.
>>>>>>>>
>>>>>>>> Won't it be problematic with the openapi definitions causing us to
>>>>>>>> have two schemas? Accepting the data in two forms is one thing, but using
>>>>>>>> openapi to describe it both ways is something I don't understand well. Are
>>>>>>>> we going to ship and test two?
>>>>>>>>
>>>>>>>> I can see why some users would want to refer to things in the api
>>>>>>>> using ID not an href. I think about the case that when calling publish and
>>>>>>>> referring to a RepositoryVersion with id=827561, num=3, and for
>>>>>>>> repository=1234. With an ID alternately accepted, you could call publish
>>>>>>>> and submit repo_version=827561 instead of repo_version='repositories/1234/version/3/'.
>>>>>>>> I can see that benefit, but it comes with downsides. Saving/storing a url I
>>>>>>>> know feels a little strange, but I do see several upsides...
>>>>>>>>
>>>>>>>> Doing it only with hrefs, ensures those benefits (nice recap btw)
>>>>>>>> will always be true. Having to submit the references using something like
>>>>>>>> 'repositories/1234/version/3/' will cause any client to store them that
>>>>>>>> way. I think that's a good thing because someone troubleshooting their
>>>>>>>> scripts or in katello's db will instead have 'repositories/1234/version/3/',
>>>>>>>> which they can directly use with HTTP. I think this is valuable. Otherwise
>>>>>>>> they would have repo version 827561, which now they have to do extra work
>>>>>>>> to start interacting with that object via HTTP. Storing urls removes the
>>>>>>>> "templating" step from the troubleshooter's responsibilities so we're
>>>>>>>> making their job easier. Spacewise, I don't think the clients benefit
>>>>>>>> hugely from storing 827561 instead of 'repositories/1234/version/3/',
>>>>>>>> but humans do.
>>>>>>>>
>>>>>>>> I don't know much about the CLI, but if we want to enable a
>>>>>>>> specific user experience, I think we can find a way to make that work.
>>>>>>>> Overall I think users should be able to specify things in the most
>>>>>>>> intuitive way possible, and I don't see how API data formats directly
>>>>>>>> influence that. For example I think referring to a repository by it's name
>>>>>>>> is the most natural; it's more natural than 1234 or repositories/1234.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jul 19, 2018 at 8:30 AM, Daniel Alley <dalley at redhat.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Keep in mind that as of yesterday, unless we revert the change, we
>>>>>>>>> are using Integers IDs instead of UUIDs
>>>>>>>>>
>>>>>>>>> https://github.com/pulp/pulp/pull/3549
>>>>>>>>>
>>>>>>>>> On Wed, Jul 18, 2018 at 9:57 PM, Bihan Zhang <bizhang at redhat.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Jul 18, 2018 at 1:05 PM, Dennis Kliban <
>>>>>>>>>> dkliban at redhat.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I was asked on IRC to state what problems the proposed changes
>>>>>>>>>>> are trying to address. There are two problems I see with the current
>>>>>>>>>>> OpenAPI 2.0 schema Pulp's REST API provides.
>>>>>>>>>>>
>>>>>>>>>>>  - The path parameters in the schema don't reflect the
>>>>>>>>>>> parameters our users should be using for identifying the resources
>>>>>>>>>>> available via REST API.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I'm not convinced that we should use hrefs as the sole
>>>>>>>>>> identifiers for the resources.
>>>>>>>>>>
>>>>>>>>>> Here are the reasons I see that we use hrefs as identifiers in a
>>>>>>>>>> REST API context:
>>>>>>>>>>     1. Each href provides full context into the resource it
>>>>>>>>>> identifies. When given a href you would know exactly which resource it is
>>>>>>>>>> referencing and would never run into the issue of: what is this {uuid}
>>>>>>>>>> because you know it is a 'repositories/{uuid}'
>>>>>>>>>>     2. discoverability, you know exactly how to access resources
>>>>>>>>>> from hitting the root url (and in a webui can just click)
>>>>>>>>>>     3. You would not need to construct urls from templates
>>>>>>>>>>
>>>>>>>>>> But things are different if we look at it from a bindings/client
>>>>>>>>>> context. The difference is mainly due to how discoverability is done: in
>>>>>>>>>> the REST API context the user has little prior knowledge to what resources
>>>>>>>>>> are available, and how to access theses resoruces. But the bindings/client
>>>>>>>>>> are generated from the schema, which defines exactly how resources are
>>>>>>>>>> structured, and what the context of each {uuid} is.
>>>>>>>>>>
>>>>>>>>>>     1. Given an {uuid} the client/bindings knows exactly what
>>>>>>>>>> resource this {uuid} refers to.  With hrefs there would be redundant
>>>>>>>>>> information pulp.repositories('repositories/{uuid}') (why do I
>>>>>>>>>> need to specify repositories twice?)
>>>>>>>>>>     2. Discoverability is done with the schema which contains all
>>>>>>>>>> the information about available resources/endpoints
>>>>>>>>>>     3. URL construction is done by the client, so the user would
>>>>>>>>>> also never need to do any url construction themselves (unless we continue
>>>>>>>>>> to force href only identifiers, in which case they might have to do some
>>>>>>>>>> url construction to pass as arguments)
>>>>>>>>>>
>>>>>>>>>> I don't think hrefs and uuid identifiers are mutually exclusive.
>>>>>>>>>> I propose that we extend HyperlinkedRelatedFields to accept either href or
>>>>>>>>>> uuid, and map these HyperlinkedRelatedFields to each other in the schema
>>>>>>>>>> with openapi definition objects [0].
>>>>>>>>>>
>>>>>>>>>> [0] https://github.com/OAI/OpenAPI-Specification/blob/master/
>>>>>>>>>> versions/2.0.md#responses-definitions-object
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  - The path parameters don't have a description in the schema.
>>>>>>>>>>>
>>>>>>>>>>> +1 to updating the schema descriptions for these parameters
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Do others agree with these problem statements?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jul 18, 2018 at 9:31 AM, Dennis Kliban <
>>>>>>>>>>> dkliban at redhat.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I am working on improving the OpenAPI 2.0 schema for Pulp 3. I
>>>>>>>>>>>> would like to get some input on the improvements I am proposing. The schema
>>>>>>>>>>>> is used to generate our REST API documentation as well as the bindings with
>>>>>>>>>>>> swagger-codegen.
>>>>>>>>>>>>
>>>>>>>>>>>> The docs generated from our current schema look something like
>>>>>>>>>>>> this:
>>>>>>>>>>>>
>>>>>>>>>>>> GET /repositories/{repository_pk}/versions/{number}/content/
>>>>>>>>>>>> <https://docs.pulpproject.org/en/3.0/nightly/integration-guide/rest-api/index.html#get--repositories-repository_pk-versions-number-content->
>>>>>>>>>>>> Parameters:
>>>>>>>>>>>>
>>>>>>>>>>>>    - *number* (*integer*) –
>>>>>>>>>>>>    - *repository_pk* (*string*) –
>>>>>>>>>>>>
>>>>>>>>>>>> Status Codes:
>>>>>>>>>>>>
>>>>>>>>>>>>    - 200 OK
>>>>>>>>>>>>    <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1>
>>>>>>>>>>>>    –
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Since Pulp identifies resources using their HREFs, I am
>>>>>>>>>>>> proposing that the schema produce documentation that states:
>>>>>>>>>>>>
>>>>>>>>>>>> GET /{repository_version_href}/content/
>>>>>>>>>>>> Parameters:
>>>>>>>>>>>>
>>>>>>>>>>>>    - *repository_version_href* (string) – HREF for the
>>>>>>>>>>>>    repository version
>>>>>>>>>>>>
>>>>>>>>>>>> Status Codes:
>>>>>>>>>>>>
>>>>>>>>>>>>    - 200 OK
>>>>>>>>>>>>    <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1>
>>>>>>>>>>>>    –
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thoughts? Ideas? All feedback is welcome. Thank you!
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Pulp-dev mailing list
>>>>>>>>>>> Pulp-dev at redhat.com
>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Pulp-dev mailing list
>>>>>>>>>> Pulp-dev at redhat.com
>>>>>>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Pulp-dev mailing list
>>>>>>>>> Pulp-dev at redhat.com
>>>>>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Pulp-dev mailing list
>>>>>>>> Pulp-dev at redhat.com
>>>>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Pulp-dev mailing list
>>>>>>> Pulp-dev at redhat.com
>>>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>>>>>
>>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Pulp-dev mailing list
>>>>> Pulp-dev at redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>>>
>>>>
>>>
>> _______________________________________________
>> Pulp-dev mailing list
>> Pulp-dev at redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>
>>
>
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20180726/8314682b/attachment.htm>