[Pulp-dev] Webserver owning the entire url namespace?

Dennis Kliban dkliban at redhat.com
Wed Nov 8 16:05:37 UTC 2017


Please see my comments inline.

On Tue, Nov 7, 2017 at 3:28 PM, Michael Hrivnak <mhrivnak at redhat.com> wrote:

>
>
> On Mon, Nov 6, 2017 at 9:34 AM, Brian Bouterse <bbouters at redhat.com>
> wrote:
>
>> Yes the REST API can be scoped to a base path. Pulp can also serve
>> content even if its scoped to a base path. So Pulp itself will work great
>> even if scoped to a base path.
>>
>> The issue is 100% around the "content serving apps" like Crane, Forge,
>> etc. I call those things "live content APIs". The current plan AIUI is that
>> "live content APIs" will be satisfied using a custom viewset so the plugin
>> developer does not need to package+ship+version+configure a separate app,
>> e.g. crane, forge, etc.
>>
>
> That may work in some cases, but I don't think it's a good fit for cases
> like the docker registry API.
>
> The registry API has enough path complexity that a viewset would not be
> sufficient, so it would need to provide a mix of routers and viewsets. It's
> an entire app worth of routes and views, including its own auth and search.
> DRF is not a great tool for that job, and it's valuable to enable plugin
> writers to use whatever tools/frameworks/languages make sense. For example,
> right now there is an effort underway to replace crane with an app that
> uses the "docker distribution" code to serve the API, but can still read
> crane's data files and serve Pulp publications. That level of flexibility
> is important.
>

I believe you are suggesting that a Pulp backend could be built for a
Docker registry. This backend would know how to consume information about
docker content published by Pulp. This would indeed be a separate
application. However, until such a registry backend exists, it would be
good to allow the Docker plugin authors to provide a docker API as part of
the same application.


> From a deployment perspective, it's been a key use case to deploy crane at
> the perimeter, rsync published image files out to a file or CDN service,
> and run the rest of Pulp on a well-protected internal network.
>

Pulp can also be installed at the perimeter. Core should support a setting
that enables/disables the REST API. Each plugin could support a setting
that enables/disables its content API.


>
>
>>
>> So we want to simplify the common cases and allow for complex cases to
>> still work. To me that is:
>>
>> * allow plugin developers to deliver live content APIs in the form of
>> viewsets. They are free to root them anywhere in the url namespace they
>> want to. Their requirements require that.
>> * Recommend that Pulp be run not scoped to a base path (simplest). If
>> users follow this recommendation 100% of their live APIs will work.
>>
>> Then for allowing scoping Pulp to a base path:
>>
>> * Pulp can be scoped to a base path and it will work without any extra
>> config. The docs should state this is possible, but that "live APIs" may
>> not work.
>> * Users will need to figure out to make the live APIs work. That's really
>> between plugin writers and users at that point.
>>
>> Note that currently one WSGI process is serving both the REST API, the
>> Content APIs, and the "live content APIs". I don't see a use case to
>> separate them at this point. If there is a believe that (a) we will have
>> more than 1 WSGI process and (b) why, please share those thoughts.
>>
>
> We should definitely keep the REST API separate from content serving, as
> it is in Pulp 2. They are very different services with different goals,
> needs and characteristics. The streamer is a third independent service that
> likely makes sense to keep separate.
>
> The REST API and content apps have different resource needs. Content
> serving can use read-only access to a DB and filesystem, and it does not
> need message broker access. We could probably get away with only giving it
> access to a few tables in the DB. It does not need access to much of the
> config or secrets that the REST API needs. The REST API app probably needs
> a lot more memory and CPU than the content app.
>
> They have different audience/access needs also. A small group of humans
> and/or automation need to infrequently use the REST API to manage what
> content Pulp makes available. A much larger audience of content consumers
> needs to access publications. The two audiences often exist on different
> networks. More downtime can be tolerated from the REST API than the content
> app.
>
> Related to the access differences, the two apps have different scalability
> needs. The amount of traffic likely to be handled by the REST API vs
> content app are very different. And on the uptime issue, we definitely have
> a use case for continuing to serve publications while Pulp is being
> upgraded or is otherwise down for maintenance.
>
> All of that said, there's no reason why a user couldn't use a web server
> like httpd to run all three WSGI apps in the same process, multiplied
> across its normal pool of processes. We should make the apps available as
> separate WSGI apps, and users can deploy them in whatever combinations meet
> their needs.
>


As mentioned above, Pulp should use configuration settings to disable and
enable the REST API and the individual content APIs. Separate WSGI
applications makes the deployment process more complicated.


>
> For example, Pulp 2 defaults to running the REST API as a separate set of
> daemon processes within httpd (see WSGIDaemonProcess for details) to
> isolate them from the rest of the httpd processes, which serve content (and
> potentially other apps like katello).
>
>
>>
>> In Pulp2 we matched on /api/v2/ and maybe /content/ and just those two
>> urls. This required plugin developres who need live APIs (docker, puppet,
>> etc) to ship a separate application (crane, forget, etc).
>>
>> There is a middleground where we recommend Pulp run from / but they can
>> bury it deeper in the url structure if they want, but their stuff may not
>> work. Overall though, if we are bundling live APIs a plugin viewsets then I
>> don't see how it will work if we don't recommend owning /.
>>
>
> If we advocate that plugin writers add endpoints somewhere to support
> type-specific content access APIs, that should go in the content-serving
> app. It's important that such APIs only serve content that is part of an
> active publication, which is a role well-matched to the content app. The
> access, scalability and reliability needs are also a match.
>


I don't see that there is a real difference in these needs. Pulp should be
scalable and reliable.


>
> A challenge with that pattern is tracking what path space is claimed by a
> plugin's live API, and making sure other Distributions don't use that path
> space. I'm sure that could be done, but it adds complexity that's worth
> thinking through.
>
> --
>
> Michael Hrivnak
>
> Principal Software Engineer, RHCE
>
> Red Hat
>
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20171108/bff47717/attachment.htm>


More information about the Pulp-dev mailing list