[Pulp-dev] Webserver owning the entire url namespace?

Dennis Kliban dkliban at redhat.com
Wed Nov 8 16:05:37 UTC 2017

Please see my comments inline.

On Tue, Nov 7, 2017 at 3:28 PM, Michael Hrivnak <mhrivnak at redhat.com> wrote:

> On Mon, Nov 6, 2017 at 9:34 AM, Brian Bouterse <bbouters at redhat.com>
> wrote:
>> Yes the REST API can be scoped to a base path. Pulp can also serve
>> content even if its scoped to a base path. So Pulp itself will work great
>> even if scoped to a base path.
>> The issue is 100% around the "content serving apps" like Crane, Forge,
>> etc. I call those things "live content APIs". The current plan AIUI is that
>> "live content APIs" will be satisfied using a custom viewset so the plugin
>> developer does not need to package+ship+version+configure a separate app,
>> e.g. crane, forge, etc.
> That may work in some cases, but I don't think it's a good fit for cases
> like the docker registry API.
> The registry API has enough path complexity that a viewset would not be
> sufficient, so it would need to provide a mix of routers and viewsets. It's
> an entire app worth of routes and views, including its own auth and search.
> DRF is not a great tool for that job, and it's valuable to enable plugin
> writers to use whatever tools/frameworks/languages make sense. For example,
> right now there is an effort underway to replace crane with an app that
> uses the "docker distribution" code to serve the API, but can still read
> crane's data files and serve Pulp publications. That level of flexibility
> is important.

I believe you are suggesting that a Pulp backend could be built for a
Docker registry. This backend would know how to consume information about
docker content published by Pulp. This would indeed be a separate
application. However, until such a registry backend exists, it would be
good to allow the Docker plugin authors to provide a docker API as part of
the same application.

> From a deployment perspective, it's been a key use case to deploy crane at
> the perimeter, rsync published image files out to a file or CDN service,
> and run the rest of Pulp on a well-protected internal network.

Pulp can also be installed at the perimeter. Core should support a setting
that enables/disables the REST API. Each plugin could support a setting
that enables/disables its content API.

>> So we want to simplify the common cases and allow for complex cases to
>> still work. To me that is:
>> * allow plugin developers to deliver live content APIs in the form of
>> viewsets. They are free to root them anywhere in the url namespace they
>> want to. Their requirements require that.
>> * Recommend that Pulp be run not scoped to a base path (simplest). If
>> users follow this recommendation 100% of their live APIs will work.
>> Then for allowing scoping Pulp to a base path:
>> * Pulp can be scoped to a base path and it will work without any extra
>> config. The docs should state this is possible, but that "live APIs" may
>> not work.
>> * Users will need to figure out to make the live APIs work. That's really
>> between plugin writers and users at that point.
>> Note that currently one WSGI process is serving both the REST API, the
>> Content APIs, and the "live content APIs". I don't see a use case to
>> separate them at this point. If there is a believe that (a) we will have
>> more than 1 WSGI process and (b) why, please share those thoughts.
> We should definitely keep the REST API separate from content serving, as
> it is in Pulp 2. They are very different services with different goals,
> needs and characteristics. The streamer is a third independent service that
> likely makes sense to keep separate.
> The REST API and content apps have different resource needs. Content
> serving can use read-only access to a DB and filesystem, and it does not
> need message broker access. We could probably get away with only giving it
> access to a few tables in the DB. It does not need access to much of the
> config or secrets that the REST API needs. The REST API app probably needs
> a lot more memory and CPU than the content app.
> They have different audience/access needs also. A small group of humans
> and/or automation need to infrequently use the REST API to manage what
> content Pulp makes available. A much larger audience of content consumers
> needs to access publications. The two audiences often exist on different
> networks. More downtime can be tolerated from the REST API than the content
> app.
> Related to the access differences, the two apps have different scalability
> needs. The amount of traffic likely to be handled by the REST API vs
> content app are very different. And on the uptime issue, we definitely have
> a use case for continuing to serve publications while Pulp is being
> upgraded or is otherwise down for maintenance.
> All of that said, there's no reason why a user couldn't use a web server
> like httpd to run all three WSGI apps in the same process, multiplied
> across its normal pool of processes. We should make the apps available as
> separate WSGI apps, and users can deploy them in whatever combinations meet
> their needs.

As mentioned above, Pulp should use configuration settings to disable and
enable the REST API and the individual content APIs. Separate WSGI
applications makes the deployment process more complicated.

> For example, Pulp 2 defaults to running the REST API as a separate set of
> daemon processes within httpd (see WSGIDaemonProcess for details) to
> isolate them from the rest of the httpd processes, which serve content (and
> potentially other apps like katello).
>> In Pulp2 we matched on /api/v2/ and maybe /content/ and just those two
>> urls. This required plugin developres who need live APIs (docker, puppet,
>> etc) to ship a separate application (crane, forget, etc).
>> There is a middleground where we recommend Pulp run from / but they can
>> bury it deeper in the url structure if they want, but their stuff may not
>> work. Overall though, if we are bundling live APIs a plugin viewsets then I
>> don't see how it will work if we don't recommend owning /.
> If we advocate that plugin writers add endpoints somewhere to support
> type-specific content access APIs, that should go in the content-serving
> app. It's important that such APIs only serve content that is part of an
> active publication, which is a role well-matched to the content app. The
> access, scalability and reliability needs are also a match.

I don't see that there is a real difference in these needs. Pulp should be
scalable and reliable.

> A challenge with that pattern is tracking what path space is claimed by a
> plugin's live API, and making sure other Distributions don't use that path
> space. I'm sure that could be done, but it adds complexity that's worth
> thinking through.
> --
> Michael Hrivnak
> Principal Software Engineer, RHCE
> Red Hat
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20171108/bff47717/attachment.htm>

More information about the Pulp-dev mailing list