[Pulp-dev] Pulp3 Applicability Design thoughts (and Katello)

Fri Jan 17 22:45:06 UTC 2020

tl;dr: at least for the initial Katello and Pulp3 integration, moving
applicability to Katello I think is the best outcome for both communities.

What I'm reading (correct me if not accurate) is that Katello's calls to
Pulp for applicability calculation need to flow so much data that Pulp's
value to provide that applicability "answer" is not worth it. This is a
concern for both performance and also integration complexity. If this is
accurate, then I am +1 to have Katello handle the last call along with all
the other data Katello indexes today. This would remove applicability from
Pulp entirely.

In terms of Pulp wanting to offer applicability to its own users, I agree
w/ @mhrivnak that since pulp3 is not "consumer-aware" that would be a large
change at least. It would require significant planning to go after a
feature set like that because Pulp3 would have to become consumer-aware
again.

On Fri, Jan 17, 2020 at 3:22 PM Michael Hrivnak <mhrivnak at redhat.com> wrote:

> A little context from the past may help, though I'm not in-the-loop on
> recent planning (for a broad definition of "recent" ;) ).
>
> Applicability was added to pulp at a time when pulp-agent existed, a
> process that ran on every managed host to both watch what packages were
> installed and carry out install/update/remove actions as directed.
> Pulp-agent was deprecated and removed long ago based on broad agreement
> that there are better tools for managing what's installed on each host, and
> that it made more sense to focus on utilizing those. During the time when
> pulp-agent existed, pulp owned both the repo data and consumer data, so it
> made sense for pulp to do applicability calculations.
>
> As already mentioned, pulp 3 is not "consumer-aware". That was a
> deliberate choice, and applicability was expected back then to be
> implemented as a separate service. There are potentially many additional
> advantages to having it separate, such as scaling independently, deploying
> it only for those who want it, using a technology stack best-suited to the
> job (graph DB? different language? dedicated cache? closely-utilize dnf's
> own code?), etc.
>
> One of the major disadvantages of pulp 2's applicability features is that
> it re-implements the decision-making that takes place on a host as
> calculated by yum or dnf (you basically want the output of `dnf
> check-update`), and it's hard to produce exactly the same results. It could
> be amazing if applicability calculation re-used the same code as dnf, but
> that kind of tight integration is easier done as a separate code base.
>
> Another disadvantage in pulp 2 is that it was hard to know if there was a
> difference between the content in the repo vs. what was currently published
> and visible to the consumers. Pulp 3's repo versions and publications will
> make that a non-issue and open up much easier opportunities for caching
> applicability results.
>
> I'm sure some assumptions and requirements have changed since we
> originally established the fundamentals of pulp 3, but hopefully it's
> useful to have a little encouragement from the past that keeping
> applicability separate was at that point considered both reasonable and
> desirable. In any case, it will be exciting to see what you all come up
> with.
>
> --
>
> Michael Hrivnak
>
> Principal Software Engineer, RHCE
>
> Red Hat
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20200117/ceefc2ca/attachment.htm>