[Pulp-dev] Thoughts needed on Yet Another Advisory-Merge Oddness

Grant Gainey ggainey at redhat.com
Wed Apr 21 12:13:31 UTC 2021

On Wed, Apr 21, 2021 at 8:03 AM Tanya Tereshchenko <ttereshc at redhat.com>

> Hey Grant,

[brilliant investigation happens]

And this is exactly what I needed - thanks Tanya, that clears up All The
Things for me.

I'm going to amend the PR to use Pulp2's naming convention, and add some
commentary in the collection-merge code so the next person won't have to go
through this exercise.

This was a great help, thanks so much!


> Let's start with thoughts without modularity :)
> I think the main reasons for the existing behavior are semantical and also
> aim to preserve collections as they are.
> The situation you describe (when I change only pkglist and nothing else at
> all) usually means one of the following:
>  1. A user wants to combine the same advisory which came from different
> repos, e.g RHEL base and RHEL debuginfo.
>  2. A user tries to fix an advisory in a bad way. This should  never
> rarely happen. If they change advisory, they need to update the `updated`
> timestamp and or version, and no merge will be done.
> For situation #1, if it happens that collection names are the same (which
> is often not the case in the wild), we are not modifying the existing one,
> and you can clearly see that it was combined, it also preserves grouping of
> packages in it which might be relevant, e.g. debuginfo vs others.
> For situation #2, I agree it looks a bit weird but they should not do it
> in the first place.
> Having said that, there is no problem functionality wise, one or two
> collections will work fine if their names are different.
> I personally would leave it as is. It gives me more confidence that we are
> not modifying a collection package list but just adding a new one and
> feeding it to createrepo_c as is. Plus if there is a problem, it's easier
> to see that it was a merge.
> Now the modularity part :)
> This might require some experimenting with dnf or just asking them how the
> situation with multiple collections for the same modules is handled.
> Here is how modular collection looks:
> Collection
>     - name_0
>     - module NSVCA_0
>     - list of packages
> If we look at 2 modular collections with the same name but different
> modules, those can not be merged.
> Current behaviour covers that case. If we choose to put all packages into
> one collection at the merge time, we need to check the modularity aspect of
> the collection.
> If we look at 2 modular collections with the same name and same modules,
> we technically can merge them.
> And here is the unknown part for me, if we leave 2 collections with the
> same modules and make names different, will dnf handle it correctly, or
> does it expect only one collection per module.
> I'd think that dnf doesn't care for module uniqueness if the collection
> names are different but it's worth checking.
> Thanks for sorting it all out,
> Tanya
> P.S. +1 to give a better name to the unnamed collections, fwiw, in pulp 2
> it was something like "collection_0", collection_1", etc.
> On Tue, Apr 20, 2021 at 3:03 PM Grant Gainey <ggainey at redhat.com> wrote:
>> Hey gang,
>> In the process of fixing issue 7282 <https://pulp.plan.io/issues/7282>,
>> I have encountered a behavior from advisory-merge that appears to be the
>> deliberately-desired behavior - and I'm not sure it's really what we want
>> to do? I need some thoughts on this. Figured I'd start in email, because
>> it's advisory-merge, and complicated, and trying to follow a verbal
>> description is confusing. More confusing.
>> The PR with the proposed fix is
>> https://github.com/pulp/pulp_rpm/pull/1973
>> The scenario is as follows:
>> 1) Upload an advisory that has a package-list with one RPM, say
>> 'bear', to a repository.
>> 2) Upload an *identical* advisory, except for having a different package
>> list with one RPM, say 'camel', to the same repository.
>> 3) The resulting advisory-conflict-resolution falls into the "date and
>> version identical, pkg-list disjoint" case, which calls for an advisory
>> merge here:
>> https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/advisory.py#L222
>> 4) advisory_merge() finds two 'collections' - one for each
>> pkglist.packages in each advisory, neither of which is named.
>> 5) advisory_merge() results in an advisory with two collections. One has
>> no name, and is a package-list containing 'bear'. The other is named
>> "None_0"[0], and is a package-list containing "camel".
>> You can see where this decision is made here:
>> https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/advisory.py#L273-L284
>> That result was surprising to me. My assumption going in was, if we were
>> merging two advisories, we would merge their same-named collections into
>> one collection with elements from both. But we're very carefully choosing
>> to not-do that.
>> So, my question here is, can someone explain to me the rationale for the
>> current behavior? I'm sure it has something to do with modularity, because
>> half of the weird things I run into in RPM-land are because of modularity
>> :) But I'd like to understand why we're doing what we're doing, so I can
>> add some commentary in the test or in the merge_advisories() code itself.
>> If this turns out to be undesirable emergent behavior, then I can address
>> it as part of this PR.
>> The test I added for all this fun is here:
>> https://github.com/pulp/pulp_rpm/pull/1973/files#diff-781abba15eb7e61b9ae644ae112bf1292dfd7eb78e7162b74830d4ddb43e1868R274
>> Thoughts very much appreciated!
>> G
>> [0] "None_0" is Rude. Even if this is the right behavior, we prob could
>> stand to notice 'unnamed' collections and do something a little less
>> painful...
>> --
>> Grant Gainey
>> Principal Software Engineer, Red Hat System Management Engineering

Grant Gainey
Principal Software Engineer, Red Hat System Management Engineering
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20210421/9982deb6/attachment.htm>

More information about the Pulp-dev mailing list