[Pulp-list] duplicate rpms

JASON STELZER jasonstelzer at boomi.com
Thu May 14 17:20:36 UTC 2020


Sorry this is a little long, but there's a lot to understand.

I have a few repositories that i'm mirroring. All of them are public.
Mostly this is a shim so we can promote changes from one env to another
over time without getting different packages installed at different times,
which is a clumsy work around for a different problem. Anyways....

I'd like to expose them as one versioned logical end point to the client
machines.

I'm working on an upgrade from an old RC version to the latest version of
pulp. What I'm observing are failed synch tasks.
For example:
                "description": "Cannot create repository version. Path is
duplicated: mysql-utilities-1.3
.6-1.el7.noarch.rpm.",

I'm omitting the traceback because ultimately it's not interesting. In this
case the package above exists in more than one upstream.

On the client side, when you yum install something the client has to decide
which thing to use and doesn't (generally) just explode in the face of
ambiguity. And, in the wild west, you can add as many .repo files as you
need.

So I did some digging and some package name set intersections to get a
count of the number of packages that are overlapping. For example:
Conflicts between rhui-REGION-rhel-server-releases vs
rhui-REGION-rhel-server-extras are 2
Conflicts between rhui-REGION-rhel-server-releases vs
rhui-REGION-rhel-server-optional are 191
Conflicts between rhui-REGION-rhel-server-rhscl vs
rhui-REGION-rhel-server-releases are 6
Conflicts between rhui-REGION-rhel-server-rhscl vs
rhui-REGION-rhel-server-optional are 8
Conflicts between mysql-tools-community vs mysql-connectors-community are 1
Conflicts between rhui-REGION-rhel-server-rh-common vs epel are 1


Now, depending on a bunch of things I could just order the repos by some
sort of consistent precedence order and just take the whole collection and
do something like:

rsync --include*.rpm --exclude=* /tmp/mirror/path /some/flattened/namespace

And then run createrepo and import the de-duplicated and overwritten files.
My question to you all is: is this expected behavior? Are there better
alternatives?

I would really prefer to not juggle 1:1 mappings of public repos to
internal repos because it gets time consuming and error prone fast.

And most of these seem genuinely the same content just in different places:
 find . -name glusterfs-api-3.7.1-16.el7.x86_64.rpm|xargs sha1sum

fa74c6e6350da38304b09dd200fba8bc33c7d4b0
 ./rhui-REGION-rhel-server-releases/Packages/g/glusterfs-api-3.7.1-16.el7.x86_64.rpm

fa74c6e6350da38304b09dd200fba8bc33c7d4b0
 ./rhui-REGION-rhel-server-rh-common/Packages/g/glusterfs-api-3.7.1-16.el7.x86_64.rpm

I'm in the middle of writing a little linter to make sure the checksums are
indeed the same and that we don't have a package 'foo' with the same
version but different contents for some crazy reason, if only for my own
clarification.

-- 
J.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-list/attachments/20200514/9bd415e8/attachment.htm>


More information about the Pulp-list mailing list