[Pulp-dev] the "relative path" problem

Daniel Alley dalley at redhat.com
Mon Apr 27 22:08:57 UTC 2020


There is a video call scheduled to discuss this issue tomorrow (Tuesday
April 28th) at 13:30 UTC (please convert to your local time).
https://meet.google.com/scy-csbx-qiu

On Sat, Apr 25, 2020 at 7:02 AM David Davis <daviddavis at redhat.com> wrote:

> I had a chance to think about this some more yesterday and wanted to email
> out my thoughts. I also think that this change sounds scary and will have a
> big impact on plugin writers so I thought of a couple alternatives:
>
> First, we could add a relative_path field to RepositoryContent instead of
> moving it there. This would be an optional field. It would be up to plugins
> to manage this field and they would still need to populate the
> relative_path field on ContentArtifact. But plugins could use this optional
> field to store relative paths per repository and then use this field when
> generating publications.
>
> The second alternative is one that is already laid out in the original
> email but to call it out again: it would be to not solve this in pulpcore.
> RPM would create its own object that would map content in a repository to
> relative_paths.
>
> David
>
>
> On Tue, Apr 21, 2020 at 9:22 AM Quirin Pamp <pamp at atix.de> wrote:
>
>> Hi,
>>
>>
>> I am not currently very well versed in the classes involved, but moving
>> relative_path around sounds slightly scary with the potential to break
>> things.
>>
>>
>> As such, I would be interested to be kept in the loop as this moves
>> forward. (Mailing list once there is some movement is entirely sufficient
>> 😉)
>>
>>
>> Thanks,
>>
>> Quirin Pamp
>> ------------------------------
>> *From:* pulp-dev-bounces at redhat.com <pulp-dev-bounces at redhat.com> on
>> behalf of Ina Panova <ipanova at redhat.com>
>> *Sent:* 21 April 2020 14:07:13
>> *To:* Daniel Alley <dalley at redhat.com>
>> *Cc:* Pulp-dev <pulp-dev at redhat.com>
>> *Subject:* Re: [Pulp-dev] the "relative path" problem
>>
>> Daniel,
>>
>> how about setting up a meeting and brainstorm the alternatives, pros/cons
>> there?
>>
>>
>> --------
>> Regards,
>>
>> Ina Panova
>> Senior Software Engineer| Pulp| Red Hat Inc.
>>
>> "Do not go where the path may lead,
>>  go instead where there is no path and leave a trail."
>>
>>
>> On Fri, Apr 17, 2020 at 5:57 PM Daniel Alley <dalley at redhat.com> wrote:
>>
>> Bump, this item needs to move forwards soon.  Does anyone have any
>> thoughts?
>>
>> On Wed, Apr 1, 2020 at 9:40 AM Pavel Picka <ppicka at redhat.com> wrote:
>>
>> Hi,
>> I'd like to add one more question to this topic. Do you think it is a
>> blocker for PRs [0] & [1] as by testing [2] this features I haven't run
>> into real world example where two really same name packages appears.
>> I think this is a 'must have' feature but until we solve/decide it we can
>> have two features working may with warning in docs for users that can
>> happen in some 'special' repositories.
>>
>> To follow topic directly I like proposed move to 'RepositoryContent' and
>> add it to its uniqueness constraint (if I understand well).
>>
>> [0] https://github.com/pulp/pulp_rpm/pull/1657
>> [1] https://github.com/pulp/pulp_rpm/pull/1642
>> [2] tested with centos 7, 8, opensuse and SLE repositories
>>
>> On Wed, Apr 1, 2020 at 3:22 PM Daniel Alley <dalley at redhat.com> wrote:
>>
>> We'd like to start a discussion on the "relative path problem" identified
>> recently.
>> Problem:
>>
>> Currently, a relative_path is tied to content in Pulp. This means that if
>> a content unit exists in two places within a repository or across
>> repositories, it has to be stored as two separate content units. This
>> creates redundant data and potential confusion for users.
>>
>> As a specific example, we need to support mirroring content in pulp_rpm
>> <https://pulp.plan.io/issues/6353>. Currently, for each location at
>> which a single package is stored, we’ll need to create a content unit. We
>> could end up with several records representing a single package. Users may
>> be confused about why they see multiple records for a package and they may
>> have trouble for example deciding which content unit to copy.
>> Proposed Solution:
>>
>> Move “relative_path” from its current location on ContentArtifact, to
>> RepositoryContent. This will require a sizable data migration. It is
>> possibly the case that in rare cases, repository versions may change
>> slightly due to deduplication.
>>
>> A repository-version-wide uniqueness constraint will be present on
>> “relative_path”, independently of any other repository uniquness
>> constraints (repo_key_fields) defined by the plugin writer.
>>
>> Modify the Stages API so that the relative_path can be processed in the
>> correct location – instead of “DeclarativeArtifact” it will likely need to
>> go on “DeclarativeContent”
>>
>> Remove “location_href” from the RPM Package content model – it was never
>> a true part of the RPM (file) metadata, it is derived from the repository
>> metadata. So storing it as a part of the Content unit doesn’t entirely make
>> sense.
>> Alternatives
>>
>> In most cases, a content unit will have a single relative path for a
>> content unit. Creating a general solution to solve a one-off problem is
>> usually not a good idea. As an alternative, we could look at another
>> solution for mirroring content. One example might be to create a new object
>> (e.g. RpmRepoMirrorContentMapping) that maps content to specific paths
>> within a repo or repo version.
>> Questions
>>
>>    - How do we handle this in pulp_file? How are content units
>>    identified in pulp_file without relative_path?
>>       - Checksum?
>>       - How was this problem handled in Pulp 2?
>>
>>
>> Please weigh in if you have any input on potential problems with the
>> proposal, potential alternate solutions, or other insights or questions!
>> _______________________________________________
>> Pulp-dev mailing list
>> Pulp-dev at redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>
>>
>>
>> --
>> Pavel Picka
>> Red Hat
>>
>> _______________________________________________
>> Pulp-dev mailing list
>> Pulp-dev at redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>
>> _______________________________________________
>> Pulp-dev mailing list
>> Pulp-dev at redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20200427/e40e2ea7/attachment.htm>


More information about the Pulp-dev mailing list