[Pulp-dev] the "relative path" problem

Ina Panova ipanova at redhat.com
Tue Apr 21 12:07:13 UTC 2020


Daniel,

how about setting up a meeting and brainstorm the alternatives, pros/cons
there?


--------
Regards,

Ina Panova
Senior Software Engineer| Pulp| Red Hat Inc.

"Do not go where the path may lead,
 go instead where there is no path and leave a trail."


On Fri, Apr 17, 2020 at 5:57 PM Daniel Alley <dalley at redhat.com> wrote:

> Bump, this item needs to move forwards soon.  Does anyone have any
> thoughts?
>
> On Wed, Apr 1, 2020 at 9:40 AM Pavel Picka <ppicka at redhat.com> wrote:
>
>> Hi,
>> I'd like to add one more question to this topic. Do you think it is a
>> blocker for PRs [0] & [1] as by testing [2] this features I haven't run
>> into real world example where two really same name packages appears.
>> I think this is a 'must have' feature but until we solve/decide it we can
>> have two features working may with warning in docs for users that can
>> happen in some 'special' repositories.
>>
>> To follow topic directly I like proposed move to 'RepositoryContent' and
>> add it to its uniqueness constraint (if I understand well).
>>
>> [0] https://github.com/pulp/pulp_rpm/pull/1657
>> [1] https://github.com/pulp/pulp_rpm/pull/1642
>> [2] tested with centos 7, 8, opensuse and SLE repositories
>>
>> On Wed, Apr 1, 2020 at 3:22 PM Daniel Alley <dalley at redhat.com> wrote:
>>
>>> We'd like to start a discussion on the "relative path problem"
>>> identified recently.
>>> Problem:
>>>
>>> Currently, a relative_path is tied to content in Pulp. This means that
>>> if a content unit exists in two places within a repository or across
>>> repositories, it has to be stored as two separate content units. This
>>> creates redundant data and potential confusion for users.
>>>
>>> As a specific example, we need to support mirroring content in pulp_rpm
>>> <https://pulp.plan.io/issues/6353>. Currently, for each location at
>>> which a single package is stored, we’ll need to create a content unit. We
>>> could end up with several records representing a single package. Users may
>>> be confused about why they see multiple records for a package and they may
>>> have trouble for example deciding which content unit to copy.
>>> Proposed Solution:
>>>
>>> Move “relative_path” from its current location on ContentArtifact, to
>>> RepositoryContent. This will require a sizable data migration. It is
>>> possibly the case that in rare cases, repository versions may change
>>> slightly due to deduplication.
>>>
>>> A repository-version-wide uniqueness constraint will be present on
>>> “relative_path”, independently of any other repository uniquness
>>> constraints (repo_key_fields) defined by the plugin writer.
>>>
>>> Modify the Stages API so that the relative_path can be processed in the
>>> correct location – instead of “DeclarativeArtifact” it will likely need to
>>> go on “DeclarativeContent”
>>>
>>> Remove “location_href” from the RPM Package content model – it was never
>>> a true part of the RPM (file) metadata, it is derived from the repository
>>> metadata. So storing it as a part of the Content unit doesn’t entirely make
>>> sense.
>>> Alternatives
>>>
>>> In most cases, a content unit will have a single relative path for a
>>> content unit. Creating a general solution to solve a one-off problem is
>>> usually not a good idea. As an alternative, we could look at another
>>> solution for mirroring content. One example might be to create a new object
>>> (e.g. RpmRepoMirrorContentMapping) that maps content to specific paths
>>> within a repo or repo version.
>>> Questions
>>>
>>>    - How do we handle this in pulp_file? How are content units
>>>    identified in pulp_file without relative_path?
>>>       - Checksum?
>>>       - How was this problem handled in Pulp 2?
>>>
>>>
>>> Please weigh in if you have any input on potential problems with the
>>> proposal, potential alternate solutions, or other insights or questions!
>>> _______________________________________________
>>> Pulp-dev mailing list
>>> Pulp-dev at redhat.com
>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>
>>
>>
>> --
>> Pavel Picka
>> Red Hat
>>
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20200421/0c224c9d/attachment.htm>


More information about the Pulp-dev mailing list