[Pulp-dev] the "relative path" problem

Matthias Dellweg mdellweg at redhat.com
Tue Apr 28 14:29:52 UTC 2020


That is only used for passthrough publication afaik. If you publish each
content unit "by hand", you create a new relative path for each published
artifact. That is, why it can be empty and still the content can be
published.

On Tue, Apr 28, 2020 at 4:09 PM Daniel Alley <dalley at redhat.com> wrote:

> We realized in our discussion that the original proposal described in my
> email will not work, because "relative_path" ultimately describes the path
> of the published *artifacts* (not content), and for content types with
> multiple artifacts, storing this information in a field on
> RepositoryContent would not be possible.
>
> On Mon, Apr 27, 2020 at 6:08 PM Daniel Alley <dalley at redhat.com> wrote:
>
>> There is a video call scheduled to discuss this issue tomorrow (Tuesday
>> April 28th) at 13:30 UTC (please convert to your local time).
>> https://meet.google.com/scy-csbx-qiu
>>
>> On Sat, Apr 25, 2020 at 7:02 AM David Davis <daviddavis at redhat.com>
>> wrote:
>>
>>> I had a chance to think about this some more yesterday and wanted to
>>> email out my thoughts. I also think that this change sounds scary and will
>>> have a big impact on plugin writers so I thought of a couple alternatives:
>>>
>>> First, we could add a relative_path field to RepositoryContent instead
>>> of moving it there. This would be an optional field. It would be up to
>>> plugins to manage this field and they would still need to populate the
>>> relative_path field on ContentArtifact. But plugins could use this optional
>>> field to store relative paths per repository and then use this field when
>>> generating publications.
>>>
>>> The second alternative is one that is already laid out in the original
>>> email but to call it out again: it would be to not solve this in pulpcore.
>>> RPM would create its own object that would map content in a repository to
>>> relative_paths.
>>>
>>> David
>>>
>>>
>>> On Tue, Apr 21, 2020 at 9:22 AM Quirin Pamp <pamp at atix.de> wrote:
>>>
>>>> Hi,
>>>>
>>>>
>>>> I am not currently very well versed in the classes involved, but moving
>>>> relative_path around sounds slightly scary with the potential to break
>>>> things.
>>>>
>>>>
>>>> As such, I would be interested to be kept in the loop as this moves
>>>> forward. (Mailing list once there is some movement is entirely sufficient
>>>> 😉)
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Quirin Pamp
>>>> ------------------------------
>>>> *From:* pulp-dev-bounces at redhat.com <pulp-dev-bounces at redhat.com> on
>>>> behalf of Ina Panova <ipanova at redhat.com>
>>>> *Sent:* 21 April 2020 14:07:13
>>>> *To:* Daniel Alley <dalley at redhat.com>
>>>> *Cc:* Pulp-dev <pulp-dev at redhat.com>
>>>> *Subject:* Re: [Pulp-dev] the "relative path" problem
>>>>
>>>> Daniel,
>>>>
>>>> how about setting up a meeting and brainstorm the alternatives,
>>>> pros/cons there?
>>>>
>>>>
>>>> --------
>>>> Regards,
>>>>
>>>> Ina Panova
>>>> Senior Software Engineer| Pulp| Red Hat Inc.
>>>>
>>>> "Do not go where the path may lead,
>>>>  go instead where there is no path and leave a trail."
>>>>
>>>>
>>>> On Fri, Apr 17, 2020 at 5:57 PM Daniel Alley <dalley at redhat.com> wrote:
>>>>
>>>> Bump, this item needs to move forwards soon.  Does anyone have any
>>>> thoughts?
>>>>
>>>> On Wed, Apr 1, 2020 at 9:40 AM Pavel Picka <ppicka at redhat.com> wrote:
>>>>
>>>> Hi,
>>>> I'd like to add one more question to this topic. Do you think it is a
>>>> blocker for PRs [0] & [1] as by testing [2] this features I haven't run
>>>> into real world example where two really same name packages appears.
>>>> I think this is a 'must have' feature but until we solve/decide it we
>>>> can have two features working may with warning in docs for users that can
>>>> happen in some 'special' repositories.
>>>>
>>>> To follow topic directly I like proposed move to 'RepositoryContent'
>>>> and add it to its uniqueness constraint (if I understand well).
>>>>
>>>> [0] https://github.com/pulp/pulp_rpm/pull/1657
>>>> [1] https://github.com/pulp/pulp_rpm/pull/1642
>>>> [2] tested with centos 7, 8, opensuse and SLE repositories
>>>>
>>>> On Wed, Apr 1, 2020 at 3:22 PM Daniel Alley <dalley at redhat.com> wrote:
>>>>
>>>> We'd like to start a discussion on the "relative path problem"
>>>> identified recently.
>>>> Problem:
>>>>
>>>> Currently, a relative_path is tied to content in Pulp. This means that
>>>> if a content unit exists in two places within a repository or across
>>>> repositories, it has to be stored as two separate content units. This
>>>> creates redundant data and potential confusion for users.
>>>>
>>>> As a specific example, we need to support mirroring content in pulp_rpm
>>>> <https://pulp.plan.io/issues/6353>. Currently, for each location at
>>>> which a single package is stored, we’ll need to create a content unit. We
>>>> could end up with several records representing a single package. Users may
>>>> be confused about why they see multiple records for a package and they may
>>>> have trouble for example deciding which content unit to copy.
>>>> Proposed Solution:
>>>>
>>>> Move “relative_path” from its current location on ContentArtifact, to
>>>> RepositoryContent. This will require a sizable data migration. It is
>>>> possibly the case that in rare cases, repository versions may change
>>>> slightly due to deduplication.
>>>>
>>>> A repository-version-wide uniqueness constraint will be present on
>>>> “relative_path”, independently of any other repository uniquness
>>>> constraints (repo_key_fields) defined by the plugin writer.
>>>>
>>>> Modify the Stages API so that the relative_path can be processed in the
>>>> correct location – instead of “DeclarativeArtifact” it will likely need to
>>>> go on “DeclarativeContent”
>>>>
>>>> Remove “location_href” from the RPM Package content model – it was
>>>> never a true part of the RPM (file) metadata, it is derived from the
>>>> repository metadata. So storing it as a part of the Content unit doesn’t
>>>> entirely make sense.
>>>> Alternatives
>>>>
>>>> In most cases, a content unit will have a single relative path for a
>>>> content unit. Creating a general solution to solve a one-off problem is
>>>> usually not a good idea. As an alternative, we could look at another
>>>> solution for mirroring content. One example might be to create a new object
>>>> (e.g. RpmRepoMirrorContentMapping) that maps content to specific paths
>>>> within a repo or repo version.
>>>> Questions
>>>>
>>>>    - How do we handle this in pulp_file? How are content units
>>>>    identified in pulp_file without relative_path?
>>>>       - Checksum?
>>>>       - How was this problem handled in Pulp 2?
>>>>
>>>>
>>>> Please weigh in if you have any input on potential problems with the
>>>> proposal, potential alternate solutions, or other insights or questions!
>>>> _______________________________________________
>>>> Pulp-dev mailing list
>>>> Pulp-dev at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>>
>>>>
>>>>
>>>> --
>>>> Pavel Picka
>>>> Red Hat
>>>>
>>>> _______________________________________________
>>>> Pulp-dev mailing list
>>>> Pulp-dev at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>>
>>>> _______________________________________________
>>>> Pulp-dev mailing list
>>>> Pulp-dev at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>>
>>> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20200428/7f4a8b86/attachment.htm>


More information about the Pulp-dev mailing list