[Pulp-dev] the "relative path" problem

David Davis daviddavis at redhat.com
Tue Apr 28 15:40:18 UTC 2020


Yes, that's correct. During our meeting we discussed two options: the first
was to extend RepositoryContent to store relative path per ContentArtifact
as storing a relative_path per Content won't work for multi-Artifact
Content units.

An alternative that I pitched was to have plugins (or maybe even core
someday) store this information outside RepositoryContent and then use this
information during publishing to set relative_path on PublishedArtifacts.
We'd have to modify the content app if we wanted to support pass through
publications but I think asking plugins to use published artifacts in this
case is warranted. That said, I don't think anyone else was keen on this
idea though.

David


On Tue, Apr 28, 2020 at 10:30 AM Matthias Dellweg <mdellweg at redhat.com>
wrote:

> That is only used for passthrough publication afaik. If you publish each
> content unit "by hand", you create a new relative path for each published
> artifact. That is, why it can be empty and still the content can be
> published.
>
> On Tue, Apr 28, 2020 at 4:09 PM Daniel Alley <dalley at redhat.com> wrote:
>
>> We realized in our discussion that the original proposal described in my
>> email will not work, because "relative_path" ultimately describes the path
>> of the published *artifacts* (not content), and for content types with
>> multiple artifacts, storing this information in a field on
>> RepositoryContent would not be possible.
>>
>> On Mon, Apr 27, 2020 at 6:08 PM Daniel Alley <dalley at redhat.com> wrote:
>>
>>> There is a video call scheduled to discuss this issue tomorrow (Tuesday
>>> April 28th) at 13:30 UTC (please convert to your local time).
>>> https://meet.google.com/scy-csbx-qiu
>>>
>>> On Sat, Apr 25, 2020 at 7:02 AM David Davis <daviddavis at redhat.com>
>>> wrote:
>>>
>>>> I had a chance to think about this some more yesterday and wanted to
>>>> email out my thoughts. I also think that this change sounds scary and will
>>>> have a big impact on plugin writers so I thought of a couple alternatives:
>>>>
>>>> First, we could add a relative_path field to RepositoryContent instead
>>>> of moving it there. This would be an optional field. It would be up to
>>>> plugins to manage this field and they would still need to populate the
>>>> relative_path field on ContentArtifact. But plugins could use this optional
>>>> field to store relative paths per repository and then use this field when
>>>> generating publications.
>>>>
>>>> The second alternative is one that is already laid out in the original
>>>> email but to call it out again: it would be to not solve this in pulpcore.
>>>> RPM would create its own object that would map content in a repository to
>>>> relative_paths.
>>>>
>>>> David
>>>>
>>>>
>>>> On Tue, Apr 21, 2020 at 9:22 AM Quirin Pamp <pamp at atix.de> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>>
>>>>> I am not currently very well versed in the classes involved, but
>>>>> moving relative_path around sounds slightly scary with the potential to
>>>>> break things.
>>>>>
>>>>>
>>>>> As such, I would be interested to be kept in the loop as this moves
>>>>> forward. (Mailing list once there is some movement is entirely sufficient
>>>>> 😉)
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Quirin Pamp
>>>>> ------------------------------
>>>>> *From:* pulp-dev-bounces at redhat.com <pulp-dev-bounces at redhat.com> on
>>>>> behalf of Ina Panova <ipanova at redhat.com>
>>>>> *Sent:* 21 April 2020 14:07:13
>>>>> *To:* Daniel Alley <dalley at redhat.com>
>>>>> *Cc:* Pulp-dev <pulp-dev at redhat.com>
>>>>> *Subject:* Re: [Pulp-dev] the "relative path" problem
>>>>>
>>>>> Daniel,
>>>>>
>>>>> how about setting up a meeting and brainstorm the alternatives,
>>>>> pros/cons there?
>>>>>
>>>>>
>>>>> --------
>>>>> Regards,
>>>>>
>>>>> Ina Panova
>>>>> Senior Software Engineer| Pulp| Red Hat Inc.
>>>>>
>>>>> "Do not go where the path may lead,
>>>>>  go instead where there is no path and leave a trail."
>>>>>
>>>>>
>>>>> On Fri, Apr 17, 2020 at 5:57 PM Daniel Alley <dalley at redhat.com>
>>>>> wrote:
>>>>>
>>>>> Bump, this item needs to move forwards soon.  Does anyone have any
>>>>> thoughts?
>>>>>
>>>>> On Wed, Apr 1, 2020 at 9:40 AM Pavel Picka <ppicka at redhat.com> wrote:
>>>>>
>>>>> Hi,
>>>>> I'd like to add one more question to this topic. Do you think it is a
>>>>> blocker for PRs [0] & [1] as by testing [2] this features I haven't run
>>>>> into real world example where two really same name packages appears.
>>>>> I think this is a 'must have' feature but until we solve/decide it we
>>>>> can have two features working may with warning in docs for users that can
>>>>> happen in some 'special' repositories.
>>>>>
>>>>> To follow topic directly I like proposed move to 'RepositoryContent'
>>>>> and add it to its uniqueness constraint (if I understand well).
>>>>>
>>>>> [0] https://github.com/pulp/pulp_rpm/pull/1657
>>>>> [1] https://github.com/pulp/pulp_rpm/pull/1642
>>>>> [2] tested with centos 7, 8, opensuse and SLE repositories
>>>>>
>>>>> On Wed, Apr 1, 2020 at 3:22 PM Daniel Alley <dalley at redhat.com> wrote:
>>>>>
>>>>> We'd like to start a discussion on the "relative path problem"
>>>>> identified recently.
>>>>> Problem:
>>>>>
>>>>> Currently, a relative_path is tied to content in Pulp. This means that
>>>>> if a content unit exists in two places within a repository or across
>>>>> repositories, it has to be stored as two separate content units. This
>>>>> creates redundant data and potential confusion for users.
>>>>>
>>>>> As a specific example, we need to support mirroring content in
>>>>> pulp_rpm <https://pulp.plan.io/issues/6353>. Currently, for each
>>>>> location at which a single package is stored, we’ll need to create a
>>>>> content unit. We could end up with several records representing a single
>>>>> package. Users may be confused about why they see multiple records for a
>>>>> package and they may have trouble for example deciding which content unit
>>>>> to copy.
>>>>> Proposed Solution:
>>>>>
>>>>> Move “relative_path” from its current location on ContentArtifact, to
>>>>> RepositoryContent. This will require a sizable data migration. It is
>>>>> possibly the case that in rare cases, repository versions may change
>>>>> slightly due to deduplication.
>>>>>
>>>>> A repository-version-wide uniqueness constraint will be present on
>>>>> “relative_path”, independently of any other repository uniquness
>>>>> constraints (repo_key_fields) defined by the plugin writer.
>>>>>
>>>>> Modify the Stages API so that the relative_path can be processed in
>>>>> the correct location – instead of “DeclarativeArtifact” it will likely need
>>>>> to go on “DeclarativeContent”
>>>>>
>>>>> Remove “location_href” from the RPM Package content model – it was
>>>>> never a true part of the RPM (file) metadata, it is derived from the
>>>>> repository metadata. So storing it as a part of the Content unit doesn’t
>>>>> entirely make sense.
>>>>> Alternatives
>>>>>
>>>>> In most cases, a content unit will have a single relative path for a
>>>>> content unit. Creating a general solution to solve a one-off problem is
>>>>> usually not a good idea. As an alternative, we could look at another
>>>>> solution for mirroring content. One example might be to create a new object
>>>>> (e.g. RpmRepoMirrorContentMapping) that maps content to specific paths
>>>>> within a repo or repo version.
>>>>> Questions
>>>>>
>>>>>    - How do we handle this in pulp_file? How are content units
>>>>>    identified in pulp_file without relative_path?
>>>>>       - Checksum?
>>>>>       - How was this problem handled in Pulp 2?
>>>>>
>>>>>
>>>>> Please weigh in if you have any input on potential problems with the
>>>>> proposal, potential alternate solutions, or other insights or questions!
>>>>> _______________________________________________
>>>>> Pulp-dev mailing list
>>>>> Pulp-dev at redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Pavel Picka
>>>>> Red Hat
>>>>>
>>>>> _______________________________________________
>>>>> Pulp-dev mailing list
>>>>> Pulp-dev at redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>>>
>>>>> _______________________________________________
>>>>> Pulp-dev mailing list
>>>>> Pulp-dev at redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>>>
>>>> _______________________________________________
>> Pulp-dev mailing list
>> Pulp-dev at redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20200428/27fcd2c9/attachment.htm>


More information about the Pulp-dev mailing list