[Pulp-dev] Repo version implementation

Brian Bouterse bbouters at redhat.com
Mon Dec 11 21:38:06 UTC 2017


Thanks so much for bringing this up David. I'm replying right away because
I think this is becoming the most important work on the critical path.

I'm +1 to the first proposal, to use direct association, and for the same
reasons. Modeling it directly will have lots more database records, but
they will be very small join records. Having it as a direct relationship is
a win for developers and plugin writers because they get the Django ORM to
work with instead of methods we wrote. It's a win for core devs because we
don't have to maintain code that duplicates the value of the Django ORM.
It's also a win for user filtering which can be more straightforward
through a more straightforward database relationship.

On Mon, Dec 11, 2017 at 3:36 PM, David Davis <daviddavis at redhat.com> wrote:

> tl;dr - do we want to store a direct association between a repo version
> and a content unit or just the changes between each version?
>
> There has been discussion about how to implement the association between a
> repo version and a content unit and two implementations have been proposed.
> The first would store a direct relation between a repo version and a
> content unit. The other proposal is currently captured in @mhrivnak’s PR[0]
> and involves storing changes—the addition/removal of a content unit. I see
> some benefits to both approaches so let me outline them.
>
> For the first proposal (the direct association), I find this to be more
> intuitive and simplistic. To get a list of content for a repo version, you
> simply would need to filter by the repo version id. Also, deleting a
> version is trivial: we would just need to remove the version from the
> database. On the other hand, deleting a repo version when you are storing
> changes means having to squash the changes.
>
> The second proposal of storing the differences in each version (compared
> to the previous version) has the advantage of storing less data in the
> database. For example, if you added a single content unit to a version,
> you’d only need to store one record for the new version no matter how many
> content units a repository has. Filtering is a little harder (you need to
> select the repo contents with version_added gte version and version_removed
> is NULL) but not overly complicated. It’s a bit of a more efficient
> solution but slightly harder to grok maybe.
>
> If we want to support the use case of allowing users to create a new
> version from a base version (as opposed to just the latest version) then
> the second proposal is a bit tricky I think. Either we have to store the
> changes in the context of the base case (which makes the filtering
> algorithm I described harder) or we have to re-compute the changes between
> the new version and the latest version when we store them in the db.
>
> It sounded like during our meetings most people were in favor of the first
> proposal. Interested to get people’s thoughts especially if anyone is -1 on
> the first proposal.
>
> Thanks.
>
> [0] https://github.com/pulp/pulp/pull/3228
>
> David
>
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20171211/cf64e161/attachment.htm>


More information about the Pulp-dev mailing list