[Pulp-dev] versioned repositories

Michael Hrivnak mhrivnak at redhat.com
Wed May 24 17:59:06 UTC 2017

On Wed, May 24, 2017 at 11:26 AM, Dennis Kliban <dkliban at redhat.com> wrote:

> I noticed that the REST API examples don't mention anything about deleting
> a particular version of a repository. This is a use case that we need to
> support.
> -Dennis

Great point. I was hoping we could avoid that need in the short term, but
in IRC yesterday, Justin Sherrill brought up an important, but hopefully
rare, use case. If a user accidentally adds something secret to a repo and
needs to remove it entirely from Pulp, we need to provide a way to
accomplish that. In Pulp 2 that is cumbersome to do, but at least possible.
It requires you to remove it from every repo it's in, re-publish them all,
and do an orphan removal.

Versions' content being immutable, I think that *any* time you change a
repo, it creates a new version with the next new-and-never-before-used

Consider a repo that has versions 1-16, and I create 17 while accidentally
adding a secret. The right next step for a user is to make whatever changes
are necessary to the repo to remove their secret. This is so far the same
as you would do with Pulp 2. No particular version awareness is necessary
yet. By making that change, which presumably removes your secret content,
Pulp creates version 18.

Version 17 is now thankfully the only version in history that contains your
secret. Removing that version will be like tearing a page out of a history
book. You can understand the history before, and you can understand the
history after. You can even understand what changed about the world since
the page before to the one after. But you also plainly see that there is a
gap in records, so exactly what happened on that page will never again be
known. But that's ok.

Data Details

With this model, removing a version just squashes its history into the next
version. It's quite simple, so let's dig into the associations for a moment.

Any association that was added in 17 and removed in 18 just gets deleted.

Any other association added or removed in version 17 has the number 17
replaced with 18.

This approach can squash arbitrary ranges of versions whether a user wants
to just trim history, or deliberately remove something sensitive.


Ok, finally to your question! Three options come to mind. There are likely
more, so please speak up if you'd like to add a favorite. I assume that in
any approach we take, the actual version deletion would happen in a task
that locks the repo.

1) Allow a DELETE call on a version. It's simple and intuitive. The only
downside is that you can't specify a range to remove in one operation.
Maybe that doesn't matter too much; it'll be quick. I think this would be a
good starting point.

DELETE /api/v3/repositories/foo/versions/17/

2) Add a squash endpoint for the collection. The endpoint would in some way
take a range of versions to squash. Maybe we use filtering syntax along
these lines:

DELETE /api/v3/repositories/foo/versions/?num__gte=5&num__lte=10

or an action endpoint on the repo

POST /api/v3/repositories/foo/squash_versions/
  {'start': 5, 'end': 10}

3) Add an action endpoint on the surviving version:

POST /api/v3/repositories/foo/versions/10/squash_since/
  {'since': 5}

Of these, I lean toward starting with the first option. It's simple and
intuitive, and it can accomplish all known use cases, even if it may be
inefficient for deleting a range of versions. We could add something
resembling options 2 or 3 in a later 3.y release if necessary.

But, I could also see an argument for making just one endpoint that
facilitates a range, and it accommodates all used cases efficiently out of
the gate. Then we'd never need option 1. It's the "There should be one--
and preferably only one --obvious way to do it." approach.

Other Considerations

One other factor to consider is publications. In an ideal world, Pulp would
also delete any publications associated with a version that's being
deleted. That's definitely not something Pulp 2 does (you can remove
content without re-publishing, potentially leaving broken links if content
is orphan-removed). If we don't tackle this in 3.0, we'll probably want to
tackle Publications as a first-class thing some time in 3.y. In any case,
keeping this in mind as a possible side-effect of version deletion is

So what do you all think? Preferences among these? Other ideas? More
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20170524/9d31eafd/attachment.htm>

More information about the Pulp-dev mailing list