[Pulp-dev] pulp3: Publishing Proposal

Thu Jun 29 03:37:17 UTC 2017

On Wed, Jun 28, 2017 at 4:52 PM, Jeff Ortel <jortel at redhat.com> wrote:

>
> I considered storing the base path in the Publication. But I don't see how
> the query using the /path/
> component of the URL could be indexed if the path is split between the
> Publication and the LinkedArtifact.

Ah yes! When I was hurriedly writing earlier, I knew there was some
algorithmic problem related to paths but couldn't remember what it was.
This is it. It's solvable, but at some point you need all the base paths in
a tree structure that the serving app can use to find the correct match to
a full path, similar to traversing directories in a filesystem. Or you need
some other equivalent algorithm.

Quickly brainstorming on that though...

Assume we have a restriction that for any given base path, it must not
reside within any other base path. We have that restriction today with yum
repos.

Given a path with n segments, for example /a/b/c/foo.rpm would have 3
segments, the serving app could break it up into n possible base paths:
/a/, /a/b/, /a/b/c/. It can do one database search on publication base
paths for all n possible paths, and should only get one or zero results. If
that field is indexed, this could be a quick way to find the right
publication. Something like:

Publication.objects.get(base_path__in=['/a/', '/a/b/', '/a/b/c/'])

Granted, then you must do a second DB query for the published artifact
within that publication. But this should all be a very small portion of the
total time it takes to actually transmit the file itself, and I could
easily believe this is as fast as reading a symlink off of an NFS mount.

-- 

Michael Hrivnak

Principal Software Engineer, RHCE

Red Hat
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20170628/aa2275ac/attachment.htm>