[Pulp-list] Package path enhancements in pulp
John Morris
john at zultron.com
Thu Mar 29 22:07:30 UTC 2012
Hi Pradeep,
> We recently ran into an issue where in some situations package paths in
> pulp could collide. The relevant bug is here #798656. Due to this we
> decided to change the package path location to include the whole package
> checksum instead of first three characters. Though the change sounds
> simple, the path for migration is involved. The following wiki page
> illustrates the changes in detailed
I'm curious under what situations that collision may occur?
One nice thing about the old directory structure is the
%{name}/%{version}/%{release}/%{arch} pattern matches koji's. I
previously had some thought why the same structure for both would be
beneficial, but now I've forgotten. :P
There's another minor concern about the directory structure in general,
with our without the extra level. The use of symlinks to point from the
repo packages directory into grinder's multi-level structure takes a lot
of disk activity to do any sort of scan that does a stat() on each RPM.
Following each symlink requires traversing 4 or 5 directories that are
unlikely to be in the fs cache. For example, compare times of '/bin/ls'
and '/bin/ls -l' in the repo packages directory.
Daily grinder syncs of large repos, like Fedora, can take quite a long
time even when there are few changes. I suspect this to be a
contributing factor. Has there been any thought about making this more
efficient, perhaps by creating hard links, or by updating a database
with grinder's sync status?
The list archives don't have anything on the thinking behind this
structure. File de-duping and bandwidth savings are clear benefits, but
I'd like to hear thoughts on whether others have this same concern, or
more likely whether I'm just not doing something right. ;)
John
More information about the Pulp-list
mailing list