fedorahosted git repo too large
Nigel Jones
dev at nigelj.com
Wed Aug 6 10:53:32 UTC 2008
On Wed, 2008-08-06 at 11:23 +0200, Jeroen van Meeuwen wrote:
> Nigel Jones wrote:
> > On Tue, 2008-08-05 at 23:44 -0400, Todd Zullinger wrote:
> >> Yuan Yijun wrote:
> >>> I just tried to download revisor git with this command "git pull
> >>> http://git.fedorahosted.org/git/revisor master". I have to repeat
> >>> 4-5 times since it breaks during downloading. The .git folder is
> >>> about 58MB. After "git gc --aggressive" it becomes only 6MB.
> >>>
> >>> Anyone please run gc on server?
> >> Perhaps better would be repack. There was a recent thread on the git
> >> list and one of the developers pointed out an older mail from Linus
> >> where he described gc --aggressive as "mostly dumb" and recommended
> >> that using something like "repack -a -d -f --depth=250 --window=250"
> >> instead.
> >>
> >> http://article.gmane.org/gmane.comp.gcc.devel/94613
> > That's actually a very useful article and the methods/reasons behind it
> > sound quite sane and it could be a useful approach for us.
> >
> > I'll try this out on one of the smaller repos (a copy of course) and see
> > what happens.
> >
>
> We've ended up doing this live as well and I'm happy with the few stabs
> I took at seeing if everything still works.
>
> Feel free to make this a regular thing on the revisor repo and I'll
> report if anything breaks, so that if it doesn't, this could maybe
> become a regular thing to do on all repos?
Okay, from a server POV it shrunk the 116MB folder down to just 7MB in
less than two minutes (based on a trial run in my homedir), which is
pretty sweet.
A trial with system-config-firewall.git went from ~20M to ~4M.
I also did a trial run of anaconda.git and anaconda-images.git:
anaconda.git:
183M (97745 objects) -> 64M (a third of the original size)
real 26m18.050s
user 23m9.395s
sys 0m6.568s
anaconda-images.git:
54M (1482 objects) -> 41M (didn't expect much here)
real 1m57.944s
user 1m43.466s
sys 0m0.848s
Maybe we should run git repack on the big repos on a bi/tri-monthly
basis, and git gc (which is very fast - <1 minute on the anaconda repo
for example) on a monthly basis.
- Nigel
More information about the Fedora-infrastructure-list
mailing list