Very slow directory traversal

Ross Boylan ross at biostat.ucsf.edu
Mon Oct 15 17:41:54 UTC 2007


On Wed, 2007-10-10 at 23:37 -0700, Ross Boylan wrote:
> On Wed, 2007-10-10 at 09:59 -0600, Andreas Dilger wrote:
> > On Oct 06, 2007  00:10 -0700, Ross Boylan wrote:
> > > My last full backup of my Cyrus mail spool had 1,393,569 files and
> > > cconsumed about 4G after compression. It took over 13 hours.  Some
> > > investigation led to the following test:
> > >  time tar cf /dev/null /var/spool/cyrus/mail/r/user/ross/debian/user/
> > 
> > FYI - "tar cf /dev/null" actually skips reading any file data.  The
> > code special cases /dev/null and skips the read entirely.
> > 
> > > That took 15 minutes the first time it ran, and 32 seconds when run
> > > immediately thereafter.  There were 355,746 files. This is typical of
> > > what I've been seeing: initial run is slow; later runs are much faster.
> > 
> > I'd expect this is because on the initial run the on-disk inode ordering 
> > causes a lot of seeks, and later runs come straight from memory.  Probably
> > not a lot you can do directly, but e.g. pre-reading the inode table would
> > be a good start.
> Judging from your comments and the thread you reference below, the
> problem is that the order returned from readdir is not inode order.  But
> if tar, in this special case (/dev/null), doesn't actually read from the
> file, why should it be so slow.  Does it do something (stat?) that makes
> it have to fetch the inode anyway?
> > 
> > 
> > > I found some earlier posts on similar issues, although they mostly
> > > concerned apparently empty directories that took a long time.  Theodore
> > > Tso had a comment that seemed to indicate that hashing conflicts with
> > > Unix requirements.  I think the implication was that you could end up
> > > with linearized, or partly linearized searches under some scenarios.
> > > Since this is a mail spool, I think it gets lots of sync()'s.
> > 
> > There was an LD_PRELOAD library that Ted wrote that may also help:
> > http://marc.info/?l=mutt-dev&m=107226330912347&w=2
> > 
> I got the code, but am not having much luck making it work.  I've tried
> various things.  The most recent is
> cc -shared -fpic -o libsd_readdir.so spd_readdir.c # as me
> # rest as root
> # export LD_LIBRARY_PATH=./
> # export LD_PRELOAD=libsd_readdir.so
> # ldconfig -v -n $(pwd)
> /usr/local/src/kernel/ext3-patch:
> 	libsd_readdir.so -> libsd_readdir.so
> corn:/usr/local/src/kernel/ext3-patch# date; time tar
> cf /dev/null /var/spool/cyrus/mail/r/user/ross/pol/asdnet/
> Wed Oct 10 23:16:44 PDT 2007
> tar: Removing leading `/' from member names
> Segmentation fault
Even stranger, when I try the same thing with a little test program that
calls readdir, it works.

I tried running tar as myself, but got the same segfault (the first test
I reported I ran as root).  tar doesn't look as if it's setuid
# ls -l /bin/tar
-rwxr-xr-x 1 root root 231188 2007-09-05 02:42 /bin/tar

> 
> I don't know how to make something for preload; can anyone give any
> hints?
> 
> Should the module I'm attempting to load have any effect on the 15
> minute time noted above for tar to /dev/null, or is it only relevant if
> I am pulling data off the disk files?
> 
> Would there be any value in having some  other program traverse the
> directories before I do the backup, or would cache limits likely mean
> the stuff from the start would be gone from the cache by the time I got
> to the end, so that the backup would basically be starting fresh?
> 
> 
> Thanks.
> Ross




More information about the Ext3-users mailing list