Problems under Redhat EL3 and ext3

Ulf Zimmermann ulf at
Thu Jul 20 00:00:19 UTC 2006

I am running into performance issues with ext3. Historically we had our
image files (pictures of cars, currently 5.3 million) sub divided into a
directory structure [0-9]/[0-9]/[0-9]/[0-9], where we would take the
first 4 letters/numbers of the file name and use that to put it into
this structure. Letters [a-cA-C] would become a 0, [d-fD-F] a 1, etc. As
the file names used to be based on VIN numbers of vehicles, that wasn't
a problem. But then our developers changed the image file names using a
vehicle ID from the database. And as we rolled over 1,000,000 in vehicle
ids we would get large numbers of files into directories. And files do
not get well distributed.

So we changed the method using [0-9a-f]/[0-9a-f]/[0-9a-f] and md5 on the
file name, using then the first 3 letters/numbers to file it away. On
initial testing this worked well, distribution nice across the
directories, so we could split this on separate file systems or disks.

When we actually got to do this, a decision was made to use hard links
from the old structure to the new structure for backward capability. And
this turned into a disaster. Rsync or find on the new structure takes
dramatic longer, talking about 5 minutes for a find on the old structure
and hours on the new structure. Using strace I tracked it down to
lstat64. On the old structure lstat64 takes on average 37 usecs/call
while on the new structure it is over 2,400 usecs/call.

EL4 does not seem to have this problem, unfortunately I can't just
upgrade, out of other reasons. So anyone have ideas why lstat64 would be
so much slower on the new structure? Any help, hints, suggestions would
be great.

Regards, Ulf.

--------------------------------------------------------------------- Inc, T: 650-532-6382, F: 650-532-6441
4600 Bohannon Drive, Suite 100, Menlo Park, CA 94025

More information about the Ext3-users mailing list