ext4 and extremely slow filesystem traversal

Vincent Caron vcaron at bearstech.com
Tue Mar 12 23:56:15 UTC 2013


Hello list,

  I have troubles with the daily backup of a modest filesystem which
tends to take more that 10 hours. I have ext4 all over the place on ~200
servers and never ran into such a problem.

  The filesystem capacity is 300 GB (19,6M inodes) with 196 GB (9,3M
inodes) used. It's mounted 'defaults,noatime'. It sits on a hardware
RAID array thru plain LVM slices. The RAID array is a RAID5 running on
5x SATA 500G disks, with a battery-backed (RAM) cache and write-back
cache policy. To be precise, it's an Areca 1231.

  The hardware RAID array use 64kB stripes and I've configured the
filesystem with 4kB blocks and stride=16. It also has 0 reserved blocks.
In other works the fs was created with 'mkfs -t ext4 -E stride=16 -m 0
-L volname /dev/vgX/Y'. I'm attaching the mke2fs.conf for reference too.

  Everything is running with Debian Squeeze and its 2.6.32 kernel (amd64
flavour), on a 4 cores and 4 GB RAM server.

  I ran a tiobench tonight on an idle instance (I have two identicals
systems - hw, sw, data - with exactly the same pb). I've attached
results as plain text to protect them from line wrapping. They look fine
to me.

  When I try to backup the problematic filesystem with tar, rsync or
whatever tool traversing the whole filesystem, things are awful. I know
that this filesystem has *lots* of directories, most with few or no
files in them. Tonight I ran a simple 'find /path/to/vol -type d |pv
-bl' (counts directories as they are found), I stopped it more than 2
hours later : it was not done, and already counted more than 2M
directories. IO stats showed 1000 read calls/sec with avq=1 and avio=5
ms. CPU is 2% so it is totally I/O bound. This looks like the worst
random read case to me.

  I even tried a hack which tries to sort directories while traversing
the filesystem to no avail.

  Right now I don't even know how to analyze my filesystem further.
Sorry for not being able to describe it more accurately. I'm in search
for any advice or direction to improve this situation. While keeping
using ext4 of course :).

  PS: I did ask to the developers to not abuse the filesystem that way,
and that in 2013 it's okay to have 10k+ files per directory... No
success, so I guess I'll have to work around it.

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: tiobench.txt
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20130313/96ad8c7d/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: mke2fs.conf
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20130313/96ad8c7d/attachment.conf>


More information about the Ext3-users mailing list