Extremely long FSCK. (>24 hours)
jhahn at rbmtechnologies.com
Tue Apr 8 21:15:58 UTC 2008
I recently encountered a problem that I thought I should bring to the
ext3 devs. I've seen some evidence of similar issues in the past, but
it wasn't clear that anyone had experienced it at quite this scale.
The short summary is that I let 'e2fsck -C 0 -y -f' run for more than
24 hours on a 4.25Tb filesystem before having to kill it. It had been
stuck at "70.1%" in Pass 2 (checking directory structure) for about 10
hours. e2fsck was using about 4.4Gb of RAM and was maxing out 1 CPU
core (out of 8).
This filesystem is used for disk-to-disk backups with dirvish The
volume was 4.25Gb large, and about 90% full. I was doing an fsck prior
to running resize2fs, as required by said tool. (I ended up switching
to ext2online, which worked fine.)
I suspect the large # of hard links and the large file system size are
what did me in. Fortunately, my filesystem is clean for now. What I'm
worried about is the day when it actually needs a proper fsck to
correct problems. I have no idea how long the fsck would have taken
had I not cancelled it. I fear it would have been more than 48hours.
Any suggestions (including undocumented command line options) I can
try to accelerate this in the future would be welcome. As this system
is for backups and is idle for about 12-16 hours a day, I can un-mount
the volume and perform some (non-destructive!!) tests if there is
interest. Unfortunately, I cannot provide remote access to the system
for security reasons as this is our backup archive.
I'm using CentOS 4.5 as my distro.
'uname -a' reports:
Linux backups-00.dc-00.rbm.local 2.6.9-55.0.12.ELsmp #1 SMP Fri Nov 2
12:38:56 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
The underlying hardware is a Dell PE 2950, with a PERC 5i RAID
controller and 6x 1Tb SATA drives and 8Gb of RAM. I/O performance has
been fine for my purposes, but I have not benchmarked, tuned or
tweaked it in any way.
 Dirvish is an rsync/hardlink based set of perl scripts -- see http://www.dirvish.org/
for more details.
More information about the Ext3-users