Extremely long FSCK. (>24 hours)

Balu manyam balu.manyam at gmail.com
Sat Apr 12 01:42:08 UTC 2008


justin - you may wish to refer the email ...with sub:forced fsck (again?)
in the archives ....

HTH

Manyam



On Wed, Apr 9, 2008 at 2:45 AM, Justin Hahn <jhahn at rbmtechnologies.com>
wrote:

> Hello all,
>
> I recently encountered a problem that I thought I should bring to the ext3
> devs. I've seen some evidence of similar issues in the past, but it wasn't
> clear that anyone had experienced it at quite this scale.
>
> The short summary is that I let 'e2fsck -C 0 -y -f' run for more than 24
> hours on a 4.25Tb filesystem before having to kill it. It had been stuck at
> "70.1%" in Pass 2 (checking directory structure) for about 10 hours. e2fsck
> was using about 4.4Gb of RAM and was maxing out 1 CPU core (out of 8).
>
> This filesystem is used for disk-to-disk backups with dirvish[1]  The
> volume was 4.25Gb large, and about 90% full. I was doing an fsck prior to
> running resize2fs, as required by said tool. (I ended up switching to
> ext2online, which worked fine.)
>
> I suspect the large # of hard links and the large file system size are
> what did me in. Fortunately, my filesystem is clean for now. What I'm
> worried about is the day when it actually needs a proper fsck to correct
> problems. I have no idea how long the fsck would have taken had I not
> cancelled it. I fear it would have been more than 48hours.
>
> Any suggestions (including undocumented command line options) I can try to
> accelerate this in the future would be welcome. As this system is for
> backups and is idle for about 12-16 hours a day, I can un-mount the volume
> and perform some (non-destructive!!) tests if there is interest.
> Unfortunately, I cannot provide remote access to the system for security
> reasons as this is our backup archive.
>
> I'm using CentOS 4.5 as my distro.
>
> 'uname -a' reports:
> Linux backups-00.dc-00.rbm.local 2.6.9-55.0.12.ELsmp #1 SMP Fri Nov 2
> 12:38:56 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
>
> The underlying hardware is a Dell PE 2950, with a PERC 5i RAID controller
> and 6x 1Tb SATA drives and 8Gb of RAM. I/O performance has been fine for my
> purposes, but I have not benchmarked, tuned or tweaked it in any way.
>
> Thanks!
>
> --jeh
>
> [1] Dirvish is an rsync/hardlink based set of perl scripts -- see
> http://www.dirvish.org/ for more details.
>
> _______________________________________________
> Ext3-users mailing list
> Ext3-users at redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20080412/960c9f03/attachment.htm>


More information about the Ext3-users mailing list