[Linux-cluster] gfs_fsck fails on large filesystem

Tue Aug 1 17:53:02 UTC 2006

Stephen Willey wrote:
> The fsck is now running after we added the 137Gb swap drive.  It appears
> to consistently chew about 4Gb of RAM (sometimes higher) but it is
> working (for now).
>
> Any ballpark idea of how long it'll take to fsck a 45Tb FS?  I know
> that's a "how long is a piece of string" question, but are we talking
> hours/days/weeks?
>
> Stephen
>   
Hi Stephen,

I don't know how long it will take to fsck a 45TB fs, but it wouldn't 
surprise me if
it took several days.  It also varies because of hardware differences, 
and of course
if you're going to swap, that might slow it down too.
Any way you look at it, 45TB is a lot of data to go through with a
fine-tooth comb like gfs_fsck does.

The latest RHEL4 U3 version (and up) and recent STABLE
and HEAD versions (in CVS) now give you a percent complete number every
second during the more lengthy passes, such as pass5.

When it finishes, can you post something on the list to let us know?

We've tried to kick around ideas on how to improve the speed, such as
(1) adding an option to only focus on areas where the journals are dirty,
(2) introducing multiple threads to process the different RGs, and even
(3) trying to get multiple nodes in the cluster to team up and do different
areas of the file system.  None of these have been implemented yet
because of higher priorities.  Since this is an open-source project, anyone
could step in and do these.  Volunteers?

Regards,

Bob Peterson
Red Hat Cluster Suite