forced fsck (again?)

Andreas Dilger adilger at sun.com
Mon Jan 28 17:52:16 UTC 2008


On Jan 25, 2008  23:33 -0500, Theodore Tso wrote:
> > Hmm, I'm not sure I understand what it is you want to do?  The fsck should
> > be run as 'e2fsck -fn "$dev"' (since we already know this is ext2|ext3).
> > Using "-C 0" isn't useful because we don't want progress in the output log,
> 
> This was my fault.  It means that when you run this from a tty, you
> get to see the progress bar.  The -s flag to logsave will strip out
> the progress information.  (I added logsave -s precisely for this
> purpose.  :-)

OK, that is fine too, I wasn't sure if it would fill the log with "===".

> > and "-p" without "-f" will just check the superblock.  
> 
> That's needed e2fsck -p will clean up the orphaned inode list, so that
> the subsequent e2fsck -fy will return 0 if the filesystem is clean.
> Without the the fsck -p, then e2fsck -fy will return 1 (because it
> modified the filesystem) which we can't distinguish from the case
> where the filesystem had errors.

Hmm, shouldn't that be cleaned up when making a snapshot?  If not, then
we are stuck with the problem that you have to have writable snapshots,
and that is less desirable than read-only snapshots, but not fatal I guess.

> > We don't want to be fixing anything (since this should be a read-only
> > snapshot) so "-fy" is  also not so great.
> 
> This is a tradeoff.  e2fsck -fy requires that the snapshot have more
> space (although if you run off, it's not that horrible; the snapshot
> will just go invalid).

Well, in my one experiment this caused the lvcheck to be unkillable, and
also marked the parent offline...  Maybe it was just that one time (I
haven't tested extensively).

> The advantage of "-fy" is that you get more
> information about any errors in the filesystem, where as "-fn" may not
> report as useful information.

True.

> > > # do everything needed to check and reset dates and counters on /dev/$1/$2.
> > > function check_fs() {
> > > 	local tmpfile=`mktemp -t e2fsck.log.XXXXXXXXXX`
> > > 	trap "rm $tmpfile ; trap - RETURN" RETURN
> > 
> > For the log file it probably makes sense to keep this around with a
> > timestamp if there is a failure.  That means it is fine to generate a
> > random filename temporarily, but it should be renamed to something
> > meaningful (e.g. /var/log/lvfsck.$dev.$(date +%Y%m%d) or similar).
> 
> The idea is if there is a failure we'll e-mail to the administrator;
> after that, there's no real need to keep it around.

Unless email is broken, for whatever reason.  I suppose it might make
sense to keep a single log for each device (put the timestamp inside the
log) so that the space usage doesn't increase dramatically.  Having
logrotate do cleanup isn't so great.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the Ext3-users mailing list