[Linux-cluster] GFS failure

Thu Jun 15 18:18:47 UTC 2006

Anthony wrote:
> Hello,
>
> yesterday,
> we had a full GFS system Fail,
> all partitions were unaccessible from all the 32 nodes.
> and now all the cluster is inaccessible.
> did any one had already seen this problem?
>
>
> GFS: Trying to join cluster "lock_gulm", "gen:ir"
> GFS: fsid=gen:ir.32: Joined cluster. Now mounting FS...
> GFS: fsid=gen:ir.32: jid=32: Trying to acquire journal lock...
> GFS: fsid=gen:ir.32: jid=32: Looking at journal...
> GFS: fsid=gen:ir.32: jid=32: Done
>
> NETDEV WATCHDOG: jnet0: transmit timed out
> ipmi_kcs_sm: kcs hosed: Not in read state for error2
> NETDEV WATCHDOG: jnet0: transmit timed out
> ipmi_kcs_sm: kcs hosed: Not in read state for error2
>
> GFS: fsid=gen:ir.32: fatal: filesystem consistency error
> GFS: fsid=gen:ir.32:   function = trans_go_xmote_bh
> GFS: fsid=gen:ir.32:   file = 
> /usr/src/build/626614-x86_64/BUILD/gfs-kernel-2.6.9-42/smp/src/gfs/glops.c, 
> line = 542
> GFS: fsid=gen:ir.32:   time = 1150223491
> GFS: fsid=gen:ir.32: about to withdraw from the cluster
> GFS: fsid=gen:ir.32: waiting for outstanding I/O
> GFS: fsid=gen:ir.32: telling LM to withdraw
Hi Anthony,

This problem could be caused by a couple of things.  Basically, it 
indicates a filesystem
consistency error occurred.  In this particular case, it means that a 
write was done to the
file system, and a transaction lock was taken out, but after the write 
transaction, the journal
for the written data was found to be still in use.  That means one of 
two things:

Either (1) some process was writing to the GFS journal when they 
shouldn't be (i.e. without
the necessary lock) or else (2) the journal data written was somehow 
corrupted on disk.

In the past, we've often tracked down such problems to hardware 
failures; in other words,
even without the GFS file system in the loop, if you use a command like 
'dd' to send data to
the raw hard disk device, then use dd to retrieve it, the data comes 
back from the hardware
different than what was written out.  That particular scenario is 
documented as bugzilla bug
175589.

I'm not saying that is your problem, but I'm saying that's what we've 
seen in the past.

My recommendation is to read the bugzilla, back up your entire file 
system or copy it to
a different set of drives, then perhaps you can do some hardware tests 
as described in the
bugzilla to see whether your hardware can consistently write data, read 
it back, and get
a match between what was written and what was read back.  Do this test 
without GFS in
there at all, and hopefully with only one node accessing that storage at 
a time.

You will probably also want to run gfs_fsck before mounting again to 
check the consistency
of the file system, just in case some rogue process on one of the nodes 
was doing something
destructive.

WARNING: overwriting your GFS file system will of course damage what was 
there,
so you better be careful not to destroy your data and make a copy before 
doing this.

If the hardware checks out 100% and you can recreate the failure, open a 
bugzilla against GFS
and we'll go from there.  In other words, we don't know of any problems 
with GFS that
can cause this, beyond hardware problems.

I hope this helps.

Regards,

Bob Peterson
Red Hat Cluster Suite