[Linux-cluster] GFS assertion failure

Ben Yarwood ben.yarwood at juno.co.uk
Mon Jul 21 09:29:07 UTC 2008


I have a three node cluster running latest 4.6 code with 14 gfs file systems running.  On a three month old, heavily used gfs file
system which has never had any problems, had no shared storage power outages or anything that I can think of that could have caused
a problem in the fs, I got the following error and a withdraw:

Jul 18 22:05:26 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2: fatal: assertion "FALSE" failed
Jul 18 22:05:26 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2:   function = xmote_bh
Jul 18 22:05:26 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2:   file =
/builddir/build/BUILD/gfs-kernel-2.6.9-75/smp/src/gfs/glock.c, line = 1093
Jul 18 22:05:26 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2:   time = 1216415126
Jul 18 22:05:26 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2: about to withdraw from the cluster
Jul 18 22:05:26 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2: waiting for outstanding I/O
Jul 18 22:05:26 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2: telling LM to withdraw
Jul 18 22:05:27 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2: withdrawn
Jul 18 22:05:27 jrmedia-c kernel: GFS: fsid=alpha_cluster:wav-4.2: ret = 0x00000002

The file system wouldn't unmount after this unfortunately and the only way to get the node up and running again was to do a fence.
I checked bugzilla and can't find anything still open relating to this.

Can anyone:

1.  Suggest a good strategy for trying to get the fs unmounted so that a fence is not required and a normal reboot can be done?
2.  Suggest what information I should have captured to better help debugging in the future, I think this would make a good FAQ and
be helpful to all.

Finally in the FAQ it says that after a gfs withdraws, the node should be rebooted before remounting, is this correct and is this
related to replaying journals?  What would happen if you didn't reboot?


Cheers
Ben






More information about the Linux-cluster mailing list