[Linux-cluster] filesystem consistency error upon umount

Axel Thimm Axel.Thimm at ATrpms.net
Thu Aug 11 19:14:45 UTC 2005


Hi,

this is from an FC4/x86_64 node that forms a cluster with three
RHEL4/x86_64 nodes. All of them running latest errata kernels and
(vendor packaged) cluster/gfs bits.

a) to start with: is it OK to mix FC4 and RHEL4, or did I do something
   forbidden?

b) the cluster wasn't doing anything with the GFS filesystem at that
   time, i.e. it was just mounted on all 4 nodes, no data was being
   moved in any direction.

c) The other nodes correctly replayed the journal, This node was
   removed from the cluster w/o fencing and w/o any traces in the logs
   other than gfs' "about to withdraw from the cluster". I expected cman
   to report this, too. The other nodes' logs only contained
   information about the journal acquisition and replay.

d) There is a 10 min. delay from the moment of the mysterious
   filesystem consistency error and a series of Glock messages

e) And most importantly why did the gfs issue a filesystem consistency
   error upon a simple umount? FC4 vs RHEL4 issue?

Thanks!

Aug 11 19:11:48 zs01 rgmanager: [25900]: <notice> Shutting down Cluster Service Manager... 
Aug 11 19:11:48 zs01 clurgmgrd[3660]: <notice> Shutting down 
Aug 11 19:11:48 zs01 clurgmgrd[3660]: <notice> Stopping service homes-cifs 
Aug 11 19:11:48 zs01 clurgmgrd[3660]: <notice> Stopping service backup 
Aug 11 19:11:48 zs01 clurgmgrd[3660]: <notice> Service homes-cifs is stopped 
Aug 11 19:11:48 zs01 clurgmgrd[3660]: <notice> Service backup is stopped 
Aug 11 19:11:52 zs01 clurgmgrd[3660]: <notice> Shutdown complete, exiting 
Aug 11 19:11:53 zs01 rgmanager: [25900]: <notice> Cluster Service Manager is stopped. 
Aug 11 19:13:14 zs01 kernel: GFS: fsid=physik:data.2: fatal: filesystem consistency error
Aug 11 19:13:14 zs01 kernel: GFS: fsid=physik:data.2:   function = trans_go_xmote_bh
Aug 11 19:13:14 zs01 kernel: GFS: fsid=physik:data.2:   file = /usr/src/build/588747-x86_64/BUILD/smp/src/gfs/glops.c, line = 542
Aug 11 19:13:14 zs01 kernel: GFS: fsid=physik:data.2:   time = 1123780394
Aug 11 19:13:14 zs01 kernel: GFS: fsid=physik:data.2: about to withdraw from the cluster
Aug 11 19:13:14 zs01 kernel: GFS: fsid=physik:data.2: waiting for outstanding I/O
Aug 11 19:13:14 zs01 kernel: GFS: fsid=physik:data.2: telling LM to withdraw
Aug 11 19:13:27 zs01 kernel: lock_dlm: withdraw abandoned memory
Aug 11 19:13:27 zs01 kernel: GFS: fsid=physik:data.2: withdrawn
Aug 11 19:23:27 zs01 kernel: ror = 0
Aug 11 19:23:27 zs01 kernel:     gh_iflags = 2 4 5 
Aug 11 19:23:27 zs01 kernel: Glock (5, 8676146)
Aug 11 19:23:27 zs01 kernel:   gl_flags = 1 
Aug 11 19:23:27 zs01 kernel:   gl_count = 3
Aug 11 19:23:27 zs01 kernel:   gl_state = 3
Aug 11 19:23:27 zs01 kernel:   req_gh = yes
Aug 11 19:23:27 zs01 kernel:   req_bh = yes
Aug 11 19:23:27 zs01 kernel:   lvb_count = 0
Aug 11 19:23:27 zs01 kernel:   object = no
Aug 11 19:23:27 zs01 kernel:   new_le = no
Aug 11 19:23:27 zs01 kernel:   incore_le = no
Aug 11 19:23:27 zs01 kernel:   reclaim = no
Aug 11 19:23:27 zs01 kernel:   aspace = no
Aug 11 19:23:27 zs01 kernel:   ail_bufs = no
Aug 11 19:23:27 zs01 kernel:   Request
Aug 11 19:23:27 zs01 kernel:     owner = -1
Aug 11 19:23:27 zs01 kernel:     gh_state = 0
Aug 11 19:23:27 zs01 kernel:     gh_flags = 0 
Aug 11 19:23:27 zs01 kernel:     error = 0
Aug 11 19:23:27 zs01 kernel:     gh_iflags = 2 4 5 
Aug 11 19:23:27 zs01 kernel:   Waiter2
Aug 11 19:23:27 zs01 kernel:     owner = -1
Aug 11 19:23:27 zs01 kernel:     gh_state = 0
Aug 11 19:23:27 zs01 kernel:     gh_flags = 0 
Aug 11 19:23:27 zs01 kernel:     error = 0
Aug 11 19:23:27 zs01 kernel:     gh_iflags = 2 4 5 
Aug 11 19:23:27 zs01 kernel: Glock (5, 7146196)
Aug 11 19:23:27 zs01 kernel:   gl_flags = 1 
Aug 11 19:23:27 zs01 kernel:   gl_count = 3
Aug 11 19:23:27 zs01 kernel:   gl_state = 3
Aug 11 19:23:27 zs01 kernel:   req_gh = yes
Aug 11 19:23:27 zs01 kernel:   req_bh = yes
Aug 11 19:23:27 zs01 kernel:   lvb_count = 0
Aug 11 19:23:27 zs01 kernel:   object = no
Aug 11 19:23:27 zs01 kernel:   new_le = no
Aug 11 19:23:27 zs01 kernel:   incore_le = no
Aug 11 19:23:27 zs01 kernel:   reclaim = no
Aug 11 19:23:27 zs01 kernel:   aspace = no
Aug 11 19:23:27 zs01 kernel:   ail_bufs = no
Aug 11 19:23:27 zs01 kernel:   Request
Aug 11 19:23:27 zs01 kernel:     owner = -1
Aug 11 19:23:27 zs01 kernel:     gh_state = 0
Aug 11 19:23:27 zs01 kernel:     gh_flags = 0 
Aug 11 19:23:27 zs01 kernel:     error = 0
Aug 11 19:23:27 zs01 kernel:     gh_iflags = 2 4 5 
Aug 11 19:23:27 zs01 kernel:   Waiter2
Aug 11 19:23:27 zs01 kernel:     owner = -1
Aug 11 19:23:27 zs01 kernel:     gh_state = 0
Aug 11 19:23:27 zs01 kernel:     gh_flags = 0 
Aug 11 19:23:27 zs01 kernel:     error = 0
Aug 11 19:23:27 zs01 kernel:     gh_iflags = 2 4 5 
Aug 11 19:23:27 zs01 kernel: Glock (5, 190905665)
[...]
-- 
Axel.Thimm at ATrpms.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050811/98b46c94/attachment.sig>


More information about the Linux-cluster mailing list