[Linux-cluster] Any thoughts on losing mount?

Wendy Cheng wcheng at redhat.com
Tue Nov 27 17:07:22 UTC 2007


isplist at logicore.net wrote:
> I'm pulling my hair out here :).
> One node in my cluster has decided that it doesn't want to mount a storage 
> partition which other nodes are not having a problem with. The console 
> messages say that there is an inconsistency in the filesystem yet none of the 
> other nodes are complaining. 
>
> I cannot figure this one out so am hoping someone on the list can give me some 
> leads on what else to look for as I do not want to cause any new problems.
>
>   

The error message indicates resource group (RG) may get corrupted. Have 
you tried to do an fsck (or did it fixes anything) ? 

Different nodes could be accessing different RGs so other nodes may not 
see the corruption (until it starts to access this particular RG 
sometime later). Note that GFS normally tries to make node and/or 
process accessing the same RG it previously used if all possible - this 
is to avoid cluster-wide bottleneck (different nodes on different RGs) 
but still keep locality (use previously accessed RG) for performance 
reason.

Also do you remember any abnormal event (unclean shut-down, panic, 
power-lost, etc) *before* this issue pops out ?

-- Wendy
 
>
> Nov 27 10:29:26 compdev kernel: GFS: Trying to join cluster "lock_dlm", 
> "vgcomp:web"
> Nov 27 10:29:28 compdev kernel: GFS: fsid=vgcomp:web.3: Joined cluster. Now 
> mounting FS...
> Nov 27 10:29:28 compdev kernel: GFS: fsid=vgcomp:web.3: jid=3: Trying to 
> acquire journal lock...
> Nov 27 10:29:28 compdev kernel: GFS: fsid=vgcomp:web.3: jid=3: Looking at 
> journal...
> Nov 27 10:29:28 compdev kernel: GFS: fsid=vgcomp:web.3: jid=3: Done
> Nov 27 10:29:28 compdev kernel: GFS: fsid=vgcomp:web.3: Scanning for log 
> elements...
> Nov 27 10:29:28 compdev kernel: GFS: fsid=vgcomp:web.3: Found 1 unlinked 
> inodes
> Nov 27 10:29:28 compdev kernel: GFS: fsid=vgcomp:web.3: Found quota changes 
> for 0 IDs
> Nov 27 10:29:28 compdev kernel: GFS: fsid=vgcomp:web.3: Done
> Nov 27 10:29:35 compdev kernel: GFS: fsid=vgcomp:web.3: fatal: filesystem 
> consistency error
> Nov 27 10:29:35 compdev kernel: GFS: fsid=vgcomp:web.3:   RG = 31104599
> Nov 27 10:29:35 compdev kernel: GFS: fsid=vgcomp:web.3:   function = 
> gfs_setbit
> Nov 27 10:29:35 compdev kernel: GFS: fsid=vgcomp:web.3:   file = 
> /home/xos/gen/updates-2007-11/xlrpm29472/rpm/BUILD/gfs-kernel-2.6.9-72/up/src/
> gfs/bits.c, line = 71
> Nov 27 10:29:35 compdev kernel: GFS: fsid=vgcomp:web.3:   time = 1196180975
> Nov 27 10:29:35 compdev kernel: GFS: fsid=vgcomp:web.3: about to withdraw from 
> the cluster
> Nov 27 10:29:35 compdev kernel: GFS: fsid=vgcomp:web.3: waiting for 
> outstanding I/O
> Nov 27 10:29:35 compdev kernel: GFS: fsid=vgcomp:web.3: telling LM to withdraw
> Nov 27 10:29:37 compdev kernel: lock_dlm: withdraw abandoned memory
> Nov 27 10:29:37 compdev kernel: GFS: fsid=vgcomp:web.3: withdrawn
>
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>   




More information about the Linux-cluster mailing list