[Linux-cluster] fc6 two-node cluster with gfs2 not working

Greg Swift gsml at netops.gvtc.com
Thu Nov 2 22:48:24 UTC 2006


David Teigland wrote:
> On Thu, Nov 02, 2006 at 03:58:41PM -0600, Greg Swift wrote:
>   
>>>> Nov 1 22:49:07 box2 gfs_controld[3639]: mount: failed -17
>>>>         
>
>   
>> [root at goumang ~]# mount -v /mnt/data
>> /sbin/mount.gfs2: mount /dev/pri_outMail/pri_outMail_lv0 /mnt/data
>> /sbin/mount.gfs2: parse_opts: opts = "rw"
>> /sbin/mount.gfs2:   clear flag 1 for "rw", flags = 0
>> /sbin/mount.gfs2: parse_opts: flags = 0
>> /sbin/mount.gfs2: parse_opts: extra = ""
>> /sbin/mount.gfs2: parse_opts: hostdata = ""
>> /sbin/mount.gfs2: parse_opts: lockproto = ""
>> /sbin/mount.gfs2: parse_opts: locktable = ""
>> /sbin/mount.gfs2: message to gfs_controld: asking to join mountgroup:
>> /sbin/mount.gfs2: write "join /mnt/data gfs2 lock_dlm outMail:data rw"
>> /sbin/mount.gfs2: setup_mount_error_fd 4 5
>> /sbin/mount.gfs2: message from gfs_controld: response to join request:
>> /sbin/mount.gfs2: lock_dlm_join: read "0"
>> /sbin/mount.gfs2: message from gfs_controld: mount options:
>> /sbin/mount.gfs2: lock_dlm_join: read "hostdata=jid=1:id=65538:first=0"
>> /sbin/mount.gfs2: lock_dlm_join: hostdata: "hostdata=jid=1:id=65538:first=0"
>> /sbin/mount.gfs2: lock_dlm_join: extra_plus: "hostdata=jid=1:id=65538:first=0"
>>     
>
> All the cluster infrastructure appears to be working ok, and no more
> gfs_controld error in the syslog again I'm assuming.  So, gfs on the
> second node is either stuck doing i/o or it's stuck trying to get a dlm
> lock.  A "ps ax -o pid,stat,cmd,wchan" might show what it's blocked on.
> You might also try the same thing with gfs1 (would eliminate the dlm as
> the problem).  It could also very well be a gfs2 or dlm bug that's been
> fixed since the fc6 kernel froze -- we need to get some updates pushed
> out.
>
> Dave
>
>   
Here is the output from the log file at the same time as what I included 
before.

Nov 2 15:51:33 goumang kernel: GFS2: fsid=: Trying to join cluster 
"lock_dlm", "outMail:data"
Nov 2 15:51:33 goumang kernel: dlm: data: recover 1
Nov 2 15:51:33 goumang kernel: GFS2: fsid=outMail:data.1: Joined 
cluster. Now mounting FS...
Nov 2 15:51:33 goumang kernel: dlm: data: add member 2
Nov 2 15:51:33 goumang kernel: dlm: Initiating association with node 2
Nov 2 15:51:33 goumang kernel: dlm: data: add member 1
Nov 2 15:51:33 goumang kernel: dlm: Error sending to node 2 -32

(sorry i pulled it offlist for a minute by not hitting reply all. i 
re-attached output for archival purposes)

-- 
http://www.gvtc.com
--
“While it is possible to change without improving, it is impossible to improve without changing.” -anonymous

“only he who attempts the absurd can achieve the impossible.” -anonymous

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: goumang.txt
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20061102/e9e760b1/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: rushou.txt
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20061102/e9e760b1/attachment-0001.txt>


More information about the Linux-cluster mailing list