[Linux-cluster] fc6 two-node cluster with gfs2 not working

Greg Swift gsml at netops.gvtc.com
Thu Nov 2 20:56:44 UTC 2006


So i've got two dell blades setup, with multipath. they even cluster 
together, but once one is up and has the gfs mounted the other can't 
start the gfs2 service. I'm basing my setup on how I was setting up gfs 
w/ rhel4, i realize this newer way has some more niceties to it, and i 
must be doing something wrong, but i am not seeing much documentation on 
the differences so i am just trying to pull this off this way.


Basic rundown of setup:

decently minimal non-X install
local drives are not lvm'd (which btw, if you have 2 boxes setup 
differently, one w/ lvm and one w/o it makes clvm a pita)


yum update
yum install screen ntp cman lvm2-cluster gfs2-utils

put on good firewall config (or turn it off, both behave the same)

selinux turned down to permissive

see attached multipath.conf and cluster.conf

after updating the multipath.conf i do this:
mkinitrd -f /boot/initrd-`uname -r` `uname -r`
init 6; exit

modprobe dm-multipath
modprobe dm-round-robin
service multipathd start

that part looks just fine.


then after updating the cluster.conf on each node i do 'ccs_tool 
addnodeids' (it said to do this when i tried to start cman the first time).

then service cman start

everything looks fine, pvcreate, vgcreate, lvcreate, mkfs.gfs2, voila we 
have a gfs formatted drive visible on both systems.

i add the /etc/fstab entry, and create the mount point.

next i start clvmd, then gfs2.

the first box starts gfs2 just fine, second won't, it hangs at this 
(from var/log/messages):

Nov 1 22:41:07 box2 kernel: audit(1162442467.427:150): avc: denied { 
connectto } for pid=3724 comm="mount.gfs2" 
path=006766735F636F6E74726F6C645F736F6$
Nov 1 22:41:07 box2 kernel: GFS2: fsid=: Trying to join cluster 
"lock_dlm", "outMail:data"
Nov 1 22:41:07 box2 kernel: audit(1162442467.451:151): avc: denied { 
search } for pid=3724 comm="mount.gfs2" name="dlm" dev=debugfs ino=13186 
scontext$
Nov 1 22:41:07 box2 kernel: dlm: data: recover 1
Nov 1 22:41:07 box2 kernel: GFS2: fsid=outMail:data.1: Joined cluster. 
Now mounting FS...
Nov 1 22:41:07 box2 kernel: dlm: data: add member 1
Nov 1 22:41:07 box2 kernel: dlm: data: add member 2
Nov 1 22:49:07 box2 gfs_controld[3639]: mount: failed -17


Remember it is set to permissive.

So I shut down the box that came up fine on its own, manually enabled 
the services on box2 (the box that wasnt coming up) and it works fine. 
Turned on the box1, and at boot it is hanging at the same place box2 was.

I also realize that a 2 node cluster is not prefered, but its what i'm 
setting up, what i have access to at the moment, and honestly i'm not 
sure that i believe a 3rd box would help (but it might).

Any suggestions?

-greg

-- 
http://www.gvtc.com
--
“While it is possible to change without improving, it is impossible to improve without changing.” -anonymous

“only he who attempts the absurd can achieve the impossible.” -anonymous

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cluster.conf.txt
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20061102/377e72c2/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: multipath.conf.txt
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20061102/377e72c2/attachment-0001.txt>


More information about the Linux-cluster mailing list