[Linux-cluster] GFS/cman related problems?

Dan B. Phung phung at cs.columbia.edu
Mon May 2 05:59:56 UTC 2005


Hello, 

I hope I'm sending to the correct list.  I'm having problems starting up
gfs, and hopefully it's just something incorrect with my configuration.

For the sources, I checked out the latest of:
  device-mapper LVM2 cluster
from
  cvs -d :pserver:cvs at sources.redhat.com:/cvs/cluster 

for the kernel sources, I'm using a vanilla 2.6.8.1 kernel, so I had to
update the cluster/*kernel from old versions.  (this may be my problem)
The kernel built and installed fine.  The kernel modules load fine.

root # modprobe dm-mod
root # device-mapper/scripts/devmap_mknod.sh
root # modprobe gfs
root # modprobe lock_dlm

  Lock_Harness <CVS> (built May  1 2005 15:31:12) installed
  GFS <CVS> (built May  1 2005 15:30:54) 
  CMAN <CVS> (built May  1 2005 14:54:13) 
  NET: Registered protocol family 30
  DLM <CVS> (built May  1 2005 14:54:32) 
  udev[5023]: creating device node '/dev/dlm-control'
  Lock_DLM (built May  1 2005 15:31:03) 

Then I try to startup the daemons

root # route add -net 224.0.0.0 netmask 255.0.0.0 dev eth0

root # ccsd -V
ccsd DEVEL.1114967270 (built May  1 2005 13:07:54)
Copyright (C) Red Hat, Inc.  2004  All rights reserved.

root # ccsd 
Starting ccsd DEVEL.1114967270: 
May  1 16:02:38 localhost ccsd[5035]:  Built: May  1 2005 13:07:54 
May  1 16:02:38 localhost ccsd[5035]:  Copyright (C) Red Hat, Inc.  2004 All rights reserved.

root # cman_tool -V
cman_tool DEVEL.1114967270 (built May  1 2005 13:07:59)
Copyright (C) Red Hat, Inc.  2004  All rights reserved.

root # cman_tool -d join
selected nodename blade1
multicast address 224.0.0.0
if eth0 for mcast address 224.0.0.0
setup up interface for address: blade1
cman: CMAN DEVEL.1114967270 (built May  1 2005 20:17:15) started
root # cman: Waiting to join or form a Linux-cluster
cman: forming a new cluster
cman: quorum regained, resuming activity

root # clvmd -V
Cluster LVM daemon version: 2.01.10-cvs (2005-04-04)
Protocol version:           0.2.1

root # clvmd   
clvmd could not connect to cluster manager
Consult syslog for more information

root # syslog | tail -1
Unable to connect to cluster infrastructure after 60 seconds
(there are many of these)

so from here I tried to startup ccsd and cman from my other blade.
'cman_tool join' on the other blade never joins and gives these messages:

  cman: sending membership request

I'm following the instructions from:
http://gfs.wikidev.net/Installation#Build_and_install

Here is my configuration:

<?xml version="1.0"?>
        <cluster name="blade_cluster" config_version="2">
        <cman two_node="1" expected_votes="1">
        <multicast addr="224.0.0.0"/>
        </cman>

        <clusternodes>
          <clusternode name="blade03" nodeid="1" votes="1">
          <multicast addr="224.0.0.0" interface="eth0"/>
          <fence>
                <method name="human">
                <device name="eth0" ipaddr="blade1"/>
                </method>
          </fence>
          </clusternode>

          <clusternode name="blade04" nodeid="2" votes="1">
          <multicast addr="224.0.0.0" interface="eth0"/>
          <fence>
                <method name="human">
                <device name="eth0" ipaddr="blade2"/>
                </method>
          </fence>
          </clusternode>
        </clusternodes>

        <fencedevices>
          <fencedevice name="human" agent="fence_manual"/>
        </fencedevices>

        <fence_daemon clean_start="0">
        </fence_daemon>
</cluster>

any help is much appreciated.

regards,
Dan




More information about the Linux-cluster mailing list