[Linux-cluster] GFS on CentOS - cman unable to start

Wes Modes wmodes at ucsc.edu
Fri Jan 6 20:38:43 UTC 2012


These servers are currently on the same host, but may not be in the
future.  They are in a vm cluster (though honestly, I'm not sure what
this means yet).

SElinux is on, but disabled.
Firewalling through iptables is turned off via system-config-securitylevel

There is no line currently in the cluster.conf that deals with multicasting.

Any other suggestions?

Wes

On 1/6/2012 12:05 PM, Luiz Gustavo Tonello wrote:
> Hi,
>
> This servers is on VMware? At the same host?
> SElinux is disable? iptables have something?
>
> In my environment I had a problem to start GFS2 with servers in
> differents hosts.
> To clustering servers, was need migrate one server to the same host of
> the other, and restart this.
>
> I think, one of the problem was because the virtual switchs.
> To solve, I changed a multicast IP, to use 225.0.0.13 at cluster.conf 
>   <multicast addr="225.0.0.13"/>
> And add a static route in both, to use default gateway.
>
> I don't know if it's correct, but this solve my problem.
>
> I hope that help you.
>
> Regards.
>
> On Fri, Jan 6, 2012 at 5:01 PM, Wes Modes <wmodes at ucsc.edu
> <mailto:wmodes at ucsc.edu>> wrote:
>
>     Hi, Steven.
>
>     I've tried just about every possible combination of hostname and
>     cluster.conf.
>
>     ping to test01 resolves to 128.114.31.112
>     ping to test01.gdao.ucsc.edu <http://test01.gdao.ucsc.edu>
>     resolves to 128.114.31.112
>
>     It feels like the right thing is being returned.  This feels like it
>     might be a quirk (or bug possibly) of cman or openais.
>
>     There are some old bug reports around this, for example
>     https://bugzilla.redhat.com/show_bug.cgi?id=488565.  It sounds
>     like the
>     way that cman reports this error is anything but straightforward.
>
>     Is there anyone who has encountered this error and found a solution?
>
>     Wes
>
>
>     On 1/6/2012 2:00 AM, Steven Whitehouse wrote:
>     > Hi,
>     >
>     > On Thu, 2012-01-05 at 13:54 -0800, Wes Modes wrote:
>     >> Howdy, y'all. I'm trying to set up GFS in a cluster on CentOS
>     systems
>     >> running on vmWare. The GFS FS is on a Dell Equilogic SAN.
>     >>
>     >> I keep running into the same problem despite many
>     differently-flavored
>     >> attempts to set up GFS. The problem comes when I try to start
>     cman, the
>     >> cluster management software.
>     >>
>     >>     [root at test01]# service cman start
>     >>     Starting cluster:
>     >>        Loading modules... done
>     >>        Mounting configfs... done
>     >>        Starting ccsd... done
>     >>        Starting cman... failed
>     >>     cman not started: Can't find local node name in cluster.conf
>     >> /usr/sbin/cman_tool: aisexec daemon didn't start
>     >>                                                              
>      [FAILED]
>     >>
>     > This looks like what it says... whatever the node name is in
>     > cluster.conf, it doesn't exist when the name is looked up, or
>     possibly
>     > it does exist, but is mapped to the loopback address (it needs
>     to map to
>     > an address which is valid cluster-wide)
>     >
>     > Since your config files look correct, the next thing to check is
>     what
>     > the resolver is actually returning. Try (for example) a ping to
>     test01
>     > (you need to specify exactly the same form of the name as is used in
>     > cluster.conf) from test02 and see whether it uses the correct ip
>     > address, just in case the wrong thing is being returned.
>     >
>     > Steve.
>     >
>     >>     [root at test01]# tail /var/log/messages
>     >>     Jan  5 13:39:40 testbench06 ccsd[13194]: Unable to connect to
>     >> cluster infrastructure after 1193640 seconds.
>     >>     Jan  5 13:40:10 testbench06 ccsd[13194]: Unable to connect to
>     >> cluster infrastructure after 1193670 seconds.
>     >>     Jan  5 13:40:24 testbench06 openais[3939]: [MAIN ] AIS
>     Executive
>     >> Service RELEASE 'subrev 1887 version 0.80.6'
>     >>     Jan  5 13:40:24 testbench06 openais[3939]: [MAIN ]
>     Copyright (C)
>     >> 2002-2006 MontaVista Software, Inc and contributors.
>     >>     Jan  5 13:40:24 testbench06 openais[3939]: [MAIN ]
>     Copyright (C)
>     >> 2006 Red Hat, Inc.
>     >>     Jan  5 13:40:24 testbench06 openais[3939]: [MAIN ] AIS
>     Executive
>     >> Service: started and ready to provide service.
>     >>     Jan  5 13:40:24 testbench06 openais[3939]: [MAIN ] local
>     node name
>     >> "test01.gdao.ucsc.edu <http://test01.gdao.ucsc.edu>" not found
>     in cluster.conf
>     >>     Jan  5 13:40:24 testbench06 openais[3939]: [MAIN ] Error
>     reading CCS
>     >> info, cannot start
>     >>     Jan  5 13:40:24 testbench06 openais[3939]: [MAIN ] Error
>     reading
>     >> config from CCS
>     >>     Jan  5 13:40:24 testbench06 openais[3939]: [MAIN ] AIS
>     Executive
>     >> exiting (reason: could not read the main configuration file).
>     >>
>     >> Here are details of my configuration:
>     >>
>     >>     [root at test01]# rpm -qa | grep cman
>     >>     cman-2.0.115-85.el5_7.2
>     >>
>     >>     [root at test01]# echo $HOSTNAME
>     >>     test01.gdao.ucsc.edu <http://test01.gdao.ucsc.edu>
>     >>
>     >>     [root at test01]# hostname
>     >>     test01.gdao.ucsc.edu <http://test01.gdao.ucsc.edu>
>     >>
>     >>     [root at test01]# cat /etc/hosts
>     >>     # Do not remove the following line, or various programs
>     >>     # that require network functionality will fail.
>     >>     128.114.31.112      test01 test01.gdao test01.gdao.ucsc.edu
>     <http://test01.gdao.ucsc.edu>
>     >>     128.114.31.113      test02 test02.gdao test02.gdao.ucsc.edu
>     <http://test02.gdao.ucsc.edu>
>     >>     127.0.0.1               localhost.localdomain localhost
>     >>     ::1             localhost6.localdomain6 localhost6
>     >>
>     >>     [root at test01]# sestatus
>     >>     SELinux status:                 enabled
>     >>     SELinuxfs mount:                /selinux
>     >>     Current mode:                   permissive
>     >>     Mode from config file:          permissive
>     >>     Policy version:                 21
>     >>     Policy from config file:        targeted
>     >>
>     >>     [root at test01]# cat /etc/cluster/cluster.conf
>     >>     <?xml version="1.0"?>
>     >>     <cluster config_version="25" name="gdao_cluster">
>     >>         <fence_daemon post_fail_delay="0" post_join_delay="120"/>
>     >>         <clusternodes>
>     >>             <clusternode name="test01" nodeid="1" votes="1">
>     >>                 <fence>
>     >>                     <method name="single">
>     >>                         <device name="gfs_vmware"/>
>     >>                     </method>
>     >>                 </fence>
>     >>             </clusternode>
>     >>             <clusternode name="test02" nodeid="2" votes="1">
>     >>                 <fence>
>     >>                     <method name="single">
>     >>                         <device name="gfs_vmware"/>
>     >>                     </method>
>     >>                 </fence>
>     >>             </clusternode>
>     >>         </clusternodes>
>     >>         <cman/>
>     >>         <fencedevices>
>     >>             <fencedevice agent="fence_manual" name="gfs1_ipmi"/>
>     >>             <fencedevice agent="fence_vmware" name="gfs_vmware"
>     >> ipaddr="gdvcenter.ucsc.edu <http://gdvcenter.ucsc.edu>"
>     login="root" passwd="1hateAmazon.com"
>     >> vmlogin="root" vmpasswd="esxpass"
>     >>
>     port="/vmfs/volumes/49086551-c64fd83c-0401-001e0bcd6848/eagle1/gfs1.vmx"/>
>     >>         </fencedevices>
>     >>         <rm>
>     >>         <failoverdomains/>
>     >>         </rm>
>     >>     </cluster>
>     >>
>     >> I've seen much discussion of this problem, but no definitive
>     solutions.
>     >> Any help you can provide will be welcome.
>     >>
>     >> Wes Modes
>     >>
>     >> --
>     >> Linux-cluster mailing list
>     >> Linux-cluster at redhat.com <mailto:Linux-cluster at redhat.com>
>     >> https://www.redhat.com/mailman/listinfo/linux-cluster
>     >
>     > --
>     > Linux-cluster mailing list
>     > Linux-cluster at redhat.com <mailto:Linux-cluster at redhat.com>
>     > https://www.redhat.com/mailman/listinfo/linux-cluster
>
>     --
>     Linux-cluster mailing list
>     Linux-cluster at redhat.com <mailto:Linux-cluster at redhat.com>
>     https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>
>
> -- 
> Luiz Gustavo P Tonello.
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20120106/9ccc2c37/attachment.htm>


More information about the Linux-cluster mailing list