[Linux-cluster] local node name "app-acc-006.app.rws.local" not found in cluster.conf

Kit Gerrits kitgerrits at gmail.com
Wed Feb 3 11:24:05 UTC 2010


Hello,



I am trying to move my cluster heartbeat to a different (interconnect)
VLAN without changing the hostname.

- Machines 1-5 have set their hostname to app-acc-00[1-5]-ic to
enforce the setting.

- Machine 6 has its regular hostname app-acc-006

- The alternate hostname app-acc-006-ic is registered to the bond1
interface through DNS and /etc/hosts.

- O/S: Red Hat Enterprise Linux Server release 5.3 (Tikanga)

- Cman: 2.0.98



The nodes with the changed hostname work fine; however our
applications expect the regular hostname.
So while this workaround would get our cluster working, it would break
our application.

What more do we need to keep the hostname as app-acc-00*, while using
the IP attached to app-acc-00*-ic as heartbeat channel?
I currently get:
Feb  2 18:10:24 app-acc-006 openais[5174]: [MAIN ] local node name
"app-acc-006.app.rws.local" not found in cluster.conf

Feb  2 18:10:24 app-acc-006 openais[5174]: [MAIN ] Error reading CCS
info, cannot start

Feb  2 18:10:24 app-acc-006 openais[5174]: [MAIN ] Error reading config from CCS

Feb  2 18:10:24 app-acc-006 openais[5174]: [MAIN ] AIS Executive
exiting (reason: could not read the main configuration file).

Details of the configuration are attached.


-- 
He attacked everything in life with a mix of extraordinary genius and
naive incompetence, and it was often difficult to tell which was
which.
- Douglas Adams
-------------- next part --------------
Hello,

I am trying to move my cluster heartbeat to a different (interconnect) VLAN without changing the hostname.
- Machines 1-5 have set their hostname to app-acc-00[1-5]-ic to enforce the setting.
- Machine 6 has its regular hostname app-acc-006
- The alternate hostname app-acc-006-ic is registered to the bond1 interface through DNS and /etc/hosts.
- O/S: Red Hat Enterprise Linux Server release 5.3 (Tikanga)
- Cman: 2.0.98

The nodes with the changed hostname work fine; however our applications expect the regular hostname. 
So while this workaround would get our cluster working, it would break our application.
What more do we need to keep the hostname as app-acc-00*, while using the IP attached to app-acc-00*-ic as heartbeat channel?


Feb  2 18:10:24 app-acc-006 ccsd[5168]: Starting ccsd 2.0.98:
Feb  2 18:10:24 app-acc-006 ccsd[5168]:  Built: Mar  5 2009 16:15:59
Feb  2 18:10:24 app-acc-006 ccsd[5168]:  Copyright (C) Red Hat, Inc.  2004  All rights reserved.
Feb  2 18:10:24 app-acc-006 ccsd[5168]: cluster.conf (cluster name = appacc, version = 32) found.
Feb  2 18:10:24 app-acc-006 ccsd[5168]: Remote copy of cluster.conf is from quorate node.
Feb  2 18:10:24 app-acc-006 ccsd[5168]:  Local version # : 32
Feb  2 18:10:24 app-acc-006 ccsd[5168]:  Remote version #: 32
Feb  2 18:10:24 app-acc-006 ccsd[5168]: Remote copy of cluster.conf is from quorate node.
Feb  2 18:10:24 app-acc-006 ccsd[5168]:  Local version # : 32
Feb  2 18:10:24 app-acc-006 ccsd[5168]:  Remote version #: 32
Feb  2 18:10:24 app-acc-006 openais[5174]: [MAIN ] AIS Executive Service RELEASE 'subrev 1358 version 0.80.3'
Feb  2 18:10:24 app-acc-006 openais[5174]: [MAIN ] Copyright (C) 2002-2006 MontaVista Software, Inc and contributors.
Feb  2 18:10:24 app-acc-006 openais[5174]: [MAIN ] Copyright (C) 2006 Red Hat, Inc.
Feb  2 18:10:24 app-acc-006 openais[5174]: [MAIN ] AIS Executive Service: started and ready to provide service.
Feb  2 18:10:24 app-acc-006 openais[5174]: [MAIN ] local node name "app-acc-006.app.rws.local" not found in cluster.conf
Feb  2 18:10:24 app-acc-006 openais[5174]: [MAIN ] Error reading CCS info, cannot start
Feb  2 18:10:24 app-acc-006 openais[5174]: [MAIN ] Error reading config from CCS
Feb  2 18:10:24 app-acc-006 openais[5174]: [MAIN ] AIS Executive exiting (reason: could not read the main configuration file).


My cluster.conf:
<?xml version="1.0"?>
<cluster alias="appacc" config_version="32" name="appacc">
        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
                <clusternode name="app-acc-001-ic" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="fence_app-acc-001"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="app-acc-002-ic" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="fence_app-acc-002"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="app-acc-003-ic" nodeid="3" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="fence_app-acc-003"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="app-acc-004-ic" nodeid="4" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="fence_app-acc-004"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="app-acc-005-ic" nodeid="5" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="fence_app-acc-005"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="app-acc-006-ic" nodeid="6" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="fence_app-acc-006"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <fencedevices>
		<!--- snipped --->
        </fencedevices>
        <rm>
                <failoverdomains>
                        <failoverdomain name="fd_iv" ordered="0" restricted="1">
                                <failoverdomainnode name="app-acc-001-ic" priority="1"/>
                                <failoverdomainnode name="app-acc-002-ic" priority="1"/>
                        </failoverdomain>
                        <failoverdomain name="fd_vd" ordered="0" restricted="1">
                                <failoverdomainnode name="app-acc-003-ic" priority="1"/>
                                <failoverdomainnode name="app-acc-004-ic" priority="1"/>
                        </failoverdomain>
                        <failoverdomain name="fd_fa" ordered="0" restricted="1">
                                <failoverdomainnode name="app-acc-005-ic" priority="1"/>
                                <failoverdomainnode name="app-acc-006-ic" priority="1"/>
                        </failoverdomain>
                </failoverdomains>
                <resources>
			<!--- snipped --->
                </resources>
                <service autostart="1" domain="fd_iv" name="service_iv" recovery="relocate">
			<!--- snipped --->
                </service>
        </rm>
</cluster>


The hosts are registered in DNS:
app-acc-001-ic.app.rws.local has address 10.47.125.97
app-acc-002-ic.app.rws.local has address 10.47.125.98
app-acc-003-ic.app.rws.local has address 10.47.125.100
app-acc-004-ic.app.rws.local has address 10.47.125.101
app-acc-005-ic.app.rws.local has address 10.47.125.103
app-acc-006-ic.app.rws.local has address 10.47.125.104

The hosts are also registered with their regular names in DNS
app-acc-001-ic.app.rws.local has address 10.47.125.10
app-acc-002-ic.app.rws.local has address 10.47.125.11
app-acc-003-ic.app.rws.local has address 10.47.125.13
app-acc-004-ic.app.rws.local has address 10.47.125.14
app-acc-005-ic.app.rws.local has address 10.47.125.16
app-acc-006-ic.app.rws.local has address 10.47.125.17

Note: 10.47.125.0/27 and 10.47.125.96/27 are different subnets.


The IP is locally registered:
[root at app-acc-006 ~]# ip a
<!-- partly snipped for brevity -->
10: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:23:7d:55:46:ac brd ff:ff:ff:ff:ff:ff
    inet 10.47.125.17/27 brd 10.47.125.31 scope global eth1
    inet6 fe80::223:7dff:fe55:46ac/64 scope link
       valid_lft forever preferred_lft forever
14: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
    link/ether 00:24:81:7d:b2:80 brd ff:ff:ff:ff:ff:ff
    inet 10.47.125.104/27 brd 10.47.125.127 scope global bond1
    inet6 fe80::224:81ff:fe7d:b280/64 scope link
       valid_lft forever preferred_lft forever



More information about the Linux-cluster mailing list