[Linux-cluster] Unable to start Apache cluster service (Error : Failed - Invalid Name Of Service )

PARAM KRISH mkparam at gmail.com
Mon Aug 27 09:31:38 UTC 2012


Hi

 I think i am almost there. I have started using RHEL6 hoping it would not
give me any night-mare this time to setup a 2 Node Cluster for a Apache
cluster service. and i think i have done pretty much everything.

In short,

1. Two nodes having private IP's eth0 configured with 192.168.18.10 and
192.168.18.11
2. Nodes are named as node1.localdomain, node2.localdomain, /etc/hosts
taken care
3.  I created the cluster, added two nodes, added the service WEB ( added
the child :IP and :apache to it)
4. Cluster is in quorum and detects other node going offline fantastically
5.  Tested the start/stop of this resource WEB using "rg_test" , it worked
just fine on both the nodes.
6.  But, for some reasons, its not starting or failing over to other node
when i manually test(using clusvcadm -e WEB) or do a reboot or whatever.

7. Please let me know how do i verify the cluster startup and failover
manually to make sure everything works
8. What is it i am missing that makes this not work now ? Please assist.

Please go through the output of all the commands attached herewith.

Let me know if there is still required.

Param
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20120827/aece4d57/attachment.htm>
-------------- next part --------------
<?xml version="1.0"?>
<cluster config_version="14" name="httpdCluster">
<logging debug="on"/>
        <cman expected_votes="1" two_node="1"/>
        <clusternodes>
                <clusternode name="node1.localdomain" nodeid="1" votes="1">
                        <fence>
                                <method name="single"/>
                        </fence>
                </clusternode>
                <clusternode name="node2.localdomain" nodeid="2" votes="1">
                        <fence>
                                <method name="single"/>
                        </fence>
                </clusternode>
        </clusternodes>
        <fencedevices/>
        <rm>
                <failoverdomains>
                        <failoverdomain name="myFailOver" nofailback="0" ordered="1" restricted="0">
                                <failoverdomainnode name="node1.localdomain" priority="1"/>
                                <failoverdomainnode name="node2.localdomain" priority="2"/>
                        </failoverdomain>
                </failoverdomains>
                <resources>
                        <apache config_file="conf/httpd.conf" name="apache" server_root="/etc/httpd" shutdown_wait="0"/>
                </resources>
                <service autostart="1" domain="myFailOver" exclusive="1" name="WEB" recovery="relocate">
                        <ip address="192.168.18.50" monitor_link="1" sleeptime="10">
                                <apache config_file="conf/httpd.conf" name="WEB" server_root="/etc/httpd" shutdown_wait="0"/>
                        </ip>
                </service>
        </rm>
        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
</cluster>

================================================

[root at node2 apache]# clustat
Cluster Status for httpdCluster @ Mon Aug 27 20:13:24 2012
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 node1.localdomain                                                   1 Online, rgmanager
 node2.localdomain                                                   2 Online, Local, rgmanager

 Service Name                                                     Owner (Last)                                                     State         
 ------- ----                                                     ----- ------                                                     -----         
 service:WEB                                                      (node2.localdomain)                                              failed        

[root at node2 apache]# ps -eaf | grep httpd
root     17219  3171  0 20:15 pts/0    00:00:00 grep httpd

[root at node2 apache]# ip addr list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
    link/ether 00:0c:29:1a:5b:cf brd ff:ff:ff:ff:ff:ff
    inet 192.168.18.11/24 brd 192.168.18.255 scope global eth0
    inet6 fe80::20c:29ff:fe1a:5bcf/64 scope link 
       valid_lft forever preferred_lft forever

[root at node2 apache]# /usr/share/cluster/apache.sh start service:WEB
<debug>  Verifying Configuration Of default
Verifying Configuration Of default
<error>  Verifying Configuration Of default > Failed - Invalid Name Of Service
Verifying Configuration Of default > Failed - Invalid Name Of Service

[root at node2 apache]# rg_test test /etc/cluster/cluster.conf start service WEB
Running in test mode.
Loading resource rule from /usr/share/cluster/openldap.sh
Loading resource rule from /usr/share/cluster/apache.sh
Loading resource rule from /usr/share/cluster/named.sh
Loading resource rule from /usr/share/cluster/lvm_by_lv.sh
Loading resource rule from /usr/share/cluster/SAPDatabase
Loading resource rule from /usr/share/cluster/postgres-8.sh
Loading resource rule from /usr/share/cluster/clusterfs.sh
Loading resource rule from /usr/share/cluster/ip.sh
Loading resource rule from /usr/share/cluster/service.sh
Loading resource rule from /usr/share/cluster/script.sh
Loading resource rule from /usr/share/cluster/nfsserver.sh
Loading resource rule from /usr/share/cluster/nfsexport.sh
Loading resource rule from /usr/share/cluster/tomcat-6.sh
Loading resource rule from /usr/share/cluster/lvm.sh
Loading resource rule from /usr/share/cluster/lvm_by_vg.sh
Loading resource rule from /usr/share/cluster/SAPInstance
Loading resource rule from /usr/share/cluster/vm.sh
Loading resource rule from /usr/share/cluster/ASEHAagent.sh
Loading resource rule from /usr/share/cluster/samba.sh
Loading resource rule from /usr/share/cluster/netfs.sh
Loading resource rule from /usr/share/cluster/fs.sh
Loading resource rule from /usr/share/cluster/mysql.sh
Loading resource rule from /usr/share/cluster/nfsclient.sh
Loading resource rule from /usr/share/cluster/oracledb.sh
Loading resource rule from /usr/share/cluster/ocf-shellfuncs
Loading resource rule from /usr/share/cluster/svclib_nfslock
Starting WEB...
<debug>  Link for eth0: Detected
Link for eth0: Detected
<info>   Adding IPv4 address 192.168.18.50/24 to eth0
Adding IPv4 address 192.168.18.50/24 to eth0
<debug>  Pinging addr 192.168.18.50 from dev eth0
Pinging addr 192.168.18.50 from dev eth0
<debug>  Sending gratuitous ARP: 192.168.18.50 00:0c:29:1a:5b:cf brd ff:ff:ff:ff:ff:ff
Sending gratuitous ARP: 192.168.18.50 00:0c:29:1a:5b:cf brd ff:ff:ff:ff:ff:ff
rdisc: no process killed
<debug>  Verifying Configuration Of apache:WEB
Verifying Configuration Of apache:WEB
<debug>  Checking Syntax Of The File /etc/httpd/conf/httpd.conf
Checking Syntax Of The File /etc/httpd/conf/httpd.conf
<debug>  Checking Syntax Of The File /etc/httpd/conf/httpd.conf > Succeed
Checking Syntax Of The File /etc/httpd/conf/httpd.conf > Succeed
<info>   Starting Service apache:WEB
Starting Service apache:WEB
<debug>  Looking For IP Addresses
Looking For IP Addresses
Query failed: Invalid argument (/cluster/rm/service[@name="WEB"]/ip[2]/@address)
<debug>  Looking For IP Addresses > Succeed -  IP Addresses Found
Looking For IP Addresses > Succeed -  IP Addresses Found
<debug>  Checking: SHA1 checksum of config file /etc/cluster/apache/apache:WEB/httpd.conf
Checking: SHA1 checksum of config file /etc/cluster/apache/apache:WEB/httpd.conf
<debug>  Checking: SHA1 checksum > succeed
Checking: SHA1 checksum > succeed
<debug>  Generating New Config File /etc/cluster/apache/apache:WEB/httpd.conf From /etc/httpd/conf/httpd.conf
Generating New Config File /etc/cluster/apache/apache:WEB/httpd.conf From /etc/httpd/conf/httpd.conf
<debug>  Generating New Config File /etc/cluster/apache/apache:WEB/httpd.conf From /etc/httpd/conf/httpd.conf > Succeed
Generating New Config File /etc/cluster/apache/apache:WEB/httpd.conf From /etc/httpd/conf/httpd.conf > Succeed
<debug>  Starting Service apache:WEB > Succeed
Starting Service apache:WEB > Succeed
Start of WEB complete

[root at node2 apache]# ip addr list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
    link/ether 00:0c:29:1a:5b:cf brd ff:ff:ff:ff:ff:ff
    inet 192.168.18.11/24 brd 192.168.18.255 scope global eth0
    inet 192.168.18.50/24 scope global secondary eth0
    inet6 fe80::20c:29ff:fe1a:5bcf/64 scope link 
       valid_lft forever preferred_lft forever

[root at node2 apache]# ps -eaf | grep httpd | wc -l
10

[root at node2 apache]# wget http://192.168.18.50
--2012-08-27 20:17:15--  http://192.168.18.50/
Connecting to 192.168.18.50:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 22 [text/html]
Saving to: `index.html'

100%[===============================================================================================================>] 22          --.-K/s   in 0s      

2012-08-27 20:17:15 (3.98 MB/s) - `index.html' saved [22/22]


/var/log/messages
Aug 27 19:24:44 node2 rgmanager[9388]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:25:03 node2 rgmanager[9523]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:25:39 node2 rgmanager[10429]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:26:23 node2 rgmanager[10585]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:26:50 node2 rgmanager[10730]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:26:58 node2 rgmanager[10807]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:27:10 node2 rgmanager[10865]: (null)
Aug 27 19:27:31 node2 rgmanager[10973]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:28:28 node2 rgmanager[11148]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:28:33 node2 rgmanager[11226]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:30:58 node2 rgmanager[11587]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:31:03 node2 rgmanager[11665]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:31:06 node2 rgmanager[11733]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:36:58 node2 rgmanager[12495]:  is not configured
Aug 27 19:38:43 node2 rgmanager[12884]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 20:13:35 node2 rgmanager[16956]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 20:16:11 node2 rgmanager[17717]: Adding IPv4 address 192.168.18.50/24 to eth0
Aug 27 20:16:14 node2 in.rdiscd[17784]: setsockopt (IP_ADD_MEMBERSHIP): Address already in use
Aug 27 20:16:14 node2 in.rdiscd[17784]: Failed joining addresses
Aug 27 20:16:15 node2 rgmanager[17876]: Starting Service apache:WEB
Aug 27 20:16:16 node2 rgmanager[17940]: Query failed: Invalid argument (/cluster/rm/service[@name="WEB"]/ip[2]/@address)
Aug 27 20:17:31 node2 rgmanager[18737]: Stopping Service apache:WEB
Aug 27 20:17:33 node2 rgmanager[18771]: Stopping Service apache:WEB > Failed - Application Is Still Running
Aug 27 20:17:33 node2 rgmanager[18791]: Stopping Service apache:WEB > Failed
Aug 27 20:17:33 node2 rgmanager[18840]: Removing IPv4 address 192.168.18.50/24 from eth0

[root at node2 cluster]# clusvcadm -e WEB -m node2.localdomain
Member node2.localdomain trying to enable service:WEB...Aborted; service failed

[root at node2 cluster]# tail /var/log/messages
..
Aug 27 20:21:06 node2 rgmanager[1771]: #43: Service service:WEB has failed; can not start.
Aug 27 20:21:06 node2 rgmanager[1771]: #13: Service service:WEB failed to stop cleanly


More information about the Linux-cluster mailing list