[Linux-cluster] Unable to start Apache cluster service (Error : Failed - Invalid Name Of Service )
PARAM KRISH
mkparam at gmail.com
Mon Aug 27 09:31:38 UTC 2012
Hi
I think i am almost there. I have started using RHEL6 hoping it would not
give me any night-mare this time to setup a 2 Node Cluster for a Apache
cluster service. and i think i have done pretty much everything.
In short,
1. Two nodes having private IP's eth0 configured with 192.168.18.10 and
192.168.18.11
2. Nodes are named as node1.localdomain, node2.localdomain, /etc/hosts
taken care
3. I created the cluster, added two nodes, added the service WEB ( added
the child :IP and :apache to it)
4. Cluster is in quorum and detects other node going offline fantastically
5. Tested the start/stop of this resource WEB using "rg_test" , it worked
just fine on both the nodes.
6. But, for some reasons, its not starting or failing over to other node
when i manually test(using clusvcadm -e WEB) or do a reboot or whatever.
7. Please let me know how do i verify the cluster startup and failover
manually to make sure everything works
8. What is it i am missing that makes this not work now ? Please assist.
Please go through the output of all the commands attached herewith.
Let me know if there is still required.
Param
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20120827/aece4d57/attachment.htm>
-------------- next part --------------
<?xml version="1.0"?>
<cluster config_version="14" name="httpdCluster">
<logging debug="on"/>
<cman expected_votes="1" two_node="1"/>
<clusternodes>
<clusternode name="node1.localdomain" nodeid="1" votes="1">
<fence>
<method name="single"/>
</fence>
</clusternode>
<clusternode name="node2.localdomain" nodeid="2" votes="1">
<fence>
<method name="single"/>
</fence>
</clusternode>
</clusternodes>
<fencedevices/>
<rm>
<failoverdomains>
<failoverdomain name="myFailOver" nofailback="0" ordered="1" restricted="0">
<failoverdomainnode name="node1.localdomain" priority="1"/>
<failoverdomainnode name="node2.localdomain" priority="2"/>
</failoverdomain>
</failoverdomains>
<resources>
<apache config_file="conf/httpd.conf" name="apache" server_root="/etc/httpd" shutdown_wait="0"/>
</resources>
<service autostart="1" domain="myFailOver" exclusive="1" name="WEB" recovery="relocate">
<ip address="192.168.18.50" monitor_link="1" sleeptime="10">
<apache config_file="conf/httpd.conf" name="WEB" server_root="/etc/httpd" shutdown_wait="0"/>
</ip>
</service>
</rm>
<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
</cluster>
================================================
[root at node2 apache]# clustat
Cluster Status for httpdCluster @ Mon Aug 27 20:13:24 2012
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
node1.localdomain 1 Online, rgmanager
node2.localdomain 2 Online, Local, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:WEB (node2.localdomain) failed
[root at node2 apache]# ps -eaf | grep httpd
root 17219 3171 0 20:15 pts/0 00:00:00 grep httpd
[root at node2 apache]# ip addr list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
link/ether 00:0c:29:1a:5b:cf brd ff:ff:ff:ff:ff:ff
inet 192.168.18.11/24 brd 192.168.18.255 scope global eth0
inet6 fe80::20c:29ff:fe1a:5bcf/64 scope link
valid_lft forever preferred_lft forever
[root at node2 apache]# /usr/share/cluster/apache.sh start service:WEB
<debug> Verifying Configuration Of default
Verifying Configuration Of default
<error> Verifying Configuration Of default > Failed - Invalid Name Of Service
Verifying Configuration Of default > Failed - Invalid Name Of Service
[root at node2 apache]# rg_test test /etc/cluster/cluster.conf start service WEB
Running in test mode.
Loading resource rule from /usr/share/cluster/openldap.sh
Loading resource rule from /usr/share/cluster/apache.sh
Loading resource rule from /usr/share/cluster/named.sh
Loading resource rule from /usr/share/cluster/lvm_by_lv.sh
Loading resource rule from /usr/share/cluster/SAPDatabase
Loading resource rule from /usr/share/cluster/postgres-8.sh
Loading resource rule from /usr/share/cluster/clusterfs.sh
Loading resource rule from /usr/share/cluster/ip.sh
Loading resource rule from /usr/share/cluster/service.sh
Loading resource rule from /usr/share/cluster/script.sh
Loading resource rule from /usr/share/cluster/nfsserver.sh
Loading resource rule from /usr/share/cluster/nfsexport.sh
Loading resource rule from /usr/share/cluster/tomcat-6.sh
Loading resource rule from /usr/share/cluster/lvm.sh
Loading resource rule from /usr/share/cluster/lvm_by_vg.sh
Loading resource rule from /usr/share/cluster/SAPInstance
Loading resource rule from /usr/share/cluster/vm.sh
Loading resource rule from /usr/share/cluster/ASEHAagent.sh
Loading resource rule from /usr/share/cluster/samba.sh
Loading resource rule from /usr/share/cluster/netfs.sh
Loading resource rule from /usr/share/cluster/fs.sh
Loading resource rule from /usr/share/cluster/mysql.sh
Loading resource rule from /usr/share/cluster/nfsclient.sh
Loading resource rule from /usr/share/cluster/oracledb.sh
Loading resource rule from /usr/share/cluster/ocf-shellfuncs
Loading resource rule from /usr/share/cluster/svclib_nfslock
Starting WEB...
<debug> Link for eth0: Detected
Link for eth0: Detected
<info> Adding IPv4 address 192.168.18.50/24 to eth0
Adding IPv4 address 192.168.18.50/24 to eth0
<debug> Pinging addr 192.168.18.50 from dev eth0
Pinging addr 192.168.18.50 from dev eth0
<debug> Sending gratuitous ARP: 192.168.18.50 00:0c:29:1a:5b:cf brd ff:ff:ff:ff:ff:ff
Sending gratuitous ARP: 192.168.18.50 00:0c:29:1a:5b:cf brd ff:ff:ff:ff:ff:ff
rdisc: no process killed
<debug> Verifying Configuration Of apache:WEB
Verifying Configuration Of apache:WEB
<debug> Checking Syntax Of The File /etc/httpd/conf/httpd.conf
Checking Syntax Of The File /etc/httpd/conf/httpd.conf
<debug> Checking Syntax Of The File /etc/httpd/conf/httpd.conf > Succeed
Checking Syntax Of The File /etc/httpd/conf/httpd.conf > Succeed
<info> Starting Service apache:WEB
Starting Service apache:WEB
<debug> Looking For IP Addresses
Looking For IP Addresses
Query failed: Invalid argument (/cluster/rm/service[@name="WEB"]/ip[2]/@address)
<debug> Looking For IP Addresses > Succeed - IP Addresses Found
Looking For IP Addresses > Succeed - IP Addresses Found
<debug> Checking: SHA1 checksum of config file /etc/cluster/apache/apache:WEB/httpd.conf
Checking: SHA1 checksum of config file /etc/cluster/apache/apache:WEB/httpd.conf
<debug> Checking: SHA1 checksum > succeed
Checking: SHA1 checksum > succeed
<debug> Generating New Config File /etc/cluster/apache/apache:WEB/httpd.conf From /etc/httpd/conf/httpd.conf
Generating New Config File /etc/cluster/apache/apache:WEB/httpd.conf From /etc/httpd/conf/httpd.conf
<debug> Generating New Config File /etc/cluster/apache/apache:WEB/httpd.conf From /etc/httpd/conf/httpd.conf > Succeed
Generating New Config File /etc/cluster/apache/apache:WEB/httpd.conf From /etc/httpd/conf/httpd.conf > Succeed
<debug> Starting Service apache:WEB > Succeed
Starting Service apache:WEB > Succeed
Start of WEB complete
[root at node2 apache]# ip addr list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
link/ether 00:0c:29:1a:5b:cf brd ff:ff:ff:ff:ff:ff
inet 192.168.18.11/24 brd 192.168.18.255 scope global eth0
inet 192.168.18.50/24 scope global secondary eth0
inet6 fe80::20c:29ff:fe1a:5bcf/64 scope link
valid_lft forever preferred_lft forever
[root at node2 apache]# ps -eaf | grep httpd | wc -l
10
[root at node2 apache]# wget http://192.168.18.50
--2012-08-27 20:17:15-- http://192.168.18.50/
Connecting to 192.168.18.50:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 22 [text/html]
Saving to: `index.html'
100%[===============================================================================================================>] 22 --.-K/s in 0s
2012-08-27 20:17:15 (3.98 MB/s) - `index.html' saved [22/22]
/var/log/messages
Aug 27 19:24:44 node2 rgmanager[9388]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:25:03 node2 rgmanager[9523]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:25:39 node2 rgmanager[10429]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:26:23 node2 rgmanager[10585]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:26:50 node2 rgmanager[10730]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:26:58 node2 rgmanager[10807]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:27:10 node2 rgmanager[10865]: (null)
Aug 27 19:27:31 node2 rgmanager[10973]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:28:28 node2 rgmanager[11148]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:28:33 node2 rgmanager[11226]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:30:58 node2 rgmanager[11587]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:31:03 node2 rgmanager[11665]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:31:06 node2 rgmanager[11733]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 19:36:58 node2 rgmanager[12495]: is not configured
Aug 27 19:38:43 node2 rgmanager[12884]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 20:13:35 node2 rgmanager[16956]: Verifying Configuration Of default > Failed - Invalid Name Of Service
Aug 27 20:16:11 node2 rgmanager[17717]: Adding IPv4 address 192.168.18.50/24 to eth0
Aug 27 20:16:14 node2 in.rdiscd[17784]: setsockopt (IP_ADD_MEMBERSHIP): Address already in use
Aug 27 20:16:14 node2 in.rdiscd[17784]: Failed joining addresses
Aug 27 20:16:15 node2 rgmanager[17876]: Starting Service apache:WEB
Aug 27 20:16:16 node2 rgmanager[17940]: Query failed: Invalid argument (/cluster/rm/service[@name="WEB"]/ip[2]/@address)
Aug 27 20:17:31 node2 rgmanager[18737]: Stopping Service apache:WEB
Aug 27 20:17:33 node2 rgmanager[18771]: Stopping Service apache:WEB > Failed - Application Is Still Running
Aug 27 20:17:33 node2 rgmanager[18791]: Stopping Service apache:WEB > Failed
Aug 27 20:17:33 node2 rgmanager[18840]: Removing IPv4 address 192.168.18.50/24 from eth0
[root at node2 cluster]# clusvcadm -e WEB -m node2.localdomain
Member node2.localdomain trying to enable service:WEB...Aborted; service failed
[root at node2 cluster]# tail /var/log/messages
..
Aug 27 20:21:06 node2 rgmanager[1771]: #43: Service service:WEB has failed; can not start.
Aug 27 20:21:06 node2 rgmanager[1771]: #13: Service service:WEB failed to stop cleanly
More information about the Linux-cluster
mailing list