[Linux-cluster] Problems to start ony one cluster service More info

carlopmart carlopmart at gmail.com
Wed Nov 28 08:42:58 UTC 2007


Lon Hohberger wrote:
> On Tue, 2007-11-27 at 11:26 +0100, carlopmart wrote:
>> Hi all
>>
>>   I have a very strange problem. I have configured three nodes under RHCS on 
>> rhel5.1 servers. All works ok, except for one service that never starts when 
>> rgmanager start-up. My cluster conf is:
>>
>> <?xml version="1.0"?>
>> <cluster alias="RhelXenCluster" config_version="17" name="RhelXenCluster">
>>          <fence_daemon post_fail_delay="0" post_join_delay="3"/>
>>          <clusternodes>
>>                  <clusternode name="rhelclu01.hpulabs.org" nodeid="1" votes="1">
>>                          <fence>
>>                                  <method name="1">
>>                                          <device name="gnbd-fence" 
>> nodename="rhelclu01.hpulabs.org"/>
>>                                  </method>
>>                          </fence>
>>                          <multicast addr="239.192.75.55" interface="eth0"/>
>>                  </clusternode>
>>                  <clusternode name="rhelclu02.hpulabs.org" nodeid="2" votes="1">
>>                          <fence>
>>                                  <method name="1">
>>                                          <device name="gnbd-fence" 
>> nodename="rhelclu02.hpulabs.org"/>
>>                                  </method>
>>                          </fence>
>>                          <multicast addr="239.192.75.55" interface="eth0"/>
>>                  </clusternode>
>>                  <clusternode name="rhelclu03.hpulabs.org" nodeid="3" votes="1">
>>                          <fence>
>>                                  <method name="1">
>>                                          <device name="gnbd-fence" 
>> nodename="rhelclu03.hpulabs.org"/>
>>                                  </method>
>>                          </fence>
>>                          <multicast addr="239.192.75.55" interface="xenbr0"/>
>>                  </clusternode>
>>          </clusternodes>
>>          <cman expected_votes="1" two_node="0">
>>                  <multicast addr="239.192.75.55"/>
>>          </cman>
>>          <fencedevices>
>>                  <fencedevice agent="fence_gnbd" name="gnbd-fence" 
>> servers="rhelclu03.hpulabs.org"/>
>>          </fencedevices>
>>          <rm log_facility="local4" log_level="7">
>>                  <failoverdomains>
>>                          <failoverdomain name="PriCluster" ordered="1" 
>> restricted="1">
>>                                  <failoverdomainnode 
>> name="rhelclu01.hpulabs.org" priority="1"/>
>>                                  <failoverdomainnode 
>> name="rhelclu02.hpulabs.org" priority="2"/>
>>                          </failoverdomain>
>>                          <failoverdomain name="SecCluster" ordered="1" 
>> restricted="1">
>>                                  <failoverdomainnode 
>> name="rhelclu02.hpulabs.org" priority="1"/>
>>                                  <failoverdomainnode 
>> name="rhelclu01.hpulabs.org" priority="2"/>
>>                          </failoverdomain>
>>                  </failoverdomains>
>>                  <resources>
>> 			<ip address="172.25.50.10" monitor_link="1"/>
>>                          <ip address="172.25.50.11" monitor_link="1"/>
>>                          <ip address="172.25.50.12" monitor_link="1"/>
>>                          <ip address="172.25.50.13" monitor_link="1"/>
>>                          <ip address="172.25.50.14" monitor_link="1"/>
>>                          <ip address="172.25.50.15" monitor_link="1"/>
>>                          <ip address="172.25.50.16" monitor_link="1"/>
>>                          <ip address="172.25.50.17" monitor_link="1"/>
>>                          <ip address="172.25.50.18" monitor_link="1"/>
>>                          <ip address="172.25.50.19" monitor_link="1"/>
>>                          <ip address="172.25.50.20" monitor_link="1"/>
>>                  </resources>
>>                  <service autostart="1" domain="PriCluster" name="dns-svc" 
>> recovery="relocate">
>>                          <ip ref="172.25.50.10">
>>                                  <script 
>> file="/data/cfgcluster/etc/init.d/named" name="named"/>
>>                          </ip>
>>                  </service>
>>                  <service autostart="1" domain="SecCluster" name="mail-svc" 
>> recovery="relocate">
>>                          <ip ref="172.25.50.11">
>>                                  <script 
>> file="/data/cfgcluster/etc/init.d/postfix-cluster" name="postfix"/>
>>                          </ip>
>>                  </service>
>>                  <service autostart="1" domain="SecCluster" name="rsync-svc" 
>> recovery="relocate">
>>                          <ip ref="172.25.50.13">
>>                                  <script 
>> file="/data/cfgcluster/etc/init.d/rsyncd" name="rsyncd"/>
>>                          </ip>
>>                  </service>
>>                  <service autostart="1" domain="PriCluster" name="wwwsoft-svc" 
>> recovery="relocate">
>>                          <ip ref="172.25.50.14">
>>                                  <script 
>> file="/data/cfgcluster/etc/init.d/httpd-mirror" name="httpd-mirror"/>
>>                          </ip>
>>                  </service>
>>                  <service autostart="1" domain="SecCluster" name="proxy-svc" 
>> recovery="relocate">
>>                          <ip ref="172.25.50.15">
>>                                  <script 
>> file="/data/cfgcluster/etc/init.d/squid" name="squid"/>
>>                          </ip>
>>                  </service>
>>          </rm>
>> </cluster>
>>
>>   The service that returns me errors and never starts when rgmanager start-up is 
>> postfix-cluster. On maillog file I find this error:
> 
> 
>>   Nov 26 11:27:31 rhelclu01 postfix[27959]: fatal: parameter inet_interfaces: no 
>> local interface found for 172.25.50.11
>> Nov 26 11:27:43 rhelclu01 postfix[28313]: fatal: 
>> /data/cfgcluster/etc/postfix-cluster/postfix-script: Permission denied
> 
>>   but thath's not true. If I start this service manually all works ok. Postfix 
>> configuration it is ok, What can be the problem??? I don't know why rgmanager 
>> dosen't config 172.25.50.11 address before execute postfix-cluster service ....
> 
> When you start it manually -- how?
> * add IP manually / running the script?
> * rg_test?
> * clusvcadm -e?
> 
> -- Lon

Another strange thing: at this morning this service is stopped, when I try to 
start using clusvcadm returns this error:

Nov 28 09:28:21 rhelclu01 clurgmgrd[1450]: <warning> #68: Failed to start 
service:mail-svc; return value: 1
Nov 28 09:28:21 rhelclu01 clurgmgrd[1450]: <notice> Stopping service 
service:mail-svc
Nov 28 09:28:22 rhelclu01 clurgmgrd: [1450]: <err> script:postfix: stop of 
/data/cfgcluster/etc/init.d/postfix-cluster failed (returned 1)
Nov 28 09:28:22 rhelclu01 clurgmgrd[1450]: <notice> stop on script "postfix" 
returned 1 (generic error)
Nov 28 09:28:22 rhelclu01 in.rdiscd[11610]: setsockopt (IP_ADD_MEMBERSHIP): 
Address already in use
Nov 28 09:28:22 rhelclu01 in.rdiscd[11610]: Failed joining addresses
Nov 28 09:28:32 rhelclu01 clurgmgrd[1450]: <notice> Service service:mail-svc is 
recovering
Nov 28 09:28:32 rhelclu01 clurgmgrd[1450]: <warning> #71: Relocating failed 
service service:mail-svc
Nov 28 09:28:32 rhelclu01 clurgmgrd[1450]: <notice> Stopping service 
service:mail-svc

  I don't understand this. IP 172.25.50.11 isn't used by anyone ....


-- 
CL Martinez
carlopmart {at} gmail {d0t} com




More information about the Linux-cluster mailing list