[Linux-cluster] Clusvcadm doesn't behave as it should

Agnieszka Kukałowicz qqlka at nask.pl
Thu Mar 6 14:15:03 UTC 2008


Hi, 

I have problem with clusvcadm command. In some cases it doesn't behave
as it should. My cluster has 3 nodes: w11.local, w12.local, w21.local. 
  Member Name                        ID   Status
  ------ ----                        ---- ------
  w11.local.polska.pl                   1 Online, Local, rgmanager
  w12.local.polska.pl                   2 Online, rgmanager
  w21.local.polska.pl                   4 Online, rgmanager
  /dev/xvdd1                            0 Online, Quorum Disk

I configured 2 simple httpd services in restricted failover domain.
The /etc/cluster/cluster.conf file is something like that:

<rm>
<failoverdomains>
    <failoverdomain name="SV_w11_failover" ordered="0" restricted="1">
         <failoverdomainnode name="w11.local" priority="1"/>
    </failoverdomain>
    <failoverdomain name="SV_w12_failover" ordered="0" restricted="1">
         <failoverdomainnode name="w12.local" priority="1"/>
    </failoverdomain>
    <failoverdomain name="SV_w21_failover" ordered="0" restricted="1">
         <failoverdomainnode name="w21.local" priority="1"/>
    </failoverdomain>
</failoverdomains>
<service autostart="1" domain="SV_w11_failover" exclusive="0"
name="httpd_w11" recovery="restart">
      <script file="/etc/init.d/httpd" name="httpd_start"/>
</service>
<service autostart="1" domain="SV_w21_failover" exclusive="0"
name="httpd_w21" recovery="restart">
     <script file="/etc/init.d/httpd" name="httpd_script2"/>
</service>
</rm>

During some test it occurs that:
1. clusvcadm -e httpd_w11 on w12.local or w21.local (not w11.local)
returns error " Local machine trying to enable
service:http_w11...Failure"
2. clusvcadm -e httpd_w11 -m w11.local on w12.local or w21.local shows
the same error
3. clusvcadm -d httpd_w11 on w11, w12, w21 disables the service

Only on w11.local commands:
clusvcadm -e httpd_w11 
clusvcadm -e httpd_w11 -m w11.local

work.

The same behaviour you can see in luci. Trying to enable httpd_w11 by
"enable service" command returns " A ricci error occurred on
w12.local:11111: clusvcadm failed to start httpd_w11..."

But the biggest problem is with the "-F" attribute.

Clusvcadm -e httpd_w11 -F on w12.local or w21.local causes on all nodes
error:
Mar  6 13:36:59 w11 clurgmgrd[2176]: <crit> Watchdog: Daemon died,
rebooting...
After that all hosts w11.local, w12.local, w21.local are rebooted.

I have installed rgmanager-2.0.31-1.el5 and the cluster runs on the Xen
guest. 

Best regards
Agnieszka Kukalowicz




More information about the Linux-cluster mailing list