[Linux-cluster] Clustat shows wrong service status

Agnieszka Kukałowicz qqlka at nask.pl
Thu Feb 28 11:32:38 UTC 2008


> 
> And I don't have situation that cman_tool nodes says:
> " Last fenced   2008-02-27 15:24:16 by override"
> 

I did more tests to find the cause of the problem.
I found that clustat has problem with "restricted" failover domain.
I tested 2 examples of my configuration: 

1. failover domain is "restricted"

<rm>
  <failoverdomains>
    <failoverdomain name="VM_w1_failover" ordered="0" restricted="1">
        <failoverdomainnode name="w1.local" priority="1"/>
    </failoverdomain>
    <failoverdomain name="VM_w2_failover" ordered="0" restricted="1">
        <failoverdomainnode name="w2.local" priority="1"/>

    </failoverdomain>
  </failoverdomains>
  <resources/>
  <vm autostart="1" domain="VM_w1_failover" exclusive="0"
name="VM_Work11_RHEL51" path="/virts/w11" recovery="restart"/>
  <vm autostart="1" domain="VM_w1_failover" exclusive="0"
name="VM_Work12_RHEL51" path="/virts/w12" recovery="restart"/>
  <vm autostart="0" domain="VM_w1_failover" exclusive="0"
name="VM_Work13_RHEL51" path="/virts/w13" recovery="disable"/>
  <vm autostart="1" domain="VM_w2_failover" exclusive="0"
name="VM_Work21_RHEL51" path="/virts/w21" recovery="restart"/>
  <vm autostart="0" domain="VM_w2_failover" exclusive="0"
name="VM_Work22_RHEL51" path="/virts/w22" recovery="disable"/>
  <vm autostart="0" domain="VM_w2_failover" exclusive="0"
name="VM_Work23_RHEL51" path="/virts/w23" recovery="disable"/>
        </rm>

Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  w2.local		                    1 Online, Local, rgmanager
  w1.local		                    2 Online, rgmanager

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  vm:VM_Work11_RHEL51  w1.local				started
  vm:VM_Work12_RHEL51  w1.local		            started
  vm:VM_Work13_RHEL51  (none)                         disabled
  vm:VM_Work21_RHEL51  w2.local		            started
  vm:VM_Work22_RHEL51  (none)                         disabled
  vm:VM_Work23_RHEL51  (none)                         disabled

After power off node w2.local and fencing "w2.local" by "w1.local"
clustat still shows the service vm:VM_Work21_RHEL51 is started on
w2.local

Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  w2.local.polska.pl                    1 Offline
  w1.local.polska.pl                    2 Online, Local, rgmanager

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  vm:VM_Work11_RHEL51  w1.local.polska.pl             started
  vm:VM_Work12_RHEL51  w1.local.polska.pl             started
  vm:VM_Work13_RHEL51  (none)                         disabled
  vm:VM_Work21_RHEL51  w2.local.polska.pl             started
  vm:VM_Work22_RHEL51  (none)                         disabled
  vm:VM_Work23_RHEL51  (none)                         disabled

2. failover domain is not restreicted
<failoverdomains>
   <failoverdomain name="VM_w1_failover" ordered="0" restricted="0">
	   <failoverdomainnode name="w1.local" priority="1"/>
</failoverdomain>
   <failoverdomain name="VM_w2_failover" ordered="0" restricted="0">
      <failoverdomainnode name="w2.local" priority="1"/>
   </failoverdomain>
</failoverdomains>

Clustat before shutting down w2.local:

[root at w1 ~]# clustat
Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  w2.local.polska.pl                    1 Online
  w1.local.polska.pl                    2 Online, Local, rgmanager

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  vm:VM_Work11_RHEL51  w1.local.polska.pl             started
  vm:VM_Work12_RHEL51  w1.local.polska.pl             started
  vm:VM_Work13_RHEL51  (none)                         disabled
  vm:VM_Work21_RHEL51  w2.local.polska.pl             started
  vm:VM_Work22_RHEL51  (none)                         disabled
  vm:VM_Work23_RHEL51  (none)                         disabled

After shutting down w2.local rgmanager migrate service
vm:VM_Work21_RHEL51  to w1.local and clustat shows correct states.

[root at w1 ~]# clustat
Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  w2.local.polska.pl                    1 Offline
  w1.local.polska.pl                    2 Online, Local, rgmanager

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  vm:VM_Work11_RHEL51  w1.local.polska.pl             started
  vm:VM_Work12_RHEL51  w1.local.polska.pl             started
  vm:VM_Work13_RHEL51  (none)                         disabled
  vm:VM_Work21_RHEL51  w1.local.polska.pl             started
  vm:VM_Work22_RHEL51  (none)                         disabled
  vm:VM_Work23_RHEL51  (none)                         disabled

I'm not sure but I think there is something wrong with rgmanager/clustat
wnen services are configured with restricted failover domain. 

Cheers 
Agnieszka Kukalowicz




More information about the Linux-cluster mailing list