From kelvin.edmison at alcatel-lucent.com  Thu Dec  3 19:19:37 2015
From: kelvin.edmison at alcatel-lucent.com (Kelvin Edmison)
Date: Thu, 3 Dec 2015 14:19:37 -0500
Subject: [Linux-cluster] Fencing problem w/ 2-node VM when a VM host dies
Message-ID: <566095C9.4050306@alcatel-lucent.com>


I am hoping that someone can help me understand the problems I'm having 
with linux clustering for VMs.

I am clustering 2 VMs on two separate VM hosts, trying to ensure that a 
service is always available.  The hosts and guests are both RHEL 6.7. 
The goal is to have only one of the two VMs running at a time.

The configuration works when we test/simulate VM deaths and graceful VM 
host shutdowns, and administrative switchovers (i.e. clusvcadm -r ).

However, when we simulate the sudden isolation of host A (e.g. ifdown 
eth0), two things happen
1) the VM on host B does not start, and repeated fence_xvm errors appear 
in the logs on host B
2) when the 'failed' node is returned to service, the cman service on 
host B dies.

This is my cluster.conf file (some elisions re: hostnames)

<?xml version="1.0"?>
<cluster config_version="14" name="clustername">
     <fence_daemon/>
     <clusternodes>
         <clusternode name="hostA.fqdn" nodeid="1">
             <fence>
                 <method name="VmFence">
                     <device name="virtfence1" port="jobhistory"/>
                 </method>
             </fence>
         </clusternode>
         <clusternode name="hostB.fqdn" nodeid="2">
             <fence>
                 <method name="VmFence">
                     <device name="virtfence2" port="jobhistory"/>
                 </method>
             </fence>
         </clusternode>
     </clusternodes>
     <cman expected_votes="1" two_node="1"/>
     <fencedevices>
         <fencedevice agent="fence_xvm" 
key_file="/etc/cluster/fence_xvm_hostA.key" 
multicast_address="239.255.1.10" name="virtfence1"/>
         <fencedevice agent="fence_xvm" 
key_file="/etc/cluster/fence_xvm_hostB.key" 
multicast_address="239.255.2.10" name="virtfence2"/>
     </fencedevices>
     <rm>
         <failoverdomains/>
         <resources/>
         <vm autostart="1" name="jobhistory" recovery="restart" 
use_virsh="1"/>
     </rm>
     <logging/>
</cluster>


Thanks for any help you can offer,
   Kelvin Edmison



From lists at alteeve.ca  Thu Dec  3 23:14:49 2015
From: lists at alteeve.ca (Digimer)
Date: Thu, 3 Dec 2015 18:14:49 -0500
Subject: [Linux-cluster] Fencing problem w/ 2-node VM when a VM host dies
In-Reply-To: <566095C9.4050306@alcatel-lucent.com>
References: <566095C9.4050306@alcatel-lucent.com>
Message-ID: <5660CCE9.50305@alteeve.ca>

On 03/12/15 02:19 PM, Kelvin Edmison wrote:
> 
> I am hoping that someone can help me understand the problems I'm having
> with linux clustering for VMs.
> 
> I am clustering 2 VMs on two separate VM hosts, trying to ensure that a
> service is always available.  The hosts and guests are both RHEL 6.7.
> The goal is to have only one of the two VMs running at a time.
> 
> The configuration works when we test/simulate VM deaths and graceful VM
> host shutdowns, and administrative switchovers (i.e. clusvcadm -r ).
> 
> However, when we simulate the sudden isolation of host A (e.g. ifdown
> eth0), two things happen
> 1) the VM on host B does not start, and repeated fence_xvm errors appear
> in the logs on host B
> 2) when the 'failed' node is returned to service, the cman service on
> host B dies.

If the node's host is dead, then there is no way for the survivor to
determine the state of the lost VM node. The cluster is not allowed to
take "no answer" as confirmation of fence success.

If your hosts have IPMI, then you could add fence_ipmilan as a backup
method where, if fence_xvm fails, it moves on and reboots the host itself.

> This is my cluster.conf file (some elisions re: hostnames)
> 
> <?xml version="1.0"?>
> <cluster config_version="14" name="clustername">
>     <fence_daemon/>
>     <clusternodes>
>         <clusternode name="hostA.fqdn" nodeid="1">
>             <fence>
>                 <method name="VmFence">
>                     <device name="virtfence1" port="jobhistory"/>
>                 </method>
>             </fence>
>         </clusternode>
>         <clusternode name="hostB.fqdn" nodeid="2">
>             <fence>
>                 <method name="VmFence">
>                     <device name="virtfence2" port="jobhistory"/>
>                 </method>
>             </fence>
>         </clusternode>
>     </clusternodes>
>     <cman expected_votes="1" two_node="1"/>
>     <fencedevices>
>         <fencedevice agent="fence_xvm"
> key_file="/etc/cluster/fence_xvm_hostA.key"
> multicast_address="239.255.1.10" name="virtfence1"/>
>         <fencedevice agent="fence_xvm"
> key_file="/etc/cluster/fence_xvm_hostB.key"
> multicast_address="239.255.2.10" name="virtfence2"/>
>     </fencedevices>
>     <rm>
>         <failoverdomains/>
>         <resources/>
>         <vm autostart="1" name="jobhistory" recovery="restart"
> use_virsh="1"/>
>     </rm>
>     <logging/>
> </cluster>
> 
> 
> Thanks for any help you can offer,
>   Kelvin Edmison
> 


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?



From kelvin.edmison at alcatel-lucent.com  Fri Dec  4 01:39:34 2015
From: kelvin.edmison at alcatel-lucent.com (Kelvin Edmison)
Date: Thu, 3 Dec 2015 20:39:34 -0500
Subject: [Linux-cluster] Fencing problem w/ 2-node VM when a VM host dies
In-Reply-To: <5660CCE9.50305@alteeve.ca>
References: <566095C9.4050306@alcatel-lucent.com> <5660CCE9.50305@alteeve.ca>
Message-ID: <5660EED6.5020908@alcatel-lucent.com>



On 12/03/2015 06:14 PM, Digimer wrote:
> On 03/12/15 02:19 PM, Kelvin Edmison wrote:
>> I am hoping that someone can help me understand the problems I'm having
>> with linux clustering for VMs.
>>
>> I am clustering 2 VMs on two separate VM hosts, trying to ensure that a
>> service is always available.  The hosts and guests are both RHEL 6.7.
>> The goal is to have only one of the two VMs running at a time.
>>
>> The configuration works when we test/simulate VM deaths and graceful VM
>> host shutdowns, and administrative switchovers (i.e. clusvcadm -r ).
>>
>> However, when we simulate the sudden isolation of host A (e.g. ifdown
>> eth0), two things happen
>> 1) the VM on host B does not start, and repeated fence_xvm errors appear
>> in the logs on host B
>> 2) when the 'failed' node is returned to service, the cman service on
>> host B dies.
> If the node's host is dead, then there is no way for the survivor to
> determine the state of the lost VM node. The cluster is not allowed to
> take "no answer" as confirmation of fence success.
>
> If your hosts have IPMI, then you could add fence_ipmilan as a backup
> method where, if fence_xvm fails, it moves on and reboots the host itself.

Thank you for the suggestion.  The hosts do have ipmi.  I'll explore it 
but I'm a little concerned about what it means for the other 
non-clustered VM workloads that exist on these two servers.

Do you have any thoughts as to why host B's cman process is dying when 
'host A' returns?

Thanks,
   Kelvin



From lists at alteeve.ca  Fri Dec  4 02:31:25 2015
From: lists at alteeve.ca (Digimer)
Date: Thu, 3 Dec 2015 21:31:25 -0500
Subject: [Linux-cluster] Fencing problem w/ 2-node VM when a VM host dies
In-Reply-To: <5660EED6.5020908@alcatel-lucent.com>
References: <566095C9.4050306@alcatel-lucent.com> <5660CCE9.50305@alteeve.ca>
	<5660EED6.5020908@alcatel-lucent.com>
Message-ID: <5660FAFD.8090504@alteeve.ca>

On 03/12/15 08:39 PM, Kelvin Edmison wrote:
> On 12/03/2015 06:14 PM, Digimer wrote:
>> On 03/12/15 02:19 PM, Kelvin Edmison wrote:
>>> I am hoping that someone can help me understand the problems I'm having
>>> with linux clustering for VMs.
>>>
>>> I am clustering 2 VMs on two separate VM hosts, trying to ensure that a
>>> service is always available.  The hosts and guests are both RHEL 6.7.
>>> The goal is to have only one of the two VMs running at a time.
>>>
>>> The configuration works when we test/simulate VM deaths and graceful VM
>>> host shutdowns, and administrative switchovers (i.e. clusvcadm -r ).
>>>
>>> However, when we simulate the sudden isolation of host A (e.g. ifdown
>>> eth0), two things happen
>>> 1) the VM on host B does not start, and repeated fence_xvm errors appear
>>> in the logs on host B
>>> 2) when the 'failed' node is returned to service, the cman service on
>>> host B dies.
>> If the node's host is dead, then there is no way for the survivor to
>> determine the state of the lost VM node. The cluster is not allowed to
>> take "no answer" as confirmation of fence success.
>>
>> If your hosts have IPMI, then you could add fence_ipmilan as a backup
>> method where, if fence_xvm fails, it moves on and reboots the host
>> itself.
> 
> Thank you for the suggestion.  The hosts do have ipmi.  I'll explore it
> but I'm a little concerned about what it means for the other
> non-clustered VM workloads that exist on these two servers.
> 
> Do you have any thoughts as to why host B's cman process is dying when
> 'host A' returns?
> 
> Thanks,
>   Kelvin

It's not dieing, it's blocking. When a node is lost, dlm blocks until
fenced tells it that the fence was successful. If fenced can't contact
the lost node's fence method(s), then it doesn't succeed and dlm stays
blocked. To anything that uses DLM, like rgmanager, it appears like the
host is hung but it is by design. The logic is that, as bad as it is to
hang, it's better than risking a split-brain.

As for what will happen to non-cluster services, well, if I can be
blunt, you shouldn't mix the two. If something is important enough to
make HA, then it is important enough for dedicated hardware in my opinion.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?



From kelvin.edmison at alcatel-lucent.com  Fri Dec  4 14:14:23 2015
From: kelvin.edmison at alcatel-lucent.com (Kelvin Edmison)
Date: Fri, 4 Dec 2015 09:14:23 -0500
Subject: [Linux-cluster] Fencing problem w/ 2-node VM when a VM host dies
In-Reply-To: <5660FAFD.8090504@alteeve.ca>
References: <566095C9.4050306@alcatel-lucent.com> <5660CCE9.50305@alteeve.ca>
	<5660EED6.5020908@alcatel-lucent.com> <5660FAFD.8090504@alteeve.ca>
Message-ID: <56619FBF.5040107@alcatel-lucent.com>



On 12/03/2015 09:31 PM, Digimer wrote:
> On 03/12/15 08:39 PM, Kelvin Edmison wrote:
>> On 12/03/2015 06:14 PM, Digimer wrote:
>>> On 03/12/15 02:19 PM, Kelvin Edmison wrote:
>>>> I am hoping that someone can help me understand the problems I'm having
>>>> with linux clustering for VMs.
>>>>
>>>> I am clustering 2 VMs on two separate VM hosts, trying to ensure that a
>>>> service is always available.  The hosts and guests are both RHEL 6.7.
>>>> The goal is to have only one of the two VMs running at a time.
>>>>
>>>> The configuration works when we test/simulate VM deaths and graceful VM
>>>> host shutdowns, and administrative switchovers (i.e. clusvcadm -r ).
>>>>
>>>> However, when we simulate the sudden isolation of host A (e.g. ifdown
>>>> eth0), two things happen
>>>> 1) the VM on host B does not start, and repeated fence_xvm errors appear
>>>> in the logs on host B
>>>> 2) when the 'failed' node is returned to service, the cman service on
>>>> host B dies.
>>> If the node's host is dead, then there is no way for the survivor to
>>> determine the state of the lost VM node. The cluster is not allowed to
>>> take "no answer" as confirmation of fence success.
>>>
>>> If your hosts have IPMI, then you could add fence_ipmilan as a backup
>>> method where, if fence_xvm fails, it moves on and reboots the host
>>> itself.
>> Thank you for the suggestion.  The hosts do have ipmi.  I'll explore it
>> but I'm a little concerned about what it means for the other
>> non-clustered VM workloads that exist on these two servers.
>>
>> Do you have any thoughts as to why host B's cman process is dying when
>> 'host A' returns?
>>
>> Thanks,
>>    Kelvin
> It's not dieing, it's blocking. When a node is lost, dlm blocks until
> fenced tells it that the fence was successful. If fenced can't contact
> the lost node's fence method(s), then it doesn't succeed and dlm stays
> blocked. To anything that uses DLM, like rgmanager, it appears like the
> host is hung but it is by design. The logic is that, as bad as it is to
> hang, it's better than risking a split-brain.
when I said the cman service is dying, I should have further qualified 
it. I mean that the corosync process is no longer running (ps -ef | grep 
corosync does not show it)  and after recovering the failed host A, 
manual intervention (service cman start) was required on host B to 
recover full cluster services.

[root at host2 ~]# for SERVICE in ricci fence_virtd cman rgmanager; do 
printf "%-12s   " $SERVICE; service $SERVICE status; done
ricci          ricci (pid  5469) is running...
fence_virtd    fence_virtd (pid  4862) is running...
cman           Found stale pid file
rgmanager      rgmanager (pid  5366) is running...


Thanks,
   Kelvin




From lists at alteeve.ca  Fri Dec  4 17:49:05 2015
From: lists at alteeve.ca (Digimer)
Date: Fri, 4 Dec 2015 12:49:05 -0500
Subject: [Linux-cluster] Fencing problem w/ 2-node VM when a VM host dies
In-Reply-To: <56619FBF.5040107@alcatel-lucent.com>
References: <566095C9.4050306@alcatel-lucent.com> <5660CCE9.50305@alteeve.ca>
	<5660EED6.5020908@alcatel-lucent.com> <5660FAFD.8090504@alteeve.ca>
	<56619FBF.5040107@alcatel-lucent.com>
Message-ID: <5661D211.1040902@alteeve.ca>

On 04/12/15 09:14 AM, Kelvin Edmison wrote:
> 
> 
> On 12/03/2015 09:31 PM, Digimer wrote:
>> On 03/12/15 08:39 PM, Kelvin Edmison wrote:
>>> On 12/03/2015 06:14 PM, Digimer wrote:
>>>> On 03/12/15 02:19 PM, Kelvin Edmison wrote:
>>>>> I am hoping that someone can help me understand the problems I'm
>>>>> having
>>>>> with linux clustering for VMs.
>>>>>
>>>>> I am clustering 2 VMs on two separate VM hosts, trying to ensure
>>>>> that a
>>>>> service is always available.  The hosts and guests are both RHEL 6.7.
>>>>> The goal is to have only one of the two VMs running at a time.
>>>>>
>>>>> The configuration works when we test/simulate VM deaths and
>>>>> graceful VM
>>>>> host shutdowns, and administrative switchovers (i.e. clusvcadm -r ).
>>>>>
>>>>> However, when we simulate the sudden isolation of host A (e.g. ifdown
>>>>> eth0), two things happen
>>>>> 1) the VM on host B does not start, and repeated fence_xvm errors
>>>>> appear
>>>>> in the logs on host B
>>>>> 2) when the 'failed' node is returned to service, the cman service on
>>>>> host B dies.
>>>> If the node's host is dead, then there is no way for the survivor to
>>>> determine the state of the lost VM node. The cluster is not allowed to
>>>> take "no answer" as confirmation of fence success.
>>>>
>>>> If your hosts have IPMI, then you could add fence_ipmilan as a backup
>>>> method where, if fence_xvm fails, it moves on and reboots the host
>>>> itself.
>>> Thank you for the suggestion.  The hosts do have ipmi.  I'll explore it
>>> but I'm a little concerned about what it means for the other
>>> non-clustered VM workloads that exist on these two servers.
>>>
>>> Do you have any thoughts as to why host B's cman process is dying when
>>> 'host A' returns?
>>>
>>> Thanks,
>>>    Kelvin
>> It's not dieing, it's blocking. When a node is lost, dlm blocks until
>> fenced tells it that the fence was successful. If fenced can't contact
>> the lost node's fence method(s), then it doesn't succeed and dlm stays
>> blocked. To anything that uses DLM, like rgmanager, it appears like the
>> host is hung but it is by design. The logic is that, as bad as it is to
>> hang, it's better than risking a split-brain.
> when I said the cman service is dying, I should have further qualified
> it. I mean that the corosync process is no longer running (ps -ef | grep
> corosync does not show it)  and after recovering the failed host A,
> manual intervention (service cman start) was required on host B to
> recover full cluster services.
> 
> [root at host2 ~]# for SERVICE in ricci fence_virtd cman rgmanager; do
> printf "%-12s   " $SERVICE; service $SERVICE status; done
> ricci          ricci (pid  5469) is running...
> fence_virtd    fence_virtd (pid  4862) is running...
> cman           Found stale pid file
> rgmanager      rgmanager (pid  5366) is running...
> 
> 
> Thanks,
>   Kelvin

Oh now that is interesting...

You'll want input from Fabio, Chrissie or one of the other core devs, I
suspect.

If this is RHEL proper, can you open a rhbz ticket? If it's CentOS, and
if you can reproduce it reliably, can you create a new thread with the
reproducer?

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?



From kelvin.edmison at alcatel-lucent.com  Fri Dec  4 18:52:06 2015
From: kelvin.edmison at alcatel-lucent.com (Kelvin Edmison)
Date: Fri, 4 Dec 2015 13:52:06 -0500
Subject: [Linux-cluster] Fencing problem w/ 2-node VM when a VM host dies
In-Reply-To: <5661D211.1040902@alteeve.ca>
References: <566095C9.4050306@alcatel-lucent.com> <5660CCE9.50305@alteeve.ca>
	<5660EED6.5020908@alcatel-lucent.com> <5660FAFD.8090504@alteeve.ca>
	<56619FBF.5040107@alcatel-lucent.com> <5661D211.1040902@alteeve.ca>
Message-ID: <5661E0D6.7000504@alcatel-lucent.com>



On 12/04/2015 12:49 PM, Digimer wrote:
> On 04/12/15 09:14 AM, Kelvin Edmison wrote:
>>
>> On 12/03/2015 09:31 PM, Digimer wrote:
>>> On 03/12/15 08:39 PM, Kelvin Edmison wrote:
>>>> On 12/03/2015 06:14 PM, Digimer wrote:
>>>>> On 03/12/15 02:19 PM, Kelvin Edmison wrote:
>>>>>> I am hoping that someone can help me understand the problems I'm
>>>>>> having
>>>>>> with linux clustering for VMs.
>>>>>>
>>>>>> I am clustering 2 VMs on two separate VM hosts, trying to ensure
>>>>>> that a
>>>>>> service is always available.  The hosts and guests are both RHEL 6.7.
>>>>>> The goal is to have only one of the two VMs running at a time.
>>>>>>
>>>>>> The configuration works when we test/simulate VM deaths and
>>>>>> graceful VM
>>>>>> host shutdowns, and administrative switchovers (i.e. clusvcadm -r ).
>>>>>>
>>>>>> However, when we simulate the sudden isolation of host A (e.g. ifdown
>>>>>> eth0), two things happen
>>>>>> 1) the VM on host B does not start, and repeated fence_xvm errors
>>>>>> appear
>>>>>> in the logs on host B
>>>>>> 2) when the 'failed' node is returned to service, the cman service on
>>>>>> host B dies.
>>>>> If the node's host is dead, then there is no way for the survivor to
>>>>> determine the state of the lost VM node. The cluster is not allowed to
>>>>> take "no answer" as confirmation of fence success.
>>>>>
>>>>> If your hosts have IPMI, then you could add fence_ipmilan as a backup
>>>>> method where, if fence_xvm fails, it moves on and reboots the host
>>>>> itself.
>>>> Thank you for the suggestion.  The hosts do have ipmi.  I'll explore it
>>>> but I'm a little concerned about what it means for the other
>>>> non-clustered VM workloads that exist on these two servers.
>>>>
>>>> Do you have any thoughts as to why host B's cman process is dying when
>>>> 'host A' returns?
>>>>
>>>> Thanks,
>>>>     Kelvin
>>> It's not dieing, it's blocking. When a node is lost, dlm blocks until
>>> fenced tells it that the fence was successful. If fenced can't contact
>>> the lost node's fence method(s), then it doesn't succeed and dlm stays
>>> blocked. To anything that uses DLM, like rgmanager, it appears like the
>>> host is hung but it is by design. The logic is that, as bad as it is to
>>> hang, it's better than risking a split-brain.
>> when I said the cman service is dying, I should have further qualified
>> it. I mean that the corosync process is no longer running (ps -ef | grep
>> corosync does not show it)  and after recovering the failed host A,
>> manual intervention (service cman start) was required on host B to
>> recover full cluster services.
>>
>> [root at host2 ~]# for SERVICE in ricci fence_virtd cman rgmanager; do
>> printf "%-12s   " $SERVICE; service $SERVICE status; done
>> ricci          ricci (pid  5469) is running...
>> fence_virtd    fence_virtd (pid  4862) is running...
>> cman           Found stale pid file
>> rgmanager      rgmanager (pid  5366) is running...
>>
>>
>> Thanks,
>>    Kelvin
> Oh now that is interesting...
>
> You'll want input from Fabio, Chrissie or one of the other core devs, I
> suspect.
>
> If this is RHEL proper, can you open a rhbz ticket? If it's CentOS, and
> if you can reproduce it reliably, can you create a new thread with the
> reproducer?
It's RHEL proper in both host and guest, and we can reproduce it reliably.



From lists at alteeve.ca  Fri Dec  4 19:00:04 2015
From: lists at alteeve.ca (Digimer)
Date: Fri, 4 Dec 2015 14:00:04 -0500
Subject: [Linux-cluster] Fencing problem w/ 2-node VM when a VM host dies
In-Reply-To: <5661E0D6.7000504@alcatel-lucent.com>
References: <566095C9.4050306@alcatel-lucent.com> <5660CCE9.50305@alteeve.ca>
	<5660EED6.5020908@alcatel-lucent.com> <5660FAFD.8090504@alteeve.ca>
	<56619FBF.5040107@alcatel-lucent.com> <5661D211.1040902@alteeve.ca>
	<5661E0D6.7000504@alcatel-lucent.com>
Message-ID: <5661E2B4.2080405@alteeve.ca>

On 04/12/15 01:52 PM, Kelvin Edmison wrote:
> 
> 
> On 12/04/2015 12:49 PM, Digimer wrote:
>> On 04/12/15 09:14 AM, Kelvin Edmison wrote:
>>>
>>> On 12/03/2015 09:31 PM, Digimer wrote:
>>>> On 03/12/15 08:39 PM, Kelvin Edmison wrote:
>>>>> On 12/03/2015 06:14 PM, Digimer wrote:
>>>>>> On 03/12/15 02:19 PM, Kelvin Edmison wrote:
>>>>>>> I am hoping that someone can help me understand the problems I'm
>>>>>>> having
>>>>>>> with linux clustering for VMs.
>>>>>>>
>>>>>>> I am clustering 2 VMs on two separate VM hosts, trying to ensure
>>>>>>> that a
>>>>>>> service is always available.  The hosts and guests are both RHEL
>>>>>>> 6.7.
>>>>>>> The goal is to have only one of the two VMs running at a time.
>>>>>>>
>>>>>>> The configuration works when we test/simulate VM deaths and
>>>>>>> graceful VM
>>>>>>> host shutdowns, and administrative switchovers (i.e. clusvcadm -r ).
>>>>>>>
>>>>>>> However, when we simulate the sudden isolation of host A (e.g.
>>>>>>> ifdown
>>>>>>> eth0), two things happen
>>>>>>> 1) the VM on host B does not start, and repeated fence_xvm errors
>>>>>>> appear
>>>>>>> in the logs on host B
>>>>>>> 2) when the 'failed' node is returned to service, the cman
>>>>>>> service on
>>>>>>> host B dies.
>>>>>> If the node's host is dead, then there is no way for the survivor to
>>>>>> determine the state of the lost VM node. The cluster is not
>>>>>> allowed to
>>>>>> take "no answer" as confirmation of fence success.
>>>>>>
>>>>>> If your hosts have IPMI, then you could add fence_ipmilan as a backup
>>>>>> method where, if fence_xvm fails, it moves on and reboots the host
>>>>>> itself.
>>>>> Thank you for the suggestion.  The hosts do have ipmi.  I'll
>>>>> explore it
>>>>> but I'm a little concerned about what it means for the other
>>>>> non-clustered VM workloads that exist on these two servers.
>>>>>
>>>>> Do you have any thoughts as to why host B's cman process is dying when
>>>>> 'host A' returns?
>>>>>
>>>>> Thanks,
>>>>>     Kelvin
>>>> It's not dieing, it's blocking. When a node is lost, dlm blocks until
>>>> fenced tells it that the fence was successful. If fenced can't contact
>>>> the lost node's fence method(s), then it doesn't succeed and dlm stays
>>>> blocked. To anything that uses DLM, like rgmanager, it appears like the
>>>> host is hung but it is by design. The logic is that, as bad as it is to
>>>> hang, it's better than risking a split-brain.
>>> when I said the cman service is dying, I should have further qualified
>>> it. I mean that the corosync process is no longer running (ps -ef | grep
>>> corosync does not show it)  and after recovering the failed host A,
>>> manual intervention (service cman start) was required on host B to
>>> recover full cluster services.
>>>
>>> [root at host2 ~]# for SERVICE in ricci fence_virtd cman rgmanager; do
>>> printf "%-12s   " $SERVICE; service $SERVICE status; done
>>> ricci          ricci (pid  5469) is running...
>>> fence_virtd    fence_virtd (pid  4862) is running...
>>> cman           Found stale pid file
>>> rgmanager      rgmanager (pid  5366) is running...
>>>
>>>
>>> Thanks,
>>>    Kelvin
>> Oh now that is interesting...
>>
>> You'll want input from Fabio, Chrissie or one of the other core devs, I
>> suspect.
>>
>> If this is RHEL proper, can you open a rhbz ticket? If it's CentOS, and
>> if you can reproduce it reliably, can you create a new thread with the
>> reproducer?
> It's RHEL proper in both host and guest, and we can reproduce it reliably.

Excellent!

Please reply here with the rhbz#. I'm keen to see what comes of it.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?



From kelvin.edmison at alcatel-lucent.com  Mon Dec  7 16:47:09 2015
From: kelvin.edmison at alcatel-lucent.com (Kelvin Edmison)
Date: Mon, 7 Dec 2015 11:47:09 -0500
Subject: [Linux-cluster] Fencing problem w/ 2-node VM when a VM host dies
In-Reply-To: <5661E2B4.2080405@alteeve.ca>
References: <566095C9.4050306@alcatel-lucent.com> <5660CCE9.50305@alteeve.ca>
	<5660EED6.5020908@alcatel-lucent.com> <5660FAFD.8090504@alteeve.ca>
	<56619FBF.5040107@alcatel-lucent.com> <5661D211.1040902@alteeve.ca>
	<5661E0D6.7000504@alcatel-lucent.com> <5661E2B4.2080405@alteeve.ca>
Message-ID: <5665B80D.3020207@alcatel-lucent.com>



On 12/04/2015 02:00 PM, Digimer wrote:
> On 04/12/15 01:52 PM, Kelvin Edmison wrote:
>>
>> On 12/04/2015 12:49 PM, Digimer wrote:
>>> On 04/12/15 09:14 AM, Kelvin Edmison wrote:
>>>> On 12/03/2015 09:31 PM, Digimer wrote:
>>>>> On 03/12/15 08:39 PM, Kelvin Edmison wrote:
>>>>>> On 12/03/2015 06:14 PM, Digimer wrote:
>>>>>>> On 03/12/15 02:19 PM, Kelvin Edmison wrote:
>>>>>>>> I am hoping that someone can help me understand the problems I'm
>>>>>>>> having
>>>>>>>> with linux clustering for VMs.
>>>>>>>>
>>>>>>>> I am clustering 2 VMs on two separate VM hosts, trying to ensure
>>>>>>>> that a
>>>>>>>> service is always available.  The hosts and guests are both RHEL
>>>>>>>> 6.7.
>>>>>>>> The goal is to have only one of the two VMs running at a time.
>>>>>>>>
>>>>>>>> The configuration works when we test/simulate VM deaths and
>>>>>>>> graceful VM
>>>>>>>> host shutdowns, and administrative switchovers (i.e. clusvcadm -r ).
>>>>>>>>
>>>>>>>> However, when we simulate the sudden isolation of host A (e.g.
>>>>>>>> ifdown
>>>>>>>> eth0), two things happen
>>>>>>>> 1) the VM on host B does not start, and repeated fence_xvm errors
>>>>>>>> appear
>>>>>>>> in the logs on host B
>>>>>>>> 2) when the 'failed' node is returned to service, the cman
>>>>>>>> service on
>>>>>>>> host B dies.
>>>>>>> If the node's host is dead, then there is no way for the survivor to
>>>>>>> determine the state of the lost VM node. The cluster is not
>>>>>>> allowed to
>>>>>>> take "no answer" as confirmation of fence success.
>>>>>>>
>>>>>>> If your hosts have IPMI, then you could add fence_ipmilan as a backup
>>>>>>> method where, if fence_xvm fails, it moves on and reboots the host
>>>>>>> itself.
>>>>>> Thank you for the suggestion.  The hosts do have ipmi.  I'll
>>>>>> explore it
>>>>>> but I'm a little concerned about what it means for the other
>>>>>> non-clustered VM workloads that exist on these two servers.
>>>>>>
>>>>>> Do you have any thoughts as to why host B's cman process is dying when
>>>>>> 'host A' returns?
>>>>>>
>>>>>> Thanks,
>>>>>>      Kelvin
>>>>> It's not dieing, it's blocking. When a node is lost, dlm blocks until
>>>>> fenced tells it that the fence was successful. If fenced can't contact
>>>>> the lost node's fence method(s), then it doesn't succeed and dlm stays
>>>>> blocked. To anything that uses DLM, like rgmanager, it appears like the
>>>>> host is hung but it is by design. The logic is that, as bad as it is to
>>>>> hang, it's better than risking a split-brain.
>>>> when I said the cman service is dying, I should have further qualified
>>>> it. I mean that the corosync process is no longer running (ps -ef | grep
>>>> corosync does not show it)  and after recovering the failed host A,
>>>> manual intervention (service cman start) was required on host B to
>>>> recover full cluster services.
>>>>
>>>> [root at host2 ~]# for SERVICE in ricci fence_virtd cman rgmanager; do
>>>> printf "%-12s   " $SERVICE; service $SERVICE status; done
>>>> ricci          ricci (pid  5469) is running...
>>>> fence_virtd    fence_virtd (pid  4862) is running...
>>>> cman           Found stale pid file
>>>> rgmanager      rgmanager (pid  5366) is running...
>>>>
>>>>
>>>> Thanks,
>>>>     Kelvin
>>> Oh now that is interesting...
>>>
>>> You'll want input from Fabio, Chrissie or one of the other core devs, I
>>> suspect.
>>>
>>> If this is RHEL proper, can you open a rhbz ticket? If it's CentOS, and
>>> if you can reproduce it reliably, can you create a new thread with the
>>> reproducer?
>> It's RHEL proper in both host and guest, and we can reproduce it reliably.
> Excellent!
>
> Please reply here with the rhbz#. I'm keen to see what comes of it.
>
Here it is.  https://bugzilla.redhat.com/show_bug.cgi?id=1289209

I was wrong about being able to re-start the corosync process; it takes 
a physical node reboot before I can get the host B back into the 
cluster.  I wonder if this situation doesn't occur often because of the 
use of ILO or power-related backup fences.