[Linux-cluster] Fencing problem w/ 2-node VM when a VM host dies

Kelvin Edmison kelvin.edmison at alcatel-lucent.com
Mon Dec 7 16:47:09 UTC 2015



On 12/04/2015 02:00 PM, Digimer wrote:
> On 04/12/15 01:52 PM, Kelvin Edmison wrote:
>>
>> On 12/04/2015 12:49 PM, Digimer wrote:
>>> On 04/12/15 09:14 AM, Kelvin Edmison wrote:
>>>> On 12/03/2015 09:31 PM, Digimer wrote:
>>>>> On 03/12/15 08:39 PM, Kelvin Edmison wrote:
>>>>>> On 12/03/2015 06:14 PM, Digimer wrote:
>>>>>>> On 03/12/15 02:19 PM, Kelvin Edmison wrote:
>>>>>>>> I am hoping that someone can help me understand the problems I'm
>>>>>>>> having
>>>>>>>> with linux clustering for VMs.
>>>>>>>>
>>>>>>>> I am clustering 2 VMs on two separate VM hosts, trying to ensure
>>>>>>>> that a
>>>>>>>> service is always available.  The hosts and guests are both RHEL
>>>>>>>> 6.7.
>>>>>>>> The goal is to have only one of the two VMs running at a time.
>>>>>>>>
>>>>>>>> The configuration works when we test/simulate VM deaths and
>>>>>>>> graceful VM
>>>>>>>> host shutdowns, and administrative switchovers (i.e. clusvcadm -r ).
>>>>>>>>
>>>>>>>> However, when we simulate the sudden isolation of host A (e.g.
>>>>>>>> ifdown
>>>>>>>> eth0), two things happen
>>>>>>>> 1) the VM on host B does not start, and repeated fence_xvm errors
>>>>>>>> appear
>>>>>>>> in the logs on host B
>>>>>>>> 2) when the 'failed' node is returned to service, the cman
>>>>>>>> service on
>>>>>>>> host B dies.
>>>>>>> If the node's host is dead, then there is no way for the survivor to
>>>>>>> determine the state of the lost VM node. The cluster is not
>>>>>>> allowed to
>>>>>>> take "no answer" as confirmation of fence success.
>>>>>>>
>>>>>>> If your hosts have IPMI, then you could add fence_ipmilan as a backup
>>>>>>> method where, if fence_xvm fails, it moves on and reboots the host
>>>>>>> itself.
>>>>>> Thank you for the suggestion.  The hosts do have ipmi.  I'll
>>>>>> explore it
>>>>>> but I'm a little concerned about what it means for the other
>>>>>> non-clustered VM workloads that exist on these two servers.
>>>>>>
>>>>>> Do you have any thoughts as to why host B's cman process is dying when
>>>>>> 'host A' returns?
>>>>>>
>>>>>> Thanks,
>>>>>>      Kelvin
>>>>> It's not dieing, it's blocking. When a node is lost, dlm blocks until
>>>>> fenced tells it that the fence was successful. If fenced can't contact
>>>>> the lost node's fence method(s), then it doesn't succeed and dlm stays
>>>>> blocked. To anything that uses DLM, like rgmanager, it appears like the
>>>>> host is hung but it is by design. The logic is that, as bad as it is to
>>>>> hang, it's better than risking a split-brain.
>>>> when I said the cman service is dying, I should have further qualified
>>>> it. I mean that the corosync process is no longer running (ps -ef | grep
>>>> corosync does not show it)  and after recovering the failed host A,
>>>> manual intervention (service cman start) was required on host B to
>>>> recover full cluster services.
>>>>
>>>> [root at host2 ~]# for SERVICE in ricci fence_virtd cman rgmanager; do
>>>> printf "%-12s   " $SERVICE; service $SERVICE status; done
>>>> ricci          ricci (pid  5469) is running...
>>>> fence_virtd    fence_virtd (pid  4862) is running...
>>>> cman           Found stale pid file
>>>> rgmanager      rgmanager (pid  5366) is running...
>>>>
>>>>
>>>> Thanks,
>>>>     Kelvin
>>> Oh now that is interesting...
>>>
>>> You'll want input from Fabio, Chrissie or one of the other core devs, I
>>> suspect.
>>>
>>> If this is RHEL proper, can you open a rhbz ticket? If it's CentOS, and
>>> if you can reproduce it reliably, can you create a new thread with the
>>> reproducer?
>> It's RHEL proper in both host and guest, and we can reproduce it reliably.
> Excellent!
>
> Please reply here with the rhbz#. I'm keen to see what comes of it.
>
Here it is.  https://bugzilla.redhat.com/show_bug.cgi?id=1289209

I was wrong about being able to re-start the corosync process; it takes 
a physical node reboot before I can get the host B back into the 
cluster.  I wonder if this situation doesn't occur often because of the 
use of ILO or power-related backup fences.





More information about the Linux-cluster mailing list