[Linux-cluster] Clusterbehaviour if one node is not reachable & fenceable any longer?
lists at alteeve.ca
Wed Jan 29 17:46:44 UTC 2014
On 29/01/14 12:42 PM, Nicolas Kukolja wrote:
> Digimer <lists <at> alteeve.ca> writes:
>> 99% of the time, I agree totally. Logs and configs are super helpful. In
>> this case though, I am pretty sure I know exactly what's happening. :)
> Thanks for the explanation, digimer. You got exactly what I mean an what
> happens. Unfortunately, that was, what I was afraid of...
> The three nodes in my scenario are located about 200km from each other.
> If one of the nodes with all infrastructure around it (PDUs, Switches,
> IPMI...) is not reachable any longer because of a power outage or a full
> network outage at this location, switching a PDU is not possible, too...
> That would mean, that in this (very probably) case, the cluster will not
> help me?
> Do you have any suggestions, what I can do to workaround this case?
> Kind regards,
And this is the fundamental problem of stretch/geo-clusters.
I am loath to recommend this, because it's soooo easy to screw it up in
the heat of the moment, so please only ever do this after you are 100%
sure the other node is dead;
If you log into the 2 remaining nodes that are blocked (because of the
inability to fence), you can type 'fence_ack_manual'. That will tell the
cluster that you have manually confirmed the lost node is powered off.
Again, USE THIS VERY CAREFULLY!
It's tempting to make assumptions when you've got users and managers
yelling at you to get services back up. So much so that Red Hat dropped
'fence_manual' entirely in RHEL 6 because it was too easy to blow things
up. I can not stress it enough just how critical it is that you confirm
that the remote location is truly off before doing this. If it's still
on and you clear the fence action, then really bad things could happen
when the link returns.
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
More information about the Linux-cluster