[Linux-cluster] qdisk WITHOUT fencing
Gordan Bobic
gordan at bobich.net
Mon Jun 21 09:20:34 UTC 2010
On 06/21/2010 08:52 AM, Kaloyan Kovachev wrote:
> On Fri, 18 Jun 2010 18:15:09 +0200, brem belguebli
> <brem.belguebli at gmail.com> wrote:
>> How do you deal with fencing when the intersite interconnects (SAN and
>> LAN) are the cause of the failure ?
>>
>
> GPRS or the good old modem over a phone line?
That isn't going to work if the whole site is down for whatever reason
(unlikely as it may be).
To protect yourself from the 100% outage of a remote site, the only sane
way I of approaching it I can think of is to do something like the
following:
1) Make each node fence itself off from the failed node using iptables
or some other firewalling method. The SAN should also be prevented from
allowing the booted out node back onto it.
2) Fail over the IP address or DNS name of the service. Since it's
across different sites, you are likely to have to use something like RIP
to re-route the IPs, so DNS on short refresh may well be an easier and
possibly safer option. It'll mean some downtime, but probably less than
any manual intervention in an unplanned case.
It's not entirely ideal, bit it's about as good as it is likely to get.
And you can write a fencing agent to do something like this easily enough.
Gordan
More information about the Linux-cluster
mailing list