[Linux-cluster] Arbitrary heuristics
Lon Hohberger
lhh at redhat.com
Mon Nov 26 18:44:53 UTC 2007
On Tue, 2007-11-20 at 14:06 -0800, Scott Becker wrote:
> I've been pondering what I'm actually looking for.
>
> Each of my nodes has a public and a private NIC. Public is for serving
> web pages, private is for fencing. I was desperately trying to get
> fencing to work over the public network but I was faced with
> reimplementing a complicated fence agent in C in order to use ssh
> (supported ok by my power switches but difficult to add to the python
> fence agent).
>
> My remaining issue is that if I lose one of my public NICs, I must
> ensure that the ensuing fencing race is won by the good node and not the
> bad node which thinks it's good. Not solved by quorum because I must
> also make it work, 'last man standing' (starting with 3 nodes).
>
> So pondering, I realized that I don't really need to monitor the ability
> to reach the gateway. What I need is for a public comm error to create
> an event, hence I use the public nic for cluster comms. Then do
> something so that the bad node doesn't fence the good nodes.
>
> So assuming only one real failure at a time, I'm thinking of making the
> first step in the fencing method a check for pinging the gateway. That
> way when a node wants to fence, it will only be able to if it's public
> NIC is working, even though it's using the private nic for the rest of
> the fencing.
That's a pretty good + simple idea.
-- Lon
More information about the Linux-cluster
mailing list