[Linux-cluster] qdisk WITHOUT fencing

Brem Belguebli brem.belguebli at gmail.com
Fri Jun 18 06:12:36 UTC 2010


If I may do this comparison, 
- All the other known cluster stacks (linux/unix/win....) have the
Japanese (Harakiri) sense of honor, ie if a node goes wrong and commits
suicide, all the remaining nodes trust blindly the fact that the node
commited suicide
- RHCS have the Italian sense of honor (Mafioso), when a node goes
wrong, even if some cluster process makes this node commit suicide
(qdisk for instance), the remaining nodes do not trust it until some
node of the cluster "shoot the sick node in the head"

It's clear that geo clustering RHCS, due to this constraint is normally
impossible, though some scripting logic could allow to bypass completely
the fencing and guarantee the integrity of the cluster.

Brem

On Thu, 2010-06-17 at 23:31 +0000, Jankowski, Chris wrote:
> Jim,
> 
> You hit architectural limitation of Linux Cluster, which is specific to Linux Cluster design, which other clusters tend not to have.
> 
> Linux Cluster assumes that you will *always* be able to execute fencing of *all* other nodes.  In fact, this is a stated *prerequisite* for correct operation of the cluster.
> 
> This is all very well when you have two PCs under your desk and a power switch.
> 
> However, this model completely fails when any network more complex then a power switch is present. Your network fails and you have a partitioned cluster that cannot fence. It all gets stuck. From a practical, operational point of view of an IT this is a disaster worse then not having a cluster.
> 
> Having come to Linux Cluster with a TruCluster background, I always had a problem with the STONITH approach used by Linux Cluster. I deem it harmful. But I see no inclination anywhere in the Linux Cluster world to remove it.
> 
> I believe that there is a major philosophical chasm dividing the design stance between the Linux Cluster and others. The Linux Cluster seems to be saying "A node is the centre of the world and can control it".  Other clusters take the opposite stance: "A node is a part of the world, cannot control it and may have a very limited visibility of the world in some circuumstances."
> 
> Regards,
> 
> Chris Jankowski
> 
> 
> 
> -----Original Message-----
> From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of jimbob palmer
> Sent: Friday, 18 June 2010 01:59
> To: linux-cluster at redhat.com
> Subject: [Linux-cluster] qdisk WITHOUT fencing
> 
> Dear distinguished linux-cluster members!
> 
> I have two data centers linked by physical fibre. Everything goes over this physical route: everything.
> 
> I would like to setup a high availability nfs server with drbd:
> * drbd to replicate storage
> * nfsd running
> * floating ip
> 
> If the physical link between the two data centers is lost, I would like the primary data center to win.
> 
> I've setup a qdisk, and this works well: the node which can access the qdisk wins. i.e. the primary datacenter, which is the data center where the san holding the qdisk also lives, wins.
> 
> Unfortunately for me, I get pages and pages of errors about being unable to fence the secondary node.
> 
> The docs tell me that I absolutely must use power fencing, but in this case fencing makes no sense: it won't work when the link between the data centers is severed. The network, and the qdisk is the decider for who "wins".
> 
> So what should I do?
> 
> Many thanks in advance.
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster





More information about the Linux-cluster mailing list