[Linux-cluster] Understanding Fencing

Sat Sep 1 16:28:18 UTC 2012

Side note, then i will answer in-line. When possible, please start a new 
email to a mailing list, instead of hitting "reply" on an existing 
message and then deleting the content. A lot of people's email clients 
threading breaks when an email isn't new.

On 09/01/2012 11:57 AM, joshi dhaval wrote:
> Hello,
>
> I tried to read some documents on fencing, still bit confused with
> technology. ( i dont want to buy any extra hardware just for fencing ).

Was this one of the things you read?

https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial#Concept.3B_Fencing

> we are using HP DL 380 G6, G7 servers at out environment, only way i can
> see fencing possible in my environment is HP ILO.

Yes, you can use fence_ilo with that. I have done so myself and cover 
how to set it up here:

https://alteeve.ca/w/Configuring_HP_iLO_2_on_EL6

and how to use it as a fence device here:

https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial#Example_.3Cfencedevice....3E_Tag_For_HP_iLO

> what is PDU ? do i need to purchase separate device to enable fencing
> using PDU ?

A PDU (Power Distribution Unit) is, by itself, just another name for a 
power bar, though it generally refers to rack-mounted power bars. In 
fencing though, we use a version called a "Switched PDU". These are 
power bars with a network connection. They allow you to connect remotely 
and turn each outlet on and off independently of the other ports. They 
also offer power monitoring and so one, but that's outside fencing.

So in fencing, if for example, the power supply failed then the server 
would power down and take the IPMI or iLO interface with it (see below). 
Without any power at all, the IPMI will not reply as it will also have 
no power. We know in this case that the node is gone, but the other 
nodes don't. All they know is that they can't talk to the node or it's 
IPMI/iLO interfaces, which could just as well be network outage leaving 
the node alive.

In this case, the cluster can call the Switched PDUs and ask them to 
turn off the outlet(s) feeding the server. When the PDUs say "ok, 
they're off", *then* the cluster can safely say "ok, now I know it has 
to be off" and can begin recovery.

> is that IPMI is same as HP ILO ?

No, but they are similar. I have a short write-up of it here:

https://alteeve.ca/w/IPMI

IPMI is a generic way for a server to offer "Out of Band" management. 
That is just a fancy way of saying "You can check on the state of the 
server even when the server is powered off".

The piece of hardware inside your server that provides IPMI is called a 
"BMC" (Baseboard Management Controller). Think of it like a little, 
separate computer sitting on your server's motherboard. It draws it's 
power from the host, it can read the host's sensors (power state, fans, 
temperatures, etc) but it is still a totally separate device.

In fencing, if one node stops responding (say because the OS crashed), 
another node in the cluster will call the victim's IPMI interface and 
say "please power off the host". The BMC then, effectively, "pushes and 
holds the power button" until the host shuts down. Then the IPMI device 
tells the caller that the power off was successful. The cluster then 
knows the state of the victim (it is powered off now) so it can safely 
recover.

As for the difference between IPMI and iLO;

Most major hardware vendors took IPMI and added a bunch of features on 
top of it. Then they renamed it to something they wanted. So HP called 
theirs "iLO", IBM called theirs "RSA", Dell called theirs "DRAC" and so 
on. These are all very similar to IPMI (some are similar enough that 
stock IPMI tools work with them).

> for above hardware what you suggest are the most reliable fencing
> techniques i should use ?

I would use 'fence_ilo'.

> is that cross cable connection is possible just to check hearbeats like
> VCS has gab and llt ?

I don't know VCS or llt so I can't comment. In RHCS, we use "corosync" 
for cluster membership. By default, it uses a multicast group for 
passing messages around the cluster and for detecting a node's death. 
It's similar to what I think you mean by "heartbeat". It is advised that 
you use a proper switch, though I do not believe it is required.

> i am panning to configure 2 nodes cluster first once i will have
> confidence i will move it to 4 or 5 node cluster.

Then definitely use a proper switch, not back to back.

> Regards,
> Dhaval

A final comment;

In clustering, a failed fence action will leave the cluster in a state 
where it does not know the condition of a member. Given the dangers of 
making an assumption, the cluster would rather block (hang) than proceed 
in a way that could cause damage. This is why fencing is so critical; It 
restores the cluster to a known state after a fault.

If you use only iLO for fencing (and many people do only use IPMI, iLO, 
etc), then you will be fine most of the time. For me personally, this is 
not good enough. If for any reason the other node(s) can't reach the 
IPMI or iLO interface, the fence action will fail and the cluster will 
hang. With a switched PDU, you have a backup fence device that would 
protect you against this by providing an alternate method of confirming 
the node's state. Thus, adding a switched PDU to your cluster, you 
remove another single point of failure.

digimer

-- 
Digimer
Papers and Projects: https://alteeve.ca