[Linux-cluster] Fencing Device Question
Brandon Young
bkyoung at gmail.com
Tue Jun 3 17:55:57 UTC 2008
In my GFS cluster, I use DRAC cards as the fencing device for each node.
Yesterday, I had a situation where the DRAC card on a particular node had
failed, and would not allow remote logins, etc, but it still returned
pings. I don't know how long the card had been dead, and I only noticed
because I wished to manually fence the node and fencing failed ... which
caused me all sorts of other fun to recover the cluster, afterwards. So, I
have uncovered a pretty scary bad-case scenario for my cluster
configuration.
My question is what (if anything) can RHCS/GFS do to determine the
health/presence/operation of fencing devices? If it can do something to
monitor the fencing devices, and discovers a bad fencing device, what will
it do? For example, if I unplug the network cable for the heartbeat, the
node will get fenced immediately. I never tested whether the same would
happen if I unplugged a fencing device. I haven't delved into the
documentation in a while, but I don't remember anything about a way to have
redundant fencing devices, like a DRAC and a network power switch. Is there
a way?
Thoughts, opinions, insight, documentation, etc would be greatly
appreciated.
--
Brandon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080603/536afcaf/attachment.htm>
More information about the Linux-cluster
mailing list