[Linux-cluster] Diskless Quorum Disk
Lon Hohberger
lhh at redhat.com
Fri Jun 22 20:36:04 UTC 2007
On Wed, Jun 20, 2007 at 05:57:05PM -0500, Chris Harms wrote:
> My nodes were set to "quorum=1 two_node=1" and fenced by DRAC cards
> using telnet over their NICs. The same NICs used in my bonded config on
> the OS so I assumed it was on the same network path. Perhaps I assume
> incorrectly.
That sounds mostly right. The point is that a node disconnected from
the cluster must not be able to fence a node which is supposedly still
connected.
That is: 'A' must not be able to fence 'B' if 'A' becomes disconnected
from the cluster. However, 'A' must be able to be fenced if 'A' becomes
disconnected.
Why was DRAC unreachable; was it unplugged too? (Is DRAC like IPMI - in
that it shares a NIC with the host machine?)
> Desired effect would be survivor claims service(s) running on
> unreachable node and attempts to fence unreachable node or bring it back
> online without fencing should it establish contact. Actual result was
> survivor spun its wheels trying to fence unreachable node and did not
> assume services.
Yes, this is an unfortunate limitation of using (most) integrated power
management systems. Basically, some BMCs share a NIC with the host
(IPMI), and some run off of the machine's power supply (IPMI, iLO,
DRAC). When the fence device becomes unreachable, we don't know whether
it's a total network outage or a "power disconnected" state.
* If the power to a node has been disconnected, it's safe to recover.
* If the node just lost all of its network connectivity, it's *NOT* safe
to recover.
* In both cases, we can not confirm the node is dead... which is why we
don't recover.
> Restoring network connectivity induced the previously
> unreachable node to reboot and the surviving node experienced some kind
> of weird power off and then powered back on (???).
That doesn't sound right; the surviving node should have stayed put (not
rebooted).
> Ergo I figured I must need quorum disk so I can use something like a
> ping node. My present plan is to use a loop device for the quorum disk
> device and then setup ping heuristics. Will this even work, i.e. do the
> nodes both need to see the same qdisk or can I fool the service with a
> loop device?
I don't believe the effect of tricking qdiskd in this way have been
explored; I don't see why it wouldn't work in theory, but... qdiskd with
or without a disk won't fix the behavior you experienced (uncertain
state due to failure to fence -> retry / wait for node to come back).
> I am not deploying GFS or GNDB and I have no SAN. My only
> option would be to add another DRBD partition for this purpose which may
> or may not work.
> What is the proper setup option, two_node=1 or qdisk?
In your case, I'd say two_node="1".
--
Lon Hohberger - Software Engineer - Red Hat, Inc.
More information about the Linux-cluster
mailing list