[Linux-cluster] Diskless Quorum Disk

Fri Jun 22 23:43:53 UTC 2007

Lon, thank you for the response.  It appears that what I thought was a 
fence duel, was actually the cluster fencing the proper node and DRBD 
halting the surviving node after a split brain scenario.  (Have some 
work to do on my drbd.conf obviously.)  After the fenced node revived, 
it saw that the other was unresponsive (it had been halted) and then 
fenced it; in this case inducing it to power on.

Our DRAC shares the NICs with the host.  We will probably hack on the 
DRAC fence script a little to take advantage of some other features 
available besides doing a poweroff poweron.

Using two_node=1 may be an option again, but then the FAQ indicates the 
quorum disk might still be beneficial.  Using a loop device didn't seem 
to go so well, but that could be due to configuration error.  Having one 
node not see the qdisk is probably an automatic test failure.

Thanks again,
Chris

Lon Hohberger wrote:
> On Wed, Jun 20, 2007 at 05:57:05PM -0500, Chris Harms wrote:
>   
>> My nodes were set to "quorum=1 two_node=1" and fenced by DRAC cards 
>> using telnet over their NICs.  The same NICs used in my bonded config on 
>> the OS so I assumed it was on the same network path.  Perhaps I assume 
>> incorrectly.
>>     
>
> That sounds mostly right.  The point is that a node disconnected from
> the cluster must not be able to fence a node which is supposedly still
> connected.
>
> That is: 'A' must not be able to fence 'B' if 'A' becomes disconnected
> from the cluster.  However, 'A' must be able to be fenced if 'A' becomes
> disconnected.
>
> Why was DRAC unreachable; was it unplugged too? (Is DRAC like IPMI - in
> that it shares a NIC with the host machine?)
>
>   
>> Desired effect would be survivor claims service(s) running on 
>> unreachable node and attempts to fence unreachable node or bring it back 
>> online without fencing should it establish contact.  Actual result was 
>> survivor spun its wheels trying to fence unreachable node and did not 
>> assume services.
>>     
>
> Yes, this is an unfortunate limitation of using (most) integrated power
> management systems.  Basically, some BMCs share a NIC with the host
> (IPMI), and some run off of the machine's power supply (IPMI, iLO,
> DRAC).  When the fence device becomes unreachable, we don't know whether
> it's a total network outage or a "power disconnected" state.
>
> * If the power to a node has been disconnected, it's safe to recover.
>
> * If the node just lost all of its network connectivity, it's *NOT* safe
> to recover.
>
> * In both cases, we can not confirm the node is dead... which is why we
> don't recover.
>
>   
>> Restoring network connectivity induced the previously 
>> unreachable node to reboot and the surviving node experienced some kind 
>> of weird power off and then powered back on (???).
>>     
>
> That doesn't sound right; the surviving node should have stayed put (not
> rebooted).
>
>   
>> Ergo I figured I must need quorum disk so I can use something like a 
>> ping node.  My present plan is to use a loop device for the quorum disk 
>> device and then setup ping heuristics.  Will this even work, i.e. do the 
>> nodes both need to see the same qdisk or can I fool the service with a 
>> loop device?
>>     
>
> I don't believe the effect of tricking qdiskd in this way have been
> explored; I don't see why it wouldn't work in theory, but... qdiskd with
> or without a disk won't fix the behavior you experienced (uncertain
> state due to failure to fence -> retry / wait for node to come back).
>
>   
>> I am not deploying GFS or GNDB and I have no SAN.  My only 
>> option would be to add another DRBD partition for this purpose which may 
>> or may not work.
>>     
>
>   
>> What is the proper setup option, two_node=1 or qdisk?
>>     
>
> In your case, I'd say two_node="1".
>
>