[Linux-cluster] Cluster stability with missing qdisk

Lon Hohberger lhh at redhat.com
Wed Feb 15 20:51:19 UTC 2012


On 02/10/2012 08:48 AM, Jan Huijsmans wrote:
> Hello,
>
> In the clusters we have we use a qdisk to determine which node had the quorum, in case of a split brain situation.
>
> This is working great... until the qdisk itself is hit due to problems with the SAN. Is there a way to have a stable cluster,
> with qdisks, where the absence of (1) qdisk won't kill the cluster all together. At this moment, with the setup with 1 qdisk,
> the cluster is totally depending on the availability of the qdisk, while, IMHO, it should be expendable.

What kind of problems are you trying to avoid?

1) I/O errors -> disk died:

solution: set max_error_cycles to something nonzero (1? 2?), and qdiskd 
will then exit on the host where the problems are occurring when I/O 
errors are received

2) Long I/O hangs (e.g. path failover)

solution: current 3.1.x / 3.2.x differentiates between I/O hangs and I/O 
errors, so hangs (e.g. due to path failover) no longer cause reboots.

-- Lon




More information about the Linux-cluster mailing list