[Linux-cluster] Cluster stability with missing qdisk

Jan Huijsmans Jan.Huijsmans at interaccess.nl
Fri Feb 10 18:04:50 UTC 2012


The timeout is now 150 sec. for the qdiskd (so it can have 2 failures on paths and have 30 seconds left to test the 3rd path in a dual fabric, dual path per fabric setup) en 300 sec for cman.

I would like to add 2 more, just to make sure it's not rebooting just because one location isn't reachable.
It's not the application that's having problems, just the cluster software that's causing the problem

With regards,
Inter Access BV

ing. J.C.M. (Jan) Huijsmans
Designer / Technical Consultant UNIX & Linux
Infrastructure Professional Services UNIX

E-mail:  jan.huijsmans at interaccess.nl

Tel:     035 688 8266
Mob.:    06 4938 8145

Hoofdkantoor:
Colosseum 9, 1213 NN Hilversum
Postbus 840, 1200 AV Hilversum
K.v.K. Hilversum 32032877
________________________________________
From: linux-cluster-bounces at redhat.com [linux-cluster-bounces at redhat.com] On Behalf Of emmanuel segura [emi2fast at gmail.com]
Sent: 10 February 2012 18:00
To: linux clustering
Subject: Re: [Linux-cluster] Cluster stability with missing qdisk

I understand what you say and that's right

One solution can be play with qdisk cluster timeout

man qdisk for more info

2012/2/10 Jan Huijsmans <Jan.Huijsmans at interaccess.nl<mailto:Jan.Huijsmans at interaccess.nl>>
We're using it on multipath, it's a failure to write to the device that's killing the cluster.
All other devices work, so the cluster can function as it should, would it be for the reboot by the cluster software.

I would like to prevent the cluster rebooting nodes just because the qdisk isn't responding (due to slow storage, failure on the quorum location such as powerloss and other, non application related, errors)

When both nodes are up and the application is able to run, there should be no reboot in my opinion.

With regards,
Inter Access BV

ing. J.C.M. (Jan) Huijsmans
Designer / Technical Consultant UNIX & Linux
Infrastructure Professional Services UNIX

E-mail:  jan.huijsmans at interaccess.nl<mailto:jan.huijsmans at interaccess.nl>

Tel:     035 688 8266<tel:035%20688%208266>
Mob.:    06 4938 8145<tel:06%204938%208145>

Hoofdkantoor:
Colosseum 9, 1213 NN Hilversum
Postbus 840, 1200 AV Hilversum
K.v.K. Hilversum 32032877
________________________________________
From: linux-cluster-bounces at redhat.com<mailto:linux-cluster-bounces at redhat.com> [linux-cluster-bounces at redhat.com<mailto:linux-cluster-bounces at redhat.com>] On Behalf Of emmanuel segura [emi2fast at gmail.com<mailto:emi2fast at gmail.com>]
Sent: 10 February 2012 15:16
To: linux clustering
Subject: Re: [Linux-cluster] Cluster stability with missing qdisk

Why you don't use qdisk on multipath, than can resolv your problem

2012/2/10 Jan Huijsmans <Jan.Huijsmans at interaccess.nl<mailto:Jan.Huijsmans at interaccess.nl><mailto:Jan.Huijsmans at interaccess.nl<mailto:Jan.Huijsmans at interaccess.nl>>>
Hello,

In the clusters we have we use a qdisk to determine which node had the quorum, in case of a split brain situation.

This is working great... until the qdisk itself is hit due to problems with the SAN. Is there a way to have a stable cluster,
with qdisks, where the absence of (1) qdisk won't kill the cluster all together. At this moment, with the setup with 1 qdisk,
the cluster is totally depending on the availability of the qdisk, while, IMHO, it should be expendable.

We have now a triangle setup, with 2 data centers and 1 extra 'quorum' location for the IBM SAN. In the SAN setup there
are 3 quorum devices, 1 in each data center and the 3rd on the quorum location. When 1 location fails (one of the data
centers or the quorum location) it is still up and running.

Is it possible to copy this setup and use 3 qdisks, so when 1 qdisk fails the cluster stays alive? I would set the vote
value of all components (systems and qdisks) to 1, so the cluster would keep running with 2 systems and 1 qdisk
or 1 system with 2 qdisks. (it'll be dead with only 3 qdisks, as the software will die with both systems ;) )

I've heard of setup's with 3 systems, where the 3rd was just for the quorum, so you this one can die, but on this
occasion it won't help us, as there are no systems on the 3rd location. (and it's not supported by Red Hat, when I'm
correctly informed)

With regards,
Inter Access BV

ing. J.C.M. (Jan) Huijsmans
Designer / Technical Consultant UNIX & Linux
Infrastructure Professional Services UNIX

E-mail:  jan.huijsmans at interaccess.nl<mailto:jan.huijsmans at interaccess.nl><mailto:jan.huijsmans at interaccess.nl<mailto:jan.huijsmans at interaccess.nl>>

Tel:     035 688 8266<tel:035%20688%208266><tel:035%20688%208266>
Mob.:    06 4938 8145<tel:06%204938%208145><tel:06%204938%208145>

Hoofdkantoor:
Colosseum 9, 1213 NN Hilversum
Postbus 840, 1200 AV Hilversum
K.v.K. Hilversum 32032877

--
Linux-cluster mailing list
Linux-cluster at redhat.com<mailto:Linux-cluster at redhat.com><mailto:Linux-cluster at redhat.com<mailto:Linux-cluster at redhat.com>>
https://www.redhat.com/mailman/listinfo/linux-cluster



--
esta es mi vida e me la vivo hasta que dios quiera

--
Linux-cluster mailing list
Linux-cluster at redhat.com<mailto:Linux-cluster at redhat.com>
https://www.redhat.com/mailman/listinfo/linux-cluster



--
esta es mi vida e me la vivo hasta que dios quiera




More information about the Linux-cluster mailing list