[Linux-cluster] qdisk max_error_cycles setting

brem belguebli brem.belguebli at gmail.com
Wed Dec 30 15:38:42 UTC 2009


It looks like the quorumd max_error_cycles parameter it not taken into account.

Here's the test I'm doing:

A 3 nodes cluster (RHEL 5.4) with a iscsi qdisk lun from a RHEL 5.4
target server.

All 3 cluster nodes have the following cqdisk configuration:

<quorumd device="/dev/iscsi/storage.quorum" interval="1"
log_facility="local5" log_level="7" tko="10" votes="1"

When I block access from the 3 nodes to the target server (iptables
rule that prevents all ip flows from the 3 nodes to the target
server), I see the Quorum disk go offline but qdisk never gets stopped
and keeps on retrying the qdisk device despite the fact that I
instructed it to abort after 10 cycles (max_error_cycles=10).

Am I misunderstanding the max_error_cycles definition in the qdisk man page ?


PS: As consequence of not being killed after this max-error_cycles,
qdisk  keeps on growing (memory usage virtual size) and if the
situation lasts too long OOM killer gets involved.....

