[Linux-cluster] Rebooting qdisk master causes quorum to dissolve.
Peter Tiggerdine
peter.tiggerdine at uq.edu.au
Mon Dec 21 03:26:38 UTC 2009
Hi,
I have a five node cluster with a shared quorum disk without heuristics.
Because of the a hardware problem I need to move the services off the
host in question and replace some ram. The services moved without a
hitch, but soon as I rebooted the nodes the cluster came down.
The relevant configuration is
<cluster alias="Services" config_version="150" name="Services">
<quorumd interval="5" tko="12" device="/dev/emcpowere" votes="3"
log_level="9" log_facility="local4" status_file="/qdisk_status"/>
<fence_daemon clean_start="1" post_fail_delay="15"
post_join_delay="30"/>
<cman deadnode_timeout="90" expected_nodes="4"/>
The relevant logs are below from an adjacent node:
Dec 21 11:40:15 io2 clurgmgrd[7271]: <notice> Member 1 shutting down
Dec 21 11:40:40 io2 qdiskd[6820]: <info> Node 1 shutdown
Dec 21 11:40:47 io2 openais[6801]: [CMAN ] lost contact with quorum
device
Dec 21 11:40:47 io2 openais[6801]: [CMAN ] quorum lost, blocking
activity
Dec 21 11:40:47 io2 clurgmgrd[7271]: <emerg> #1: Quorum Dissolved
Dec 21 11:40:47 io2 kernel: dlm: closing connection to node 1
Have I configured this in-correctly or is the a known problem with
rebooting the qdisk master? It's just occurred to me that I did lock the
resource groups to prevent the moved services from returning to the
node.
Thanks in-advance and look forward to your replies,
Peter Tiggerdine
HPC & eResearch Specialist
High Performance Computing Group
Information Technology Services
University of Queensland
More information about the Linux-cluster
mailing list