[Cluster-devel] unfence during startup

David Teigland teigland at redhat.com
Fri Nov 6 17:27:57 UTC 2009


The current init.d/cman startup sequence is:

start_cman
unfence_self
start_qdiskd
wait_for_quorum
start_fenced
start_dlm_controld
start_gfs_controld
join_fence_domain

I believe the reason we put unfence between cman and qdisk was in case the
qdisk was on a fenced device.  But, I'd forgotten about the more critical
case where someone runs 'service cman start' on a node after it has been
kicked out of the cluster and has been fenced (via fence_scsi).  This is
not too uncommon for someone to try -- they think they can just restart
the cluster on the node without first rebooting.  We go to a lot of
trouble in fenced and other daemons to recognize when someone does that
and shut things down again before getting far enough to corrupt storage.

Obviously, unfencing right at the beginning undercuts all those checks and
precautions, and could easily lead to corrupt storage.  So, we need to
move unfence to just before the join_fence_domain step.  Requiring a qdisk
to use a disk not subject to fencing shouldn't be too onerous?

Dave




More information about the Cluster-devel mailing list