[Linux-cluster] Use qdisk heuristics w/o a quorum device/partition

Gerhard Spiegl gspiegl at gmx.at
Tue Sep 9 19:59:38 UTC 2008


Kevin Anderson wrote:
> You can avoid the withdraw and force a panic by using the debug mount
> option for your GFS filesystems.  With debug set, GFS will when getting
> an I/O error, panic the system effectively self fencing the node.  The
> reason behind withdraw was to give the operator a chance to gracefully
> remove the node from the cluster after filesystem failure.  This is
> useful when multiple filesystems are mounted with multiple storage
> devices.  A withdraw always requires rebooting the node to recover.
> However, in your case, panic action is probably what you want.  We
> recently opened a new bugzilla for a new feature to give you better
> control of the options in this case.  
> https://bugzilla.redhat.com/show_bug.cgi?id=461065
> 
> Anyway, the debug mount option should avoid the situation you are
> describing.

If it does, it is exactly what we were looking for. In fact GFS reported an IO
error in syslog (and on "ls" "df" ...), but only the "nice" withdraw happened.
The only thing we found out was that passing the -w option to gfs_controld
(init.d/cman) would avoid withdrawing GFS. We expected the kernel to panic but
the only result was a puny syslog message :)
Tomorrow I will try adding "debug" to the mount opts in fstab.

> With split sites and an even number of nodes, you could end up in the
> situation that if an entire site goes down, you no longer have cluster
> quorum.  Having an extra and therefor odd number of nodes in the cluster
> would enable the cluster to continue to operate at the remaining site.
Will keep this in mind, may become handy someday.

> 
> Thanks
> Kevin

Thank You!
Gerhard





More information about the Linux-cluster mailing list