[Cluster-devel] cluster/cman man/Makefile qdisk/README man/mkq ...

lhh at sourceware.org lhh at sourceware.org
Fri Jul 21 17:56:16 UTC 2006


CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4U4
Changes by:	lhh at sourceware.org	2006-07-21 17:56:15

Modified files:
	cman/man       : Makefile 
	cman/qdisk     : README 
Added files:
	cman/man       : mkqdisk.8 qdisk.5 qdiskd.8 

Log message:
	Add man pages for qdisk

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cman/man/mkqdisk.8.diff?cvsroot=cluster&only_with_tag=RHEL4U4&r1=NONE&r2=1.2.2.1
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cman/man/qdisk.5.diff?cvsroot=cluster&only_with_tag=RHEL4U4&r1=NONE&r2=1.2.2.1
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cman/man/qdiskd.8.diff?cvsroot=cluster&only_with_tag=RHEL4U4&r1=NONE&r2=1.2.2.1
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cman/man/Makefile.diff?cvsroot=cluster&only_with_tag=RHEL4U4&r1=1.1&r2=1.1.14.1
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cman/qdisk/README.diff?cvsroot=cluster&only_with_tag=RHEL4U4&r1=1.1.2.1.2.1&r2=1.1.2.1.2.2

/cvs/cluster/cluster/cman/man/mkqdisk.8,v  -->  standard output
revision 1.2.2.1
--- cluster/cman/man/mkqdisk.8
+++ -	2006-07-21 17:56:15.747890000 +0000
@@ -0,0 +1,23 @@
+.TH "mkqdisk" "8" "July 2006" "" "Quorum Disk Management"
+.SH "NAME"
+mkqdisk \- Cluster Quorum Disk Utility
+.SH "WARNING"
+Use of this command can cause the cluster to malfunction.
+.SH "SYNOPSIS"
+\fBmkqdisk [\-?|\-h] | [\-L] | [\-f \fPlabel\fB] [\-c \fPdevice \fB -l \fPlabel\fB]
+.SH "DESCRIPTION"
+.PP 
+The \fBmkqdisk\fP command is used to create a new quorum disk or display
+existing quorum disks accessible from a given cluster node.
+.SH "OPTIONS"
+.IP "\-c device \-l label"
+Initialize a new cluster quorum disk.  This will destroy all data on the given
+device.  If a cluster is currently using that device as a quorum disk, the
+entire cluster will malfunction.  Do not ru
+.IP "\-f label"
+Find the cluster quorum disk with the given label and display information about it..
+.IP "\-L"
+Display information on all accessible cluster quorum disks.
+
+.SH "SEE ALSO"
+qdisk(5) qdiskd(8)
/cvs/cluster/cluster/cman/man/qdisk.5,v  -->  standard output
revision 1.2.2.1
--- cluster/cman/man/qdisk.5
+++ -	2006-07-21 17:56:15.834011000 +0000
@@ -0,0 +1,309 @@
+.TH "QDisk" "8" "July 2006" "" "Cluster Quorum Disk"
+.SH "NAME"
+QDisk 1.0 \- a disk-based quorum daemon for CMAN / Linux-Cluster
+.SH "1. Overview"
+.SH "1.1 Problem"
+In some situations, it may be necessary or desirable to sustain
+a majority node failure of a cluster without introducing the need for
+asymmetric cluster configurations (e.g. client-server, or heavily-weighted
+voting nodes).
+
+.SH "1.2. Design Requirements"
+* Ability to sustain 1..(n-1)/n simultaneous node failures, without the
+danger of a simple network partition causing a split brain.  That is, we
+need to be able to ensure that the majority failure case is not merely
+the result of a network partition.
+
+* Ability to use external reasons for deciding which partition is the 
+the quorate partition in a partitioned cluster.  For example, a user may
+have a service running on one node, and that node must always be the master
+in the event of a network partition.  Or, a node might lose all network
+connectivity except the cluster communication path - in which case, a
+user may wish that node to be evicted from the cluster.
+
+* Integration with CMAN.  We must not require CMAN to run with us (or
+without us).  Linux-Cluster does not require a quorum disk normally -
+introducing new requirements on the base of how Linux-Cluster operates
+is not allowed.
+
+* Data integrity.  In order to recover from a majority failure, fencing
+is required.  The fencing subsystem is already provided by Linux-Cluster.
+
+* Non-reliance on hardware or protocol specific methods (i.e. SCSI
+reservations).  This ensures the quorum disk algorithm can be used on the
+widest range of hardware configurations possible.
+
+* Little or no memory allocation after initialization.  In critical paths
+during failover, we do not want to have to worry about being killed during
+a memory pressure situation because we request a page fault, and the Linux
+OOM killer responds...
+
+.SH "1.3. Hardware Considerations and Requirements"
+.SH "1.3.1. Concurrent, Synchronous, Read/Write Access"
+This quorum daemon requires a shared block device with concurrent read/write
+access from all nodes in the cluster.  The shared block device can be
+a multi-port SCSI RAID array, a Fiber-Channel RAID SAN, a RAIDed iSCSI
+target, or even GNBD.  The quorum daemon uses O_DIRECT to write to the
+device.
+
+.SH "1.3.2. Bargain-basement JBODs need not apply"
+There is a minimum performance requirement inherent when using disk-based
+cluster quorum algorithms, so design your cluster accordingly.  Using a
+cheap JBOD with old SCSI2 disks on a multi-initiator bus will cause 
+problems at the first load spike.  Plan your loads accordingly; a node's
+inability to write to the quorum disk in a timely manner will cause the
+cluster to evict the node.  Using host-RAID or multi-initiator parallel
+SCSI configurations with the qdisk daemon is unlikely to work, and will
+probably cause administrators a lot of frustration.  That having been
+said, because the timeouts are configurable, most hardware should work
+if the timeouts are set high enough.
+
+.SH "1.3.3. Fencing is Required"
+In order to maintain data integrity under all failure scenarios, use of
+this quorum daemon requires adequate fencing, preferrably power-based
+fencing.  Watchdog timers and software-based solutions to reboot the node
+internally, while possibly sufficient, are not considered 'fencing' for 
+the purposes of using the quorum disk.
+
+.SH "1.4. Limitations"
+* At this time, this daemon supports a maximum of 16 nodes.  This is
+primarily a scalability issue: As we increase the node count, we increase
+the amount of synchronous I/O contention on the shared quorum disk.
+
+* Cluster node IDs must be statically configured in cluster.conf and
+must be numbered from 1..16 (there can be gaps, of course).
+
+* Cluster node votes should be more or less equal.
+
+* CMAN must be running before the qdisk program can start.
+
+* CMAN's eviction timeout should be at least 2x the quorum daemon's
+to give the quorum daemon adequate time to converge on a master during a
+failure + load spike situation.
+
+* The total number of votes assigned to the quorum device should be
+equal to or greater than the total number of node-votes in the cluster.
+While it is possible to assign only one (or a few) votes to the quorum
+device, the effects of doing so have not been explored.
+
+* Currently, the quorum disk daemon is difficult to use with CLVM if
+the quorum disk resides on a CLVM logical volume.  CLVM requires a
+quorate cluster to correctly operate, which introduces a chicken-and-egg
+problem for starting the cluster: CLVM needs quorum, but the quorum daemon
+needs CLVM (if and only if the quorum device lies on CLVM-managed storage).
+One way to work around this is to *not* set the cluster's expected votes
+to include the quorum daemon's votes.  Bring all nodes online, and start
+the quorum daemon *after* the whole cluster is running.  This will allow
+the expected votes to increase naturally.
+
+.SH "2. Algorithms"
+.SH "2.1. Heartbeating & Liveliness Determination"
+Nodes update individual status blocks on the quorum disk at a user-
+defined rate.  Each write of a status block alters the timestamp, which
+is what other nodes use to decide whether a node has hung or not.  If,
+after a user-defined number of 'misses' (that is, failure to update a
+timestamp), a node is declared offline.  After a certain number of 'hits'
+(changed timestamp + "i am alive" state), the node is declared online.
+
+The status block contains additional information, such as a bitmask of
+the nodes that node believes are online.  Some of this information is
+used by the master - while some is just for performace recording, and
+may be used at a later time.  The most important pieces of information
+a node writes to its status block are:
+
+.in 12
+- Timestamp
+.br
+- Internal state (available / not available)
+.br
+- Score
+.br
+- Known max score (may be used in the future to detect invalid configurations)
+.br
+- Vote/bid messages
+.br
+- Other nodes it thinks are online
+.in 0
+
+.SH "2.2. Scoring & Heuristics"
+The administrator can configure up to 10 purely arbitrary heuristics, and
+must exercise caution in doing so.  At least one administrator-
+defined heuristic is required for operation, but it is generally a good
+idea to have more than one heuristic.  By default, only nodes scoring over
+1/2 of the total maximum score will claim they are available via the
+quorum disk, and a node (master or otherwise) whose score drops too low
+will remove itself (usually, by rebooting).
+
+The heuristics themselves can be any command executable by 'sh -c'.  For
+example, in early testing the following was used:
+
+.ti 12
+<\fBheuristic \fP\fIprogram\fP\fB="\fP[ -f /quorum ]\fB" \fP\fIscore\fP\fB="\fP10\fB" \fP\fIinterval\fP\fB="\fP2\fB"/>\fP
+
+This is a literal sh-ism which tests for the existence of a file called
+"/quorum".  Without that file, the node would claim it was unavailable.
+This is an awful example, and should never, ever be used in production,
+but is provided as an example as to what one could do...
+
+Typically, the heuristics should be snippets of shell code or commands which
+help determine a node's usefulness to the cluster or clients.  Ideally, you
+want to add traces for all of your network paths (e.g. check links, or
+ping routers), and methods to detect availability of shared storage.
+
+.SH "2.3. Master Election"
+Only one master is present at any one time in the cluster, regardless of
+how many partitions exist within the cluster itself.  The master is
+elected by a simple voting scheme in which the lowest node which believes
+it is capable of running (i.e. scores high enough) bids for master status.
+If the other nodes agree, it becomes the master.  This algorithm is 
+run whenever no master is present.
+
+If another node comes online with a lower node ID while a node is still
+bidding for master status, it will rescind its bid and vote for the lower
+node ID.  If a master dies or a bidding node dies, the voting algorithm
+is started over.  The voting algorithm typically takes two passes to
+complete.
+
+Master deaths take marginally longer to recover from than non-master
+deaths, because a new master must be elected before the old master can
+be evicted & fenced.
+
+.SH "2.4. Master Duties"
+The master node decides who is or is not in the master partition, as
+well as handles eviction of dead nodes (both via the quorum disk and via
+the linux-cluster fencing system by using the cman_kill_node() API).
+
+.SH "2.5. How it All Ties Together"
+When a master is present, and if the master believes a node to be online,
+that node will advertise to CMAN that the quorum disk is available.  The
+master will only grant a node membership if:
+
+.in 12
+(a) CMAN believes the node to be online, and
+.br
+(b) that node has made enough consecutive, timely writes
+.in 16
+to the quorum disk, and
+.in 12
+(c) the node has a high enough score to consider itself online.
+.in 0
+
+.SH "3. Configuration"
+.SH "3.1. The <quorumd> tag"
+This tag is a child of the top-level <cluster> tag.
+
+.in 8
+\fB<quorumd\fP
+.in 9
+\fIinterval\fP\fB="\fP1\fB"\fP
+.in 12 
+This is the frequency of read/write cycles
+
+.in 9
+\fItko\fP\fB="\fP10\fB"\fP
+.in 12
+This is the number of cycles a node must miss in order to be declared dead.
+
+.in 9
+\fIvotes\fP\fB="\fP3\fB"\fP
+.in 12
+This is the number of votes the quorum daemon advertises to CMAN when it
+has a high enough score.
+
+.in 9
+\fIlog_level\fP\fB="\fP4\fB"\fP
+.in 12
+This controls the verbosity of the quorum daemon in the system logs.
+0 = emergencies; 7 = debug.
+
+.in 9
+\fIlog_facility\fP\fB="\fPlocal4\fB"\fP
+.in 12
+This controls the syslog facility used by the quorum daemon when logging.
+For a complete list of available facilities, see \fBsyslog.conf(5)\fP.
+
+.in 9
+\fIstatus_file\fP\fB="\fP/foo\fB"\fP
+.in 12
+Write internal states out to this file periodically ("-" = use stdout).
+This is primarily used for debugging.
+
+.in 9
+\fImin_score\fP\fB="\fP3\fB"\fP
+.in 12
+Absolute minimum score to be consider one's self "alive".  If omitted,
+or set to 0, the default function "floor((n+1)/2)" is used, where \fIn\fP
+is the sum-total of all of defined heuristics' \fIscore\fP attribute.
+
+.in 9
+\fIdevice\fP\fB="\fP/dev/sda1\fB"\fP
+.in 12
+This is the device the quorum daemon will use.  This device must be the
+same on all nodes.
+
+.in 9
+\fIlabel\fP\fB="\fPmylabel\fB"/>\fP
+.in 12
+This overrides the device field if present.  If specified, the quorum
+daemon will read /proc/partitions and check for qdisk signatures
+on every block device found, comparing the label against the specified
+label.  This is useful in configurations where the block device name
+differs on a per-node basis.
+.in 0
+
+.SH "3.2.  The <heuristic> tag"
+This tag is a child of the <quorumd> tag.
+
+.in 8
+\fB<heuristic\fP
+.in 9
+\fIprogram\fP\fB="\fP/test.sh\fB"\fP
+.in 12
+This is the program used to determine if this heuristic is alive.  This
+can be anything which may be executed by \fI/bin/sh -c\fP.  A return
+value of zero indicates success; anything else indicates failure.
+
+.in 9
+\fIscore\fP\fB="\fP1\fB"\fP
+.in 12
+This is the weight of this heuristic.  Be careful when determining scores
+for heuristics.
+
+.in 9
+\fIinterval\fP\fB="\fP2\fB"/>\fP
+.in 12
+This is the frequency at which we poll the heuristic.
+.in 0
+
+.SH "3.3. Example"
+.in 8
+<quorumd interval="1" tko="10" votes="3" label="testing">
+.in 12
+<heuristic program="ping A -c1 -t1" score="1" interval="2"/>
+.br
+<heuristic program="ping B -c1 -t1" score="1" interval="2"/>
+.br
+<heuristic program="ping C -c1 -t1" score="1" interval="2"/>
+.br
+.in 8
+</quorumd>
+.in 0
+
+.SH "3.4. Heuristic score considerations"
+* Heuristic timeouts should be set high enough to allow the previous run
+of a given heuristic to complete.
+
+* Heuristic scripts returning anything except 0 as their return code 
+are considered failed.
+
+* The worst-case for improperly configured quorum heuristics is a race
+to fence where two partitions simultaneously try to kill each other.
+
+.SH "3.5. Creating a quorum disk partition"
+The mkqdisk utility can create and list currently configured quorum disks
+visible to the local node; see
+.B mkqdisk(8)
+for more details.
+
+.SH "SEE ALSO"
+mkqdisk(8), qdiskd(8), cman(5), syslog.conf(5)
/cvs/cluster/cluster/cman/man/qdiskd.8,v  -->  standard output
revision 1.2.2.1
--- cluster/cman/man/qdiskd.8
+++ -	2006-07-21 17:56:15.915107000 +0000
@@ -0,0 +1,20 @@
+.TH "qdiskd" "8" "July 2006" "" "Quorum Disk Management"
+.SH "NAME"
+qdiskd \- Cluster Quorum Disk Daemon
+.SH "SYNOPSIS"
+\fBqdiskd [\-f] [\-d]
+.SH "DESCRIPTION"
+.PP 
+The \fBqdiskd\fP daemon talks to CMAN and provides a mechanism for determining
+node-fitness in a cluster environment.  See
+.B
+qdisk(5)
+for configuration information.
+.SH "OPTIONS"
+.IP "\-f"
+Run in the foreground (do not fork / daemonize).
+.IP "\-d"
+Enable debug output.
+
+.SH "SEE ALSO"
+mkqdisk(8), qdisk(5), cman(5)
--- cluster/cman/man/Makefile	2004/08/13 06:38:22	1.1
+++ cluster/cman/man/Makefile	2006/07/21 17:56:15	1.1.14.1
@@ -18,10 +18,10 @@
 install:
 	install -d ${mandir}/man5
 	install -d ${mandir}/man8
-	install cman.5 ${mandir}/man5
-	install cman_tool.8 ${mandir}/man8
+	install cman.5 qdisk.5 ${mandir}/man5
+	install cman_tool.8 qdiskd.8 mkqdisk.8 ${mandir}/man8
 
 uninstall:
-	${UNINSTALL} cman.5 ${mandir}/man5
-	${UNINSTALL} cman_tool.8 ${mandir}/man8
+	${UNINSTALL} cman.5 qdisk.5 ${mandir}/man5
+	${UNINSTALL} cman_tool.8 qdiskd.8 mkqdisk.8 ${mandir}/man8
 
--- cluster/cman/qdisk/README	2006/06/23 16:02:01	1.1.2.1.2.1
+++ cluster/cman/qdisk/README	2006/07/21 17:56:15	1.1.2.1.2.2
@@ -1,274 +1 @@
-qdisk 1.0 - a disk-based quorum algorithm for Linux-Cluster
-
-(C) 2006 Red Hat, Inc.
-
-1. Overview
-
-1.1. Problem
-
-In some situations, it may be necessary or desirable to sustain
-a majority node failure of a cluster without introducing the need for
-asymmetric (client-server, or heavy-weighted voting nodes).
-
-1.2. Design Requirements
-
-* Ability to sustain 1..(n-1)/n simultaneous node failures, without the
-danger of a simple network partition causing a split brain.  That is, we
-need to be able to ensure that the majority failure case is not merely
-the result of a network partition.
-
-* Ability to use external reasons for deciding which partition is the 
-the quorate partition in a partitioned cluster.  For example, a user may
-have a service running on one node, and that node must always be the master
-in the event of a network partition.  Or, a node might lose all network
-connectivity except the cluster communication path - in which case, a
-user may wish that node to be evicted from the cluster.
-
-* Integration with CMAN.  We must not require CMAN to run with us (or
-without us).  Linux-Cluster does not require a quorum disk normally -
-introducing new requirements on the base of how Linux-Cluster operates
-is not allowed.
-
-* Data integrity.  In order to recover from a majority failure, fencing
-is required.  The fencing subsystem is already provided by Linux-Cluster.
-
-* Non-reliance on hardware or protocol specific methods (i.e. SCSI
-reservations).  This ensures the quorum disk algorithm can be used on the
-widest range of hardware configurations possible.
-
-* Little or no memory allocation after initialization.  In critical paths
-during failover, we do not want to have to worry about being killed during
-a memory pressure situation because we request a page fault, and the Linux
-OOM killer responds...
-
-
-1.3. Hardware Configuration Considerations
-
-1.3.1. Concurrent, Synchronous, Read/Write Access
-
-This daemon requires a shared block device with concurrent read/write
-access from all nodes in the cluster.  The shared block device can be
-a multi-port SCSI RAID array, a Fiber-Channel RAID SAN, a RAIDed iSCSI
-target, or even GNBD.  The quorum daemon uses O_DIRECT to write to the
-device.
-
-1.3.2. Bargain-basement JBODs need not apply
-
-There is a minimum performance requirement inherent when using disk-based
-cluster quorum algorithms, so design your cluster accordingly.  Using a
-cheap JBOD with old SCSI2 disks on a multi-initiator bus will cause 
-problems at the first load spike.  Plan your loads accordingly; a node's
-inability to write to the quorum disk in a timely manner will cause the
-cluster to evict the node.  Using host-RAID or multi-initiator parallel
-SCSI configurations with the qdisk daemon is unlikely to work, and will
-probably cause administrators a lot of frustration.  That having been
-said, because the timeouts are configurable, most hardware should work
-if the timeouts are set high enough.
-
-1.3.3. Fencing is Required
-
-In order to maintain data integrity under all failure scenarios, use of
-this quorum daemon requires adequate fencing, preferrably power-based
-fencing.
-
-
-1.4. Limitations
-
-* At this time, this daemon only supports a maximum of 16 nodes.
-
-* Cluster node IDs must be statically configured in cluster.conf and
-must be numbered from 1..16 (there can be gaps, of course).
-
-* Cluster node votes should be more or less equal.
-
-* CMAN must be running before the qdisk program can start.  This
-limitation will be removed before a production release.
-
-* CMAN's eviction timeout should be at least 2x the quorum daemon's
-to give the quorum daemon adequate time to converge on a master during a
-failure + load spike situation.
-
-* The total number of votes assigned to the quorum device should be
-equal to or greater than the total number of node-votes in the cluster.
-While it is possible to assign only one (or a few) votes to the quorum
-device, the effects of doing so have not been explored.
-
-* Currently, the quorum disk daemon is difficult to use with CLVM if
-the quorum disk resides on a CLVM logical volume.  CLVM requires a
-quorate cluster to correctly operate, which introduces a chicken-and-egg
-problem for starting the cluster: CLVM needs quorum, but the quorum daemon
-needs CLVM (if and only if the quorum device lies on CLVM-managed storage).
-One way to work around this is to *not* set the cluster's expected votes
-to include theh quorum daemon's votes.  Bring all nodes online, and start
-the quorum daemon *after* the whole cluster is running.  This will allow
-the expected votes to increase naturally.
-
-
-2. Algorithms
-
-2.1. Heartbeating & Liveliness Determination
-
-Nodes update individual status blocks on the quorum disk at a user-
-defined rate.  Each write of a status block alters the timestamp, which
-is what other nodes use to decide whether a node has hung or not.  If,
-after a user-defined number of 'misses' (that is, failure to update a
-timestamp), a node is declared offline.  After a certain number of 'hits'
-(changed timestamp + "i am alive" state), the node is declared online.
-
-The status block contains additional information, such as a bitmask of
-the nodes that node believes are online.  Some of this information is
-used by the master - while some is just for performace recording, and
-may be used at a later time.  The most important pieces of information
-a node writes to its status block are:
-
-  - timestamp
-  - internal state (available / not available)
-  - score
-  - max score
-  - vote/bid messages
-  - other nodes it thinks are online
-
-
-2.2. Scoring & Heuristics
-
-The administrator can configure up to 10 purely arbitrary heuristics, and
-must exercise caution in doing so.  By default, only nodes scoring over
-1/2 of the total maximum score will claim they are available via the
-quorum disk, and a node (master or otherwise) whose score drops too low
-will remove itself (usually, by rebooting).
-
-The heuristics themselves can be any command executable by 'sh -c'.  For
-example, in early testing, I used this:
-
-    <heuristic program="[ -f /quorum ]" score="10" interval="2"/>
-
-This is a literal sh-ism which tests for the existence of a file called
-"/quorum".  Without that file, the node would claim it was unavailable.
-This is an awful example, and should never, ever be used in production,
-but is provided as an example as to what one could do...
-
-Typically, the heuristics should be snippets of shell code or commands which
-help determine a node's usefulness to the cluster or clients.  Ideally, you
-want to add traces for all of your network paths (e.g. check links, or
-ping routers), and methods to detect availability of shared storage.
-
-
-2.3. Master Election
-
-Only one master is present at any one time in the cluster, regardless of
-how many partitions exist within the cluster itself.  The master is
-elected by a simple voting scheme in which the lowest node which believes
-it is capable of running (i.e. scores high enough) bids for master status.
-If the other nodes agree, it becomes the master.  This algorithm is 
-run whenever no master is present.
-
-If another node comes online with a lower node ID while a node is still
-bidding for master status, it will rescind its bid and vote for the lower
-node ID.  If a master dies or a bidding node dies, the voting algorithm
-is started over.  The voting algorithm typically takes two passes to
-complete.
-
-Master deaths take marginally longer to recover from than non-master
-deaths, because a new master must be elected before the old master can
-be evicted & fenced.
-
-
-2.4. Master Duties
-
-The master node decides who is or is not in the master partition, as
-well as handles eviction of dead nodes (both via the quorum disk and via
-the linux-cluster fencing system by using the cman_kill_node() API).
-
-
-2.5. How it All Ties Together
-
-When a master is present, and if the master believes a node to be online,
-that node will advertise to CMAN that the quorum disk is avilable.  The
-master will only grant a node membership if:
-
-   (a) CMAN believes the node to be online, and
-   (b) that node has made enough consecutive, timely writes to the quorum
-       disk.
-
-
-3. Configuration
-
-3.1. The <quorumd> tag
-
-This tag is a child of the top-level <cluster> tag.
-
-   <quorumd
-    interval="1"          This is the frequency of read/write cycles
-    tko="10"              This is the number of cycles a node must miss
-                          in order to be declared dead.
-    votes="3"             This is the number of votes the quorum daemon
-                          advertises to CMAN when it has a high enough
-                          score.
-    log_level="4"         This controls the verbosity of the quorum daemon
-                          in the system logs. 0 = emergencies; 7 = debug
-    log_facility="local4" This controls the syslog facility used by the
-			  quorum daemon when logging.
-    status_file="/foo"    Write internal states out to this file
-			  periodically ("-" = use stdout).
-    min_score="3"	  Absolute minimum score to be consider one's
-			  self "alive".  If omitted, or set to 0, the
-			  default function "floor((n+1)/2)" is used.
-    device="/dev/sda1"    This is the device the quorum daemon will use.
-			  This device must be the same on all nodes.
-    label="mylabel"/>     This overrides the device field if present.
-			  If specified, the quorum daemon will read
-			  /proc/partitions and check for qdisk signatures
-			  on every block device found, comparing the label
-			  against the specified label.  This is useful in
-			  configurations where the block device name
-			  differs on a per-node basis.
-
-
-3.2.  The <heuristic> tag
-
-This tag is a child of the <quorumd> tag.
-
-   <heuristic
-    program="/test.sh"    This is the program used to determine if this
-                          heuristic is alive.  This can be anything which
-                          may be executed by "/bin/sh -c".  A return value
-                          of zero indicates success.
-    score="1"             This is the weight of this heuristic.  Be careful
-                          when determining scores for heuristics.
-    interval="2"/>        This is the frequency at which we poll the
-                          heuristic.
-
-3.3. Example
-
-  <quorumd interval="1" tko="10" votes="3" device="/dev/gnbd/qdisk">
-    <heuristic program="ping routerA -c1 -t1" score="1" interval="2"/>
-    <heuristic program="ping routerB -c1 -t1" score="1" interval="2"/>
-    <heuristic program="ping routerC -c1 -t1" score="1" interval="2"/>
-  </quorumd>
-
-3.4. Heuristic score considerations
-
-* Heuristic timeouts should be set high enough to allow the previous run
-of a given heuristic to complete.
-
-* Heuristic scripts returning anything except 0 as their return code 
-are considered failed.
-
-* The worst-case for improperly configured quorum heuristics is a race
-to fence where two partitions simultaneously try to kill each other.
-
-3.5. Creating a quorum disk partition
-
-3.5.1. The mkqdisk utility.
-
-The mkqdisk utility can create and list currently configured quorum disks
-visible to the local node.
-
-  mkqdisk -L		List available quorum disks.
-
-  mkqdisk -f <label>	Find a quorum device by the given label.
-
-  mkqdisk -c <device> -l <label>
-			Initialize <device> and name it <label>.  This
-			will destroy all data on the device, so be careful
-			when running this command.
+See qdisk(5) for setup and other information




More information about the Cluster-devel mailing list