[Cluster-devel] cluster/cman man/Makefile qdisk/README man/mkq ...
lhh at sourceware.org
lhh at sourceware.org
Fri Jul 21 17:56:16 UTC 2006
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4U4
Changes by: lhh at sourceware.org 2006-07-21 17:56:15
Modified files:
cman/man : Makefile
cman/qdisk : README
Added files:
cman/man : mkqdisk.8 qdisk.5 qdiskd.8
Log message:
Add man pages for qdisk
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cman/man/mkqdisk.8.diff?cvsroot=cluster&only_with_tag=RHEL4U4&r1=NONE&r2=1.2.2.1
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cman/man/qdisk.5.diff?cvsroot=cluster&only_with_tag=RHEL4U4&r1=NONE&r2=1.2.2.1
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cman/man/qdiskd.8.diff?cvsroot=cluster&only_with_tag=RHEL4U4&r1=NONE&r2=1.2.2.1
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cman/man/Makefile.diff?cvsroot=cluster&only_with_tag=RHEL4U4&r1=1.1&r2=1.1.14.1
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cman/qdisk/README.diff?cvsroot=cluster&only_with_tag=RHEL4U4&r1=1.1.2.1.2.1&r2=1.1.2.1.2.2
/cvs/cluster/cluster/cman/man/mkqdisk.8,v --> standard output
revision 1.2.2.1
--- cluster/cman/man/mkqdisk.8
+++ - 2006-07-21 17:56:15.747890000 +0000
@@ -0,0 +1,23 @@
+.TH "mkqdisk" "8" "July 2006" "" "Quorum Disk Management"
+.SH "NAME"
+mkqdisk \- Cluster Quorum Disk Utility
+.SH "WARNING"
+Use of this command can cause the cluster to malfunction.
+.SH "SYNOPSIS"
+\fBmkqdisk [\-?|\-h] | [\-L] | [\-f \fPlabel\fB] [\-c \fPdevice \fB -l \fPlabel\fB]
+.SH "DESCRIPTION"
+.PP
+The \fBmkqdisk\fP command is used to create a new quorum disk or display
+existing quorum disks accessible from a given cluster node.
+.SH "OPTIONS"
+.IP "\-c device \-l label"
+Initialize a new cluster quorum disk. This will destroy all data on the given
+device. If a cluster is currently using that device as a quorum disk, the
+entire cluster will malfunction. Do not ru
+.IP "\-f label"
+Find the cluster quorum disk with the given label and display information about it..
+.IP "\-L"
+Display information on all accessible cluster quorum disks.
+
+.SH "SEE ALSO"
+qdisk(5) qdiskd(8)
/cvs/cluster/cluster/cman/man/qdisk.5,v --> standard output
revision 1.2.2.1
--- cluster/cman/man/qdisk.5
+++ - 2006-07-21 17:56:15.834011000 +0000
@@ -0,0 +1,309 @@
+.TH "QDisk" "8" "July 2006" "" "Cluster Quorum Disk"
+.SH "NAME"
+QDisk 1.0 \- a disk-based quorum daemon for CMAN / Linux-Cluster
+.SH "1. Overview"
+.SH "1.1 Problem"
+In some situations, it may be necessary or desirable to sustain
+a majority node failure of a cluster without introducing the need for
+asymmetric cluster configurations (e.g. client-server, or heavily-weighted
+voting nodes).
+
+.SH "1.2. Design Requirements"
+* Ability to sustain 1..(n-1)/n simultaneous node failures, without the
+danger of a simple network partition causing a split brain. That is, we
+need to be able to ensure that the majority failure case is not merely
+the result of a network partition.
+
+* Ability to use external reasons for deciding which partition is the
+the quorate partition in a partitioned cluster. For example, a user may
+have a service running on one node, and that node must always be the master
+in the event of a network partition. Or, a node might lose all network
+connectivity except the cluster communication path - in which case, a
+user may wish that node to be evicted from the cluster.
+
+* Integration with CMAN. We must not require CMAN to run with us (or
+without us). Linux-Cluster does not require a quorum disk normally -
+introducing new requirements on the base of how Linux-Cluster operates
+is not allowed.
+
+* Data integrity. In order to recover from a majority failure, fencing
+is required. The fencing subsystem is already provided by Linux-Cluster.
+
+* Non-reliance on hardware or protocol specific methods (i.e. SCSI
+reservations). This ensures the quorum disk algorithm can be used on the
+widest range of hardware configurations possible.
+
+* Little or no memory allocation after initialization. In critical paths
+during failover, we do not want to have to worry about being killed during
+a memory pressure situation because we request a page fault, and the Linux
+OOM killer responds...
+
+.SH "1.3. Hardware Considerations and Requirements"
+.SH "1.3.1. Concurrent, Synchronous, Read/Write Access"
+This quorum daemon requires a shared block device with concurrent read/write
+access from all nodes in the cluster. The shared block device can be
+a multi-port SCSI RAID array, a Fiber-Channel RAID SAN, a RAIDed iSCSI
+target, or even GNBD. The quorum daemon uses O_DIRECT to write to the
+device.
+
+.SH "1.3.2. Bargain-basement JBODs need not apply"
+There is a minimum performance requirement inherent when using disk-based
+cluster quorum algorithms, so design your cluster accordingly. Using a
+cheap JBOD with old SCSI2 disks on a multi-initiator bus will cause
+problems at the first load spike. Plan your loads accordingly; a node's
+inability to write to the quorum disk in a timely manner will cause the
+cluster to evict the node. Using host-RAID or multi-initiator parallel
+SCSI configurations with the qdisk daemon is unlikely to work, and will
+probably cause administrators a lot of frustration. That having been
+said, because the timeouts are configurable, most hardware should work
+if the timeouts are set high enough.
+
+.SH "1.3.3. Fencing is Required"
+In order to maintain data integrity under all failure scenarios, use of
+this quorum daemon requires adequate fencing, preferrably power-based
+fencing. Watchdog timers and software-based solutions to reboot the node
+internally, while possibly sufficient, are not considered 'fencing' for
+the purposes of using the quorum disk.
+
+.SH "1.4. Limitations"
+* At this time, this daemon supports a maximum of 16 nodes. This is
+primarily a scalability issue: As we increase the node count, we increase
+the amount of synchronous I/O contention on the shared quorum disk.
+
+* Cluster node IDs must be statically configured in cluster.conf and
+must be numbered from 1..16 (there can be gaps, of course).
+
+* Cluster node votes should be more or less equal.
+
+* CMAN must be running before the qdisk program can start.
+
+* CMAN's eviction timeout should be at least 2x the quorum daemon's
+to give the quorum daemon adequate time to converge on a master during a
+failure + load spike situation.
+
+* The total number of votes assigned to the quorum device should be
+equal to or greater than the total number of node-votes in the cluster.
+While it is possible to assign only one (or a few) votes to the quorum
+device, the effects of doing so have not been explored.
+
+* Currently, the quorum disk daemon is difficult to use with CLVM if
+the quorum disk resides on a CLVM logical volume. CLVM requires a
+quorate cluster to correctly operate, which introduces a chicken-and-egg
+problem for starting the cluster: CLVM needs quorum, but the quorum daemon
+needs CLVM (if and only if the quorum device lies on CLVM-managed storage).
+One way to work around this is to *not* set the cluster's expected votes
+to include the quorum daemon's votes. Bring all nodes online, and start
+the quorum daemon *after* the whole cluster is running. This will allow
+the expected votes to increase naturally.
+
+.SH "2. Algorithms"
+.SH "2.1. Heartbeating & Liveliness Determination"
+Nodes update individual status blocks on the quorum disk at a user-
+defined rate. Each write of a status block alters the timestamp, which
+is what other nodes use to decide whether a node has hung or not. If,
+after a user-defined number of 'misses' (that is, failure to update a
+timestamp), a node is declared offline. After a certain number of 'hits'
+(changed timestamp + "i am alive" state), the node is declared online.
+
+The status block contains additional information, such as a bitmask of
+the nodes that node believes are online. Some of this information is
+used by the master - while some is just for performace recording, and
+may be used at a later time. The most important pieces of information
+a node writes to its status block are:
+
+.in 12
+- Timestamp
+.br
+- Internal state (available / not available)
+.br
+- Score
+.br
+- Known max score (may be used in the future to detect invalid configurations)
+.br
+- Vote/bid messages
+.br
+- Other nodes it thinks are online
+.in 0
+
+.SH "2.2. Scoring & Heuristics"
+The administrator can configure up to 10 purely arbitrary heuristics, and
+must exercise caution in doing so. At least one administrator-
+defined heuristic is required for operation, but it is generally a good
+idea to have more than one heuristic. By default, only nodes scoring over
+1/2 of the total maximum score will claim they are available via the
+quorum disk, and a node (master or otherwise) whose score drops too low
+will remove itself (usually, by rebooting).
+
+The heuristics themselves can be any command executable by 'sh -c'. For
+example, in early testing the following was used:
+
+.ti 12
+<\fBheuristic \fP\fIprogram\fP\fB="\fP[ -f /quorum ]\fB" \fP\fIscore\fP\fB="\fP10\fB" \fP\fIinterval\fP\fB="\fP2\fB"/>\fP
+
+This is a literal sh-ism which tests for the existence of a file called
+"/quorum". Without that file, the node would claim it was unavailable.
+This is an awful example, and should never, ever be used in production,
+but is provided as an example as to what one could do...
+
+Typically, the heuristics should be snippets of shell code or commands which
+help determine a node's usefulness to the cluster or clients. Ideally, you
+want to add traces for all of your network paths (e.g. check links, or
+ping routers), and methods to detect availability of shared storage.
+
+.SH "2.3. Master Election"
+Only one master is present at any one time in the cluster, regardless of
+how many partitions exist within the cluster itself. The master is
+elected by a simple voting scheme in which the lowest node which believes
+it is capable of running (i.e. scores high enough) bids for master status.
+If the other nodes agree, it becomes the master. This algorithm is
+run whenever no master is present.
+
+If another node comes online with a lower node ID while a node is still
+bidding for master status, it will rescind its bid and vote for the lower
+node ID. If a master dies or a bidding node dies, the voting algorithm
+is started over. The voting algorithm typically takes two passes to
+complete.
+
+Master deaths take marginally longer to recover from than non-master
+deaths, because a new master must be elected before the old master can
+be evicted & fenced.
+
+.SH "2.4. Master Duties"
+The master node decides who is or is not in the master partition, as
+well as handles eviction of dead nodes (both via the quorum disk and via
+the linux-cluster fencing system by using the cman_kill_node() API).
+
+.SH "2.5. How it All Ties Together"
+When a master is present, and if the master believes a node to be online,
+that node will advertise to CMAN that the quorum disk is available. The
+master will only grant a node membership if:
+
+.in 12
+(a) CMAN believes the node to be online, and
+.br
+(b) that node has made enough consecutive, timely writes
+.in 16
+to the quorum disk, and
+.in 12
+(c) the node has a high enough score to consider itself online.
+.in 0
+
+.SH "3. Configuration"
+.SH "3.1. The <quorumd> tag"
+This tag is a child of the top-level <cluster> tag.
+
+.in 8
+\fB<quorumd\fP
+.in 9
+\fIinterval\fP\fB="\fP1\fB"\fP
+.in 12
+This is the frequency of read/write cycles
+
+.in 9
+\fItko\fP\fB="\fP10\fB"\fP
+.in 12
+This is the number of cycles a node must miss in order to be declared dead.
+
+.in 9
+\fIvotes\fP\fB="\fP3\fB"\fP
+.in 12
+This is the number of votes the quorum daemon advertises to CMAN when it
+has a high enough score.
+
+.in 9
+\fIlog_level\fP\fB="\fP4\fB"\fP
+.in 12
+This controls the verbosity of the quorum daemon in the system logs.
+0 = emergencies; 7 = debug.
+
+.in 9
+\fIlog_facility\fP\fB="\fPlocal4\fB"\fP
+.in 12
+This controls the syslog facility used by the quorum daemon when logging.
+For a complete list of available facilities, see \fBsyslog.conf(5)\fP.
+
+.in 9
+\fIstatus_file\fP\fB="\fP/foo\fB"\fP
+.in 12
+Write internal states out to this file periodically ("-" = use stdout).
+This is primarily used for debugging.
+
+.in 9
+\fImin_score\fP\fB="\fP3\fB"\fP
+.in 12
+Absolute minimum score to be consider one's self "alive". If omitted,
+or set to 0, the default function "floor((n+1)/2)" is used, where \fIn\fP
+is the sum-total of all of defined heuristics' \fIscore\fP attribute.
+
+.in 9
+\fIdevice\fP\fB="\fP/dev/sda1\fB"\fP
+.in 12
+This is the device the quorum daemon will use. This device must be the
+same on all nodes.
+
+.in 9
+\fIlabel\fP\fB="\fPmylabel\fB"/>\fP
+.in 12
+This overrides the device field if present. If specified, the quorum
+daemon will read /proc/partitions and check for qdisk signatures
+on every block device found, comparing the label against the specified
+label. This is useful in configurations where the block device name
+differs on a per-node basis.
+.in 0
+
+.SH "3.2. The <heuristic> tag"
+This tag is a child of the <quorumd> tag.
+
+.in 8
+\fB<heuristic\fP
+.in 9
+\fIprogram\fP\fB="\fP/test.sh\fB"\fP
+.in 12
+This is the program used to determine if this heuristic is alive. This
+can be anything which may be executed by \fI/bin/sh -c\fP. A return
+value of zero indicates success; anything else indicates failure.
+
+.in 9
+\fIscore\fP\fB="\fP1\fB"\fP
+.in 12
+This is the weight of this heuristic. Be careful when determining scores
+for heuristics.
+
+.in 9
+\fIinterval\fP\fB="\fP2\fB"/>\fP
+.in 12
+This is the frequency at which we poll the heuristic.
+.in 0
+
+.SH "3.3. Example"
+.in 8
+<quorumd interval="1" tko="10" votes="3" label="testing">
+.in 12
+<heuristic program="ping A -c1 -t1" score="1" interval="2"/>
+.br
+<heuristic program="ping B -c1 -t1" score="1" interval="2"/>
+.br
+<heuristic program="ping C -c1 -t1" score="1" interval="2"/>
+.br
+.in 8
+</quorumd>
+.in 0
+
+.SH "3.4. Heuristic score considerations"
+* Heuristic timeouts should be set high enough to allow the previous run
+of a given heuristic to complete.
+
+* Heuristic scripts returning anything except 0 as their return code
+are considered failed.
+
+* The worst-case for improperly configured quorum heuristics is a race
+to fence where two partitions simultaneously try to kill each other.
+
+.SH "3.5. Creating a quorum disk partition"
+The mkqdisk utility can create and list currently configured quorum disks
+visible to the local node; see
+.B mkqdisk(8)
+for more details.
+
+.SH "SEE ALSO"
+mkqdisk(8), qdiskd(8), cman(5), syslog.conf(5)
/cvs/cluster/cluster/cman/man/qdiskd.8,v --> standard output
revision 1.2.2.1
--- cluster/cman/man/qdiskd.8
+++ - 2006-07-21 17:56:15.915107000 +0000
@@ -0,0 +1,20 @@
+.TH "qdiskd" "8" "July 2006" "" "Quorum Disk Management"
+.SH "NAME"
+qdiskd \- Cluster Quorum Disk Daemon
+.SH "SYNOPSIS"
+\fBqdiskd [\-f] [\-d]
+.SH "DESCRIPTION"
+.PP
+The \fBqdiskd\fP daemon talks to CMAN and provides a mechanism for determining
+node-fitness in a cluster environment. See
+.B
+qdisk(5)
+for configuration information.
+.SH "OPTIONS"
+.IP "\-f"
+Run in the foreground (do not fork / daemonize).
+.IP "\-d"
+Enable debug output.
+
+.SH "SEE ALSO"
+mkqdisk(8), qdisk(5), cman(5)
--- cluster/cman/man/Makefile 2004/08/13 06:38:22 1.1
+++ cluster/cman/man/Makefile 2006/07/21 17:56:15 1.1.14.1
@@ -18,10 +18,10 @@
install:
install -d ${mandir}/man5
install -d ${mandir}/man8
- install cman.5 ${mandir}/man5
- install cman_tool.8 ${mandir}/man8
+ install cman.5 qdisk.5 ${mandir}/man5
+ install cman_tool.8 qdiskd.8 mkqdisk.8 ${mandir}/man8
uninstall:
- ${UNINSTALL} cman.5 ${mandir}/man5
- ${UNINSTALL} cman_tool.8 ${mandir}/man8
+ ${UNINSTALL} cman.5 qdisk.5 ${mandir}/man5
+ ${UNINSTALL} cman_tool.8 qdiskd.8 mkqdisk.8 ${mandir}/man8
--- cluster/cman/qdisk/README 2006/06/23 16:02:01 1.1.2.1.2.1
+++ cluster/cman/qdisk/README 2006/07/21 17:56:15 1.1.2.1.2.2
@@ -1,274 +1 @@
-qdisk 1.0 - a disk-based quorum algorithm for Linux-Cluster
-
-(C) 2006 Red Hat, Inc.
-
-1. Overview
-
-1.1. Problem
-
-In some situations, it may be necessary or desirable to sustain
-a majority node failure of a cluster without introducing the need for
-asymmetric (client-server, or heavy-weighted voting nodes).
-
-1.2. Design Requirements
-
-* Ability to sustain 1..(n-1)/n simultaneous node failures, without the
-danger of a simple network partition causing a split brain. That is, we
-need to be able to ensure that the majority failure case is not merely
-the result of a network partition.
-
-* Ability to use external reasons for deciding which partition is the
-the quorate partition in a partitioned cluster. For example, a user may
-have a service running on one node, and that node must always be the master
-in the event of a network partition. Or, a node might lose all network
-connectivity except the cluster communication path - in which case, a
-user may wish that node to be evicted from the cluster.
-
-* Integration with CMAN. We must not require CMAN to run with us (or
-without us). Linux-Cluster does not require a quorum disk normally -
-introducing new requirements on the base of how Linux-Cluster operates
-is not allowed.
-
-* Data integrity. In order to recover from a majority failure, fencing
-is required. The fencing subsystem is already provided by Linux-Cluster.
-
-* Non-reliance on hardware or protocol specific methods (i.e. SCSI
-reservations). This ensures the quorum disk algorithm can be used on the
-widest range of hardware configurations possible.
-
-* Little or no memory allocation after initialization. In critical paths
-during failover, we do not want to have to worry about being killed during
-a memory pressure situation because we request a page fault, and the Linux
-OOM killer responds...
-
-
-1.3. Hardware Configuration Considerations
-
-1.3.1. Concurrent, Synchronous, Read/Write Access
-
-This daemon requires a shared block device with concurrent read/write
-access from all nodes in the cluster. The shared block device can be
-a multi-port SCSI RAID array, a Fiber-Channel RAID SAN, a RAIDed iSCSI
-target, or even GNBD. The quorum daemon uses O_DIRECT to write to the
-device.
-
-1.3.2. Bargain-basement JBODs need not apply
-
-There is a minimum performance requirement inherent when using disk-based
-cluster quorum algorithms, so design your cluster accordingly. Using a
-cheap JBOD with old SCSI2 disks on a multi-initiator bus will cause
-problems at the first load spike. Plan your loads accordingly; a node's
-inability to write to the quorum disk in a timely manner will cause the
-cluster to evict the node. Using host-RAID or multi-initiator parallel
-SCSI configurations with the qdisk daemon is unlikely to work, and will
-probably cause administrators a lot of frustration. That having been
-said, because the timeouts are configurable, most hardware should work
-if the timeouts are set high enough.
-
-1.3.3. Fencing is Required
-
-In order to maintain data integrity under all failure scenarios, use of
-this quorum daemon requires adequate fencing, preferrably power-based
-fencing.
-
-
-1.4. Limitations
-
-* At this time, this daemon only supports a maximum of 16 nodes.
-
-* Cluster node IDs must be statically configured in cluster.conf and
-must be numbered from 1..16 (there can be gaps, of course).
-
-* Cluster node votes should be more or less equal.
-
-* CMAN must be running before the qdisk program can start. This
-limitation will be removed before a production release.
-
-* CMAN's eviction timeout should be at least 2x the quorum daemon's
-to give the quorum daemon adequate time to converge on a master during a
-failure + load spike situation.
-
-* The total number of votes assigned to the quorum device should be
-equal to or greater than the total number of node-votes in the cluster.
-While it is possible to assign only one (or a few) votes to the quorum
-device, the effects of doing so have not been explored.
-
-* Currently, the quorum disk daemon is difficult to use with CLVM if
-the quorum disk resides on a CLVM logical volume. CLVM requires a
-quorate cluster to correctly operate, which introduces a chicken-and-egg
-problem for starting the cluster: CLVM needs quorum, but the quorum daemon
-needs CLVM (if and only if the quorum device lies on CLVM-managed storage).
-One way to work around this is to *not* set the cluster's expected votes
-to include theh quorum daemon's votes. Bring all nodes online, and start
-the quorum daemon *after* the whole cluster is running. This will allow
-the expected votes to increase naturally.
-
-
-2. Algorithms
-
-2.1. Heartbeating & Liveliness Determination
-
-Nodes update individual status blocks on the quorum disk at a user-
-defined rate. Each write of a status block alters the timestamp, which
-is what other nodes use to decide whether a node has hung or not. If,
-after a user-defined number of 'misses' (that is, failure to update a
-timestamp), a node is declared offline. After a certain number of 'hits'
-(changed timestamp + "i am alive" state), the node is declared online.
-
-The status block contains additional information, such as a bitmask of
-the nodes that node believes are online. Some of this information is
-used by the master - while some is just for performace recording, and
-may be used at a later time. The most important pieces of information
-a node writes to its status block are:
-
- - timestamp
- - internal state (available / not available)
- - score
- - max score
- - vote/bid messages
- - other nodes it thinks are online
-
-
-2.2. Scoring & Heuristics
-
-The administrator can configure up to 10 purely arbitrary heuristics, and
-must exercise caution in doing so. By default, only nodes scoring over
-1/2 of the total maximum score will claim they are available via the
-quorum disk, and a node (master or otherwise) whose score drops too low
-will remove itself (usually, by rebooting).
-
-The heuristics themselves can be any command executable by 'sh -c'. For
-example, in early testing, I used this:
-
- <heuristic program="[ -f /quorum ]" score="10" interval="2"/>
-
-This is a literal sh-ism which tests for the existence of a file called
-"/quorum". Without that file, the node would claim it was unavailable.
-This is an awful example, and should never, ever be used in production,
-but is provided as an example as to what one could do...
-
-Typically, the heuristics should be snippets of shell code or commands which
-help determine a node's usefulness to the cluster or clients. Ideally, you
-want to add traces for all of your network paths (e.g. check links, or
-ping routers), and methods to detect availability of shared storage.
-
-
-2.3. Master Election
-
-Only one master is present at any one time in the cluster, regardless of
-how many partitions exist within the cluster itself. The master is
-elected by a simple voting scheme in which the lowest node which believes
-it is capable of running (i.e. scores high enough) bids for master status.
-If the other nodes agree, it becomes the master. This algorithm is
-run whenever no master is present.
-
-If another node comes online with a lower node ID while a node is still
-bidding for master status, it will rescind its bid and vote for the lower
-node ID. If a master dies or a bidding node dies, the voting algorithm
-is started over. The voting algorithm typically takes two passes to
-complete.
-
-Master deaths take marginally longer to recover from than non-master
-deaths, because a new master must be elected before the old master can
-be evicted & fenced.
-
-
-2.4. Master Duties
-
-The master node decides who is or is not in the master partition, as
-well as handles eviction of dead nodes (both via the quorum disk and via
-the linux-cluster fencing system by using the cman_kill_node() API).
-
-
-2.5. How it All Ties Together
-
-When a master is present, and if the master believes a node to be online,
-that node will advertise to CMAN that the quorum disk is avilable. The
-master will only grant a node membership if:
-
- (a) CMAN believes the node to be online, and
- (b) that node has made enough consecutive, timely writes to the quorum
- disk.
-
-
-3. Configuration
-
-3.1. The <quorumd> tag
-
-This tag is a child of the top-level <cluster> tag.
-
- <quorumd
- interval="1" This is the frequency of read/write cycles
- tko="10" This is the number of cycles a node must miss
- in order to be declared dead.
- votes="3" This is the number of votes the quorum daemon
- advertises to CMAN when it has a high enough
- score.
- log_level="4" This controls the verbosity of the quorum daemon
- in the system logs. 0 = emergencies; 7 = debug
- log_facility="local4" This controls the syslog facility used by the
- quorum daemon when logging.
- status_file="/foo" Write internal states out to this file
- periodically ("-" = use stdout).
- min_score="3" Absolute minimum score to be consider one's
- self "alive". If omitted, or set to 0, the
- default function "floor((n+1)/2)" is used.
- device="/dev/sda1" This is the device the quorum daemon will use.
- This device must be the same on all nodes.
- label="mylabel"/> This overrides the device field if present.
- If specified, the quorum daemon will read
- /proc/partitions and check for qdisk signatures
- on every block device found, comparing the label
- against the specified label. This is useful in
- configurations where the block device name
- differs on a per-node basis.
-
-
-3.2. The <heuristic> tag
-
-This tag is a child of the <quorumd> tag.
-
- <heuristic
- program="/test.sh" This is the program used to determine if this
- heuristic is alive. This can be anything which
- may be executed by "/bin/sh -c". A return value
- of zero indicates success.
- score="1" This is the weight of this heuristic. Be careful
- when determining scores for heuristics.
- interval="2"/> This is the frequency at which we poll the
- heuristic.
-
-3.3. Example
-
- <quorumd interval="1" tko="10" votes="3" device="/dev/gnbd/qdisk">
- <heuristic program="ping routerA -c1 -t1" score="1" interval="2"/>
- <heuristic program="ping routerB -c1 -t1" score="1" interval="2"/>
- <heuristic program="ping routerC -c1 -t1" score="1" interval="2"/>
- </quorumd>
-
-3.4. Heuristic score considerations
-
-* Heuristic timeouts should be set high enough to allow the previous run
-of a given heuristic to complete.
-
-* Heuristic scripts returning anything except 0 as their return code
-are considered failed.
-
-* The worst-case for improperly configured quorum heuristics is a race
-to fence where two partitions simultaneously try to kill each other.
-
-3.5. Creating a quorum disk partition
-
-3.5.1. The mkqdisk utility.
-
-The mkqdisk utility can create and list currently configured quorum disks
-visible to the local node.
-
- mkqdisk -L List available quorum disks.
-
- mkqdisk -f <label> Find a quorum device by the given label.
-
- mkqdisk -c <device> -l <label>
- Initialize <device> and name it <label>. This
- will destroy all data on the device, so be careful
- when running this command.
+See qdisk(5) for setup and other information
More information about the Cluster-devel
mailing list