[Linux-cluster] GFS problem
Abraham Alawi
a.alawi at auckland.ac.nz
Tue Feb 2 02:54:04 UTC 2010
-The cluster.conf lists more than 12 nodes, if there're redundant nodes then you may need to clean up cluster.conf just in case
-Why expected_votes="8"? expected_votes should be the total votes in a fully functioning cluster, in your case it should be '12' the quorum would be calculated automatically by the basic formula (1/2 expected_votes_number + 1), so in the case of 12 votes (1 vote/node) the quorum would be 7, in other words the cluster would be kept running as long as there's 7 nodes (because in your case 1 vote per node).
-I'd change post_fail_delay="0" to 5 (seconds)
If still no luck then try this line in your cluster.conf file:
<logging debug="on" logfile="/var/log/rhcs.log" to_file="yes"/>
Good luck,
-- Abraham
On 27/01/2010, at 11:22 PM, Alex Urbanowicz wrote:
> From: Jorge Palma <jpalmae at gmail.com>
> To: linux clustering <linux-cluster at redhat.com>
> Subject: Re: [Linux-cluster] GFS problem
> Message-ID:
> <5b65f1b11001251343p659d3b96gf07dd2165adf521e at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Please send your fence configuration and cluster.conf
>
> cluster.conf:
>
> <?xml version="1.0"?>
> <!--
> ** puppet managed file $Revision: 2889 $
> -->
> <cluster config_version="14" name="gfs-filmweb">
> <fence_daemon post_fail_delay="0" post_join_delay="3"/>
> <clusternodes>
> <clusternode name="www1" nodeid="1" votes="1">
> <fence>
> <method name="1">
> <device name="rsysrq" nodename="www1" password="fencepassword" port="9" operation="1bbbb"/>
> </method>
> <!-- if method 1 happen to fail - use method 2 -->
> <method name="2">
> <device name="manual" nodename="www1"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="www2" nodeid="2" votes="1">
> <fence>
> <method name="1">
> <device name="rsysrq" nodename="www2" password="fencepassword" port="9" operation="1bbbb"/>
> </method>
> <method name="2">
> <device name="manual" nodename="www2"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="app1" nodeid="65" votes="1">
> <fence>
> <method name="1">
> <device name="rsysrq" nodename="app1" password="fencepassword" port="9" operation="1bbbb"/>
> </method>
> <method name="2">
> <device name="manual" nodename="app1"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="app2" nodeid="66" votes="1">
> <fence>
> <method name="1">
> <device name="rsysrq" nodename="app2" password="fencepassword" port="9" operation="1bbbb"/>
> </method>
> <method name="2">
> <device name="manual" nodename="app2"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="app3" nodeid="67" votes="1">
> <fence>
> <method name="1">
> <device name="rsysrq" nodename="app3" password="fencepassword" port="9" operation="1bbbb"/>
> </method>
> <method name="2">
> <device name="manual" nodename="app3"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="app4" nodeid="68" votes="1">
> <fence>
> <method name="1">
> <device name="rsysrq" nodename="app4" password="fencepassword" port="9" operation="1bbbb"/>
> </method>
> <method name="2">
> <device name="manual" nodename="app4"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="app5" nodeid="69" votes="1">
> <fence>
> <method name="1">
> <device name="rsysrq" nodename="app5" password="fencepassword" port="9" operation="1bbbb"/>
> </method>
> <method name="2">
> <device name="manual" nodename="app5"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="app6" nodeid="70" votes="1">
> <fence>
> <method name="1">
> <device name="rsysrq" nodename="app6" password="fencepassword" port="9" operation="1bbbb"/>
> </method>
> <method name="2">
> <device name="manual" nodename="app6"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="app7" nodeid="71" votes="1">
> <fence>
> <method name="1">
> <device name="rsysrq" nodename="app7" password="fencepassword" port="9" operation="1bbbb"/>
> </method>
> <method name="2">
> <device name="manual" nodename="app7"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="blade403" nodeid="72" votes="1">
> <fence>
> <method name="1">
> <device name="rsysrq" nodename="blade403" password="fencepassword" port="9" operation="1bbbb"/>
> </method>
> <method name="2">
> <device name="manual" nodename="blade403"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="blade404" nodeid="73" votes="1">
> <fence>
> <method name="1">
> <device name="rsysrq" nodename="blade404" password="fencepassword" port="9" operation="1bbbb"/>
> </method>
> <method name="2">
> <device name="manual" nodename="blade404"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="blade405" nodeid="74" votes="1">
> <fence>
> <method name="1">
> <device name="rsysrq" nodename="blade405" password="fencepassword" port="9" operation="1bbbb"/>
> </method>
> <method name="2">
> <device name="manual" nodename="blade405"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="blade406" nodeid="75" votes="1">
> <fence>
> <method name="1">
> <device name="rsysrq" nodename="blade406" password="fencepassword" port="9" operation="1bbbb"/>
> </method>
> <method name="2">
> <device name="manual" nodename="blade406"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="blade407" nodeid="76" votes="1">
> <fence>
> <method name="1">
> <device name="rsysrq" nodename="blade407" password="fencepassword" port="9" operation="1bbbb"/>
> </method>
> <method name="2">
> <device name="manual" nodename="blade407"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="blade408" nodeid="77" votes="1">
> <fence>
> <method name="1">
> <device name="rsysrq" nodename="blade408" password="fencepassword" port="9" operation="1bbbb"/>
> </method>
> <method name="2">
> <device name="manual" nodename="blade408"/>
> </method>
> </fence>
> </clusternode>
> </clusternodes>
> <cman expected_votes="8" two_node="0"/>
> <fencedevices>
> <fencedevice agent="fence_rsysrq" name="rsysrq"/>
> <fencedevice agent="fence_manual" name="manual"/>
> </fencedevices>
> <rm>
> <failoverdomains/>
> <resources/>
> </rm>
> </cluster>
>
> fencing is done using fence_rsysrq so there is no configuration to speak of except the iptables/modprobe part:
>
> options ipt_SYSRQ passwd="fencepassword" tolerance=3720
>
> -A INPUT -i bond0.108 -s 10.100.108.0/24 -d <hostip> -p udp -m udp --dport 9 -j SYSRQ
>
> Alex.
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
''''''''''''''''''''''''''''''''''''''''''''''''''''''
Abraham Alawi
Unix/Linux Systems Administrator
Science IT
University of Auckland
e: a.alawi at auckland.ac.nz
p: +64-9-373 7599, ext#: 87572
''''''''''''''''''''''''''''''''''''''''''''''''''''''
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100202/6993c285/attachment.htm>
More information about the Linux-cluster
mailing list