[Linux-cluster] GFS problem

Abraham Alawi a.alawi at auckland.ac.nz
Tue Feb 2 02:54:04 UTC 2010


-The cluster.conf lists more than 12 nodes, if there're redundant nodes then you may need to clean up cluster.conf just in case
-Why expected_votes="8"? expected_votes should be the total votes in a fully functioning cluster, in your case it should be '12' the quorum would be calculated automatically by the basic formula (1/2 expected_votes_number + 1), so in the case of 12 votes (1 vote/node) the quorum would be 7, in other words the cluster would be kept running as long as there's 7 nodes (because in your case 1 vote per node). 
-I'd change post_fail_delay="0" to 5 (seconds)

If still no luck then try this line in your cluster.conf file:
<logging debug="on" logfile="/var/log/rhcs.log" to_file="yes"/>

Good luck,

   -- Abraham

On 27/01/2010, at 11:22 PM, Alex Urbanowicz wrote:

> From: Jorge Palma <jpalmae at gmail.com>
> To: linux clustering <linux-cluster at redhat.com>
> Subject: Re: [Linux-cluster] GFS problem
> Message-ID:
>        <5b65f1b11001251343p659d3b96gf07dd2165adf521e at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
> 
> Please send your fence configuration and cluster.conf
> 
> cluster.conf:
> 
> <?xml version="1.0"?>
> <!--
> ** puppet managed file $Revision: 2889 $
> -->
> <cluster config_version="14" name="gfs-filmweb">
>         <fence_daemon post_fail_delay="0" post_join_delay="3"/>
>         <clusternodes>
>                 <clusternode name="www1" nodeid="1" votes="1">
>                         <fence>
>                                 <method name="1">
>                                         <device name="rsysrq" nodename="www1" password="fencepassword" port="9" operation="1bbbb"/>
>                                 </method>
>                                 <!-- if method 1 happen to fail - use method 2 -->
>                                 <method name="2">
>                                         <device name="manual" nodename="www1"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="www2" nodeid="2" votes="1">
>                         <fence>
>                                 <method name="1">
>                                         <device name="rsysrq" nodename="www2" password="fencepassword" port="9" operation="1bbbb"/>
>                                 </method>
>                                 <method name="2">
>                                         <device name="manual" nodename="www2"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="app1" nodeid="65" votes="1">
>                         <fence>
>                                 <method name="1">
>                                         <device name="rsysrq" nodename="app1" password="fencepassword" port="9" operation="1bbbb"/>
>                                 </method>
>                                 <method name="2">
>                                         <device name="manual" nodename="app1"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="app2" nodeid="66" votes="1">
>                         <fence>
>                                 <method name="1">
>                                         <device name="rsysrq" nodename="app2" password="fencepassword" port="9" operation="1bbbb"/>
>                                 </method>
>                                 <method name="2">
>                                         <device name="manual" nodename="app2"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="app3" nodeid="67" votes="1">
>                         <fence>
>                                 <method name="1">
>                                         <device name="rsysrq" nodename="app3" password="fencepassword" port="9" operation="1bbbb"/>
>                                 </method>
>                                 <method name="2">
>                                         <device name="manual" nodename="app3"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="app4" nodeid="68" votes="1">
>                         <fence>
>                                 <method name="1">
>                                         <device name="rsysrq" nodename="app4" password="fencepassword" port="9" operation="1bbbb"/>
>                                 </method>
>                                 <method name="2">
>                                         <device name="manual" nodename="app4"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="app5" nodeid="69" votes="1">
>                         <fence> 
>                                 <method name="1">
>                                         <device name="rsysrq" nodename="app5" password="fencepassword" port="9" operation="1bbbb"/>
>                                 </method>
>                                 <method name="2">
>                                         <device name="manual" nodename="app5"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="app6" nodeid="70" votes="1">
>                         <fence> 
>                                 <method name="1">
>                                         <device name="rsysrq" nodename="app6" password="fencepassword" port="9" operation="1bbbb"/>
>                                 </method>
>                                 <method name="2">
>                                         <device name="manual" nodename="app6"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="app7" nodeid="71" votes="1">
>                         <fence> 
>                                 <method name="1">
>                                         <device name="rsysrq" nodename="app7" password="fencepassword" port="9" operation="1bbbb"/>
>                                 </method>
>                                 <method name="2">
>                                         <device name="manual" nodename="app7"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="blade403" nodeid="72" votes="1">
>                         <fence> 
>                                 <method name="1">
>                                         <device name="rsysrq" nodename="blade403" password="fencepassword" port="9" operation="1bbbb"/>
>                                 </method>
>                                 <method name="2">
>                                         <device name="manual" nodename="blade403"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="blade404" nodeid="73" votes="1">
>                         <fence> 
>                                 <method name="1">
>                                         <device name="rsysrq" nodename="blade404" password="fencepassword" port="9" operation="1bbbb"/>
>                                 </method>
>                                 <method name="2">
>                                         <device name="manual" nodename="blade404"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="blade405" nodeid="74" votes="1">
>                         <fence> 
>                                 <method name="1">
>                                         <device name="rsysrq" nodename="blade405" password="fencepassword" port="9" operation="1bbbb"/>
>                                 </method>
>                                 <method name="2">
>                                         <device name="manual" nodename="blade405"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="blade406" nodeid="75" votes="1">
>                         <fence>
>                                 <method name="1">
>                                         <device name="rsysrq" nodename="blade406" password="fencepassword" port="9" operation="1bbbb"/>
>                                 </method>
>                                 <method name="2">
>                                         <device name="manual" nodename="blade406"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="blade407" nodeid="76" votes="1">
>                         <fence>
>                                 <method name="1">
>                                         <device name="rsysrq" nodename="blade407" password="fencepassword" port="9" operation="1bbbb"/>
>                                 </method>
>                                 <method name="2">
>                                         <device name="manual" nodename="blade407"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="blade408" nodeid="77" votes="1">
>                         <fence>
>                                 <method name="1">
>                                         <device name="rsysrq" nodename="blade408" password="fencepassword" port="9" operation="1bbbb"/>
>                                 </method>
>                                 <method name="2">
>                                         <device name="manual" nodename="blade408"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>         </clusternodes>
>         <cman expected_votes="8" two_node="0"/>
>         <fencedevices>
>                 <fencedevice agent="fence_rsysrq" name="rsysrq"/>
>                 <fencedevice agent="fence_manual" name="manual"/>
>         </fencedevices>
>         <rm>
>                 <failoverdomains/>
>                 <resources/>
>         </rm>
> </cluster>
> 
> fencing is done using fence_rsysrq so there is no configuration to speak of except the iptables/modprobe part:
> 
> options ipt_SYSRQ passwd="fencepassword" tolerance=3720
> 
> -A INPUT  -i bond0.108 -s 10.100.108.0/24 -d <hostip> -p udp -m udp --dport 9 -j SYSRQ
> 
> Alex.
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

''''''''''''''''''''''''''''''''''''''''''''''''''''''
Abraham Alawi

Unix/Linux Systems Administrator
Science IT
University of Auckland
e: a.alawi at auckland.ac.nz
p: +64-9-373 7599, ext#: 87572

''''''''''''''''''''''''''''''''''''''''''''''''''''''

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100202/6993c285/attachment.htm>


More information about the Linux-cluster mailing list