<div dir="ltr">Hi thanks for your reply<br><br>This is Cisco UCS machine. yesterday cisco guys created a separate vswitch for this heartbeat.<br><br>regards,<br>Ben<br><br><div class="gmail_quote">On Tue, Sep 18, 2012 at 6:25 AM, Digimer <span dir="ltr"><<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">You have two problems;<br>
<br>
1. The nodes can't talk to each other (via multicast) *or* you are taking too long to start each node. Given that you are using luci, I am guessing the former. Log into your switch and see if the multicast group shown in 'cman_tool status' exists.<br>


<br>
2. Your fencing isn't working. Read the man page for fence_cisco_ucs to try and debug it.<br>
<br>
digimer<br>
<br>
PS - Please don't reply directly to me. Keep the conversation public.<br>
PPS - Filter out your passwords. ;)<div class="im"><br>
<br>
On 09/17/2012 11:17 PM, Ben .T.George wrote:<br>
</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
Hi thanks for your reply<br>
<br>
Beloe is my cluster.conffile<br>
<br>
<?xml version="1.0"?><br>
<cluster config_version="7" name="eccprd"><br>
         <clusternodes><br>
                 <clusternode name="<a href="http://cgceccprd1.combinedgroup.net" target="_blank">cgceccprd1.<u></u>combinedgroup.net</a><br></div>
<<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>>" nodeid="1"><div class="im"><br>
                         <fence><br>
                                 <method name="ucs-node1"/><br>
                         </fence><br>
                 </clusternode><br>
                 <clusternode name="<a href="http://cgceccprd2.combinedgroup.net" target="_blank">cgceccprd2.<u></u>combinedgroup.net</a><br></div>
<<a href="http://cgceccprd2.combinedgroup.net" target="_blank">http://cgceccprd2.<u></u>combinedgroup.net</a>>" nodeid="2"><div class="im"><br>
                         <fence><br>
                                 <method name="ucs-node2"/><br>
                         </fence><br>
                 </clusternode><br>
         </clusternodes><br>
         <cman expected_votes="1" two_node="1"/><br>
         <rm><br>
                 <resources><br>
                         <ip address="<a href="tel:172.22.10.230" value="+911722210230" target="_blank">172.22.10.230</a>" sleeptime="10"/><br>
                 </resources><br>
                 <service exclusive="1" name="eccsapmnt"<br>
recovery="relocate"><br>
                         <ip ref="<a href="tel:172.22.10.230" value="+911722210230" target="_blank">172.22.10.230</a>"/><br>
                 </service><br>
         </rm><br>
         <fencedevices><br>
                 <fencedevice agent="fence_cisco_ucs"<br></div>
ipaddr="172.22.90.61" login="admin" name="ucs-node1" passwd="..."/><br>
                 <fencedevice agent="fence_cisco_ucs"<br>
ipaddr="172.22.90.59" login="admin" name="ucs-node2" passwd="..."/><div class="im"><br>
         </fencedevices><br>
</cluster><br>
<br>
when i try to start cluster on node1, i am geeting this message on mesages:<br>
<br>
  tail -f -n 0 /var/log/messages<br>
Sep 18 06:06:02 cgceccprd1 modcluster: Starting service: eccsapmnt on node<br>
Sep 18 06:06:08 cgceccprd1 modcluster: Starting service: eccsapmnt on<br></div>
node <a href="http://cgceccprd1.combinedgroup.net" target="_blank">cgceccprd1.combinedgroup.net</a> <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>><div class="im">

<br>
<br>
<br>
but the service is not starting.on luci , it's showing both nodes are<br>
online.but on clustat different<br>
<br>
main error getting on messages is<br>
<br>
Sep 18 03:35:48 cgceccprd1 fenced[8424]: fencing node<br>
</div><a href="http://cgceccprd2.combinedgroup.net" target="_blank">cgceccprd2.combinedgroup.net</a> <<a href="http://cgceccprd2.combinedgroup.net" target="_blank">http://cgceccprd2.<u></u>combinedgroup.net</a>> still<div class="im">

<br>
retrying<br>
Sep 18 04:06:16 cgceccprd1 fenced[8424]: fencing node<br>
</div><a href="http://cgceccprd2.combinedgroup.net" target="_blank">cgceccprd2.combinedgroup.net</a> <<a href="http://cgceccprd2.combinedgroup.net" target="_blank">http://cgceccprd2.<u></u>combinedgroup.net</a>> still<div class="im">

<br>
retrying<br>
Sep 18 04:36:45 cgceccprd1 fenced[8424]: fencing node<br>
</div><a href="http://cgceccprd2.combinedgroup.net" target="_blank">cgceccprd2.combinedgroup.net</a> <<a href="http://cgceccprd2.combinedgroup.net" target="_blank">http://cgceccprd2.<u></u>combinedgroup.net</a>> still<div class="im">

<br>
retrying<br>
Sep 18 05:07:14 cgceccprd1 fenced[8424]: fencing node<br>
</div><a href="http://cgceccprd2.combinedgroup.net" target="_blank">cgceccprd2.combinedgroup.net</a> <<a href="http://cgceccprd2.combinedgroup.net" target="_blank">http://cgceccprd2.<u></u>combinedgroup.net</a>> still<div class="im">

<br>
retrying<br>
Sep 18 05:37:42 cgceccprd1 fenced[8424]: fencing node<br>
</div><a href="http://cgceccprd2.combinedgroup.net" target="_blank">cgceccprd2.combinedgroup.net</a> <<a href="http://cgceccprd2.combinedgroup.net" target="_blank">http://cgceccprd2.<u></u>combinedgroup.net</a>> still<div class="im">

<br>
retrying<br>
<br>
These messages from node1.i am geeting same message on node saying that<br>
<br>
cgceccprd2 fenced[8424]: fencing node <a href="http://cgceccprd1.combinedgroup.net" target="_blank">cgceccprd1.combinedgroup.net</a><br></div>
<<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>> still retrying<div class="im"><br>
<br>
i don't know what is problem here.<br>
<br>
please help me solve<br>
Regards,<br>
Ben<br>
<br>
On Tue, Sep 18, 2012 at 4:42 AM, Digimer <<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a><br></div><div class="im">
<mailto:<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a>>> wrote:<br>
<br>
    On 09/17/2012 06:07 PM, Ben .T.George wrote:<br>
<br>
        Hi<br>
<br>
        My cluster is failing to start.<br>
<br>
        if i check clustat on node1, status is showing node1 online and<br>
        node2<br>
        offline. If the check clustat on node2, node2 is showing online and<br>
        node1 is offline<br>
<br>
        i checked logs.fanced is throwing errors.how can i rectify this<br>
<br>
        Sep 17 23:24:54 fenced fencing node <a href="http://cgceccprd1.combinedgroup.net" target="_blank">cgceccprd1.combinedgroup.net</a><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>><br></div>
        <<a href="http://cgceccprd1." target="_blank">http://cgceccprd1.</a>__<a href="http://combinedgroup.net" target="_blank">combinedg<u></u>roup.net</a><div class="im"><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>>> still retrying<br>
<br>
        Sep 17 23:55:06 fenced fencing node <a href="http://cgceccprd1.combinedgroup.net" target="_blank">cgceccprd1.combinedgroup.net</a><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>><br></div>
        <<a href="http://cgceccprd1." target="_blank">http://cgceccprd1.</a>__<a href="http://combinedgroup.net" target="_blank">combinedg<u></u>roup.net</a><div class="im"><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>>> still retrying<br>
<br>
        Sep 18 00:25:19 fenced fencing node <a href="http://cgceccprd1.combinedgroup.net" target="_blank">cgceccprd1.combinedgroup.net</a><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>><br></div>
        <<a href="http://cgceccprd1." target="_blank">http://cgceccprd1.</a>__<a href="http://combinedgroup.net" target="_blank">combinedg<u></u>roup.net</a><div class="im"><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>>> still retrying<br>
<br>
        Sep 18 00:55:03 fenced fenced 3.0.12.1 started<br>
        Sep 18 00:55:03 fenced failed to get dbus connection<br>
        Sep 18 00:55:55 fenced fencing node <a href="http://cgceccprd1.combinedgroup.net" target="_blank">cgceccprd1.combinedgroup.net</a><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>><br></div>
        <<a href="http://cgceccprd1." target="_blank">http://cgceccprd1.</a>__<a href="http://combinedgroup.net" target="_blank">combinedg<u></u>roup.net</a><div class="im"><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>>><br>
<br>
        Sep 18 00:55:55 fenced fence <a href="http://cgceccprd1.combinedgroup.net" target="_blank">cgceccprd1.combinedgroup.net</a><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>><br></div>
        <<a href="http://cgceccprd1." target="_blank">http://cgceccprd1.</a>__<a href="http://combinedgroup.net" target="_blank">combinedg<u></u>roup.net</a><div class="im"><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>>> dev 0.0 agent none<br>
        result: error<br>
<br>
        no method<br>
        Sep 18 00:55:55 fenced fence <a href="http://cgceccprd1.combinedgroup.net" target="_blank">cgceccprd1.combinedgroup.net</a><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>><br></div>
        <<a href="http://cgceccprd1." target="_blank">http://cgceccprd1.</a>__<a href="http://combinedgroup.net" target="_blank">combinedg<u></u>roup.net</a><div class="im"><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>>> failed<br>
<br>
        Sep 18 00:55:58 fenced fencing node <a href="http://cgceccprd1.combinedgroup.net" target="_blank">cgceccprd1.combinedgroup.net</a><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>><br></div>
        <<a href="http://cgceccprd1." target="_blank">http://cgceccprd1.</a>__<a href="http://combinedgroup.net" target="_blank">combinedg<u></u>roup.net</a><div class="im"><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>>><br>
<br>
        Sep 18 00:55:58 fenced fence <a href="http://cgceccprd1.combinedgroup.net" target="_blank">cgceccprd1.combinedgroup.net</a><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>><br></div>
        <<a href="http://cgceccprd1." target="_blank">http://cgceccprd1.</a>__<a href="http://combinedgroup.net" target="_blank">combinedg<u></u>roup.net</a><div class="im"><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>>> dev 0.0 agent none<br>
        result: error<br>
<br>
        no method<br>
        Sep 18 00:55:58 fenced fence <a href="http://cgceccprd1.combinedgroup.net" target="_blank">cgceccprd1.combinedgroup.net</a><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>><br></div>
        <<a href="http://cgceccprd1." target="_blank">http://cgceccprd1.</a>__<a href="http://combinedgroup.net" target="_blank">combinedg<u></u>roup.net</a><div class="im"><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>>> failed<br>
<br>
        Sep 18 00:56:01 fenced fencing node <a href="http://cgceccprd1.combinedgroup.net" target="_blank">cgceccprd1.combinedgroup.net</a><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>><br></div>
        <<a href="http://cgceccprd1." target="_blank">http://cgceccprd1.</a>__<a href="http://combinedgroup.net" target="_blank">combinedg<u></u>roup.net</a><div class="im"><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>>><br>
<br>
        Sep 18 00:56:01 fenced fence <a href="http://cgceccprd1.combinedgroup.net" target="_blank">cgceccprd1.combinedgroup.net</a><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>><br></div>
        <<a href="http://cgceccprd1." target="_blank">http://cgceccprd1.</a>__<a href="http://combinedgroup.net" target="_blank">combinedg<u></u>roup.net</a><div class="im"><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>>> dev 0.0 agent none<br>
        result: error<br>
<br>
        no method<br>
        Sep 18 00:56:01 fenced fence <a href="http://cgceccprd1.combinedgroup.net" target="_blank">cgceccprd1.combinedgroup.net</a><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>><br></div>
        <<a href="http://cgceccprd1." target="_blank">http://cgceccprd1.</a>__<a href="http://combinedgroup.net" target="_blank">combinedg<u></u>roup.net</a><div class="im"><br>
        <<a href="http://cgceccprd1.combinedgroup.net" target="_blank">http://cgceccprd1.<u></u>combinedgroup.net</a>>> failed<br>
<br>
<br>
<br>
        please help me solve this issue<br>
<br>
        Regards,<br>
        Ben<br>
<br>
<br>
    What is your cluster.conf?<br>
<br>
    likely you either have no fencing configured, or your fencing is not<br>
    working. Either way, failing to fence is a critical problem and the<br>
    cluster will hang, just as you're seeing here. This is by design.<br>
    Better to hang a cluster than to corrupt it.<br>
<br>
    digimer<br>
<br>
    --<br>
    Digimer<br>
    Papers and Projects: <a href="https://alteeve.ca" target="_blank">https://alteeve.ca</a><br>
<br>
<br>
<br>
</div></blockquote><div class="HOEnZb"><div class="h5">
<br>
<br>
-- <br>
Digimer<br>
Papers and Projects: <a href="https://alteeve.ca" target="_blank">https://alteeve.ca</a><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br><div dir="ltr"><div><span></span><span style="font-size:10.0pt;color:black">Yours Sincerely<br></span><br><span style="border-collapse:collapse"><font face="'courier new', monospace" size="1"><font color="#6600cc"><b>#!/usr/bin/env python<br>


#Mysignature.py :)</b></font><br><br><font color="#ff6666">Signature </font><font color="#202020">= " </font><font color="#006600">" " Ben.T.George \n<br>
                  Linux System Administrator \n<br>                  Diyar United Company \n<br>                  kuwait \n<br>                  Phone : <a value="+96565122700">+965 - 50629829</a> \n " "</font><font color="#202020"> "</font><br>


<br><font color="#6600cc">Print </font><font color="#ff9900">Signature</font></font></span><span></span></div></div><br>
</div>