Rolling back to previous openais package allowed me to restart cman. From openais-0.80.3-22el5 to <br>openais-0.80.3-15.el5.<br><br><br><div class="gmail_quote">2009/1/28 Dave Costakos <span dir="ltr"><<a href="mailto:david.costakos@gmail.com">david.costakos@gmail.com</a>></span><br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><span style="font-family: arial,helvetica,sans-serif;">Like you, I've run into this same issue.  I have 2 clusters that I'm trying to update in our lab.  On one, I only updated the cman and rgmanager packages: this update was successful.  On another I did a full update to 5.3 and ran into what appears to be this same problem.  II've noticed that manually attempting to start cman via 'cman_tool -d join' prints out this message right before cman fails.</span><br style="font-family: arial,helvetica,sans-serif;">

<br style="font-family: arial,helvetica,sans-serif;"><pre style="font-family: arial,helvetica,sans-serif;">aisexec: ckpt.c:3961: message_handler_req_exec_ckpt_sync_checkpoint_refcount:Assertion `checkpoint != ((void *)0)' failed<br>

<br><br>I suspect an openais issue, would someone be able to confirm that?<br><br>Also, II'm going to try downgrading openais back to the version from RHEL 5.2 to see if that fixes it (though I won't get to that until the end of today).  If that works, I'll report back.<br>

</pre><br style="font-family: arial,helvetica,sans-serif;"><br><div class="gmail_quote">2009/1/27 Alan A <span dir="ltr"><<a href="mailto:alan.zg@gmail.com" target="_blank">alan.zg@gmail.com</a>></span><div><div></div>
<div class="Wj3C7c"><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
I just opened RHEL case number 1890184 regarding the same issue. First Kernel would not start due to the HP ILO driver conflict, but at the same time CMAN broke, and fencing fails. I rolled back cman rpm to the previous version but problem persists. Something else changed to affect CMAN not starting again.<br>


<br><div class="gmail_quote">2009/1/27 Gunther Schlegel <span dir="ltr"><<a href="mailto:schlegel@riege.com" target="_blank">schlegel@riege.com</a>></span><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

<div><div></div><div>
Hello,<br>
<br>
I updated one node from 5.2 to 5.3 using yum update and now cman does not start up anymore -- looks like ccsd has some problems:<br>
<br>
[root@motel6 /]# /sbin/ccsd -4 -n<br>
Starting ccsd 2.0.98:<br>
 Built: Dec  3 2008 16:32:30<br>
 Copyright (C) Red Hat, Inc.  2004  All rights reserved.<br>
  IP Protocol:: IPv4 only<br>
  No Daemon:: SET<br>
<br>
Cluster is not quorate.  Refusing connection.<br>
Error while processing connect: Connection refused<br>
Cluster is not quorate.  Refusing connection.<br>
Error while processing connect: Connection refused<br>
Unable to connect to cluster infrastructure after 30 seconds.<br>
Unable to connect to cluster infrastructure after 60 seconds.<br>
<br>
<br>
When starting ccsd using /etc/init.d/cman it reports all three nodes to be on cluster.conf version 78, so I guess it is not a network connectivity problem.<br>
<br>
The other two nodes (still on 5.2z) of the cluster are up and running with quorum. Openais is talking to those 2 other nodes and it looks fine to me:<br>
<br>
Jan 27 21:05:26 motel6 openais[1278]: [CLM  ] Members Joined:<br>
Jan 27 21:05:26 motel6 openais[1278]: [CLM  ] #011r(0) ip(10.11.5.22)<br>
Jan 27 21:05:26 motel6 openais[1278]: [CLM  ] #011r(0) ip(10.11.5.23)<br>
Jan 27 21:05:26 motel6 openais[1278]: [SYNC ] This node is within the primary component and will provide service.<br>
Jan 27 21:05:26 motel6 openais[1278]: [TOTEM] entering OPERATIONAL state.<br>
Jan 27 21:05:26 motel6 openais[1278]: [CMAN ] quorum regained, resuming activity<br>
Jan 27 21:05:26 motel6 openais[1278]: [CLM  ] got nodejoin message 10.11.5.21<br>
Jan 27 21:05:26 motel6 openais[1278]: [CLM  ] got nodejoin message 10.11.5.22<br>
Jan 27 21:05:26 motel6 openais[1278]: [CLM  ] got nodejoin message 10.11.5.23<br>
<br>
<br>
I am a bit lost...<br>
<br>
cluster.conf:<br>
[root@motel6 init.d]# cat /etc/cluster/cluster.conf<br>
<?xml version="1.0"?><br>
<cluster alias="RSIXENCluster2" config_version="87" name="RSIXENCluster2"><br>
        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/><br>
        <clusternodes><br>
                <clusternode name="<a href="http://concorde.riege.de" target="_blank">concorde.riege.de</a>" nodeid="1" votes="1"><br>
                        <fence><br>
                                <method name="1"><br>
                                        <device name="Concorde_IPMI"/><br>
                                </method><br>
                        </fence><br>
                </clusternode><br>
                <clusternode name="<a href="http://motel6.riege.de" target="_blank">motel6.riege.de</a>" nodeid="2" votes="1"><br>
                        <fence><br>
                                <method name="1"><br>
                                        <device name="Motel6_IPMI"/><br>
                                </method><br>
                        </fence><br>
                </clusternode><br>
                <clusternode name="<a href="http://mercure.riege.de" target="_blank">mercure.riege.de</a>" nodeid="3" votes="1"><br>
                        <fence><br>
                                <method name="1"><br>
                                        <device name="Mercure_IPMI"/><br>
                                </method><br>
                        </fence><br>
                </clusternode><br>
        </clusternodes><br>
        <fencedevices><br>
                <fencedevice agent="fence_ipmilan" ipaddr="10.11.5.132" login="root" name="Concorde_IPMI" passwd="XXX"/><br>
                <fencedevice agent="fence_ipmilan" ipaddr="10.11.5.131" login="root" name="Motel6_IPMI" passwd="xxx"/><br>
                <fencedevice agent="fence_ipmilan" ipaddr="10.11.5.133" login="root" name="Mercure_IPMI" passwd="XXX"/><br>
        </fencedevices><br>
        <rm><br>
                <failoverdomains><br>
                        <failoverdomain name="Earth" nofailback="1" ordered="1" restricted="1"><br>
                                <failoverdomainnode name="<a href="http://concorde.riege.de" target="_blank">concorde.riege.de</a>" priority="1"/><br>
                                <failoverdomainnode name="<a href="http://motel6.riege.de" target="_blank">motel6.riege.de</a>" priority="1"/><br>
                                <failoverdomainnode name="<a href="http://mercure.riege.de" target="_blank">mercure.riege.de</a>" priority="1"/><br>
                        </failoverdomain><br>
                        <failoverdomain name="Europe" nofailback="0" ordered="1" restricted="0"><br>
                                <failoverdomainnode name="<a href="http://concorde.riege.de" target="_blank">concorde.riege.de</a>" priority="2"/><br>
                        </failoverdomain><br>
                        <failoverdomain name="North America" nofailback="0" ordered="1" restricted="0"><br>
                                <failoverdomainnode name="<a href="http://motel6.riege.de" target="_blank">motel6.riege.de</a>" priority="2"/><br>
                        </failoverdomain><br>
                        <failoverdomain name="Africa" nofailback="0" ordered="1" restricted="0"><br>
                                <failoverdomainnode name="<a href="http://mercure.riege.de" target="_blank">mercure.riege.de</a>" priority="1"/><br>
                        </failoverdomain><br>
                </failoverdomains><br>
                <resources/><br>
                <vm autostart="1" domain="Africa" exclusive="0" migrate="live" name="vm64.test.riege.de_64" path="/etc/xen" recovery="restart"/><br>



                <vm autostart="1" domain="North America" exclusive="0" migrate="pause" name="rt.test.riege.de_32" path="/etc/xen" recovery="restart"/><br>



                <vm autostart="1" domain="Africa" exclusive="0" migrate="pause" name="poincare.riege.de_32" path="/etc/xen" recovery="restart"/><br>



                <vm autostart="1" domain="North America" exclusive="0" migrate="live" name="jboss.dev.riege.de_64" path="/etc/xen" recovery="relocate"/><br>



                <vm autostart="1" domain="Africa" exclusive="0" migrate="live" name="master.cc3.dev.riege.de_64" path="/etc/xen" recovery="relocate"/><br>



                <vm autostart="1" domain="Europe" exclusive="0" migrate="pause" name="test.alphatrans.scope.riege.com_32" path="/etc/xen" recovery="relocate"/><br>



                <vm autostart="1" domain="North America" exclusive="0" migrate="live" name="slave.cc3.dev.riege.de_64" path="/etc/xen" recovery="restart"/><br>



                <vm autostart="1" domain="North America" exclusive="0" migrate="live" name="webmail.riege.com_64" path="/etc/xen" recovery="relocate"/><br>



                <vm autostart="1" domain="Europe" exclusive="0" migrate="live" name="live.rsi.scope.riege.com_64" path="/etc/xen" recovery="relocate"/><br>



                <vm autostart="1" domain="Europe" exclusive="0" migrate="pause" name="qa-16.rsi.scope.riege.com_32" path="/etc/xen" recovery="relocate"/><br>



                <vm autostart="1" domain="Africa" exclusive="0" migrate="pause" name="qa-18.rsi.scope.riege.com_32" path="/etc/xen" recovery="relocate"/><br>



                <vm autostart="1" domain="Africa" exclusive="0" migrate="pause" name="vm32.test.riege.de_32" path="/etc/xen" recovery="restart"/><br>



                <vm autostart="1" domain="Europe" exclusive="0" migrate="pause" name="qa-head.rsi.scope.riege.com_32" path="/etc/xen" recovery="restart"/><br>



                <vm autostart="1" domain="North America" exclusive="0" migrate="live" name="mq.dev.riege.de_64" path="/etc/xen" recovery="relocate"/><br>



                <vm autostart="1" domain="Europe" exclusive="0" migrate="live" name="archive.dev.riege.de_64" path="/etc/xen" recovery="restart"/><br>



        </rm><br>
        <cman quorum_dev_poll="50000"/><br>
        <totem consensus="4800" join="60" token="60000" token_retransmits_before_loss_const="20"/><br>
        <quorumd device="/dev/mapper/Quorum_Partition" interval="3" min_score="1" tko="10" votes="2"/><br>
</cluster><br>
<br>
best regards, Gunther<br>
<br>
-- <br>
.............................................................<br>
Riege Software International GmbH  Fon: +49 (2159) 9148 0<br>
Mollsfeld 10                       Fax: +49 (2159) 9148 11<br>
40670 Meerbusch                    Web: <a href="http://www.riege.com" target="_blank">www.riege.com</a><br>
Germany                            E-Mail: <a href="mailto:schlegel@riege.com" target="_blank">schlegel@riege.com</a><br>
---                                ---<br>
Handelsregister:                   Managing Directors:<br>
Amtsgericht Neuss HRB-NR 4207      Christian Riege<br>
USt-ID-Nr.: DE120585842            Gabriele  Riege<br>
                                  Johannes  Riege<br>
.............................................................<br>
          YOU CARE FOR FREIGHT, WE CARE FOR YOU          <br>
<br>
<br>
<br></div></div>--<br>
Linux-cluster mailing list<br>
<a href="mailto:Linux-cluster@redhat.com" target="_blank">Linux-cluster@redhat.com</a><br>
<a href="https://www.redhat.com/mailman/listinfo/linux-cluster" target="_blank">https://www.redhat.com/mailman/listinfo/linux-cluster</a><br></blockquote></div><br><br clear="all"><br>-- <br><font color="#888888">Alan A.<br>


</font><br>--<br>
Linux-cluster mailing list<br>
<a href="mailto:Linux-cluster@redhat.com" target="_blank">Linux-cluster@redhat.com</a><br>
<a href="https://www.redhat.com/mailman/listinfo/linux-cluster" target="_blank">https://www.redhat.com/mailman/listinfo/linux-cluster</a><br></blockquote></div></div></div><font color="#888888"><br><br clear="all"><br>-- <br>
Dave Costakos<br>mailto:<a href="mailto:david.costakos@gmail.com" target="_blank">david.costakos@gmail.com</a><br>

</font><br>--<br>
Linux-cluster mailing list<br>
<a href="mailto:Linux-cluster@redhat.com">Linux-cluster@redhat.com</a><br>
<a href="https://www.redhat.com/mailman/listinfo/linux-cluster" target="_blank">https://www.redhat.com/mailman/listinfo/linux-cluster</a><br></blockquote></div><br><br clear="all"><br>-- <br>Alan A.<br>