[Linux-cluster] openais question

Pedro Bandim Faustino pedro.faustino at fccn.pt
Fri Dec 7 16:55:54 UTC 2007


The problem was the firewall iptables!

I've done as told by the FAQ 
http://sources.redhat.com/cluster/faq.html#iptables
Node 2 iptables.conf:
#       rgmanager/clurgmgrd
-A SERVICOS -p tcp -m tcp -s node1-IPAddr --dport 41966:41969 -j ACCEPT
#       ccsd
-A SERVICOS -p tcp -m tcp -s node1-IPAddr --dport 50006 -j ACCEPT
-A SERVICOS -p udp -m udp -s node1-IPAddr --dport 50007 -j ACCEPT
-A SERVICOS -p tcp -m tcp -s node1-IPAddr --dport 50008:50009 -j ACCEPT
#       dlm
-A SERVICOS -p tcp -m tcp -s node1-IPAddr --dport 21064 -j ACCEPT
#       openais
-A SERVICOS -p udp -m udp -s node1-IPAddr --dport 5405 -j ACCEPT
-A SERVICOS -j RETURN

But when these rules are enabled, my previous email explains the problems.

Does any of you know what am I missing in my iptables.conf?

Thanks,

Pedro Bandim Faustino
email/sip: pedro.faustino at fccn.pt



Pedro Bandim Faustino wrote:
> Other way to observe the same (maybe other reason??):
>
> After booting both nodes, when doing a service cman start on both 
> nodes with some seconds of interval between both commands, the two 
> nodes join the cluster and get quored.
> When doing service cman stop on both nodes (also with some seconds of 
> interval between both commands), one of the nodes successfully leaves 
> the cluster, but the other prints this out
>
> [root at m07 ~]# service cman stop
> Stopping cluster:
>   Stopping fencing... done
>   Stopping cman... failed
> Timed-out waiting for cluster
>                                                           [FAILED]
>
> while the messages in the log are
> Dec  7 15:27:17 m07 openais[5436]: [TOTEM] The consensus timeout expired.
> Dec  7 15:27:17 m07 openais[5436]: [TOTEM] entering GATHER state from 3.
> Dec  7 15:27:32 m07 openais[5436]: [TOTEM] The consensus timeout expired.
> Dec  7 15:27:32 m07 openais[5436]: [TOTEM] entering GATHER state from 3.
> Dec  7 15:27:47 m07 openais[5436]: [TOTEM] The consensus timeout expired.
> Dec  7 15:27:47 m07 openais[5436]: [TOTEM] entering GATHER state from 3.
>
> Do you know what the problem is?
>
> output of ps fax:
> ....
> 5430 ?        Ssl    0:00 /sbin/ccsd
> 5436 ?        SLl    0:03 aisexec
> 5450 ?        Ss     0:00 /sbin/groupd
> 5458 ?        Ss     0:00 /sbin/fenced
> 5464 ?        Ss     0:00 /sbin/dlm_controld
> 5470 ?        Ss     0:00 /sbin/gfs_controld
> ...
>
> cluster.conf
> <?xml version="1.0"?>
> <cluster name="VoIP_RCTS" config_version="8">
>
> <!-- The quorum disk solves the imbalance caused by this two-node 
> cluster -->
> <cman two_node="0" expected_votes="3">
> </cman>
>
> <!-- Change logging from /var/log/messages to /log/cluster/cluster.log 
> -->
> <rm log_level="6" log_facility="local4">
> </rm>
>
> <fence_daemon post_join_delay="10"/>
>
> <clusternodes>
>        <clusternode name="m07.<whatever>" votes="1" nodeid="1">
>                <fence>
>                        <method name="1"><device 
> name="fence_bladecenter-VoIP_RCTS-cluster" blade="7"/></method>
>                </fence>
>        </clusternode>
>        <clusternode name="m08.<whatever>" votes="1" nodeid="2">
>                <fence>
>                        <method name="1"><device 
> name="fence_bladecenter-VoIP_RCTS-cluster" blade="8"/></method>
>                </fence>
>        </clusternode>
> </clusternodes>
>
> <fencedevices>
>        <fencedevice name="fence_bladecenter-VoIP_RCTS-cluster" 
> agent="fence_bladecenter" ipaddr="192.168.0.1" login="<login>" 
> password="<password>"/>
> </fencedevices>
>
> <!-- Specify here the shared quorum disk -->
> <quorumd label="QUORUM-VoIP" votes="1"/></cluster>
>
>
>
>
> Pedro Bandim Faustino
> email/sip: pedro.faustino at fccn.pt
>
> FCCN - Fundação para a Computação Científica Nacional
> Av. do Brasil, n.º 101
> 1700-066 Lisboa
> Tel: +351 21 844 0100
> Fax: +351 21 847 2167
> www.fccn.pt
>
> Aviso de Confidencialidade
>
> Esta mensagem é exclusivamente destinada ao seu destinatário, podendo 
> conter informação CONFIDENCIAL, cuja divulgação está expressamente 
> vedada nos termos da lei. Caso tenha recepcionado indevidamente esta 
> mensagem, solicitamos-lhe que nos comunique esse mesmo facto por esta 
> via ou para o telefone +351 218440100 devendo apagar o seu conteúdo de 
> imediato. This message is intended exclusively for its addressee. It 
> may contain CONFIDENTIAL information protected by law. If this message 
> has been received by error, please notify us via e-mail or by 
> telephone +351 218440100 and delete it immediately.
>
>
>
> Pedro Bandim Faustino wrote:
>> Hi All,
>>
>> I've a running cluster (v2.01.00) with two Fedora7 nodes. While 
>> testing I've disabled all the NICs in one node. I started observing 
>> these messages on the other node:
>>
>> Dec  7 13:05:55 m07 openais[4233]: [TOTEM] The consensus timeout 
>> expired.
>> Dec  7 13:05:55 m07 openais[4233]: [TOTEM] entering GATHER state from 3.
>> Dec  7 13:06:10 m07 openais[4233]: [TOTEM] The consensus timeout 
>> expired.
>> Dec  7 13:06:10 m07 openais[4233]: [TOTEM] entering GATHER state from 3.
>> Dec  7 13:06:25 m07 openais[4233]: [TOTEM] The consensus timeout 
>> expired.
>> Dec  7 13:06:25 m07 openais[4233]: [TOTEM] entering GATHER state from 3.
>>
>> When I enabled the NICs and network was restored the same messages 
>> kept appearing, now on both nodes.
>> I've searched but couldn't find an answer/explanation.
>>
>> Thanks for your help,
>>
>> ------------------------------------------------------------------------
>>
>> -- 
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> ------------------------------------------------------------------------
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 2980 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20071207/c9eec9ed/attachment.bin>


More information about the Linux-cluster mailing list