[Linux-cluster] reasons for sporadic token loss?

Wed Aug 1 14:32:17 UTC 2012

Hello,

My answers is just feelings, I have to little experiment on RHCS 6.
A. Your token lose is on token, consensus comes after losing token.
B. Maybe your problem is on network, not on cluster tuning.
C. No idea
D. I think it doesn't. Token multicast use your network address matching nodename resolved.

I think you should use your interconnect link on one single interface for testing without bonding. If your problem disepear, your bond mode 5 is KO.

Regards,

-- 
  .`'`.   GouNiNi
 :  ': :  
 `. ` .`  GNU/Linux
   `'`    http://www.geekarea.fr

----- Mail original -----
> De: "Heiko Nardmann" <heiko.nardmann at itechnical.de>
> À: linux-cluster at redhat.com
> Envoyé: Mardi 31 Juillet 2012 15:57:33
> Objet: [Linux-cluster] reasons for sporadic token loss?
> 
> Hi together!
> 
> I am experiencing sporadic problems with my cluster setup. Maybe
> someone
> has an idea? But first some facts:
> 
> Type: RHEL 6.1 two node cluster (corosync 1.2.3-36) on two Dell R610
> each with a quad port NIC
> 
> NICs:
> - interfaces em1/em2 are bonded using mode 5; these interfaces are
> cross
> connected (intended to be used for the cluster housekeeping
> communication) - no network element in between
> - interfaces em3/em4 are bonded using mode 1; these interfaces are
> connected to two switches
> 
> Cluster configuration:
> 
> <?xml version="1.0"?>
> <cluster config_version="51" name="my-cluster">
>      <cman expected_votes="1" two_node="1"/>
>      <clusternodes>
>          <clusternode name="df1-clusterlink" nodeid="1">
>              <fence>
>                  <method name="VBoxManage-DF-1">
>                      <device name="VBoxManage-DF-1" />
>                  </method>
>              </fence>
>              <unfence>
>              </unfence>
>          </clusternode>
>          <clusternode name="df2-clusterlink" nodeid="2">
>              <fence>
>                  <method name="VBoxManage-DF-2">
>                      <device name="VBoxManage-DF-2" />
>                  </method>
> 
>              </fence>
>              <unfence>
>              </unfence>
>          </clusternode>
>      </clusternodes>
>      <fencedevices>
>          <fencedevice name="VBoxManage-DF-1" agent="fence_vbox"
> vboxhost="vboxhost.private" login="test" vmname="RHEL 6.1 x86_64
> DF-System Server 1" />
>          <fencedevice name="VBoxManage-DF-2" agent="fence_vbox"
> vboxhost="vboxhost.private" login="test" vmname="RHEL 6.1 x86_64
> DF-System Server 2" />
>      </fencedevices>
>      <rm>
>          <resources>
>              <ip address="10.200.104.15/27" monitor_link="on"
> sleeptime="10"/>
>              <script file="/usr/share/cluster/app.sh" name="myapp"/>
>          </resources>
>          <failoverdomains>
>              <failoverdomain name="fod-myapp" nofailback="0"
>              ordered="1"
> restricted="0">
>                  <failoverdomainnode name="df1-clusterlink"
>                  priority="1"/>
>                  <failoverdomainnode name="df2-clusterlink"
>                  priority="2"/>
>              </failoverdomain>
>          </failoverdomains>
>          <service domain="fod-myapp" exclusive="1" max_restarts="3"
> name="rg-myapp" recovery="restart" restart_expire_time="1">
>              <script ref=myapp"/>
>              <ip ref="10.200.104.15/27"/>
>          </service>
>      </rm>
>      <logging debug="on"/>
>      <gfs_controld enable_plock="0" plock_rate_limit="0"/>
>      <dlm enable_plock="0" plock_ownership="1" plock_rate_limit="0"/>
> </cluster>
> 
> 
> --------------------------------------------------------------------------------
> 
> Problem:
> Sometimes the second node "detects" that the token has been lost
> (corosync.log):
> 
> [no TOTEM messages before that]
> Jul 28 13:00:10 corosync [TOTEM ] The token was lost in the
> OPERATIONAL
> state.
> Jul 28 13:00:10 corosync [TOTEM ] A processor failed, forming new
> configuration.
> Jul 28 13:00:10 corosync [TOTEM ] Receive multicast socket recv
> buffer
> size (262142 bytes).
> Jul 28 13:00:10 corosync [TOTEM ] Transmit multicast socket send
> buffer
> size (262142 bytes).
> 
> This happens lets say once a week. This leads to fencing of the first
> node. What I see from 'corosync-objctl -a' is that this is maybe due
> to
> a consensus timeout (some excerpt from the commands output follows);
> I
> marked the lines which I so far consider as important:
> 
> totem.transport=udp
> totem.version=2
> totem.nodeid=2
> totem.vsftype=none
> totem.token=10000
> totem.join=60
> totem.fail_recv_const=2500
> totem.consensus=2000
> totem.rrp_mode=none
> totem.secauth=1
> totem.key=my-cluster
> totem.interface.ringnumber=0
> totem.interface.bindnetaddr=172.16.42.2
> totem.interface.mcastaddr=239.192.187.168
> totem.interface.mcastport=5405
> runtime.totem.pg.mrp.srp.orf_token_tx=3
> runtime.totem.pg.mrp.srp.orf_token_rx=1103226
> runtime.totem.pg.mrp.srp.memb_merge_detect_tx=395
> runtime.totem.pg.mrp.srp.memb_merge_detect_rx=1098359
> runtime.totem.pg.mrp.srp.memb_join_tx=38
> runtime.totem.pg.mrp.srp.memb_join_rx=50
> runtime.totem.pg.mrp.srp.mcast_tx=218
> runtime.totem.pg.mrp.srp.mcast_retx=0
> runtime.totem.pg.mrp.srp.mcast_rx=541
> runtime.totem.pg.mrp.srp.memb_commit_token_tx=12
> runtime.totem.pg.mrp.srp.memb_commit_token_rx=18
> runtime.totem.pg.mrp.srp.token_hold_cancel_tx=49
> runtime.totem.pg.mrp.srp.token_hold_cancel_rx=173
> runtime.totem.pg.mrp.srp.operational_entered=6
> runtime.totem.pg.mrp.srp.operational_token_lost=1
> ^^^
> runtime.totem.pg.mrp.srp.gather_entered=7
> runtime.totem.pg.mrp.srp.gather_token_lost=0
> runtime.totem.pg.mrp.srp.commit_entered=6
> runtime.totem.pg.mrp.srp.commit_token_lost=0
> runtime.totem.pg.mrp.srp.recovery_entered=6
> runtime.totem.pg.mrp.srp.recovery_token_lost=0
> runtime.totem.pg.mrp.srp.consensus_timeouts=1
> ^^^
> runtime.totem.pg.mrp.srp.mtt_rx_token=1727
> runtime.totem.pg.mrp.srp.avg_token_workload=62244458
> runtime.totem.pg.mrp.srp.avg_backlog_calc=0
> runtime.totem.pg.mrp.srp.rx_msg_dropped=0
> runtime.totem.pg.mrp.srp.members.2.ip=r(0) ip(172.16.42.2)
> runtime.totem.pg.mrp.srp.members.2.join_count=1
> runtime.totem.pg.mrp.srp.members.2.status=joined
> runtime.totem.pg.mrp.srp.members.1.ip=r(0) ip(172.16.42.1)
> runtime.totem.pg.mrp.srp.members.1.join_count=3
> runtime.totem.pg.mrp.srp.members.1.status=joined
> runtime.blackbox.dump_flight_data=no
> runtime.blackbox.dump_state=no
> 
> Some questions at this point:
> A) why did the cluster lose the token? due to timeout? token (10000)
> or
> consensus (2000)?
> B) why is the timeout ellapsed? maybe that is connected with the
> answer
> to A ... ?
> C) is it normal that 'token=10000' and 'consensus=2000' although
> normal
> documentation says that default is 'token=1000' and
> 'consensus=1.2*token'?
> D) since I suspect problems concerning the switches connecting the
> other
> interfaces (em3/em4 bonded to bond0) of those machines I wonder
> whether
> any traffic goes that way and not via bond1?
> 
> As I already stated: the connection of em3/em4 is a direct one
> without
> any network element.
> 
> So far I want to add the following line to cluster.conf and see
> whether
> the situation improves:
> 
>      <totem token_retransmits_before_loss_const="10"
> fail_recv_const="100" consensus="12000"/>
> 
> Any comment concerning that?
> 
> While googling for reasons I have seen that it is also a problem if
> both
> nodes are not synchronized concerning time; but in my case the ntpd
> on
> both nodes uses two stratum 2 NTP servers. I also cannot detect
> anything
> unsual like e.g. a jump of multiple seconds inside the log files
> although I have to admit that so far the ntpd does not run with debug
> enabled.
> 
> 
> Thanks in advance for any hint or comment!
> 
> 
> Kind regards,
> 
>      Heiko
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>