[Linux-cluster] Nodes leaving and re-joining intermittently
Matthew Painter
matthew.painter at kusiri.com
Sat Dec 10 20:32:05 UTC 2011
Hi all,
We are trying to get to the bottom of some odd intermittent behavior on a
cluster. We are intermittently seeing nodes leave and rejoin clusters,
without being fenced. Further the gap between leaving on re-joining is 8
minutes. We are monitoring the latency between boxes, and it is acceptable
(<5ms).
How can nodes exhibit this behavior? There seem to be no impact on the
services running on the box, just this leaving and re-joining. The SNMP
messages are below.
All help decoding this gratefully received! :)
Thanks,
Matt
Sat Dec 10 15:22:00 GMT 2011: cluster3.localdomain
DISMAN-EVENT-MIB::sysUpTimeInstance
= 3:2:52:23.35, SNMPv2-MIB::snmpTrapOID.0 =
COROSYNC-MIB::corosyncNoticesNodeStatus,
COROSYNC-MIB::corosyncObjectsNodeName.0 = "cluster1.localdomain",
COROSYNC-MIB::corosyncObjectsNodeID.0 = 1,
COROSYNC-MIB::corosyncObjectsNodeAddress.0
= "10.79.202.1", COROSYNC-MIB::corosyncObjectsNodeStatus.0 = "left"
Sat Dec 10 15:30:25 GMT 2011: cluster3.localdomain
DISMAN-EVENT-MIB::sysUpTimeInstance
= 3:3:00:48.75, SNMPv2-MIB::snmpTrapOID.0 =
COROSYNC-MIB::corosyncNoticesNodeStatus,
COROSYNC-MIB::corosyncObjectsNodeName.0 = "cluster1.localdomain",
COROSYNC-MIB::corosyncObjectsNodeID.0 = 1,
COROSYNC-MIB::corosyncObjectsNodeAddress.0
= "10.79.202.1", COROSYNC-MIB::corosyncObjectsNodeStatus.0 = "joined"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20111210/578c078a/attachment.htm>
More information about the Linux-cluster
mailing list