[Linux-cluster] [TOTEM] The token was lost in the OPERATIONAL state: explanation?
Jos Vos
jos at xos.nl
Sat Nov 10 22:05:40 UTC 2007
Hi,
In a two-node cluster, a few times per day one of the nodes (not always
the same) reboots because it is fenced by the other node. The logging
on the fencing node starts with:
Nov 10 22:30:14 node2 openais[3275]: [TOTEM] The token was lost in the OPERATIONAL state.
Nov 10 22:30:14 node2 openais[3275]: [TOTEM] Receive multicast socket recv buffer size (262142 bytes).
Nov 10 22:30:14 node2 openais[3275]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes).
Nov 10 22:30:14 node2 openais[3275]: [TOTEM] entering GATHER state from 2.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] entering GATHER state from 0.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] Creating commit token because I am the rep.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] Saving state aru 32fc3 high seq received 32fc3
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] entering COMMIT state.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] entering RECOVERY state.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] position [0] member <ip-addr-of-node-2>:
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] previous ring seq 56 rep <ip-addr-of-node-1>
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] aru 32fc3 high delivered 32fc3 received flag 0
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] Did not need to originate any messages in recovery.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] Storing new sequence id for ring 3c
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] Sending initial ORF token
On the fenced node, in most cases nothing is logged before the reboot.
A few times, a "fatal: filesystem consistency error" was reported on
the fenced node just before the reboot.
Should I assume that in case nothing is logged this is also caused by a
fs error, although the log was not wriiten to disk in time before being
fenced?
Thanks,
--
-- Jos Vos <jos at xos.nl>
-- X/OS Experts in Open Systems BV | Phone: +31 20 6938364
-- Amsterdam, The Netherlands | Fax: +31 20 6948204
More information about the Linux-cluster
mailing list