[Linux-cluster] "Missed too many heartbeats" messages and hung cluster
Fabrizio Lippolis
Fabrizio.Lippolis at AurigaInformatica.it
Tue Jun 27 13:35:35 UTC 2006
Patrick Caulfield ha scritto:
>> Jun 23 23:37:17 AICLSRV02 kernel: CMAN: removing node AICLSRV01 from the
>> cluster : Missed too many heartbeats
>
>
> That message means that the heartbeat messages are getting lost somehow.
> either through an unreliable network link or something else odd happening on
> the machine to prevent the heartbeat packets reaching the network.
This is very strange since the two machines are connected by a gigabit
crossover cable and no other device is in the middle. Also, no firewall
rules are configured on any machine.
By the way, actually I am using the fence manual method but it isn't
much helpful and I would like to switch to a method that ensures a
reliable service. Does it mean I have to buy a device sitting in the
middle of the machines that connects network and power cables? I am
rather new to it so please any suggestion is welcome.
--
Fabrizio Lippolis fabrizio.lippolis at aurigainformatica.it
Auriga Informatica s.r.l. Via Don Guanella 15/B - 70124 Bari
Tel.: 080/5025414 - Fax: 080/5027448 - http://www.aurigainformatica.it/
More information about the Linux-cluster
mailing list