R: [Linux-cluster] "Missed too many heartbeats" messages and hung cluster
Fabrizio Lippolis
Fabrizio.Lippolis at AurigaInformatica.it
Tue Jun 27 09:51:58 UTC 2006
Leandro Dardini ha scritto:
> If something happens between the two machine, they fence each other.
I have configured manual fencing but as I wrote it's not much useful
since, I think, requires manual handling which couldn't be possible
immediately. Therefore I am looking for a method to let the services run
even if such a thing happens. This is not the first time the problem
arises, apparently without a reason, though the last time happened long
time ago.
> You can try to "ping" each other and see, when the problem arise, the connectivity state.
Sometimes the machines are completely locked and it's not even possible
to log in. A brute force switch off is necessary in this case. Sometimes
looks like only the cluster service is locked and I can regularly ping
the other machine though the cluster is not working.
> Maybe a "too much intelligent switch" is handling the traffic and have some sort of "traffic shaping and control".
There is nothing like that, the two machines are connected by a 1GB
crossover cable, not even so long, provided by HP with the two machines.
--
Fabrizio Lippolis fabrizio.lippolis at aurigainformatica.it
Auriga Informatica s.r.l. Via Don Guanella 15/B - 70124 Bari
Tel.: 080/5025414 - Fax: 080/5027448 - http://www.aurigainformatica.it/
More information about the Linux-cluster
mailing list