R: R: [Linux-cluster] "Missed too many heartbeats" messages andhung cluster
Leandro Dardini
l.dardini at comune.prato.it
Tue Jun 27 10:04:36 UTC 2006
> -----Messaggio originale-----
> Da: linux-cluster-bounces at redhat.com
> [mailto:linux-cluster-bounces at redhat.com] Per conto di
> Fabrizio Lippolis
> Inviato: martedì 27 giugno 2006 11.52
> A: linux clustering
> Oggetto: Re: R: [Linux-cluster] "Missed too many heartbeats"
> messages andhung cluster
>
> Leandro Dardini ha scritto:
>
> > If something happens between the two machine, they fence each other.
>
> I have configured manual fencing but as I wrote it's not much
> useful since, I think, requires manual handling which
> couldn't be possible immediately. Therefore I am looking for
> a method to let the services run even if such a thing
> happens. This is not the first time the problem arises,
> apparently without a reason, though the last time happened
> long time ago.
>
> > You can try to "ping" each other and see, when the problem
> arise, the connectivity state.
>
> Sometimes the machines are completely locked and it's not
> even possible to log in. A brute force switch off is
> necessary in this case. Sometimes looks like only the cluster
> service is locked and I can regularly ping the other machine
> though the cluster is not working.
This is really bad. This smells like an hardware problem or buggy kernel driver. Try to stress test the machines individually without cluster support. I usually start with a memtest from a Knoppix CD and then build a kernel for CPU stress. Try to transfer huge chunk of data to test the lan.
Leandro
>
> > Maybe a "too much intelligent switch" is handling the
> traffic and have some sort of "traffic shaping and control".
>
> There is nothing like that, the two machines are connected by
> a 1GB crossover cable, not even so long, provided by HP with
> the two machines.
>
> --
> Fabrizio Lippolis
> fabrizio.lippolis at aurigainformatica.it
> Auriga Informatica s.r.l. Via Don Guanella 15/B -
> 70124 Bari
> Tel.: 080/5025414 - Fax: 080/5027448 -
> http://www.aurigainformatica.it/
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
More information about the Linux-cluster
mailing list