[Linux-cluster] CS4 / defaults parameters dor heart-beat

Patrick Caulfield pcaulfie at redhat.com
Thu Jan 26 10:19:36 UTC 2006


Alain Moulle wrote:
> Hi
> 
> In the Config file defaults, we can find :
> 
> #define DEFAULT_HELLO_TIMER       5     /* Period between HELLO messages */
> #define DEFAULT_DEADNODE_TIMER   21     /* If we don't get a message from a
> #define DEFAULT_MAX_RETRIES 5           /* Number of times we resend a message */
> 
> That seems to mean that the node sends a Hello message
> on heart-beat interface every 5s, waits at max 21s before
> retry and this 5 times, and if at 5th time , it has no
> response in the 21s period , it decides to kill the other node.
> Am I right ?

Not quite. The max retries doesn't apply to heartbeat messages, only to
internal messages (such as used during transitions or communicating
applications). so the 21s is the total time a node is allowed to go without
have a heartbeat sent (not 5x21 as you implied)

> Besides, could you explain for me the JOINREQ, JOINACK, and
> JOINCONF notions ?
> 

They are to do with the joining protocol, obviously. A new node sends a
JOINREQ message to a node, which responds with a JOINACK (which may be a NAK).
When the cluster has completed a transition to admit the mode then a JOINCONF
is sent.

-- 

patrick




More information about the Linux-cluster mailing list