[Linux-cluster] totem token & post_fail_delay question

emmanuel segura emi2fast at gmail.com
Tue Aug 26 08:11:39 UTC 2014


from man fenced

Post-fail delay is the number of seconds the daemon will wait before
fencing any victims after a domain member fails.

It's used for delay the fence action.

2014-08-26 8:56 GMT+02:00 Vasil Valchev <vasil.val at gmail.com>:
> Hello,
>
> I have a cluster that sometimes has intermittent network issues on the
> heartbeat network.
> Unfortunately improving the network is not an option, so I am looking for a
> way to tolerate longer interruptions.
>
> Previously it seemed to me the post_fail_delay option is suitable, but after
> some research it might not be what I am looking for.
>
> If I am correct, when a member leaves (due to token timeout) the cluster
> will wait the post_fail_delay before fencing. If the member rejoins before
> that, it will still be fenced, because it has previous state?
> From a recent fencing on this cluster there is a strange message:
>
> Aug 24 06:20:45 node2 openais[29048]: [MAIN ] Not killing node node1cl
> despite it rejoining the cluster with existing state, it has a lower node ID
>
> What does this mean?
>
> And lastly is increasing the totem token timeout the way to go?
>
>
> Thanks,
> Vasil Valchev
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



-- 
esta es mi vida e me la vivo hasta que dios quiera




More information about the Linux-cluster mailing list