[Linux-cluster] STONITH
Lon Hohberger
lhh at redhat.com
Mon Oct 9 20:53:24 UTC 2006
On Fri, 2006-10-06 at 12:10 +0100, Grant Waters wrote:
> Powering cycling both nodes and the array fixes the problem, but I
> want to know whats causing it in the first place. It doesn't appear
> to be related to load, although I can't rule that out - both outages
> were at approx 04:40 on a Friday.
The tg3 link mysteriously disappearing/reappearing looks like the
culprit. clumanager doesn't control those kinds of things...
(a) up the failover interval to 30sec. If it's just a flaky
card/driver/cable/etc., this buys more time.
(b) cludb -p clumembd%rtp 10
If you think it's a scheduling problem.
(c) cludb -p cluster%msgsvc_noarp 1
Gets rid of "SIOCGARP..." errors.
(d) cludb -p clulockd%loglevel 4
Because clulockd @ debug level is a waste of resources.
-- Lon
More information about the Linux-cluster
mailing list