> And make sure the daemons are running in the real time scheduler class.
> There exists a state, commomly refered to as "split brain", where the 
> nodes of a cluster "think" the other one is down, which is not the fact. 
> Reason for this may be that the load is so high, that the heartbeat 
> daemon is not scheduled in time to answer the requests (it happened to 
> me with a commercial product). then both nodes mount the filesystem. 
> Usually the inital fsck (or log replay or whatever) is enough to destroy 
> the filesystem beyond repair. But all these things are in no way LVM 
> specific, so it works.

Or don't, and buy a power supply that you can control from serial, and
do so -- STONITH, it's called -- Shoot The Other Node In The Head.  Once
the power is off, there is no danger of fsck'ing...

It's a rather elegant way to solve that problem, IMHO.

We're going that route.

