[Linux-cluster] Starter Cluster / GFS

Thu Nov 11 09:31:57 UTC 2010

Jankowski, Chris wrote:

> The point is that no matter what you'd do, your cluster cannot fix the network.
> So, fencing nodes on network failure is the last thing you want to do. You loose
> warm database caches, user sessions and incomplete transactions. Disk quorum times
> out in 10 seconds or so. A typical network meltdown due to spanning tree recalculation
> is 40 seconds.

I'd argue that if you regularly get outages of 40 seconds due to 
spanning tree rebuilds, you have bigger problems (such as too many 
machines on the same VLAN). And if you have that many nodes in a cluster 
(you do keep your cluster interfaces on a dedicated VLAN, right?), you 
are doing way better than what the claimed limits for RHCS are. :)

Gordan