[Linux-cluster] DRBD with GFS applicable for this scenario?

Thu Jan 28 05:25:23 UTC 2010

Zaeem Arshad wrote:
> Hi List,
> 
> We have 2 geographically distant sites located approximately 35km
> apart with dark fiber connectivity available between them. Mail01 and
> SAN1 is placed at site A while Mail02 and SAN2 is at site B. Our
> requirement is to have the mail servers in a cluster configuration in
> an active/active mode. To cater for the loss of connectivity or losing
> a SAN itself, I have come up with the following design.

What's the point of having a SAN if you're using DRBD? You might as well 
have DAS in each of the two mail servers. Unless you need so much 
storage space that you can't put enough disks directly into the server...

> 1) Export 1 block device from each SAN to its mail server i.e. SAN1
> exports to Mail01
> 2) Use DRBD to configure a block device comprising of the 2 SAN
> volumes and use it as a physical volume in clvm.

The CLVM bit is isn't relevant per se, you don't strictly need it, but 
it won't hurt.

> 3) Create a GFS logical volume from this PV that can be used by both servers.

That's fine.

> I am wondering if this is a correct design as theoretically it looks
> to address both node and SAN failure or connectivity loss.

The problem you have is that you have no way of enacting fencing if the 
connectivity between the sites fails. If a node fails, any cluster file 
system (GFS included) will mandate a fencing action to ensure that one 
of the nodes gets taken down and stays down. If you have lost cross-site 
connectivity, the nodes won't be able to fence each other, and GFS will 
simply block until connectivity is restored and fencing succeeds. The 
chances are that when this happens, it'll also cause a fencing shoot-out 
and both nodes may well end up getting fenced.

You could use some kind of cheat-fencing, say, by setting a firewall 
rule that will prevent the nodes from re-connecting (you'd need to write 
your own fencing agent, but that's not particularly difficult), but then 
you would be pretty much guaranteeing a split-brain situation, where the 
nodes would end up operating independently without any hope of ever 
re-synchronising.

The bottom line is that you need reliable out-of-band fencing mechanism. 
If you have GSM/wireless signal in both areas you could rig up a 
separate, small fencing "server" on each site with a GSM modem, and 
write a fencing agent that sends a fencing request by SMS. When the 
fencing server receives a fencing request, you'd have to make it issue a 
local fencing action using one of the more standard fencing agents. Note 
that in this case, due to high latency of things like SMS, you'd need to 
implement accurate time stamping and deliberately semi-randomize the 
delay between fencing requests being sent so that you could check time 
stamps and the fencing servers could sensibly decide whether to obey the 
local fencing request or the remote one.

You have to get a little creative about it and write a few lines of code 
to glue it together. I've been meaning to implement something like this 
for a while, but I haven't gotten around to it yet.

Gordan