[Linux-cluster] Re: [RFC] DRBD + GFS cookbook (Lon Hohberger)

Mon Dec 10 21:10:10 UTC 2007

On Sat, 2007-12-08 at 10:41 +0530, Koustubha Kale wrote:

> We are using a similar setup with a three node cluster. The third node mounts the gfs volumes through a manged NFS service. All three cluster nodes act as servers for diskless nodes ( XDMCP through LVS).
> We have observed few issues though.
> 1) On the drbd nodes, we have root partition on a logical volume. Also our drbd+gfs disks are clustered LV's; So we had to manually restart clvmd after drbd in order for the gfs volumes to be active.

:o

I think that complicates things.  CLVM expects shared storage.  DRBD
isn't shared storage; it's distributed storage acting as shared storage.

> 2) The manged NFS service refuses to failover. I am not sure whether this is because of manual fencing. Our APC MasterSwitch is expected shortly so will know more about NFS failover after we have proper fencing setup.
> I would be very interested in trying this fencing through DRBD..

I don't like the idea of asking a node who has been evicted from the
cluster to "stop I/O pretty-please-with-sugar-on-top", but that's just
my opinion.

A simple outdate-peer script could be done using ssh assuming
distributed keys:

  ssh <host> -c "drbdadm outdate all"

Though it would be nicer to have an equivalent to dopd/drbd-outdate-peer
written for cman/openais (though the ssh model works independent of
underlying cluster architecture...)

> 3) The disk IO is very slow. Almost a bottleneck. I wonder getting rid of the LV's & making gfs directly on the drbd device might help?

I think it will help some things (e.g. you won't have to restart clvmd)
- but I don't think it will help much in the I/O bottleneck.

> Another question may be OT sorry if so. Is there a way to failover the diskless nodes to other cluster server in case of one cluster server going down?

I don't fully understand.  You want to start a service on a node which
doesn't have access to DRBD - but the service depends on DRBD?

This is probably not easy to do.  It sounds like you would have to
reconfigure drbd on the remaining cluster node on the fly...

-- Lon