[Linux-cluster] Re: [RFC] DRBD + GFS cookbook (Lon Hohberger)

Lon Hohberger lhh at redhat.com
Tue Dec 11 19:31:38 UTC 2007


On Tue, 2007-12-11 at 14:12 +0530, Koustubha Kale wrote:

> Does it work as expected? There seem to be two problems to me...
> 
> a) when we use something like.. (from your cookbook)
> disk {
>                fencing resource-and-stonith;
>        }
>        handlers {
>                outdate-peer "/sbin/obliterate"; # We'll get back to this.
>        }

> when this handler gets called both nodes will try to fence each other. Is that the intended effect?

Yes, in a network partition of a two-node cluster, both nodes will race
to fence.  One wins, the other dies. ;)


> b) If we try to do ssh <host> -c "drbdadm outdate all",  gfs is still mounted on top of drbd and drbd is primary so here is no effect of the command and the split brain continues. I have seen this.

... but with resource-and-stonith, drbd freezes I/O until the
outdate-peer script returns a 4 or 7...  If it doesn't return


> >I don't fully understand.  You want to start a service on a node which
> >doesn't have access to DRBD - but the service depends on DRBD?
> 
> We are using a three server node cluster. Two of the server nodes act as the shared storage in Active-Active DRBD. The third
>  server node mounts the gfs volumes through a manged NFS service. All three


>  cluster nodes act as servers for diskless nodes ( XDMCP through LVS --> Direct Routing method). The diskless nodes are not part of RHCS cluster. They are thin clients for students.
> What I was wondering about is if there is a way to switch over a users session in the event of a server cluster node crashing. It wont have to depend on DRBD as the other server node will still be active as drbd primary, also the third server will continue working with NFS failing over to the remaining DRBD machine. 

Fail over an xdmcp session?  I think xdm/gdm/etc. were not designed to
handle that sort of a failure case.  It sounds like a cool idea, but I
would not even know where to begin to make that work.

-- Lon




More information about the Linux-cluster mailing list