[Linux-cluster] Interfacing csnap to cluster stack
Daniel Phillips
phillips at redhat.com
Fri Oct 8 21:25:40 UTC 2004
On Friday 08 October 2004 15:16, Lon Hohberger wrote:
> On Thu, 2004-10-07 at 13:58 -0400, Daniel Phillips wrote:
> > Suppose that the winner of the race to get the exclusive lock is a
> > bad choice to run the server. Perhaps it has a fast connection to
> > the net but is connected to the disk over the network instead of
> > directly like the other nodes. How do you fix that, within this
> > model?
>
> Let me see if I am getting this use-case picture right...
> Are either of those close?
No, the arranagement I was describing is:
SAN GigE
| <---> iSCSI/GNBD <---> |
| |
| <---> Client <-------> | Node 1
| |
| <---> Client <-------> | Node 2
|
Client <-------> | Node 3
Server <-------> | Node 3
Node 3 won the race to get the EX lock because the lock is mediated over
the GigE network. But Node 3 is a bad choice because it is two hops
away from the disk. The DLM chose Node 3 because the DLM doesn't know
anything about network topology, just who got there first to grab the
lock.
> (1) Don't set up your csnap server in such a way that some the nodes
> exhibit a bottleneck on disk I/O and some do not.
But what prevents it? How do you "set up your csnap server"? Why would
you want to introduce new rules about cluster topology instead of
fixing the code?
> (2) Have the administrator make an intelligent decision as to whether
> or not to relocate the csnap master server again as [s]he tries to
> fix the problem that caused the failover. I.E. Don't worry about it
> if the csnap master server is running slowly.
The administrator is normally asleep or busy with girlfriend when
anything goes wrong.
> Your clients still work, and the csnap server is available, albeit at
> a potentially degraded state.
Well...
> (3) Don't use the cluster-lock model. It has its shortcomings. Its
> strengths are in its simplicity; not its flexibility.
Yes, that's the one. We need real resource management, even if it
initially just consists of an administrator setting up config files.
Something has to read those config files[1] and respond to server
instance requests from csnap agents accordingly.
[1] At cluster bring-up time. The resource manager has to be able to
operate without reading files during failover.
Regards,
Daniel
More information about the Linux-cluster
mailing list