[Linux-cluster] Interfacing csnap to cluster stack
Daniel Phillips
phillips at redhat.com
Fri Oct 8 01:42:19 UTC 2004
On Thursday 07 October 2004 18:57, Daniel McNeil wrote:
> Daniel,
>
> Maybe you should describe what kind of help you are looking for
> from the infrastructure?
Sure, there are two separate problems:
1) Resource management
- The resource to be instantiated is the csnap server.
- There may never be more than one, or the snapshot metadata will
be corrupted (this sounds like a good job for gdlm: let the
server take an exclusive lock on the snapshot store).
- Server instance requests come from csnap agents, one per node.
The reply to an instance request is always a server address and
port, whether the server had to be instantiated or was already
running.
- If the resource manager determines no server is running, then
it must instantiate one, by picking one of the cluster nodes,
finding the csnap agent on it, and requesting that the agent
start a server.
- When instantiated in a failover path, the local part of the
failover path must restrict itself to bounded memory use.
Only a limited set of syscalls may be used in the entire
failover path, and all must be known. Accessing a host
filesystem is pretty much out of the question, as is
on-demand library or plugin loading. If anything like this
is required, it must be done at initialization time, not
during failover.
2) Membership
- If a snapshot client disconnects, the server needs to know if
it is coming back or has left the cluster, so that it can
decide whether to release the client's read locks.
- If a server fails over, the new incarnation needs to know
that all snapshot clients of the former incarnation have
either reconnected or left the cluster.
- There exists a snapshot client protocol variation that adds
an additional message (confirmation of read lock release)
and allows the snapshot server to ignore cluster membership
entirely, This is a way of wimping out instead of dealing
with interface issues.
- Origin clients don't present a problem, they don't hold
locks.
Regards,
Daniel
More information about the Linux-cluster
mailing list