[Linux-cluster] Interfacing csnap to cluster stack

Wed Oct 6 18:27:07 UTC 2004

On Tuesday 05 October 2004 18:47, Daniel McNeil wrote:
> > The idea is, there is a service manager out there somewhere that
> > keeps track of how many instances of a service of a given type
> > currently exist, and has some way of creating new resource
> > instances if needed, or killing off extra ones.  In our case, we
> > want exactly one instance of a csnap server.  We need not only to
> > specify that constraint somehow and get a  connection to it, but we
> > need to supply a method of starting a csnap server.  So csnap-agent
> > will be a client of service manager and an agent of resource
> > manager.
>
> Why do you need a service manager for this?  As Lon suggested,
> a DLM lock can provided the 1 master and the others ready
> to take over when the lock is released.

The DLM uses the service manager.  Why lather on another layer, when 
really we just want to use the service manager too?

> > We won't talk to either service manager or resource manager
> > directly, but go through Lon's Magma library, which is supposed to
> > provide a nice stable api for us to work with, regardless of
> > whether particular services reside in kernel or user space, or are
> > local or remote.  Lon has said that he will adapt the Magma api as
> > we go, if we break anything or run into limitations.  (I suppose
> > that is why it is called Magma, it flows under pressure.)
>
> Why do we want to use Magma?  At the cluster summit I thought
> that Magma was just the way to provide backward compatibility
> for the older GFS releases.  Did we agree to make magma the
> API?  Having csnap depend on the DLM API makes more sense to me.

Have you looked at the dlm api?  Why would we want to be directly 
ioctling sockets when we could be using a library interface?  I'm not 
necessarily disagreeing with you, the question is: should we be using a 
library for this or not?  I'd think that using a library is motherhood, 
though it does force us to think about the api a little harder.

> > Magma doesn't actually know anything about what we're asking it, it
> > only knows how to pass on requests to somebody who does.  So we're
> > actually talking to service manager and resource manager through
> > Magma, and presumably they talk to each other as well, because
> > service manager must ask resource manager to create or kill off
> > resource instances on its behalf.
>
> What would need to be killed off?  Under what circumstances?

If the cluster shrinks,the resource manager might decide that the 
population of a particular sort of server is too high and some should 
be culled.  Of course, having too many servers is less of a problem 
than having too few, but I generally dislike "grow only" systems of any 
ilk.

> > Anyway, csnap-agent is mainly going to be talking to service
> > manager through Magma, but it also needs to tell resource manager
> > about our resource, its constraints and how to set itself up as an
> > agent to create it.  I don't have a clear picture of how this works
> > at the moment, and that is the point of this email.
> >
> > For example, how do we specify the service manager constraints,
> > i.e., "exactly one" in this case: before we request the instance,
> > or as part of the request, or in a configuration file somewhere?
>
> The cnap-agent to csnap-server seems like a perfect example of why we
> a cluster communication API.  The csnap-agent wants to send
> information to the csnap-server and could use a highly available
> communication mechanism.

A csnap agent never sends information to a csnap server, except to start 
one locally at the request of a resource agent.

There may a good use for a virtual synchrony-based cluster communication 
api somewhere in this, but that's not it.

Regards,

Daniel