[Linux-cluster] Interfacing csnap to cluster stack

Thu Oct 7 00:55:27 UTC 2004

On Wednesday 06 October 2004 16:34, Daniel McNeil wrote:
> On Wed, 2004-10-06 at 11:27, Daniel Phillips wrote:
> > On Tuesday 05 October 2004 18:47, Daniel McNeil wrote:
> > > > The idea is, there is a service manager out there somewhere
> > > > that keeps track of how many instances of a service of a given
> > > > type currently exist, and has some way of creating new resource
> > > > instances if needed, or killing off extra ones.  In our case,
> > > > we want exactly one instance of a csnap server.  We need not
> > > > only to specify that constraint somehow and get a  connection
> > > > to it, but we need to supply a method of starting a csnap
> > > > server.  So csnap-agent will be a client of service manager and
> > > > an agent of resource manager.
> > >
> > > Why do you need a service manager for this?  As Lon suggested,
> > > a DLM lock can provided the 1 master and the others ready
> > > to take over when the lock is released.
> >
> > The DLM uses the service manager.  Why lather on another layer,
> > when really we just want to use the service manager too?
>
> The DLM is a well known interface that has had many implementations.
> When Patrick sent out the Generic Kernel API it included membership
> and quorum interfaces which is also things that have/could have many
> implementations.  The service manager is something new that I have
> not seen in other cluster implementations.  Are you planning on
> doing a generic API for service manager as well?

I really don't know, I haven't tried it yet. In fact I'm starting to
sour on the whole concept of generic API for genericness's sake, and
I'm now looking for a real value-add in each layer.  Before I actually 
looked at Magma, I imagined it must be a sort of switch that would act
as a single point for all cluster services on a given node to contact
and interface to the global infrastructure.  It's not, but maybe it
should be, that would be a real value add.  Gathering together lots of
local component connections and funneling the traffic over a few global
connections would be a very good thing for Magma to be doing.

I'm also curious what Ben comes up with, who was going to take a run
at gluing this stuff together.

> From my previous 
> experience with other cluster implementations, the DLM was only
> dependent on membership and quorum (and cluster-wide communication).
> From my perspective the service manager is the other layer. :)
> If you make csnap depend on the service manager, then any other
> cluster implementation that wanted to use csnap would have to provide
> the service manager functionality.

Which cluster infrastructure is it that does have a dlm but does not 
have a service manager?

Anyway, Lon's suggestion was cute but doesn't actually do any resource
management worthy of the name.  A resource manager is supposed to take a
look at the loading of the various candidates for resource instancing
and pick a good candidate.  Or follow rules set out by an administrator. 
Or do some sensible thing, not just run a haphazard random number
generator consisting entirely of cluster races.  What would prevent one
node from winning the race every time and end up loading every network
service onto itself?

> > > Why do we want to use Magma?  At the cluster summit I thought
> > > that Magma was just the way to provide backward compatibility
> > > for the older GFS releases.  Did we agree to make magma the
> > > API?  Having csnap depend on the DLM API makes more sense to me.
> >
> > Have you looked at the dlm api?  Why would we want to be directly
> > ioctling sockets when we could be using a library interface?  I'm
> > not necessarily disagreeing with you, the question is: should we be
> > using a library for this or not?  I'd think that using a library is
> > motherhood, though it does force us to think about the api a little
> > harder.
>
> I've looked at libdlm.h and libdlm.so.  It looks like it is the
> library that provides dlm_lock(), dlm_unlock() and friends.
> I have not reviewed all the dlm calls, but it looks about right.
> What am I missing?  I didn't see any direct ioctls.

Sorry, I meant "cman api", here is how Magma talks to it:

...cluster/magma-plugins$ grep ioctl * -r
cman/cman.c:#include <sys/ioctl.h>
cman/cman.c:            x = ioctl(p->sockfd, SIOCCLUSTER_GETMEMBERS, NULL);
cman/cman.c:    } while (ioctl(p->sockfd, SIOCCLUSTER_GETMEMBERS, &cman_nl) !=
cman/cman.c:    qs = ioctl(p->sockfd, SIOCCLUSTER_ISQUORATE, NULL);
cman/cman.c:    return ioctl(p->sockfd, SIOCCLUSTER_KILLNODE, nodeid);
sm/services.c:#include <linux/ioctl.h>
sm/services.c:#include <sys/ioctl.h>
sm/services.c:          x = ioctl(sockfd, SIOCCLUSTER_GETMEMBERS, NULL);
sm/services.c:  } while (ioctl(sockfd, SIOCCLUSTER_GETMEMBERS, &cman_nl) !=
sm/sm.c:#include <sys/ioctl.h>
sm/sm.c:                x = ioctl(p->sockfd, op, NULL);
sm/sm.c:        } while (ioctl(p->sockfd, op, &sm_nl) != sm_nl.max_members);
sm/sm.c:        qs = ioctl(p->sockfd, SIOCCLUSTER_ISQUORATE, NULL);
sm/sm.c:                if (ioctl(p->sockfd, SIOCCLUSTER_SERVICE_GETEVENT,
sm/sm.c:                        ioctl(p->sockfd, SIOCCLUSTER_SERVICE_STARTDONE,
sm/sm.c:                if (ioctl(p->sockfd, SIOCCLUSTER_SERVICE_GETEVENT,
sm/sm.c:        if (ioctl(p->sockfd, SIOCCLUSTER_SERVICE_REGISTER, p->groupname) < 0) {
sm/sm.c:        if (ioctl(p->sockfd, SIOCCLUSTER_SERVICE_JOIN, p->groupname) < 0) {
sm/sm.c:                if (ioctl(p->sockfd, SIOCCLUSTER_SERVICE_LEAVE, NULL))
sm/sm.c:        ioctl(p->sockfd, SIOCCLUSTER_SERVICE_UNREGISTER, NULL);
sm/sm.c:        return ioctl(p->sockfd, SIOCCLUSTER_KILLNODE, nodeid);
sm/sm.c:        if (ioctl(p->sockfd, SIOCCLUSTER_SERVICE_GETEVENT, &ev) < 0) {
sm/sm.c:                //printf("ioctl() failed: %s\n", strerror(errno));
sm/sm.c:                ioctl(p->sockfd, SIOCCLUSTER_SERVICE_STARTDONE, ev.event_id);

Maybe csnap will end up talking to it the same way, who knows.  I just
always feel like I got something icky on myself when I call an ioctl.

> > > > ...service manager must ask resource manager to
> > > > create or kill off resource instances on its behalf.
> > >
> > > What would need to be killed off?  Under what circumstances?
> >
> > If the cluster shrinks,the resource manager might decide that the
> > population of a particular sort of server is too high and some
> > should be culled.  Of course, having too many servers is less of a
> > problem than having too few, but I generally dislike "grow only"
> > systems of any ilk.
>
> I agree with this usage for resource managers in general, but this
> does not seem to apply to the csnap server.

How does the cluster get rid of the csnap server if it finds itself
needing zero of them?  (CLVM removes the snapshot target from the device
stack.)  What about maintaining a pool of servers so that a failover
server can be started without loading it from disk?  (i.e., a service
group that provides the service "become a csnap server".)  How would
you keep that pool balanced at something less than every node in the
cluster?

It seems to me that the notions of service and resource manager are
entirely appropriate for the snapshot server, what I'm trying to sort
out is how they interact.

> I just starting reading through your cluster.snapshot.design.html.
> I was talking about the csnap client to csnap server communication.
> I did a  quick search through the design doc and don't see what the
> csnap-agent is for.  I'll keep reading.

Sorry, the part you were looking for is the recently revised
"Integration with Cluster Infrastructure" section, checked in just now. 
The old version was a generic rant about heartbeating, the revised
version provides as-built specifics of the failover design.

Regards,

Daniel