[Linux-cluster] Where to go with cman ?

Mon Jul 18 08:10:30 UTC 2005

As I see it there are two things we can do with userland cman that's current in
the head of CVS:

1. Leave it as it is - a port of the kernel one. This has some benefits: it's
easy (plus a few bug fixes that need to go in), it's protocol-compatible with
the kernel one. There are a small number of extra features that could go in
there (that would, annoyingly, break that compatibility) but nothing really
serious. It doesn't give us anything new, but what new is neeed ?

2. Migrate it to something much more sophisticated. I've mentioned Virtual
Synchrony a few times before and I've been looking into this in some detail
since. The benefits are largely internal but they do provide a reliable, robust
and well-performing messaging system that other cluster subsystems can use.
While the application programmers at the cluster summit maintained they had no
use for a cluster messaging system, I still believe that it is a useful thing to
have at a lower level - if only for our own programming needs. I know that Jon
looked into the existing cman messaging system before rejecting it as too slow
and unreliable for he needs of the cluster mirroring code.

There are two suboptions here.
  a) write it ourself. Quite a big job this. Bigger than I would like. To be
honest I did make a start at this and now realise just what a huge job it is to
get something that both performs well and is reliable. REALLY reliable. even
worse if the academics want something provably reliable.
   b) adopt something else. The obvious candidate here is the openAIS code[1].
This looks to be quite mature now and has all the features we need of a low
level messaging system. It's very nicely abstracted out so we can pick out just
the bits we need without having the whole (rather heavyweight) system on top of it.

The one problem with the openAIS code is that it doesn't support IPv6, and much
of the code is tied to IPv4. Having had a look at it and emailed Steven Dake
about this he reckons it's about 2 weeks work to add.[2]

The advantages of doing this are several.
- It saves time. We get something that is known to work, even though it needs
extra features added for our own use.
- we're not inventing something new that already exists in several other places.
- we get more people who know the code. Currently only I know the internals of
cman as it stands and it's quite scary code that people don't want to get
involved with (we've have several DLM patches in the past, but no CMAN ones).
This way we get at least 2 (Steven and me) as well as anyone else who is
following openAIS. Of course there will be CMAN-specific stuff on top of their
comms layer to make it quorum-based and capable of supporting GFS and DLM that
will be Red Hat specific but these are not going to be large.
- the APIs are all open (based on SAforum specifications) and already
implemented. Although adding saCLM to CMAN is pretty easy as I proved last week.

The disadvantages are
- Need to learn the internals of someone else's code.
- We don't have full control over the code. Although we can obviously fork it if
we feel the need it would, obviously be preferable not to.
- non-compatibility with "old" cman, making rolling upgrades har or even
impossible. I'm not sure what to do about this yet, but it's worth pointing out
that the DLM has a new line-protocol too.
- openAIS is BSD licensed, I don't think this is a problem but it probably needs
checking.

In short, I'm advocating adopting the openAIS core (libtotem basically) as
CMAN's communications/membership protocol. If we're going to do a "CMAN V2" that
has anything significant over V1 then re-inventing it is going to be a huge
amount of work that someone else has already done.

Comments?

Patrick