[Linux-cluster] 32 nodes limit?

Mon Sep 15 08:10:19 UTC 2008

Marc Aurele La France wrote:
> Hi.
> 
> Just what I hope will be a quick question regarding the cluster suite.
> 
> The current lock manager FAQ states ...
> 
> "CMAN in RHEL4 has known problems when you have more than 32 nodes in
> the cluster.  We're working to resolve those issues, but until then use
> GULM if you have more than 32 nodes."
> 
> ... while the pre-Wiki version of this document refered to DLM instead
> of CMAN in RHEL4.  Which one is it?  DLM makes more sense to me.

CMAN & DLM are two different parts of the same cluster infrastructure.
CMAN is the cluster manager, and DLM is the Distributed Lock Manager. if
you're using GFS then you will need both.

> In any case, I gather that this issue has been resolved.  If so, can you
> tell me the minimum version of the cluster suite and/or upstream kernel
> that would allow for more than 32 nodes (with DLM)?  A pointer to a
> patch or patches that I could use would be ideal.

We have seen users with 32 or more nodes have trouble with CMAN that do
seem to be related to the size of the cluster. But it it's not a hard
limit and after a certain amount of (non-QE) internal testing we had no
problems with 36 nodes in a CMAN cluster.

There have been no patches committed to RHEL4 that specifically address
the problems people have seen with 32+ nodes.

> More details ...
> 
> I'm trying to move a 5TB filespace from NFS to GFS2.  I have a P4 (the
> current NFS server) and 33 Opteron nodes, all running a stock 2.6.22
> kernel, OpenAIS 0.80.3, and a 2.00.00 cluster suite.  For now, I've
> dummied out fencing and set expected_votes to 1.  I can start/stop cman
> on all nodes no problem.  With all cman's running, I've formatted,
> mounted and populated the filesystem using the P4.  Proceeding through
> the Opterons to mount the filesystem succeeds until the 32nd node, at
> which point mount.gfs2 hangs (in "D" according to `ps ax`).  Going back,
> the first 16 systems that have mounted the filesystem can still `ls` the
> top level directory, but attempts to do so on the remaining systems also
> get stuck in "D".  Any attempt to unmount the filesystem throws the
> entire setup in "D".
> 
> Due to various considerations, moving to more recent versions is not the
> preferred option at this point.  Hence my question.

CMAN/openais in RHEL5 seems to be happy up to around 48 nodes (again
this is not a QE figure, it's something we have tested in development
only) with appropriate tuning. If you are seeing problems then it might
be helpful to adjust some of the times use in the openais totem
protocol. man openais.conf will tell you something about them. Before
doing this though it's worth checking the output of "group_tool" command
and syslog to see if there are any openais or other daemon errors that
might be causing your problems. If necessary post them to this list.

It's also worth mentioning that 2.00.00 has had a considerable number of
bugfixes applied since it was released and the current version is
2.03.07. I do strongly recommend you upgrade to this version even though
you say it is not "the preferred option at this point".

I hope this helps,

Chrissie