[Linux-cluster] 32 nodes limit?

Marc Aurele La France tsi at ualberta.ca
Fri Sep 12 16:44:32 UTC 2008


Hi.

Just what I hope will be a quick question regarding the cluster suite.

The current lock manager FAQ states ...

"CMAN in RHEL4 has known problems when you have more than 32 nodes in the 
cluster.  We're working to resolve those issues, but until then use GULM if 
you have more than 32 nodes."

... while the pre-Wiki version of this document refered to DLM instead of 
CMAN in RHEL4.  Which one is it?  DLM makes more sense to me.

In any case, I gather that this issue has been resolved.  If so, can you tell 
me the minimum version of the cluster suite and/or upstream kernel that would 
allow for more than 32 nodes (with DLM)?  A pointer to a patch or patches 
that I could use would be ideal.

More details ...

I'm trying to move a 5TB filespace from NFS to GFS2.  I have a P4 (the 
current NFS server) and 33 Opteron nodes, all running a stock 2.6.22 kernel, 
OpenAIS 0.80.3, and a 2.00.00 cluster suite.  For now, I've dummied out 
fencing and set expected_votes to 1.  I can start/stop cman on all nodes no 
problem.  With all cman's running, I've formatted, mounted and populated the 
filesystem using the P4.  Proceeding through the Opterons to mount the 
filesystem succeeds until the 32nd node, at which point mount.gfs2 hangs (in 
"D" according to `ps ax`).  Going back, the first 16 systems that have 
mounted the filesystem can still `ls` the top level directory, but attempts 
to do so on the remaining systems also get stuck in "D".  Any attempt to 
unmount the filesystem throws the entire setup in "D".

Due to various considerations, moving to more recent versions is not the 
preferred option at this point.  Hence my question.

Any ideas?

Thanks.

Marc.

+----------------------------------+----------------------------------+
|  Marc Aurele La France           |  work:   1-780-492-9310          |
|  Academic Information and        |  fax:    1-780-492-1729          |
|    Communications Technologies   |  email:  tsi at ualberta.ca         |
|  352 General Services Building   +----------------------------------+
|  University of Alberta           |                                  |
|  Edmonton, Alberta               |    Standard disclaimers apply    |
|  T6G 2H1                         |                                  |
|  CANADA                          |                                  |
+----------------------------------+----------------------------------+




More information about the Linux-cluster mailing list