[Linux-cluster] clvmd hangs when third node tries to connect to cluster

Patrick Caulfield pcaulfie at redhat.com
Tue Oct 30 11:02:22 UTC 2007


s.c.graham at gmail.com wrote:
> Hi there,
> 
> I have a cluster with three nodes (all clone HL DL380 G4s) attached to
> a Fibre SAN (HP MSA1000) and serving a number of GFS filesystems.  My
> OS is Ubuntu Dapper (6.06) and my kernel is 2.6.15-29-amd64-server.
> These machines have been working nicely for a long time.
> 
> On the weekend I "apt-get updated" to the latest version of the Dapper
> redhat-cluster-suite package (1.20060222-0ubuntu6.1).  Now, when the
> cluster boots the first two nodes to come up are able to see the GFS
> filesystem. However, the third node to come up hangs at the point of
> starting the clvm service.  Concomitantly, I see the following message
> in /var/log/syslog of one of the other machines in the cluster:
> 
> Oct 28 14:42:18 machinea kernel: [ 1681.325152] CMAN: node machinec rejoining
> Oct 28 14:42:20 machinea kernel: [ 1683.528299] Extra connection from
> node 2 attempted
> 
> It does not seem to matter which order the nodes come up in - it is
> always the third node to boot that will hang when starting clvmd.  I
> have included my cluster.conf file below for reference - I can include
> any additional diagnostics as required.
> 
> Any help would be most appreciated!

That sounds like a bug that has already been fixed. I don't have the reference
to hand as I've just returned from holiday, sorry.

Patrick

Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street,
Windsor, Berkshire, SL4 ITE, UK.
Registered in England and Wales under Company Registration No. 3798903




More information about the Linux-cluster mailing list