[Linux-cluster] cluster instability
Christine Caulfield
ccaulfie at redhat.com
Tue Jun 17 07:29:00 UTC 2008
GS R wrote:
>
>
> On 6/16/08, *Shawn Hood* <shawnlhood at gmail.com
> <mailto:shawnlhood at gmail.com>> wrote:
>
> All,
>
> This message was sent out to my office, so the voice may seem a bit
> odd. We have a 4 node cluster running RHEL4U6 on Dell Poweredge
> 1950s. Fencing is done via DRAC.
>
> Using packages (from RHN):
>
> cman-kernel-smp-2.6.9-53.13
> cman-1.0.17-0.el4_6.5
> ccs-1.0.11-1.el4_6.1
> fence-1.32.50-2.el4_6.1
> lvm2-cluster-2.02.27-2.el4_6.2
> dlm-kernel-smp-2.6.9-52.9
> dlm-kernheaders-2.6.9-52.9
>
> Our cluster became unstable on Saturday morning. Apparently
> hugin stopped sending out heartbeats, causing it to become
> fenced. hugin
> was under heavy load (~10) at the time:
>
> 03:30:02 AM 6 453 9.35 10.29 10.51
> 03:40:01 AM 12 465 11.02 11.00 10.75
> 03:50:02 AM 3 446 9.75 10.80 10.86
> 04:00:01 AM 5 430 9.23 9.47 10.07
> Average: 7 455 10.19 10.32 10.28
>
> 04:09:35 AM LINUX RESTART
>
> As you can see, hugin was fenced at 4:09. The other nodes then began
> logging the following:
>
> Jun 14 04:08:06 munin kernel: CMAN: Initiating transition, generation 58
> Jun 14 04:08:21 munin kernel: CMAN: Initiating transition, generation 59
> Jun 14 04:08:36 munin kernel: CMAN: Initiating transition, generation 60
> Jun 14 04:08:51 munin kernel: CMAN: Initiating transition, generation 61
> Jun 14 04:09:06 munin kernel: CMAN: too many transition restarts -
> will die
> Jun 14 04:09:06 munin kernel: CMAN: we are leaving the cluster.
> Inconsistent
> cluster view
>
>
> I guess this has to do with network issue though its utilization was low
> when this logged.
> The node is not able to receive messages.
>
I suspect you've hit this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=444751
There's a patch in the bugzilla, and a workaround program you can run
which should help if you can't upgrade the kernel module (See comment #10)
--
Chrissie
More information about the Linux-cluster
mailing list