[Linux-cluster] Problem with fenced on cluster with 2 BladeCenter machines: 1st machine is remove physically. The remaining one does not became Active (waiting for fenced)

James Parsons jparsons at redhat.com
Thu Jul 12 15:02:33 UTC 2007


catalin.lupescu at bull.net wrote:

>
> Hello!
>
> I have a Cluster Redhat made with 2 nodes IBM blades on Blade Center 
> chassis.
> (fenced version 1.32.6)
>
> I have done the following test:
> I have removed physically the node 1 machine (the Active one).
> The second one is never became active one. "Clustat" command does not 
> printing any information.
> In /var/log/messages we can found the following messages (repeated):
>
> Jul 11 17:46:24 cdrc1-2 fenced[4214]: fencing node "cdrc1-1"
> Jul 11 17:46:38 cdrc1-2 fenced[4214]: agent "fence_bladecenter" 
> reports: pattern match timed-out at /sbin/fence_bladecenter line 185
> Jul 11 17:46:38 cdrc1-2 fenced[4214]: fence "cdrc1-1" failed
>
> If the node 1 is plugged, the node 2 became Active (fenced OK)
>
bz#240509 changed the sleep timeout in the bladecenter agent from 5 to 
10...this is on or about line 193 in /sbin/fence_bladecenter.  See what 
yours is set at, and try pushing it out a bit. This minor change is 
making its way through the distribution chain now.

-j




More information about the Linux-cluster mailing list