[Linux-cluster] Re: Starting up two of three nodes that compose a cluster
teigland at redhat.com
Fri Sep 21 16:55:11 UTC 2007
On Fri, Sep 21, 2007 at 06:36:04PM +0200, carlopmart wrote:
> David Teigland wrote:
> >On Fri, Sep 21, 2007 at 06:15:37PM +0200, carlopmart wrote:
> >>>>[root at thranduil ~]# fence_ack_manual -n elrond.hpulabs.org
> >>>>Warning: If the node "elrond.hpulabs.org" has not been manually fenced
> >>>>(i.e. power cycled or disconnected from shared storage devices)
> >>>>the GFS file system may become corrupted and all its data
> >>>>unrecoverable! Please verify that the node shown above has
> >>>>been reset or disconnected from storage.
> >>>>Are you certain you want to continue? [yN] y
> >>>>can't open /tmp/fence_manual.fifo: No such file or directory
> >>>That looks like the old RHEL4/cluster-1.0 version of fence_ack_manual...
> >>And has some solution???
> >You need to make sure the RHEL4/cluster-1.0 binaries are removed from the
> >nodes and the new RHEL5/cluster-2.0/openais binaries are installed. If
> >you're getting this far, it may only be some fencing binaries that are
> >incorrect, so first just remove fence_manual and fence_ack_manual and make
> >sure you have the new fence_ack_manual installed (it's now a bash script).
> >fence_manual no longer exists in RHEL5/cluster-2.0 code since
> >fence_ack_manual talks directly with fenced.
> Sorry??? this three nodes are RHEL5 with lastest patches applied except
> kernel version 2.6.18-8.1.10.
> Version of cman is: cman-2.0.64-1.0.1.el5
> Version of gfs-utils:
> Version of rgmanager: rgmanager-2.0.24-1.el5
> And fence-manual exists on this cluster suite:
> [root at haldir xen]# whereis fence_manual
> fence_manual: /sbin/fence_manual /usr/share/man/man8/fence_manual.8.gz
> [root at haldir xen]# rpm -qf /sbin/fence_manual
> [root at smeagol xen]#
> And fence_ack_manual it is not a bash script, it is a binary:
> [root at haldir xen]# whereis fence_ack_manual
> fence_ack_manual: /sbin/fence_ack_manual
> [root at haldir xen]# cd /sbin
> [root at haldir sbin]# file fence_ack_manual
> fence_ack_manual: ELF 32-bit LSB executable, Intel 80386, version 1
> (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs), for
> GNU/Linux 2.6.9, stripped
> [root at haldir sbin]#
> Do I need to install rhel5.1 beta to do this?? If it yes i have a very
> very great problem ....
Looks like I was wrong about what got into RHEL5, it's a real pity the new
stuff didn't make it. Looking back at your cluster.conf file it seems
that you're using fence_gnbd for that node, so my next guess is that
fence_gnbd isn't found or isn't working.
I can't find a way to override a failing fence operation in the RHEL5
code, so that probably means you'll have to get fence_gnbd working.
Or, another somewhat dangerous option is to disable startup fencing
altogether by adding this to cluster.conf:
More information about the Linux-cluster