[Linux-cluster] Re: Starting up two of three nodes that compose a cluster

David Teigland teigland at redhat.com
Fri Sep 21 15:51:25 UTC 2007


On Fri, Sep 21, 2007 at 05:50:09PM +0200, carlopmart wrote:
> David Teigland wrote:
> >On Fri, Sep 21, 2007 at 05:29:22PM +0200, carlopmart wrote:
> >> [root at thranduil log]# mount -t gfs /dev/xvdc1 /data
> >>/sbin/mount.gfs: lock_dlm_join: gfs_controld join error: -22
> >>/sbin/mount.gfs: error mounting lockproto lock_dlm
> >
> >This has already been changed to report a descriptive error message,
> >  "node not a member of the default fence domain"
> >
> >as is shown in the debug log from gfs_controld below, and I suspect
> >appears in your /var/log/messages.
> >
> >>1190388485 mount: not in default fence domain
> >>1190388485 datavol01 do_mount: rv -22
> >
> >>[root at thranduil log]# group_tool -v; group_tool dump gfs
> >>type             level name     id       state node id local_done
> >>fence            0     default  00010001 JOIN_START_WAIT 1 100010001 0
> >>[1]
> >
> >This shows it's not in the fence domain yet.  The reason appears to be
> >that it's trying to fence someone.  Again, look in /var/log/messages to
> >find out more information about what needs to be fenced, or why fencing
> >isn't working.
> >
> >Dave
> >
> >
> Correct Dave. Error is:
> 
> Sep 21 16:51:25 thranduil fenced[1081]: fencing node "elrond.hpulabs.org"
> Sep 21 16:51:25 thranduil fenced[1081]: fence "elrond.hpulabs.org" failed
> 
>  And it is ok. "elrond.hpulabs.org" is the node that I can't startup 
> (it is on maintenance hardware until monday). I need to start all other 
> cluster services under thranduil and haldir .... Is it possible???

Two options:

1. Remove that node from cluster.conf so it's not fenced every time the
cluster starts up.

2. Manually override/ack the fencing operation every time it happens with:
fence_ack_manual -n elrond.hpulabs.org.  This will allow things to
continue.

Dave




More information about the Linux-cluster mailing list