[Linux-cluster] RE: Linux-cluster Digest, Vol 35, Issue 13

Sai Loganathan sail at serverengines.com
Tue Mar 13 02:47:45 UTC 2007


Hello,
Thanks for the info. Now I am doing manual fencing but get the following
error whenever I do a failover.

Mar 12 17:25:50 node2 clurgmgrd[6088]: <info> State change: node1 DOWN
Mar 12 17:25:52 node2 clurgmgrd[6088]: <notice> Starting stopped service
iscsi_ip
Mar 12 17:25:52 node2 clurgmgrd: [6088]: <info> Adding IPv4 address
172.40.2.119 to eth2
Mar 12 17:25:52 node2 clurgmgrd[6088]: <notice> Starting stopped service
iscsi_lun
Mar 12 17:25:53 node2 clurgmgrd[6088]: <notice> Service iscsi_lun started
Mar 12 17:25:54 node2 clurgmgrd[6088]: <notice> Service iscsi_ip started
Mar 12 17:26:24 node2 kernel: CMAN: removing node node1 from the cluster :
Missed too many heartbeats
Mar 12 17:26:24 node2 fenced[6040]: node1 not a cluster member after 0 sec
post_fail_delay
Mar 12 17:26:24 node2 fenced[6040]: fencing node "node1"
Mar 12 17:26:24 node2 fence_manual: Node node1 needs to be reset before
recovery can procede.  Waiting for node1 to rejoin the cluster or for manual
acknowledgement that it has been reset (i.e. fence_ack_manual -n node1)

I just power down node 1 to simulate the failover to node2. Unless I execute
the command fence_ack_manual -n node1, the system will not move forward and
wait in fencing. How to fix this error?

During shutdown, I get the following error message and system waits there
infinitely.
Starting Killall: CMAN: sendmsg failed: -101
WARNING: dlm_emergency_shutdown
SM: 00000003 sm_stop: SG stilljoined
How to fix this error?

Thanks,
Sai Logan



-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of
linux-cluster-request at redhat.com
Sent: Saturday, March 10, 2007 9:00 AM
To: linux-cluster at redhat.com
Subject: Linux-cluster Digest, Vol 35, Issue 13

Send Linux-cluster mailing list submissions to
	linux-cluster at redhat.com

To subscribe or unsubscribe via the World Wide Web, visit
	https://www.redhat.com/mailman/listinfo/linux-cluster
or, via email, send a message with subject or body 'help' to
	linux-cluster-request at redhat.com

You can reach the person managing the list at
	linux-cluster-owner at redhat.com

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Linux-cluster digest..."


Today's Topics:

   1. Re: cluster not doing failover (Jonathan E Brassow)


----------------------------------------------------------------------

Message: 1
Date: Fri, 9 Mar 2007 19:53:40 -0600
From: Jonathan E Brassow <jbrassow at redhat.com>
Subject: Re: [Linux-cluster] cluster not doing failover
To: linux clustering <linux-cluster at redhat.com>
Message-ID: <40407159e8e6506b05d46c82d921d936 at redhat.com>
Content-Type: text/plain; charset="iso-8859-1"


On Mar 9, 2007, at 5:30 PM, Sai Loganathan wrote:
>             <fencedevices>
>                         <fencedevice agent="fence_ilo" hostname="admin"
> login="admin" name="node1_fence" passwd="admin"/>
>                         <fencedevice agent="fence_ilo" hostname="admin"
> login="admin" name="node2_fence" passwd="admin"/>
>             </fencedevices>

The above line look funny to me.  The hostname for the fence device is 
"admin"?

> Using the cluster ip address (172.40.2.119), I was able to do an nfs
> mount of the shared lun from a 3rd machine. Started an infinite ls on
> that lun.
> To simulate failover, I just powered-down the node1 and hoping to see
> the read io stop but resume via the node2. But, I see the following
> error message on the node 2.
> Mar  9 12:14:49 node2 fenced[7422]: fence "node1" failed
> Mar  9 12:14:54 node2 fenced[7422]: fencing node "node1"
> Mar  9 12:14:54 node2 fenced[7422]: agent "fence_ilo" reports: Can't
> call method "configure" on an undefined value at /sbin/fence_ilo line
> 169, <> line 4.
> Mar  9 12:14:54 node2 fenced[7422]: fence "node1" failed
> Mar  9 12:14:59 node2 fenced[7422]: fencing node "node1"
> Mar  9 12:14:59 node2 fenced[7422]: agent "fence_ilo" reports: Can't
> call method "configure" on an undefined value at /sbin/fence_ilo line
> 169, <> line 4.
>  
> Seems like I am not doing something correct with respect to fencing.
> Can I setup cluster without fencing first of all?

Yes.  You can use manual fencing.  That should only be used for testing 
purposes though... it is not a supported configuration.


  brassow
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 3471 bytes
Desc: not available
Url :
https://www.redhat.com/archives/linux-cluster/attachments/20070309/c015b8da/
attachment.bin

------------------------------

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

End of Linux-cluster Digest, Vol 35, Issue 13
*********************************************



_________________________________________________________________________________________________________________
This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended 
recipient please telephone or e-mail the sender and delete this message and all attachments from your system - ServerEngines  LLC






More information about the Linux-cluster mailing list