[Linux-cluster] fence " node1" failed if etho down

Mon Feb 16 08:29:48 UTC 2009

Hello All,

I have a cluster with two nodes running one service (mysql). The two nodes
uses a ISCSI disk with gfs on it.
I haven´t configured fencing at all.

I have tested diferent situtations of fail and these are my results:

If I halt node1 the service relocates to node2 - OK
if I kill the process in node1 the services relocate to node2 - OK

but

if I unplug the wire of the ether device or make ifdown eth0 on node1 all
the cluster fails. The service doesn´t relocate.
In node2 I get the messages:

Feb 15 13:29:34 localhost fenced[3405]: fencing node "192.168.1.188"
Feb 15 13:29:34 localhost fenced[3405]: fence "192.168.1.188" failed
Feb 15 13:29:39 localhost fenced[3405]: fencing node "192.168.1.188"
Feb 15 13:29:39 localhost fenced[3405]: fence "192.168.1.188" failed

again and again. The node2 never runs the service and I try to reboot the
node1 the computer hangs waiting for stopping the services.

In this situation all I can do is to switch off the power of node1 and
reboot the node2. This situation is not acceptable at all.

I think the problem is just with fencing but I dont know how to apply to
this situation ( I have RTFM from redhat site  but I have seen how to apply
it. :-( )

this is my cluster.conf file

<cluster alias="MICLUSTER" config_version="62" name="MICLUSTER">
        <fence_daemon clean_start="0" post_fail_delay="0"
post_join_delay="3"/>
        <clusternodes>
                <clusternode name="node1" nodeid="1" votes="1">
                        <fence/>
                </clusternode>
                <clusternode name="node2" nodeid="2" votes="1">
                        <fence/>
                </clusternode>
        </clusternodes>
        <cman expected_votes="1" two_node="1"/>
        <fencedevices/>
        <rm>
                <failoverdomains>
                        <failoverdomain name="DOMINIOFAIL" nofailback="0"
ordered="0" restricted="1">
                                <failoverdomainnode name="node1"
priority="1"/>
                                <failoverdomainnode name="node2"
priority="1"/>
                        </failoverdomain>
                </failoverdomains>
                <resources/>
                <service domain="DOMINIOFAIL" exclusive="0" name="BBDD"
revovery="restart">
                        <mysql config_file="/etc/my.cnf" listen_address=""
mysql_options="" name="mydb" shutdown_wait="3"/>
                        <ip address="192.168.1.183" monitor_link="1"/>
                </service>
        </rm>
</cluster>

Any idea? references?

Thanks in advance

Greetings

ESG
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20090216/cd643e65/attachment.htm>