[Linux-cluster] Manual Fencing problem

Thu Nov 13 15:25:03 UTC 2008

Ok i think that it works but now i have another problem

On the node0 messages i can see the follow message

fence_manual: Node node1 needs to be reset before recovery can procede.
Waiting for node1 to rejoin the cluster or for manual acknowledgement that
it has been reset (i.e. fence_ack_manual -n node1)
so i try to fence_ack_manual -n node1

and i recived this message

fence_ack_manual -n node1
Warning:  If the node "node1" has not been manually fenced
(i.e. power cycled or disconnected from shared storage devices)
the GFS file system may become corrupted and all its data
unrecoverable!  Please verify that the node shown above has
been reset or disconnected from storage.
Are you certain you want to continue? [yN] y
can't open /tmp/fence_manual.fifo: No such file or directory

Thank you for your help

Best Regards

Mauro Casiraghi

On Thu, Nov 13, 2008 at 3:58 PM, John Ruemker <jruemker at redhat.com> wrote:

> Try adding the nodename attribute to each device as seen here:
>
> Mauro Casiraghi wrote:
>
>> I have two cluster nodes with the follow configuration
>>  For each node i had setup manual fencing
>>  <?xml version="1.0"?>
>> <cluster alias="rhcs" config_version="13" name="mauro">
>>        <fence_daemon clean_start="0" post_fail_delay="0"
>> post_join_delay="3"/>
>>        <clusternodes>
>>                <clusternode name="node0" nodeid="1" votes="1">
>>                        <fence>
>>                                <method name="1">
>>
>                                               <device name="Manual-0"
> nodename="node0"/>
>
>>                                </method>
>>                        </fence>
>>                </clusternode>
>>                <clusternode name="node1" nodeid="2" votes="1">
>>                        <fence>
>>                                <method name="1">
>>
>                                              <device name="Manual-1"
> nodename="node1"/>
>
>                                </method>
>>                        </fence>
>>                </clusternode>
>>        </clusternodes>
>>        <cman expected_votes="1" two_node="1"/>
>>        <fencedevices>
>>                <fencedevice agent="fence_manual" name="Manual-0" />
>>                <fencedevice agent="fence_manual" name="Manual-1" />
>>        </fencedevices>
>>        <rm>
>>                <failoverdomains>
>>                        <failoverdomain name="rhcs-domain" ordered="0"
>> restricted="1">
>>                                <failoverdomainnode name="node0"
>> priority="1"/>
>>                                <failoverdomainnode name="node1"
>> priority="1"/>
>>                        </failoverdomain>
>>                </failoverdomains>
>>                <resources>
>>                        <ip address="xx.xxx.xx.78" monitor_link="1"/>
>>                </resources>
>>                <service autostart="1" domain="rhcs-domain" exclusive="0"
>> name="rhcs-web" recovery="relocate">
>>                        <ip ref="xx.xxx.xx..78"/>
>>                </service>
>>        </rm>
>> </cluster>
>>  On my messages (node0) i had recived this message
>>  Nov 13 12:06:34 lxxxxxxx fenced[2002]: fencing node "node1"
>> Nov 13 12:06:34 lxxxxxxx fenced[2002]: agent "fence_manual" reports:
>> failed: fence_manual no node name
>>  How can i fix this problem
>>
>>
>
> -John
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081113/e44e7228/attachment.htm>