[Linux-cluster] Halt nodes in cluster with cable disconnect

emmanuel segura emi2fast at gmail.com
Fri Jan 27 18:51:20 UTC 2012


It's ok like that, the node doesn't has the sleep always gets fenced,
because when it tries to use the fence device of the other node to make the
fence take the sleep, sorry for my bad english :-)

if you use a quorum disk you can aboid this problem with master_wins="1" in
the quorum tags in your cluster conf && and if you wanna info about the
parameters for the cluster

man qdisk ; man fence ; man cluster.conf

I recommend to use a qdisk if you are using SAN or iscsi, but if you are
using just DRBD, remember drbd has it's own internal fencing

I had experience with drbd and i think work better with heartbeat+pacemaker

Remember every redhat cluster version has the diferents problems


http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/index.html

2012/1/27 Miguel Angel Guerrero <kortux at gmail.com>

> Hi Digimer and Emmanuel
>
> I was trying some tests with my cluster configuration and, in short:
>
> 1. I think something's wrong with my configuration, because when a
> real desconnection (i.e. unplug the cable) happens on the node which
> does not have the sleep in the script (node A), the other node (node
> B) is always stonith'ed, when obviously the node which should reboot
> is the node A. This important to me because I want to know how the
> cluster should behave when a fail over the switch port or the NIC
> occurs.
>
> 2.  @Emmanuel, could you point me to redhat's documentation about
> this? I tried your solution as this:
>
> <fence_daemon clean_start="0" post_fail_delay="10" post_join_delay="30"/>
>
> But still failed, tthere is another way?
>
> 3. Another solution in this thread is to add a quorum disk to the
> cluster. I began to make this with this manual
>
> http://www.skau.dk/index.php?option=com_content&view=article&id=34:rhcs-cluster-using-iscsi&catid=4:cases-to-explain&Itemid=3
>
> But I need to replicate the data using only two nodes, and it seems
> that this solution requires three. Could somebody tell me if I'm doing
> it fine/wrong? This causes conflicts when using DRBD?
>
> On Wed, Jan 25, 2012 at 5:02 PM, Digimer <linux at alteeve.com> wrote:
> > On 01/25/2012 05:00 PM, Miguel Angel Guerrero wrote:
> >> The obliterate-peer.sh was restored, but when i make a cable
> >> disconnection or simulate this with ifdown, always the same node
> >> reboot in this case the node without sleep in obliterate-peer.sh
> >> script, this is a normal situation?
> >
> > Yup, this is expected. When the link breaks, the one with the sleep will
> > delay long enough that it will be dead before it finishes sleeping.
> > However, if the node without the sleep dies, the one with the sleep will
> > still succeed and the cluster will recover but with a short delay.
> >
> > --
> > Digimer
> > E-Mail:              digimer at alteeve.com
> > Papers and Projects: https://alteeve.com
>
>
>
> --
> Atte:
> ------------------------------------
> Miguel Angel Guerrero
> Usuario GNU/Linux Registrado #353531
> ------------------------------------
>



-- 
esta es mi vida e me la vivo hasta que dios quiera
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20120127/181614b8/attachment.htm>


More information about the Linux-cluster mailing list