[Linux-cluster] use of quorum disk 4u5 does not prevent fencing after missing too many heartbeats

Ferry Harmusial ferry.harmusial at gmail.com
Sat Dec 1 14:21:18 UTC 2007


Hello All,

When I use a quorum disk in 4u5 it does not prevent fencing after missing
too many heartbeats.
http://sources.redhat.com/cluster/faq.html#quorum

I set up the heuristic program (ping) in such a way that both still report
themselves "fit for duty" even with the disconnected cluster communication
link.
I have set deadnode_timer to a value more than twice the time needed voor
quorum daemon to time

Any pointers on what I am missing would be much appreciated ....

Kind Regards,

Ferry Harmusial

[root at vm2 ~]# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               localhost.localdomain localhost
172.16.77.22            vm2.localdomain vm2
172.16.77.21            vm1.localdomain vm1

[root at vm2 ~]# ip addr
1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:0c:29:21:9a:75 brd ff:ff:ff:ff:ff:ff
    inet 172.16.169.128/24 brd 172.16.169.255 scope global eth0
    inet6 fe80::20c:29ff:fe21:9a75/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:0c:29:21:9a:7f brd ff:ff:ff:ff:ff:ff
    inet 172.16.77.22/24 brd 172.16.77.255 scope global eth1
    inet6 fe80::20c:29ff:fe21:9a7f/64 scope link
       valid_lft forever preferred_lft forever
4: sit0: <NOARP> mtu 1480 qdisc noop
    link/sit 0.0.0.0 brd 0.0.0.0


<?xml version="1.0"?>
<cluster alias="cluster1" config_version="63" name="cluster1">
        <quorumd interval="1" label="cluster1_qd" min_score="1" tko="10"
votes="3">
                <heuristic interval="2" program="ping 172.16.169.1 -c1 -t1"
score="1"/>
        </quorumd>

        <fence_daemon post_fail_delay="0" post_join_delay="20"/>
        <clusternodes>
                <clusternode name="vm1.localdomain" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="vmware"
port="/var/lib/vmware/Virtual Machines/Red Hat Linux/Red Hat Linux.vmx"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="vm2.localdomain" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="vmware"
port="/var/lib/vmware/Virtual Machines/RHEL4u5-1/Red Hat Linux.vmx"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="3" two_node="0" deadnode_timer="61"/>
        <fencedevices>
                <fencedevice agent="fence_vmware" ipaddr="172.16.77.1"
login="root" name="vmware" passwd="xxxxxxx"/>
        </fencedevices>
        <rm>
                <failoverdomains>
                        <failoverdomain name="FO_apache" ordered="1"
restricted="0">
                                <failoverdomainnode name="vm1.localdomain"
priority="2"/>
                                <failoverdomainnode name="vm2.localdomain"
priority="1"/>
                        </failoverdomain>
                </failoverdomains>
                <resources/>
                <service autostart="1" name="SVC_apache">
                        <fs device="/dev/sdb" force_fsck="1"
force_unmount="1" fsid="44559" fstype="ext2" mountpoint="/var/www/html"
name="FS_apache" options="" self_fence="0"/>
                        <ip address="172.16.169.140" monitor_link="1"/>
                        <script file="/etc/init.d/httpd"
name="SCRIPT_apache"/>
                </service>
        </rm>
</cluster>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20071201/318f7c47/attachment.htm>


More information about the Linux-cluster mailing list