[Linux-cluster] Help with a two node cluster for a web serverneeded

Holger L. Ratzel holger.ratzel at she.net
Tue Jan 15 14:45:25 UTC 2008


Hi,

Am Montag 14 Januar 2008 21:03:46 schrieb Lon Hohberger:
> So, what was happening was this:
[...]
>
> First, let's ping the router with the cable unplugged to see how long it
> takes for our heuristic to complete when things are "broken".  On my
> machine:
>
> [lhh at ayanami ~]$ time ping -c1 -t1 frederick
> PING frederick (12.1.2.99) 56(84) bytes of data.
>
> >From ayanami (12.1.2.37) icmp_seq=1 Destination Host Unreachable
>
> --- frederick ping statistics ---
> 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
>
>
> real    0m3.006s
> ^^^^^^^^^^^^^^^^
> user    0m0.000s
> sys     0m0.000s
>

[root at testcluster-1 cluster]# time ping -c3 -t1 10.200.10.1
PING 10.200.10.1 (10.200.10.1) 56(84) bytes of data.
64 bytes from 10.200.10.1: icmp_seq=1 ttl=64 time=1.41 ms
64 bytes from 10.200.10.1: icmp_seq=2 ttl=64 time=1.29 ms
64 bytes from 10.200.10.1: icmp_seq=3 ttl=64 time=1.32 ms

--- 10.200.10.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 1.290/1.341/1.410/0.065 ms

real    0m2.007s
^^^^^^^^^^^^^^^^
user    0m0.001s
sys     0m0.002s
[root at testcluster-1 cluster]# time ping -c3 -t1 10.200.10.65
PING 10.200.10.65 (10.200.10.65) 56(84) bytes of data.
From 10.200.10.187 icmp_seq=1 Destination Host Unreachable
From 10.200.10.187 icmp_seq=2 Destination Host Unreachable
From 10.200.10.187 icmp_seq=3 Destination Host Unreachable

--- 10.200.10.65 ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 1999ms
, pipe 3

real    0m3.004s
^^^^^^^^^^^^^^^^
user    0m0.001s
sys     0m0.002s

[...]
>
> Option 1:
>
> Make 1 tko sufficient by making the heuristic do more work.  In my quick
> testing, the same 3 seconds for 1 packet was used for 3 packets.
>
[...]
>
> Option 2:
>
> Make things fit around your heuristic.  Given our 12 second "negative"
> case for our heuristic/tko, we can simply make qdisk time out in >12
> seconds.  Then, we double that and add a bit for CMAN:

First I've tried both options, but with no success. Then I've tried to do 
both, to be on the safe side:

        ...
        <quorumd interval="1" label="Qdisk1" tko="15" votes="1">
                <heuristic interval="1" program="ping 10.200.10.1 -c3 -t1" 
score="1" tko="1"/>
        </quorumd>
        <totem token="40000"/>
        ...

Then I watched what happend by running 'clustat -li 1' on node 1. After that I 
unpluged the cable from node 1. After about 45 seconds node 2 was reported 
dead by clustat, but the quorum disk was still reported online. And then 
about 10 seconds later both nodes fenced each other...

I've attached parts of '/var/log/messages' from both nodes. Perhaps it 
contains helpfull information.

Many thanks and best regards,

	Holger

-- 
----------------- SHE - IT-Sicherheit von Experten ------------------
SHE Informationstechnologie AG
Holger L. Ratzel                               Fon:+49 621 5200 - 210 
Service Delivery & Support                     Fax:+49 621 5200 - 555
Donnersbergweg 3                                holger.ratzel at she.net
D-67059 Ludwigshafen                              http://www.she.net/
Sitz der Gesellschaft und Registergericht Ludwigshafen HRB 4593
Aufsichtsratsvorsitzender: Ulrich Engelhardt
Vorstand: Klaus Schulz
------------------------ I am root. Fear me! ------------------------

PGP-Fingerprint:
9A 73 40 22 72 64 BE D1  D8 1A 54 3C 5B 64 AF C3  CC E3 CA A8
Get my PGP public key at: http://pgp.she.net/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: node1.log
Type: text/x-log
Size: 46707 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080115/13cc73fe/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: node2.log
Type: text/x-log
Size: 45890 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080115/13cc73fe/attachment-0001.bin>


More information about the Linux-cluster mailing list