[Linux-cluster] Help with a two node cluster for a web serverneeded
Holger L. Ratzel
holger.ratzel at she.net
Tue Jan 15 14:45:25 UTC 2008
Hi,
Am Montag 14 Januar 2008 21:03:46 schrieb Lon Hohberger:
> So, what was happening was this:
[...]
>
> First, let's ping the router with the cable unplugged to see how long it
> takes for our heuristic to complete when things are "broken". On my
> machine:
>
> [lhh at ayanami ~]$ time ping -c1 -t1 frederick
> PING frederick (12.1.2.99) 56(84) bytes of data.
>
> >From ayanami (12.1.2.37) icmp_seq=1 Destination Host Unreachable
>
> --- frederick ping statistics ---
> 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
>
>
> real 0m3.006s
> ^^^^^^^^^^^^^^^^
> user 0m0.000s
> sys 0m0.000s
>
[root at testcluster-1 cluster]# time ping -c3 -t1 10.200.10.1
PING 10.200.10.1 (10.200.10.1) 56(84) bytes of data.
64 bytes from 10.200.10.1: icmp_seq=1 ttl=64 time=1.41 ms
64 bytes from 10.200.10.1: icmp_seq=2 ttl=64 time=1.29 ms
64 bytes from 10.200.10.1: icmp_seq=3 ttl=64 time=1.32 ms
--- 10.200.10.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 1.290/1.341/1.410/0.065 ms
real 0m2.007s
^^^^^^^^^^^^^^^^
user 0m0.001s
sys 0m0.002s
[root at testcluster-1 cluster]# time ping -c3 -t1 10.200.10.65
PING 10.200.10.65 (10.200.10.65) 56(84) bytes of data.
From 10.200.10.187 icmp_seq=1 Destination Host Unreachable
From 10.200.10.187 icmp_seq=2 Destination Host Unreachable
From 10.200.10.187 icmp_seq=3 Destination Host Unreachable
--- 10.200.10.65 ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 1999ms
, pipe 3
real 0m3.004s
^^^^^^^^^^^^^^^^
user 0m0.001s
sys 0m0.002s
[...]
>
> Option 1:
>
> Make 1 tko sufficient by making the heuristic do more work. In my quick
> testing, the same 3 seconds for 1 packet was used for 3 packets.
>
[...]
>
> Option 2:
>
> Make things fit around your heuristic. Given our 12 second "negative"
> case for our heuristic/tko, we can simply make qdisk time out in >12
> seconds. Then, we double that and add a bit for CMAN:
First I've tried both options, but with no success. Then I've tried to do
both, to be on the safe side:
...
<quorumd interval="1" label="Qdisk1" tko="15" votes="1">
<heuristic interval="1" program="ping 10.200.10.1 -c3 -t1"
score="1" tko="1"/>
</quorumd>
<totem token="40000"/>
...
Then I watched what happend by running 'clustat -li 1' on node 1. After that I
unpluged the cable from node 1. After about 45 seconds node 2 was reported
dead by clustat, but the quorum disk was still reported online. And then
about 10 seconds later both nodes fenced each other...
I've attached parts of '/var/log/messages' from both nodes. Perhaps it
contains helpfull information.
Many thanks and best regards,
Holger
--
----------------- SHE - IT-Sicherheit von Experten ------------------
SHE Informationstechnologie AG
Holger L. Ratzel Fon:+49 621 5200 - 210
Service Delivery & Support Fax:+49 621 5200 - 555
Donnersbergweg 3 holger.ratzel at she.net
D-67059 Ludwigshafen http://www.she.net/
Sitz der Gesellschaft und Registergericht Ludwigshafen HRB 4593
Aufsichtsratsvorsitzender: Ulrich Engelhardt
Vorstand: Klaus Schulz
------------------------ I am root. Fear me! ------------------------
PGP-Fingerprint:
9A 73 40 22 72 64 BE D1 D8 1A 54 3C 5B 64 AF C3 CC E3 CA A8
Get my PGP public key at: http://pgp.she.net/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: node1.log
Type: text/x-log
Size: 46707 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080115/13cc73fe/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: node2.log
Type: text/x-log
Size: 45890 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080115/13cc73fe/attachment-0001.bin>
More information about the Linux-cluster
mailing list