[Linux-cluster] Basic Active Active File Server

Tue Jun 22 15:47:18 UTC 2010

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com]
> On Behalf Of Jason Fitzpatrick
> Sent: Tuesday, June 22, 2010 4:40 AM
> To: Kit Gerrits
> Cc: linux clustering
> Subject: Re: [Linux-cluster] Basic Active Active File Server
> 
> Hi Kit..
> 
> Awesome and all as I am when it comes to computers, networking is a
> serious weak point ;0)
> 
> How would I go about checking if multicast traffic is making it
> between the two nodes, ...

When I suspect multicast problems I like to do a sanity check with
multicast ping.

First, on all cluster hosts make sure you enable ICMP multicast
responses:

    echo 0 >/proc/sys/net/ipv4/icmp_echo_ignore_broadcasts

Second, find your multicast address on a cluster node.  This is shown by
"cman_tool status" or "ip maddr":

    # ip maddr
    1:      lo
            inet  224.0.0.1
    2:      eth0
            link  01:00:5e:00:00:01
            inet  224.0.0.1
    3:      eth1
            link  01:00:5e:40:f9:ce
            link  01:00:5e:00:00:01
            inet  239.192.249.206    <---- my multicast address
            inet  224.0.0.1

Noting the address and interface name above, try a multicast ping with a
count of at least 2:

    # ping -I eth1 -b -L 239.192.249.206 -c 2
    PING 239.192.249.206 (239.192.249.206) from 10.65.3.166 eth1: 56(84)
bytes of data.
    64 bytes from 10.65.3.102: icmp_seq=1 ttl=64 time=0.394 ms
    64 bytes from 10.65.3.86: icmp_seq=1 ttl=64 time=0.415 ms (DUP!)
    64 bytes from 10.65.3.182: icmp_seq=1 ttl=64 time=0.418 ms (DUP!)
    64 bytes from 10.65.3.134: icmp_seq=1 ttl=64 time=0.420 ms (DUP!)
    64 bytes from 10.65.3.87: icmp_seq=1 ttl=64 time=0.971 ms (DUP!)
    64 bytes from 10.65.3.183: icmp_seq=1 ttl=64 time=0.985 ms (DUP!)
    64 bytes from 10.65.3.103: icmp_seq=1 ttl=64 time=0.987 ms (DUP!)
    64 bytes from 10.65.3.167: icmp_seq=1 ttl=64 time=0.990 ms (DUP!)
    64 bytes from 10.65.3.135: icmp_seq=1 ttl=64 time=0.992 ms (DUP!)
    64 bytes from 10.65.3.134: icmp_seq=2 ttl=64 time=0.486 ms

9 ping responses on a cluster of size 10.  Looks good.  Repeat this test
on each cluster member--you should see a consistent number of replies.

-Jeff