[Linux-cluster] people having trouble with the default multicast address 239.192.x.x

Tue Aug 28 14:17:18 UTC 2007

Hi Steve,

Thanks for offering to look. Here's the setup:

VLAN ID 99 has 172.31.99.0/24 routed into it
VLAN ID 100 has 172.31.100.0/24 routed into it

I've got two physical nodes:

aphost0 (172.31.99.50 on its eth0 in VLAN ID 99)
aphost1 (172.31.100.50 on its eth0 in VLAN ID 100)

Each of these servers are dom0 hosts for Xen domU guests using the
dom0's bridged eth1. The domU cluster is having the same problem as the
dom0 cluster, but let's just ignore that for the time being.

Here's what I'm certain about as the network equipment. Both hosts are
patched into the same catalyst 2960 G switch. aphost0 into a port
assigned to VLAN ID 99 only and aphost1 into a port assigned to VLAN ID
100 only.

>From here, I'm less certain, but I think the following is true: The cat
2960 is linked using Rapid Spanning Tree Protocol to two layer 3
switches - a Catalyst 6506 and a Catalyst 3750G-12S. I'm also quite
certain that access control lists are applied on these switches, however
I had one of the network admins do a quick check on the routers and he
said IGMP should be working and the multicast IP range that I've been
assigned internally (239.224.72.0/24) has not been used within the
university for a long time.

Indeed there are some video streams running around the campus using
neighbouring ranges of multicast IPs and they seem to work well enough
on the same network infrastructure.

I've tried disabling iptables on the hosts and still see the same
results, so it's not firewalls.

Rebooting the nodes is very slow, because they sit for four or five
minutes waiting at "Starting fencing ..."

I'm going to send you the messages file off-list, it's just a bit too
big, especially with all the spurious fencing and multipath errors that
are going into it.

Regards,

Nik Lam