[Linux-cluster] A better understanding of multicast issues
linux at alteeve.com
Sat Feb 12 16:21:44 UTC 2011
On 02/12/2011 05:51 AM, Kit Gerrits wrote:
> Did you ever get a reply from anyone?
> If what you say is true, failure of one of our HSRP(HA) switches/routers
> might break the cluster.
> (if they don't share multicast menberships)
> I would guess that multicast groups originate in the cluster, not the
> In that case, if the switch has been rebooted, the cluster needs to
> re-create the multicast groups on the switch.
> I would guess that the cluster itself needs to check if the switch is
> properly handling multicast.
> (subscribe to its own group and check if the packets are being handles
> This should provide an insight into clustering/multicast:
I did not, and thank you for replying.
So the frequent multicast breakdowns, given that it's fairly rare for
switches to reset, is probably in the periodic checks done by the
switches. I wonder then if corosync, for whatever reasons, doesn't or
isn't able to answer the requests (quickly enough). Perhaps the process
takes too much time? Corosync will, by default, decare a ring dead after
More to think about, and I appreciate that link. Thanks. :)
E-Mail: digimer at alteeve.com
Node Assassin: http://nodeassassin.org
More information about the Linux-cluster