Multicast group memberships lost if eth0 brought down and up

Deron Meranda deron.meranda at gmail.com
Wed Oct 6 17:54:01 UTC 2004


Tom Mitchell wrote:
> In my opinion tearing down the infrastructure for a network
> link should tear down and forget state.  i.e. the kernel
> should forget about such things.   One of the 'reasons' for 
> cycling an interface down/up is to clear out such state.

Yes, from the network's perspective I agree that the group membership
should be torn down when an interface goes offline.  This includes the
IGMP announcements, etc.  The kernel is doing the correct thing there.

However, from the application's perspective, this causes a problem
because, unlike it's peers out on the network, the application is
never informed that the group membership was torn down...nor does the
application have any means to determine that.  Like I said the socket
descriptor is still a perfectly valid descriptor and packets can be
sent to it just fine (once the interface comes back up).  It's just
that no packets will be received, and the application doesn't know if
that's just because it's there are no peers, the peers are being
quiet, or if the kernel silently flushed it's group memberships
without telling it.

This is a bad state for an application; one which the application
didn't cause, one which it can't detect, and one which even the
sysadmin can't correct with any fancy /sbin/ip command line sequence
(short of restarting the application).

Either there has to be a way for an application to be informed that
the group membership it registered has been withdrawn (just as it's
network peers are informed via IGMP)... or the kernel should attempt
to re-establish group memberships automatically when an interface
comes back up.  All the information it should need to do this should
already be recorded in the various socket structures.


> Is there some sort of timeout or heartbeat in the design that can be used
> by the application to notice that the connection has been torn down.

No.  The application doesn't know whether there are any peers or it's alone in
the world.  And peers can come and go, so it's not the application's
responsibility
to double-check the mechanics of multicasting protocol.


> ...are you tunneling.  If so does the tunnel collapse (as it
> should) and routing can no longer find you.

No tunneling... This is direct IPv6 over ethernet.  In particular I'm using
a "link scoped" multicast address.


(P.S. Please also CC: me in responses to make it easier for me to reply
to the correct thread message)
-- 
Deron




More information about the fedora-list mailing list