[Linux-cluster] Add option SO_LINGER to dlm sctp socket when the other endpoint is down.

David Teigland teigland at redhat.com
Mon Dec 2 20:55:12 UTC 2013


On Thu, Nov 28, 2013 at 05:50:09PM +0100, Lars Marowsky-Bree wrote:
> > With the patch, how much more likely would it be for data from a previous
> > connection to interfere with a new connection?  (I had this problem some
> > years ago, and added some safeguards to deal with it, but I don't think
> > they are perfect.  There are cases where a very short time separates
> > connections being closed and new connections being created.)
> 
> We've not seen this during testing. We now have positive confirmation
> not just from our tests but also customers testing this on multiple
> nodes.
> 
> And I still don't see how this could happen - we close the socket once
> the other node has been fenced or stopped. Short of a false-positive
> fence, we shouldn't see what you describe, right?

It's just a sign that your experience doesn't cover the entire range of
possible dlm usage.  There are cases where fencing is not relevant, there
are cases where this happens without failures, just quick leaving and
rejoining of a lockspace, and my recollection is that userland tests are
ones that most easily uncovered problems.

> Setting SO_LINGER just before close really doesn't make a big
> difference, since we'd always want to set it. We close the connection
> because we don't want to talk to the other side any more, hence we might
> as well discard anything that is still in the queue.

It's not the direct effects of this that concern me as much as the
potential secondary effects.

> I'm quite interested in driving this discussion forward. Anything more
> we can provide?

Safest would be to enable it with a config option, but I'd like to avoid
that if possible.  Mike Christie also uses dlm sctp and has most recently
tested it, so I'd be interested to get his feedback.

Mike, do you see any potential problems with this patch [1], have any
suggestions, or the ability to try it?
Thanks,
Dave

[1] https://www.redhat.com/archives/linux-cluster/2013-November/msg00032.html




More information about the Linux-cluster mailing list