[Linux-cluster] Add option SO_LINGER to dlm sctp socket when the other endpoint is down.

Lars Marowsky-Bree lmb at suse.de
Mon Dec 2 22:04:30 UTC 2013


On 2013-12-02T15:55:12, David Teigland <teigland at redhat.com> wrote:

> > And I still don't see how this could happen - we close the socket once
> > the other node has been fenced or stopped. Short of a false-positive
> > fence, we shouldn't see what you describe, right?
> It's just a sign that your experience doesn't cover the entire range of
> possible dlm usage.

What! You mean our testing is imperfect? Liar! ;-)

Well, that's obviously true. Hence why I closed that paragraph on a
question.

> There are cases where fencing is not relevant, there are cases where
> this happens without failures, just quick leaving and rejoining of a
> lockspace, and my recollection is that userland tests are ones that
> most easily uncovered problems.

Are you refering to libdlm/dlm/tests/usertest?

Which of those would you want us to run for how long and in what
environment?

(That's not just related to this question, of course. We always welcome
feedback on how to do better QA. If you tell us once, we'll keep running
them forever and ever again ;-)

> > because we don't want to talk to the other side any more, hence we might
> > as well discard anything that is still in the queue.
> It's not the direct effects of this that concern me as much as the
> potential secondary effects.

Right. Since I can't reproduce any in practice, I was asking for more
theoretical details (both to investigate if we can show it's not
possible to hit, or at least testing if we hit it in practice).

> Safest would be to enable it with a config option, but I'd like to avoid
> that if possible.

Yes, definitely. That'd suck.

> Mike Christie also uses dlm sctp and has most recently
> tested it, so I'd be interested to get his feedback.

I'd also be quite curious if Mike can reproduce the problem we hit.


Thanks,
    Lars

-- 
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde




More information about the Linux-cluster mailing list