[Linux-cluster] Deadlock detection in libdlm
David Teigland
teigland at redhat.com
Tue Jan 25 22:19:48 UTC 2011
On Tue, Jan 25, 2011 at 08:01:00PM +0000, Steve Little wrote:
> I've been trying to make use of deadlock detection in libdlm, but
> without any luck so far. I'm hoping someone can tell me what I'm doing
> wrong, or how to debug this further.
The dlm detects *conversion* deadlocks on a single resource and returns
EDEADLK for them.
> This should cause a classic deadlock: process 1 is waiting on resource
> A, which is locked by process 2. Process 2 is waiting on resource B,
> which is locked by process 1.
A "classic" multi-resource deadlock is not detected.
Google came up with this nice description of the difference:
http://books.google.com/books?id=ydKIsgCiFVsC&pg=PA143&lpg=PA143&dq=conversion+deadlock&source=bl&ots=LSJEUQU3HI&sig=eo4UhF9sR474OvQ1Nbeid2iHTOI&hl=en&ei=6kk_TZOeNImycdLyidEB&sa=X&oi=book_result&ct=result&resnum=6&ved=0CDkQ6AEwBTgK#v=onepage&q=conversion%20deadlock&f=false
The dlm also does lock timeouts which could be used to approximate
deadlock detection/resolution.
I wrote a "toy" proof of concept for full deadlock detection once. The
code still exists in dlm_controld, I'm not sure if the sufficient flags
exist in the API to enable and play with it any more (that's about all
it's good for.)
> EDEADLOCK The lock operation is causing a deadlock and has been
> cancelled. If this was a conversion then the lock is
> reverted to its previously granted state. If it was a
> new lock then it has not been granted. (NB Only
> conversion deadlocks are currently detected)"
It does note the limitation.
Dave
More information about the Linux-cluster
mailing list