[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]

*From*: Carlo Wood <carlo alinoe com>*To*: "Liu, Bing Wei" <bing wei liu intel com>*Cc*: phil-list redhat com, cgl_discussion osdl org*Subject*: Re: [PATCH] Robust and Real-time Mutexes in NPTL*Date*: Fri, 27 Jun 2003 16:04:50 +0200

On Fri, Jun 27, 2003 at 10:08:44AM +0800, Liu, Bing Wei wrote: > This is a ---prototype implementation--- that extends NPTL with > real-time and robust mutex features. The intention is to have an > approach to locking that will benefit the applications that care > about responsiveness and robustness of the locks without affecting > other types of applications. Currently this is work in progress, and > the implementation is subject to much flux, both in the user or > kernel code. Hi Liu, I am having some ideas to detect possible dead-locks *before* they actually happen. I don't think I will have time to actually implement this though, so I thought I might run the idea past you; perhaps you are interested to write an implementation for this. Basically, the idea is to have some debug mode for the mutexes that will detect possible dead-locks - kinda like the current PTHREAD_MUTEX_ERRORCHECK. Let me explain what I mean using the simplest case first: a simple mutex that is either locked or not locked. Consider two of such locks: X and Y. Then, a dead-lock would occur when thread '1' locks X, thread '2' locks Y and then thread '1' waits for mutex Y and thread '2' waits for mutex X. This dead-lock could be detected before it actually happens by noticing that at one moment (anywhere during execution) thread '1' locks X and then, while holding X, locks Y - while later (or vica versa) another thread first locks Y and then, while holding Y, locks X. So, basically one would need to make a list of all locks involved and assign an 'ordering' to them. If thread '1' locks X and then locks Y, then we could "define" that X < Y. When later another thread first locks Y and then X, we notice that it is no longer possible to define an ording for the mutexes and barf. Lets make this more complex: suppose we have several simple mutexes: A, B, C, D, E and F. At some time during execution, thread '1' locks first C and then D: C < D. Furthermore we get for example: C < D A < E B < F F < A --> B < F < A < E C < A D < A --> C < D < A E < D --> B < F < A < E < D < A D < B --> error Not something you'd detect quickly yourself (assuming it never happened during testing, but only after the spaceshuttle was launced and about to transmit the message that would have stopped the nuclear holocaust). The order in which these pairs are locked is not important. Note that if a thread first locks 'B' then 'F' and then 'A', then this leads to two pairs: B<F and F<A. While when thread '1' first locks A and then B (A<B), thread '2' locks first B and then C (B<C), then this implies a pair A<C. All of the above is very easy to implement and hardly a challenge - but, it gets more interesting when we also include the read/write locks: CHANGE OF NOTATION: Read locks are allowed in reverse order, lets call a read lock R1, R2 etc, and the corresponding write locks W1, W2 etc. Lets therefore write a comma instead of '<'. Then it is allowed to have R1,R2 + R2,R1 (thread 'x' takes read-lock 1 and then, while holding lock 1, read-lock 2. While at some other moment another thread takes read-lock 2 and then, while holding lock 2, lock 1). also it is allowed to have R1,W2 + R2,R1 and even R1,W2 + R2,W1 So we might as well denote this with: R1,R2 + R2,R/W1 Where R/W1 means: R1 or W1. It is not allowed to have R1,W2 + R2,W1 If we build a complete list we get: R1,R2 + R2,R1 allowed R1,R2 + R2,W1 allowed R1,R2 + W2,R1 allowed R1,R2 + W2,W1 not allowed R1,W2 + R2,R1 allowed R1,W2 + R2,W1 not allowed R1,W2 + W2,R1 allowed R1,W2 + W2,W1 not allowed W1,R2 + R2,R1 allowed W1,R2 + R2,W1 allowed W1,R2 + W2,R1 allowed W1,R2 + W2,W1 allowed W1,W2 + R2,R1 not allowed W1,W2 + R2,W1 not allowed W1,W2 + W2,R1 not allowed W1,W2 + W2,W1 not allowed I have to admit that I already wrote an implementation for this, even including a "high priority read-only mutex" that will get its lock even while another thread was waiting too, bot in order to write (while normally a write lock gets precedence over a read-only lock in my implementation); It uses six "groups", represented by six bits in a bit mask: Every possible pair of two mutexes (read-only, write, or normal for each of them) corresponds to a fixed bit mask: the "groups" that pair belongs to. Every time a new pair is locked (or implied by certain rules) the bit mask is AND-ed with the new mask and an error condition is raised when the resulting bit mask is 0: there is no allowable "group" left anymore and thus we have a *potential* deadlock situation. The groups in the above case are (from my libcwd project): uint16_t const group1 = 0x002; // Instance 1 locked first. uint16_t const group2 = 0x004; // Instance 2 locked first. uint16_t const group3 = 0x020; // Instance 1 read only and high_priority when second. uint16_t const group4 = 0x040; // Instance 2 read only and high_priority when second. uint16_t const group5 = 0x200; // (Instance 1 locked first and (either one read only and high_priority when second)) // or (both read only and high_priority when second). uint16_t const group6 = 0x400; // (Instance 2 locked first and (either one read only and high_priority when second)) // or (both read only and high_priority when second). In pseudo code, the decoding of the bit mask works as follows: // During the decoding, we assume that the first instance is the smallest (instance 1). keypair_info.state = group1; if (first instance is read-only) keypair_info.state |= (group3 | group5); if (second instance is read-only and high priority) { keypair_info.state |= (group4 | group5); if (first instance is read-only) keypair_info.state |= group6; } // Put the smallest instance in instance1. if (instance_first < instance_second) { keypair_key.instance1 = instance_first; keypair_key.instance2 = instance_second; } else // Oh, the first instance was NOT the smallest. { keypair_key.instance1 = instance_second; keypair_key.instance2 = instance_first; // Correct the state by swapping groups 1 <-> 2, 3 <-> 4, and 5 <-> 6. keypair_info.state = ((keypair_info.state << 1) | (keypair_info.state >> 1)) & (group1|group2|group3|group4|group5|group6); } Then, if the same key pair already exists (arbitrary order: R1,W2 and R2,R1 are both the "pair" with instance numbers (1,2) here), then we do: uint16_t prev_state = stored_info.state; stored_info.state &= keypair_info.state; // Limit possible groups. if (stored_info.state == 0) // No group left? { // Possible dead-lock The REAL challenge is to make this work with arbitrary priority mutexes }:-). Also, there are several complex "rules" missing in my implementation that dictate implicit mutex pairs (involving more than two threads that together do something that is equivalent to a single thread taking two locks in a certain order. For example, X1,W2 + R2,Y3 implies X1,Y3 (Where X and Y are normal mutexes). That means that when a pair R2,Y3 is detected and we already detected a pair X1,W2 (or vica versa) an implicit pair X1,Y3 should be generated). I am convinced that a debugging mode like this would be very very valuable. My implementation for the simple read/write locks has helped me to detect potential deadlocks in a very early stage before they caused problems in Real Life situations. Since this has everything to do with robustness, I hope I interested you. -- Carlo Wood <carlo alinoe com>

**References**:**[PATCH] Robust and Real-time Mutexes in NPTL***From:*Liu, Bing Wei