[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Thread starvation with mutex

Perez-Gonzalez, Inaky wrote:
> > > 1. Thread A calls FUTEX_WAKE
> > > 2. Thread A receives 0 from FUTEX_WAKE
> > > 3. Thread A atomically unlocks the user space word
> > >
> > > Now, if some Thread B comes in between 2 and 3 and tries to
> > > lock, it will see the user space word locked and go down to
> > > wait in the kernel. It will sit there for ever because
> > > in (3) the word is locked and nobody knows B is there sleeping.
> > 
> > After step 2, Thread B sees the user space word is locked and does an
> > atomic decrement (or whatever) to indicate that there is a waiter, as
> > usual.
> > 
> > In step 3, Thread A tries to unlock by doing an atomic copmare and
> > exchang, and then it sees that the word indicates there is a waiter,
> > so loops back to 1.
> Ouch, I had forgotten we had the counter, not just a cmp/exchange.
> Thanks for the correction.
> But I still think the deadlock apples (in slightly more twisted fashion): 
> B, in between 2 and 3, has decremented and gone down to the kernel. 
> Before it is able to grab the futex spinlock, A is running, sees there is
> a waiter according to the userspace word, so goes back to one. FUTEX_WAKE
> returns 0 (because B still hasn't had the chance to queue up), so it 
> unlocks. Now B reaches the futex and sleeps for ever.
> Did I miss anything in this scenario?

Thread A cannot unlock as long as the userspace word indicates "there
are waiters".  Therefore it will keep looping calling FUTEX_WAKE It's
a near-livelock not a deadlock, and will resolve eventually but may
take a long time.

The correct version (having just looked at Rusty's code :) is:

   1. Thread A tries cmp/exchange to unlock the word;
      it fails because there are waiters
   2. Thread A calls FUTEX_WAKE to pass ownership
   3. Thread A receives 0 from FUTEX_WAKE
   4. Thread A atomically increments the word; still finds waiters
   5. Thread A calls FUTEX_WAKE to signal change
         -> Thread B will either be woken or FUTEX_WAIT returns -EAGAIN.

Steps 1-3 are the ownership passing "fair" wakup.  Steps 4-5 are the
fallback "unfair" wakeup, that only occurs during the race window.

This provides a high likelihood of ownership passing, but does not
guarantee ownership will be passed.
-- Jamie

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]