Hang in pthread_cond_wait

Sebastien Decugis sebastien.decugis at ext.bull.net
Thu Apr 29 12:36:44 UTC 2004


I think the futex_requeue feature used in pthread_cond_broadcast can
lead to a hang. Please consider the following sequence:

Thread A:
/* please note that this use of pthread_cond_broadcast is legal
according to POSIX */

Thread B and C:
pthread_cond_wait(&cond, &mutex);

  C: locks the mutex
  C: enters cond_wait
  C: locks cond->__lock
  C: releases the mutex
  C: cond->total_seq = 1
  C: val=seq=0
  C: unlocks cond->__lock
  C: futex_wait (@=cond->wake_up)
A  : locks the mutex
A  : unlocks the mutex
A  : enters pthread_cond_broadcast
A  : locks cond->__lock
A  : cond->wake_up=cond_total_seq ( == 1)
A  : unlocks cond->__lock
 B : locks the mutex
 B : enters cond_wait
 B : locks cond->__lock
 B : releases the mutex
 B : cond->total_seq = 2
 B : val=seq=1
 B : unlocks cond->__lock
 B : futex_wait (@=cond->wake_up)
A  : futex_requeue => thread B is awaken, thread C is requeued on mutex.
A  : will try to lock the mutex on next loop
 B : locks cond->__lock
 B : as seq == cond->wake_up, we loop inside the function
 B : unlocks cond->__lock
 B : futex_wait (@=cond->wake_up)

Both 3 threads are now waiting.

The only workaround I can think of is to remove the FUTEX_REQUEUE call
from the broadcast function and always do a FUTEX_WAKE (ALL) instead.
This might be bad for performances but will avoid such hangs.

Please let me know if this sequence is incorrect (and why), as I don't
know the internals of futexes.



Sébastien DECUGIS
Bull S.A.
NPTL Tests & Trace project

More information about the Phil-list mailing list