[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Poor condvar performance

Ulrich Drepper wrote:
> > If all the waiters are on the same mutex, you want to wake one and
> > requeue the rest.  If they are different mutexes, you want to take
> > as many as there are separate mutexes.
> Look at the current requeue interface.  There are two numbers:
> ~ the number of threads to wake
> ~ the number of threads to requeue
> cond_broadcast always passes 1 and INT_MAX.  The context for the threads
> which are requeued will contain the information with which to locate the
> destinate futex.


> I wouldn't object if in this case all threads are simply woken since the
> POSIX specs require the same mutex to be used.

The man pages that I've read for cond_wait suggest that all waiters
use the same mutex.  But it doesn't seem to be a requirement.

Am I to understand that it's a hard requirement: there _must_ be just
one mutex among all the waiters, and anything else gives undefined

> [if FUTEX_WAIT2 is called with different mutexes...] it might be
> best to fail at the time the FUTEX_WAIT2 call is made.  If there is
> already a waiter which registered a different requeue futex, fail
> the FUTEX_WAIT2 call.

If one mutex is a hard requirement, then failing FUTEX_WAIT2 is fine.

Another possibility is that whichever mutex address is given to the
_last_ call to FUTEX_WAIT2 is used.  That provides a mechanism for
multiple requeueings of a waiter, if that should ever prove to
be useful.

> > On the other hand, getting rid of the list makes it much easier to
> > ensure waiters are woken in FIFO order.  To do FIFO with the list
> > would require two intertwined lists in userspace, and more
> > FUTEX_REQUEUE2 calls in some cases.
> Where does FIFO comes in here?  There are no FIFO guarantees.  Ideally
> the futex wakeup functionality will at some point get some intelligence
> and either do implement FIFO or, better IMO, a policy which prefers
> threads with hot caches.  The POSIX specs does not guarantee anything
> for regular thread.

FIFO is used by Rusty's "fair mutex" in the futex-2.2 example code,
which cannot starve any threads unlike the simplest mutex type in his

> When it comes to RT you'll have to look at priorities and this is a
> completely different ballgame.

Indeed, futex seems quite useful for RT synchronisation.  If only we
had a good idea what we wanted from the kernel to permit userspace to
implement all the different priority inversion workarounds that
various programs would want.

> > At the moment, futex_requeue returns the number of requeued waiters,
> No, it returns the number of threads woken plus the number of threads
> requeued.

_Effectively_, the number of requeued waiters is easily calculated
from the return value :)

> > so the caller of futex_requeue can be responsible for setting the
> > mutex state so as to indicate waiters or not.
> When Ingo and I started I've experimented _a lot_ with more elaborate
> mechanisms which do refcounting etc.  They all failed under stress
> sooner or later.  I don't say it's not possible, but it'll require large
> amounts of support code.  You'll once again have to register handlers to
> help in situations like cancellation or signal handling.  Things we had
> to do with LinuxThreads and don't have to do now.  All this overhead
> would slow down the ordinary operations and my expectations are that the
> result is worse than what we have today.

Isn't it safe for the woken thread to be told if any were requeued,
and if so do the atomic word op _as if_ another thread had done "lock
and block"?  Or do signals complicate this picture?

> Once again, first priority for me is to get the implementation stable
> and well performing.  The simple REQUEUE2 interface can help.  It's a
> very specific interface, just like REQUEUE itself.  But the amount of
> additional code shouldn't be large to justify adding it.

If I understand right that all cond_wait-ers _must_ use the same
mutex, then I agree.  Otherwise I wonder.

Do you still need REQUEUE?

It seems we could drop it and just have WAIT2/REQUEUE2.  We could also
drop WAKE (just use REQUEUE2 with 0 for nr_requeue) but the API hit is
too severe for that ;)

-- Jamie

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]