[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Poor condvar performance

I've been getting some poor (and highly variable) results using
condition variables as a notification mechanism for data being available
on a queue. I won't go into details about the implementation at the
moment (except to say this is on a UP machine) as I would initially just
like some clarification on a couple of points:

1. From reading previous postings here (eg
https://www.redhat.com/archives/phil-list/2003-October/msg00010.html) it
looks like currently if a condition variable is broadcast()ed, a waiting
thread is immediately scheduled. Is this correct?

2. This scheduled thread then immediately sleeps again because it cannot
obtain the mutex associated with the condition variable. So there are
already 2 context switches per pthread_cond_broadcast(). This is
certainly reflected in the tests I have run where I see context switches
of the order of 300000/sec (!).

The immediate scheduling of a waiting thread seems a little odd, and
there was a suggestion in the same thread to replace wake_up_all() with
wake_up_all_sync() in the futex code

I tried that and it made no difference. Apologies for going slightly off
topic here into kernel land, but anyway, delving a bit further I see in

#define wake_up_all_sync(x) __wake_up_sync((x),TASK_UNINTERRUPTIBLE |

and in kernel/sched.c:

void __wake_up_sync(wait_queue_head_t *q, unsigned int mode, int

if (likely(nr_exclusive))
	__wake_up_common(q, mode, nr_exclusive, 1);
	__wake_up_common(q, mode, nr_exclusive, 0);

So it seems to me that wake_up_all_sync() is always calling
__wake_up_common() with 0 for the sync flag (though I do not pretend to
understand the behaviour of likely() / __builtin_expect so maybe that
makes a difference).

If I change the __wake_up_sync() code to always pass 1 to
__wake_up_common() (and keep the futex code calling wake_up_all_sync()
), I get far better (and repeatable) performance with context switches
of the sensible order of 500 / second.

Ulrich stated that "The best solution is to get requeue working
(although two syscalls are made)." I would also appreciate some
clarification of this - does it mean that the immediate scheduling of
the woken thread would not happen, in a similar way to the hack to
sched.c above? Because that seems to be a bad thing.

I'm using Fedora Core 1 with a 2.6.10-test10 kernel and the NPTL from
CVS with the cross process cond var fix.

Again, thanks for any insight


Luke Elliott.

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]