[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Barrier reinit? (prev: Thread starvation with mutex)



> > it does pthread_barrier_destroy and then
> > pthread_barrier_init, 
> 
> Check the results of each of the function calls.  The
> pthread_barrier_destroy call will fail if the barrier is still in use.
> Therefore you re-init an active barrier.

Actually, no.
The thread calling pthread_barrier_destroy() is one of the waiters, so
it has already returned from pthread_barrier_wait(), and so I guess all
threads should have done the same...

For my point of view, here is what happens:
The code is something like:
-> Thread B: 
pthread_barrier_wait(barrier1);

-> Thread A:
pthread_barrier_init(barrier1, NULL, 2);
pthread_create(... ThreadB ...);
pthread_barrier_wait(barrier1);
pthread_barrier_destroy(barrier1);
pthread_barrier_init(barrier1, NULL, 3);


When it runs, the barrier is initialized properly, then the thread is
created, it enters in pthread_barrier_wait before the thread A returns
from create, so refering to the pthread_barrier_wait() source it comes
to line "lll_futex_wait(&ibarrier->curr_event, event);" with event = 0
and left = 1 and then wait for the futex.

Then Thread A resume execution, returns from create, enters
pthread_barrier_wait.
As left now equals 0, it restores the barrier to the original count,
increases curr_event, wakes the futex, and returns.
But there is no reason why the thread B should resume execution before
thread A goes on, is there?
So let's say thread A goes on...
It enters pthread_barrier_destroy.
As left==init_count, the destroy assumes the barrier is not used
anymore. so the returned value is 0!
Next step, pthread_barrier_init is called, which put 0 in curr_event. At
this point, thread B can resume its execution, the futex was awaken so
it will loop "while (event == ibarrier->curr_event)", which is true. the
whole process now hangs because this thread will never return...

The only workaround I found is to put some unuseful code between
pthread_barrier_wait and pthread_barrier_destroy. I think this is not
acceptable because how to know how to ensure the scheduler will
schedule? And how people can know they should yield or whatever if they
did not read the source?

Thank you for your reading...

Regards
Sebastien Decugis.










> 
> -- 
> ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-- 
Sébastien DECUGIS
Bull S.A.
Tel: 04 76 29 74 93




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]