Barriers hanging

Sebastien Decugis sebastien.decugis at ext.bull.net
Thu Feb 19 10:51:29 UTC 2004


> You call it weakness because you want to depend on other functionality.
>  In reality, the only weak code is your's since you depend on
> non-standard functionality.
I am sorry I don't understand which functionnality you means? barriers?
or the fact that after having passed the pthread_barrier_wait I assume
the barrier is free and can be destroyed?

> 
> The proposed changes really look bad.  You wake all the threads just to
> have them run into the next lock.  The scheduling of this is killing the
> performance, especially with many threads and many processors.
I was aware of this but could not find a better idea... That's the
purpose of this list, isn't it? I mean, that experienced people correct
and help young people, to increase quality of NPTL...?

> 
> 
> Having said this, I did make some changes which are far less intrusive
> and which still guarantee that no pthread_barrier_destroy succeeds if
> there is still a thread which hasn't returned from a previous
> pthread_barrier_wait call. 

I think this is far better.

Since it means that basic programmer who want to use a barrier will have
to write things like:

thread_A:
pthread_barrier_init(&b, NULL, 2);
pthread_create(..thread_B...);
pthread_barrier_wait(&b);
do {
  rc = pthread_barrier_destroy(&b);
} while (rc != 0)

thread_B: 
pthread_barrier_wait(&b);

Don't you think everyone who doesn't know anything inside the NPTL will
assume that calling pthread_barrier_destroy() after
pthread_barrier_wait() is a good way to ensure pthread_barrier_destroy
will not fail, even if the standard is unclear? ( I read the following
lines a lot of time before understanding their meaning...)

"The results are undefined if pthread_barrier_destroy() is called when
any thread is blocked on the barrier"

"[EBUSY]
        The implementation has detected an attempt to destroy a barrier
        while it is in use (for example, while being used in a
        pthread_barrier_wait() call) by another thread"
        

I may have misunderstood the meaning of EBUSY, but at first I thought it
meant that a thread is still blocked on the barrier... 


>  The code is written so that no part of the
> barrier object is touched after the last thread leaves.  Your test
> program, although broken, runs with the changed code.  The new
> tst-barrier4 test exercises the new functionality but, as it is clearly
> noted, this is no test for POSIX conformance.  It tests additional
> functionality of NPTL.
Once more, I don't understand why you say it is not POSIX conformance...
Standard says at least that the EBUSY should be returned when the
barrier data is still used in a function, which is precisely the problem
I ran into.


Additionnaly, it seems that  
"if (atomic_exchange_and_add (ibarrier->left, 1) == init_count - 1)"
does not compile on ia64, but I am not sure of this since it is the
first time I try and compile on this target. I get the following errors:
../nptl/sysdeps/pthread/pthread_barrier_wait.c: In function
`pthread_barrier_wait':
../nptl/sysdeps/pthread/pthread_barrier_wait.c:76: error: invalid type
argument of `unary *'
../nptl/sysdeps/pthread/pthread_barrier_wait.c:76: warning: type
defaults to `int' in declaration of `__result'
../nptl/sysdeps/pthread/pthread_barrier_wait.c:76: error: invalid type
argument of `unary *'
../nptl/sysdeps/pthread/pthread_barrier_wait.c:76: warning: cast to
pointer from integer of different size
../nptl/sysdeps/pthread/pthread_barrier_wait.c:76: error: invalid type
argument of `unary *'
../nptl/sysdeps/pthread/pthread_barrier_wait.c:76: warning: cast to
pointer from integer of different size
make[2]: ***
[/mnt/home1/home/decugiss/libc/build/nptl/pthread_barrier_wait.o] Error
1


Can someone confirm it compiles well? thank you!




Best regards,
Sébastien Decugis.







More information about the Phil-list mailing list