[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Question about pthread_cond_broadcast

I have no idea if this is the right list to send this to, but I am trying to 
find out if the current RedHat9.2 has a bug in the pthreads library on SMP, 
or if in fact I am using pthread_cond_broadcast incorrectly.

The attached program is run as "broadcast 10 < file" where 10 is how many 
parallel threads to create, and file is any file with about 10000 characters. 
On all other versions of Linux, and also on Windows under pthreads emulation, 
and on Irix, and on RedHat9.2 when the number of threads is 2 or on a 
single-processor machine, this runs to completion. However on RedHat9.2 on a 
dual-processer SMP machine with hyperthreads enabled, and threads 3 or more, 
this will hang eventually, with all threads in the pthread_cond_wait() call. 
I am quite certain that this program is written so that this will only happen 
if pthread_cond_broadcast() does not actually make all pending 
pthread_cond_wait() calls return.

The problem seems to be that the thread that calls wait() also calls 
broadcast() on the *same* condition, despite the fact that the mutexes are 
set up to make it impossible for this to happen at the same time (and I have 
confirmed the mutexes are working perfectly). Recompiling the program with 
-DFIXED will make it work by making a different condition for the other 

If in fact it is illegal to use a condition bidirectionally like this, I 
believe it is possible to rewrite any program to get around this, but it can 
be a pain. In our real application, which uses the condition for parallel 
update to millions of different objects, I replaced each wait() call with a 
call that creates a new condition on the stack and adds it to a linked list. 
I then replaced the broadcast() call with a call that does signal() to each 
condition on the list. This appears to work fine.

However I have been unable to find out if in fact I am attempting to use 
pthread_cond incorrectly, or if in fact this is a bug in the current 
pthreads. Extensive searches of the internet and pthreads documentation has 
not revealed anything. I was hoping to find somebody in the know who can 
answer this question.

If this is not your area, any pointers to a pthread developers or users 
mailing list would be appreciated.


=== cut here for broadcast.c ===
/* Test of pthread_cond_broadcast failing */
/* Compile with -DFIXED to make a working program */

#include <pthread.h>
#include <stdio.h>

pthread_mutex_t mutex;
pthread_mutexattr_t attrib = {PTHREAD_MUTEX_RECURSIVE_NP};
pthread_cond_t cond;

#ifdef FIXED
pthread_cond_t cond2;
#define COND2 cond2
#define COND2 cond

int n = 0;
int m = 0;

void* thread_proc(void* v) {
  int i = (int)v;
  int pn = 0;
  for (;;) {
    while (n == pn) pthread_cond_wait(&cond, &mutex);
    pn = n;
    printf("%*d %d\n", 10*i, i, n);

pthread_attr_t attr;
pthread_t threads[100];

int main(int argc, char** argv) {
  int pm = 0;
  int numthreads = atoi(argv[1]);
  int i;

  pthread_mutex_init(&mutex, &attrib);
  pthread_cond_init(&cond, 0);
#ifdef FIXED
  pthread_cond_init(&cond2, 0);

  pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);

  for (i = 1; i < numthreads; i++)
    pthread_create(&threads[i], &attr, thread_proc, (void*)(i));

  while (getchar() >= 0) {
    n = n+1;
    printf("%d\n", n);
    while (m < pm+numthreads) pthread_cond_wait(&COND2, &mutex);
    pm = m;
=== cut here ===

                   ,~,~,~,~ ~ ~ ~ ~
     /\_       _|_========___         Bill Spitzak
 ~~~/\/\\~~~~~~\____________/~~~~~~~~ spitzak d2 com

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]