[Freeipa-devel] [PATCH] 294 sleep before doing a task

Nathan Kinder nkinder at redhat.com
Thu Oct 15 15:15:59 UTC 2009


On 10/15/2009 06:40 AM, Simo Sorce wrote:
> On Thu, 2009-10-15 at 15:28 +0200, Pavel Zuna wrote:
>    
>> Rob Crittenden wrote:
>>      
>>> One of the last steps of an install is to run through any updates. This
>>> change adds a sleep() prior to calling tasks to ensure postop writes are
>>> done
>>>
>>> We were seeing a rare deadlock of DS when creating the memberOf task
>>> because one thread was adding memberOf in a postop while another was
>>> trying to create an index and this was causing a PRLock deadlock.
>>>
>>> rob
>>>
>>>        
>> sleep might not be the best synchronization mechanism out there, but I think
>> that in this case it is pretty much the only one available and it gets the job
>> done, so ack.
>>      
> So are we covering a DS bug here ? Or are we doing an asynchronous ldap
> request when we should do a synchronous one and wait for it to finish
> (I've fixed another place where we were doing that and racing against
> our own requests) ?
>    
It has nothing to do with the way you are performing your LDAP operations.

The issue really stems from what I consider to be a bug in NSPR's 
implementation of reader-writer locks.  It is documented that a single 
thread can hold multiple reader locks safely, but I've found that to not 
exactly be the case.  The NSPR implementation favors writers, so a 
thread waiting for the writer lock will block attempts by other threads 
to get a reader lock.  The problem is that we use reader locks in a 
re-entrant fashion, so a thread that already has a reader lock can be 
blocked when attempting to get a second reader lock due to a waiting 
writer.  This thread in turn blocks the writer thread since it already 
holds a reader lock.

I have proposed a solution to the NSPR developers that would allow an 
attempt to get a reader lock to go through even if a writer is waiting 
if the requesting thread already has another reader lock.  I'm hoping 
that this can be resolved in NSPR, otherwise we may have to change DS to 
use the pthread_rwlock_* interfaces instead.

The sleep is a temporary workaround.  This issue should not arise in 
normal operation since the lock in question is around the backend 
struct, which is only modified when there is some sort of database 
maintenance operation (such as the reindexing task that Rob triggered it 
with).
> Simo.
>
>    




More information about the Freeipa-devel mailing list