[libvirt] [PATCH v5 1/3] nwfilter: Convert _virNWFilterObj to use virObjectRWLockable

John Ferlan jferlan at redhat.com
Fri Feb 9 13:00:13 UTC 2018



On 02/09/2018 03:41 AM, Michal Privoznik wrote:
> On 02/08/2018 04:06 PM, John Ferlan wrote:
>> [...]
>>
>>>>>> +static void
>>>>>> +virNWFilterObjPromoteToWrite(virNWFilterObjPtr obj)
>>>>>> +{
>>>>>> +    virObjectRWUnlock(obj);
>>>>>> +    virObjectRWLockWrite(obj);
>>>>>> +}
>>>>>
>>>>> How can this not deadlock? This will work only if @obj is locked exactly
>>>>> once. But since we allow recursive locks any lock() that happens in the
>>>>> 2nd level must deadlock with this. On the other hand, there's no such
>>>>> case in the code currently. But that doesn't stop anybody from calling
>>>>> PromoteWrite() without understanding its limitations.
>>>>>
>>>>> Maybe the fact that we need recursive lock for NWFilterObj means it's
>>>>> not a good candidate for virObjectRWLocable? It is still a good
>>>>> candidate for virObject though.
>>>>>
>>>>> Or if you want to spend extra time on this, maybe introducing
>>>>> virObjectRecursiveLockable may be worth it (terrible name, I know).
>>>>>
>>>>>
>>>>
>>>> I dunno, worked for me. The helper is being called from a thread that
>>>> has taken out a READ lock at most one time and needs to get the WRITE
>>>> lock in order to change things. If something else has the READ lock that
>>>> thread waits until the other thread releases the READ lock as far as I
>>>> understand things.
>>>
>>> Yes, I can see that. It's just that since the original lock is recursive
>>> I expected things to get recursive and thus I've been thinking how would
>>> this work there, nested in 2nd, 3rd, .. level. But as noted earlier, the
>>> places where lock promoting is used are kind of safe since all the
>>> locking is done at the first level.
>>>
>>
>> The reason I chose rwlocks and taking a read lock by default in the
>> FindByUUID and FindByName (as opposed to other vir*FindBy* API's which
>> return a write locked object) is because I know there's the potential
>> for other code to be doing Reads and rwlocks allowed for that without
>> the need to create any sort of recursive lockable object or any sort of
>> API to perform "trylock"'s (all things I tried until the epiphany to use
>> rwlocks was reached when you added them for other drivers and their list
>> protection).
>>
>> In the long/short run I figured that having a common way to get the lock
>> and then deciding to change it from Read to Write as necessary would
>> work just fine for the purpose that was needed.
> 
> This is where I disagree. I don't think it's safe to promote a lock by
> unlocking it. The moment the lock is unlocked another thread will lock
> the object and all the assumptions we made/checked about the object we
> made when we had the object locked are gone. So we would need to
> reiterate the decision making process.
> 

OK - so let's think about where/why/when we promote the lock and what
the checks/balances/consequences are.  There are 3 things being
protected by the lock in _virNWFilterObj (@wantRemoved, @def, and
@newDef). Each of promotion opportunities occur because the
virNWFilterTriggerVMFilterRebuild must be called when we attempt to
update or remove a filter.

1. virNWFilterObjTestUnassignDef

Changes the obj->wantRemoved flag. This is the only API that sets that
flag. The flag is checked by the virNWFilterObjWantRemoved that checked
as part of the virNWFilterObjListFindInstantiateFilter processing that
would would fail if it was determined the flag was set.

This essentially allows/disallows the nwfilterUndefine code to proceed
or not. The nwfilterUndefine code is further protected by a series of
locks that ensures one Undefine at a time. IOW: It's not possible to
have two threads mucking with this (from the comments in Undefine):

/* Serialization of *one* Undefine consumer is extremely important
 * as it relates to virNWFilterObjListFindInstantiateFilter processing
 * via virNWFilterTriggerVMFilterRebuild that occurs during
 * virNWFilterObjTestUnassignDef */
nwfilterDriverLock();
virNWFilterWriteLockFilterUpdates();
virNWFilterCallbackDriversLock();

2. virNWFilterObjListAssignDef

Changes to the obj->def/obj->newDef are made here.

Calls to the function are made from:

  2a. nwfilterDefineXML

Which not surprisingly has the following similar sequence:

/* Serialization of *one* DefineXML consumer is extremely important
 * as it relates to virNWFilterObjListFindInstantiateFilter processing
 * via virNWFilterTriggerVMFilterRebuild that occurs during
 * virNWFilterObjListAssignDef */
nwfilterDriverLock();
virNWFilterWriteLockFilterUpdates();
virNWFilterCallbackDriversLock();


  2b. virNWFilterObjListLoadConfig via virNWFilterObjListLoadAllConfigs

  Code path is from StateInitialization or StateReload. For Initialize I
think/hope we can agree - it's one path, no ability for define/undefine
to impede.

  For the Reload path, our similar sequence:

/* Serialization of virNWFilterObjListLoadAllConfigs is extremely
 * important as it relates to virNWFilterObjListFindInstantiateFilter
 * processing via virNWFilterTriggerVMFilterRebuild that occurs during
 * virNWFilterObjListAssignDef */
nwfilterDriverLock();
virNWFilterWriteLockFilterUpdates();
virNWFilterCallbackDriversLock();


So in the long run I'm not sure this Promotion means much of anything.
We could keep RWRead locks and not have a single chance that something
is altering the fields we're touching.  At least the Promotion shows
when we do touch fields.

>>
>>>>
>>>> Back when I first did this I had quite a lot of debug code and I have a
>>>> vague recollection of seeing output from this particular path and seeing
>>>> a couple of unlocks before the WRITE was taken while running the avocado
>>>> tests.
>>>>
>>>> I have zero interest in spending more time on this. That ship sailed. If
>>>> the decision is to drop the patches, then fine. I tried.  I disagree
>>>> that NWFilterObj is not a candidate for RWLockable - in fact it's quite
>>>> the opposite *because* of the recursive locks.
>>>
>>> Well, there's a difference between recursive locks and rwlocks. And it's
>>> exactly in what happens in recursion. With recursive locks we don't need
>>> any lock promoting and thus it's safe to do read/write after lock().
>>> With lock promoting (esp. done in unlock()+lockWrite() manner) we get
>>> TOCTOU bugs.
>>>
>>
>> If we had a real lock manager available this wouldn't be a problem ;-)
>> [sorry just my OpenVMS roots and bias showing].  Still if you read up a
>> bit on "dlm" you'll get an idea of what I mean.
>>
>> There's so many locks in the nwfilter code it's a wonder it all works.
> 
> Totally agreed.
> 
>>
>> It's truly unfortunate that the wrlock mechanism is undefined for
>> promoting a read lock to write leaving the consumer with few options.
> 
> I believe pthread developers tried to avoid having any lock promoting
> because it is hard to do right. It might break the definition of mutual
> exclusion of readers and writers. I mean, imagine one thread would do:
> 
> lockRead();
> lockRead();
> lockPromote();
> 
> There are two possible solutions:
> a) both read locks are promoted to write locks
> b) only the second lock is promoted.
> 
> At any rate, either the lock is locked two times for writing, or for
> reading and writing at the same time. On the other hand, since this is
> done in one thread I guess it doesn't matter that much.
> 
> 

Yep it's hard, but not impossible. Still a lot easier to pass the buck
to someone else to figure out how/what to do for their own application
or library needs.

The really hard part is deadlock detection problems where multiple locks
are in play. It's not the single lock that's so hard. Attempting to
promote a lock while holding a write lock where if you have to wait to
get the promotion because some other thread is doing it's lock promotion
in a different order.

>>
>> Since we can limit the "updating" of the fields in @obj and getting a
>> write lock to very specific code paths, hopefully we can limit damage
>> and/or deadlock type and/or TOCTOU issues
> 
> I'm working on what I suggested earlier - making this
> virObjectRecursiveLockable. Let's see how that works out.
> 

And why is it felt it will be different than the last time I tried this
as I noted in my last response in this thread which was below the hunk
that you snipped out, see:

https://www.redhat.com/archives/libvir-list/2017-May/msg01183.html

?

John




More information about the libvir-list mailing list