[Cluster-devel] [PATCH 1/2] gfs2: Fix occasional glock use-after-free

Ross Lagerwall ross.lagerwall at citrix.com
Tue Apr 9 15:36:07 UTC 2019


On 4/5/19 6:50 PM, Andreas Gruenbacher wrote:
> Hi Ross,
> 
> On Tue, 2 Apr 2019 at 00:59, Andreas Gruenbacher <agruenba at redhat.com> wrote:
>> thanks for the great analysis.
>>
>> On Tue, 26 Mar 2019 at 20:14, Bob Peterson <rpeterso at redhat.com> wrote:
>>> ----- Original Message -----
>>>> 6. gfs2_log_flush() continues and calls revoke_lo_after_commit() and
>>>> uses the freed glock (stack trace above).
>>>>
>>>> Should evict_inode call gfs2_ail_flush() or something so that the revoke
>>>> is written before it drops its reference to the glock?
>>>>
>>>> Or is there something else that is meant to keep the glock around if the
>>>> inode is evicted while there is a linked gfs2_bufdata sitting on some
>>>> AIL list?
>>>>
>>>> Let me know if any more specific information is needed since I now have
>>>> a test setup that can usually reproduce this within 10 minutes.
>>>
>>> Very interesting.
>>>
>>> It's not unusual for glocks to outlive their inodes, but I'm not sure
>>> what the right answer is in this case. Either the revoke should
>>> take an additional reference to the glock, and not let go until the
>>> revoke is written by some log flush, or else the evict needs to do the
>>> log flush to make sure the revoke is committed. But we've had issues with
>>> evict in the past, so we need to be careful about how we fix it.
>>> Andreas and I will look into the best way to fix it. Thanks again for your help.
>>
>> Usually, glocks are attached to inodes or other objects. When the
>> state of the glock changes, the go_lock, go_sync, and go_inval
>> operations determine what happens to the attached object. In
>> gfs2_evict_inode, we disassociate the inode from the glock, do the
>> necessary cleanup work locally, and put the glock asynchronously when
>> needed, though. We do that in PF_MEMALLOC context to avoid
>> deadlocking; see commit 71c1b2136835 ("gfs2: gfs2_evict_inode: Put
>> glocks asynchronously"). Even if we didn't do that, there could still
>> be other glock holders like the glock state machine. There is no
>> guarantee that the glock will go away anytime soon, or actually at
>> all: it could get reused by another instance of the same inode. So
>> waiting for the glock to go away first in gfs2_evict_inode definitely
>> is not an option.
>>
>> This basically answers your above question: gfs2_evict_inode should
>> definitely clean up before putting the glock. I'm just not sure how to
>> best achieve that, yet.
> 
> after more discussions, Bob convinced me that it makes perfect sense
> to not write out outstanding revokes in gfs2_evict_inode. We really
> want to delay writing revokes as long as possible so that as many of
> the revokes as possible will go away (either because the blocks are
> re-added to the journal and the revokes are "unrevoked", or because
> the journal wraps around). So the right fix here is indeed taking
> additional glock references.
> 
> I've come up with a patch that takes one additional reference for each
> glock instead of one reference for each bufdata object; that should
> scale better. I'll post that and a follow-up patch separately.
> 
> Could you please verify?
> 

I've tested the patches and no longer see the use-after-free.

Thanks for working on this!

-- 
Ross Lagerwall




More information about the Cluster-devel mailing list