[Cluster-devel] [PATCH 1/2] gfs2: Fix occasional glock use-after-free

Ross Lagerwall ross.lagerwall at citrix.com
Tue Mar 26 18:49:41 UTC 2019


On 1/31/19 5:18 PM, Andreas Gruenbacher wrote:
> Hi Ross,
> 
> On Thu, 31 Jan 2019 at 11:56, Ross Lagerwall <ross.lagerwall at citrix.com> wrote:
>> Each gfs2_bufdata stores a reference to a glock but the reference count
>> isn't incremented. This causes an occasional use-after-free of the
>> glock. Fix by taking a reference on the glock during allocation and
>> dropping it when freeing.
>>
>> Found by KASAN:
>>
>> BUG: KASAN: use-after-free in revoke_lo_after_commit+0x8e/0xe0 [gfs2]
>> Write of size 4 at addr ffff88801aff6134 by task kworker/0:2H/20371
>>
>> CPU: 0 PID: 20371 Comm: kworker/0:2H Tainted: G O 4.19.0+0 #1
>> Hardware name: Dell Inc. PowerEdge R805/0D456H, BIOS 4.2.1 04/14/2010
>> Workqueue: glock_workqueue glock_work_func [gfs2]
>> Call Trace:
>>   dump_stack+0x71/0xab
>>   print_address_description+0x6a/0x270
>>   kasan_report+0x258/0x380
>>   ? revoke_lo_after_commit+0x8e/0xe0 [gfs2]
>>   revoke_lo_after_commit+0x8e/0xe0 [gfs2]
>>   gfs2_log_flush+0x511/0xa70 [gfs2]
>>   ? gfs2_log_shutdown+0x1f0/0x1f0 [gfs2]
>>   ? __brelse+0x48/0x50
>>   ? gfs2_log_commit+0x4de/0x6e0 [gfs2]
>>   ? gfs2_trans_end+0x18d/0x340 [gfs2]
>>   gfs2_ail_empty_gl+0x1ab/0x1c0 [gfs2]
>>   ? inode_go_dump+0xe0/0xe0 [gfs2]
>>   ? inode_go_sync+0xe4/0x220 [gfs2]
>>   inode_go_sync+0xe4/0x220 [gfs2]
>>   do_xmote+0x12b/0x290 [gfs2]
>>   glock_work_func+0x6f/0x160 [gfs2]
>>   process_one_work+0x461/0x790
>>   worker_thread+0x69/0x6b0
>>   ? process_one_work+0x790/0x790
>>   kthread+0x1ae/0x1d0
>>   ? kthread_create_worker_on_cpu+0xc0/0xc0
>>   ret_from_fork+0x22/0x40
> 
> thanks for tracking this down, very interesting.
> 
> The consistency model here is that every buffer head that a struct
> gfs2_bufdata object is attached to is protected by a glock. Before a
> glock can be released, all the buffers under that glock have to be
> flushed out and released; this is what allows another node to access
> the same on-disk location without causing inconsistencies. When there
> is a bufdata object that points to a glock that has already been
> freed, this consistency model is broken. Taking an additional refcount
> as this patch does may make the use-after-free go away, but it doesn't
> fix the underlying problem. So I think we'll need a different fix
> here.
> 
> Did you observe this problem in a real-world scenario, or with KASAN
> only? It might be that we're looking at a small race that is unlikely
> to trigger in the field. In any case, I think we need to understand
> better what't actually going on.
> 

I finally got time to look into this further. The following is what 
happens as far as I can tell though my understanding is a bit limited at 
this point:

1. A file is opened, truncated and closed which results in a metadata 
write so a gfs2_bufdata object is created and associated with the inode 
glock.

2. The gfs2_bufdata is written out and placed on an AIL list.

3. The VFS layer calls gfs2_evict_inode() and the inode is evicted which 
drops a reference on the glock. It takes case 3 (i_nlink > 0) so no log 
flush or ail_flush happens.

4. The gfs2_bufdata object is however still on one of the AIL lists and 
the next gfs2_log_flush() begins to write a revoke entry.

5. At about the same time the glock is freed. At the point of freeing, 
gl_revokes is 1 (should probably have a GLOCK_BUG_ON for this).

6. gfs2_log_flush() continues and calls revoke_lo_after_commit() and 
uses the freed glock (stack trace above).

Should evict_inode call gfs2_ail_flush() or something so that the revoke 
is written before it drops its reference to the glock?

Or is there something else that is meant to keep the glock around if the 
inode is evicted while there is a linked gfs2_bufdata sitting on some 
AIL list?

Let me know if any more specific information is needed since I now have 
a test setup that can usually reproduce this within 10 minutes.

Thanks,
-- 
Ross Lagerwall




More information about the Cluster-devel mailing list