[Cluster-devel] [PATCH] gfs2: Fix uaf for qda in gfs2_quota_sync

Bob Peterson rpeterso at redhat.com
Tue Aug 22 19:32:28 UTC 2023


On 1/26/23 11:10 PM, eadavis at sina.com wrote:
> From: Edward Adam Davis <eadavis at sina.com>
> 
> [   81.372851][ T5532] CPU: 1 PID: 5532 Comm: syz-executor.0 Not tainted 6.2.0-rc1-syzkaller-dirty #0
> [   81.382080][ T5532] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/12/2023
> [   81.392343][ T5532] Call Trace:
> [   81.395654][ T5532]  <TASK>
> [   81.398603][ T5532]  dump_stack_lvl+0x1b1/0x290
> [   81.418421][ T5532]  gfs2_assert_warn_i+0x19a/0x2e0
> [   81.423480][ T5532]  gfs2_quota_cleanup+0x4c6/0x6b0
> [   81.428611][ T5532]  gfs2_make_fs_ro+0x517/0x610
> [   81.457802][ T5532]  gfs2_withdraw+0x609/0x1540
> [   81.481452][ T5532]  gfs2_inode_refresh+0xb2d/0xf60
> [   81.506658][ T5532]  gfs2_instantiate+0x15e/0x220
> [   81.511504][ T5532]  gfs2_glock_wait+0x1d9/0x2a0
> [   81.516352][ T5532]  do_sync+0x485/0xc80
> [   81.554943][ T5532]  gfs2_quota_sync+0x3da/0x8b0
> [   81.559738][ T5532]  gfs2_sync_fs+0x49/0xb0
> [   81.564063][ T5532]  sync_filesystem+0xe8/0x220
> [   81.568740][ T5532]  generic_shutdown_super+0x6b/0x310
> [   81.574112][ T5532]  kill_block_super+0x79/0xd0
> [   81.578779][ T5532]  deactivate_locked_super+0xa7/0xf0
> [   81.584064][ T5532]  cleanup_mnt+0x494/0x520
> [   81.593753][ T5532]  task_work_run+0x243/0x300
> [   81.608837][ T5532]  exit_to_user_mode_loop+0x124/0x150
> [   81.614232][ T5532]  exit_to_user_mode_prepare+0xb2/0x140
> [   81.619820][ T5532]  syscall_exit_to_user_mode+0x26/0x60
> [   81.625287][ T5532]  do_syscall_64+0x49/0xb0
> [   81.629710][ T5532]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [   81.636292][ T5532] RIP: 0033:0x7efdd688d517
> [   81.640728][ T5532] Code: ff ff ff f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> [   81.660550][ T5532] RSP: 002b:00007fff34520ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> [   81.669413][ T5532] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007efdd688d517
> [   81.677403][ T5532] RDX: 00007fff34520db9 RSI: 000000000000000a RDI: 00007fff34520db0
> [   81.685388][ T5532] RBP: 00007fff34520db0 R08: 00000000ffffffff R09: 00007fff34520b80
> [   81.695973][ T5532] R10: 0000555555ca38b3 R11: 0000000000000246 R12: 00007efdd68e6b24
> [   81.704152][ T5532] R13: 00007fff34521e70 R14: 0000555555ca3810 R15: 00007fff34521eb0
> [   81.712868][ T5532]  </TASK>
> 
> The function "gfs2_quota_cleanup()" may be called in the function "do_sync()",
> This will cause the qda obtained in the function "qd_check_sync" to be released, resulting in the occurrence of uaf.
> In order to avoid this uaf, we can increase the judgment of "sdp->sd_quota_bitmap" released in the function
> "gfs2_quota_cleanup" to confirm that "sdp->sd_quota_list" has been released.
> 
> Link: https://lore.kernel.org/all/0000000000002b5e2405f14e860f@google.com
> Reported-and-tested-by: syzbot+3f6a670108ce43356017 at syzkaller.appspotmail.com
> Signed-off-by: Edward Adam Davis <eadavis at sina.com>
> ---
>   fs/gfs2/quota.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
> index 1ed1722..4cf66bd 100644
> --- a/fs/gfs2/quota.c
> +++ b/fs/gfs2/quota.c
> @@ -1321,6 +1321,9 @@ int gfs2_quota_sync(struct super_block *sb, int type)
>   					qda[x]->qd_sync_gen =
>   						sdp->sd_quota_sync_gen;
>   
> +			if (!sdp->sd_quota_bitmap)
> +				break;
> +
>   			for (x = 0; x < num_qd; x++)
>   				qd_unlock(qda[x]);
>   		}

Hi Edward,

Can you try to recreate this problem on a newer version of the gfs2 code?

In the linux-gfs2 repository I've got a branch called "bobquota" that 
has a bunch of patches related to quota syncing. I don't know if these 
will fix your problem, but it's worth trying.

The thing is, the qda array should have been populated by previous 
calls, like qd_fish and such, and should be okay to release by 
quota_cleanup.

I can tell you this:

In the call trace above, function do_sync tried to lock an inode glock, 
which tried to instantiate it, and that caused a withdraw.
The thing is, the only inode glock used by do_sync is for the system 
quota inode. If it had a problem instantiating that, your file system is 
corrupt and needs to be run through fsck.gfs2. It could indicate a 
hardware problem reading the system quota dinode from the storage media.

If possible I'd like to know how you cause this problem to occur. What 
were you doing to get this to happen? And how can I recreate it?

GFS2 might have a problem with withdrawing during this sequence, but I 
don't think it has much to do with the sd_quota_bitmap.

Regards,

Bob Peterson
GFS2 File System



More information about the Cluster-devel mailing list