[Cluster-devel] [PATCH] gfs2: Flag a withdraw if init_threads() fails

Andrew Price anprice at redhat.com
Mon Mar 15 15:05:57 UTC 2021


On 15/03/2021 14:32, Andreas Gruenbacher wrote:
> On Mon, Mar 15, 2021 at 1:24 PM Andrew Price <anprice at redhat.com> wrote:
>> Interrupting mount with ^C quickly enough can cause the kthread_run()
>> calls in gfs2's init_threads() to fail and the error path leads to a
>> deadlock on the s_umount rwsem. The abridged chain of events is:
>>
>>    [mount path]
>>    get_tree_bdev()
>>      sget_fc()
>>        alloc_super()
>>          down_write_nested(&s->s_umount, SINGLE_DEPTH_NESTING); [acquired]
>>      gfs2_fill_super()
>>        gfs2_make_fs_rw()
>>          init_threads()
>>            kthread_run()
>>              ( Interrupted )
>>        [Error path]
>>        gfs2_gl_hash_clear()
>>          flush_workqueue(glock_workqueue)
>>            wait_for_completion()
>>
>>    [workqueue context]
>>    glock_work_func()
>>      run_queue()
>>        do_xmote()
>>          freeze_go_sync()
>>            freeze_super()
>>              down_write(&sb->s_umount) [deadlock]
>>
>> In freeze_go_sync() there is a gfs2_withdrawn() check that we can use to
>> make sure freeze_super() is not called in the error path, so add a
>> gfs2_withdraw_delayed() call when init_threads() fails.
>>
>> Ref: https://bugzilla.kernel.org/show_bug.cgi?id=212231
>>
>> Reported-by: Alexander Aring <aahringo at redhat.com>
>> Signed-off-by: Andrew Price <anprice at redhat.com>
>> ---
>>   fs/gfs2/super.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
>> index 97076d3f562f..9e91c9d92bd6 100644
>> --- a/fs/gfs2/super.c
>> +++ b/fs/gfs2/super.c
>> @@ -162,8 +162,10 @@ int gfs2_make_fs_rw(struct gfs2_sbd *sdp)
>>          int error;
>>
>>          error = init_threads(sdp);
>> -       if (error)
>> +       if (error) {
>> +               gfs2_withdraw_delayed(sdp);
> 
> Hmm, marking the filesystem as withdrawing before we've even started
> looks a bit odd,

I agree, it's not elegant. I'm not confident that I understand why there 
is work queued to freeze the fs at this point but this is the cleanest 
way I can think of to prevent that right now. Perhaps we can come up 
with something better.

Andy

> but given that we're already checking for withdrawing
> / withdrawn filesystems all over the place, it should be okay. I'll
> push this to for-next.
> 
>>                  return error;
>> +       }
>>
>>          j_gl->gl_ops->go_inval(j_gl, DIO_METADATA);
>>          if (gfs2_withdrawn(sdp)) {
>> --
>> 2.29.2
> 
> Thanks,
> Andreas
> 




More information about the Cluster-devel mailing list