[Cluster-devel] [GFS2 PATCH 1/3] gfs2: switch go_xmote_bh glop to pass gh not gl

Bob Peterson rpeterso at redhat.com
Wed Aug 25 16:11:13 UTC 2021


On 8/24/21 5:27 PM, Andreas Gruenbacher wrote:
> On Tue, Aug 24, 2021 at 6:48 PM Bob Peterson <rpeterso at redhat.com> wrote:
>> On 8/24/21 11:12 AM, Andreas Gruenbacher wrote:
>>> On Tue, Aug 24, 2021 at 4:02 PM Bob Peterson <rpeterso at redhat.com> wrote:
>>>> Before this patch, the go_xmote_bh function was passed gl, the glock
>>>> pointer. This patch switches it to gh, the holder, which points to the gl.
>>>> This facilitates improvements for the next patch.
>>>>
>>>> Signed-off-by: Bob Peterson <rpeterso at redhat.com>
>>>> ---
>>>>    fs/gfs2/glock.c  | 4 ++--
>>>>    fs/gfs2/glops.c  | 5 +++--
>>>>    fs/gfs2/incore.h | 2 +-
>>>>    3 files changed, 6 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
>>>> index e0eaa9cf9fb6..d43eed1696ab 100644
>>>> --- a/fs/gfs2/glock.c
>>>> +++ b/fs/gfs2/glock.c
>>>> @@ -562,9 +562,9 @@ static void finish_xmote(struct gfs2_glock *gl, unsigned int ret)
>>>>           if (test_and_clear_bit(GLF_DEMOTE_IN_PROGRESS, &gl->gl_flags))
>>>>                   gfs2_demote_wake(gl);
>>>>           if (state != LM_ST_UNLOCKED) {
>>>> -               if (glops->go_xmote_bh) {
>>>> +               if (gh && glops->go_xmote_bh) {
>>>
>>> This changes when the callback is called. Please explain why that's okay.
>>
>> This is okay because patch 3 eliminates go_xmote_bh() completely anyway.
>> I just threw the "gh &&" as an abundance of precaution.
> 
> I didn't mean this as a joke. This patch changes when the go_xmote_bh
> hook is called, and there is no mention why that's a safe thing to do.
> Then the go_xmote_bh hook is removed in favor of go_lock, which is yet
> called under different circumstances, and again there is no mention
> why that doesn't matter and we still end up calling freeze_go_xmote_bh
> (now freeze_go_lock) in the right circumstances.
> 
> Also, it's never been okay to break things just because a patch
> further down the line fixes it again (or breaks it in a different
> way). Please explain your changes; this also serves to document that
> you haven't accidentally missed a case.
> 
> Thanks,
> Andreas
> 
Hi Andreas,

The patch doesn't really change when the go_xmote_bh hook is called
because of the various circumstances under which it's called:

First of all, bear in mind that the only user of go_xmote_bh is the 
freeze glock.
There are only three places from which finish_xmote() is called:
(1) do_xmote calls finish_xmote for target == LM_ST_UNLOCKED.
     In this case finish_xmote() only calls glops->go_xmote_bh if state
     != LM_ST_UNLOCKED, so in this case, go_xmote_bh (and also
     do_promote) can never be called.
(2) do_xmote calls finish_xmote but only if lock_nolock locking
     protocol, which grants a holder immediately rather than making a
     lock request to dlm. In this case a holder must exist in this case
     because:
        do_xmote has the following callers:
           (2a) run_queue's call to do_xmote at the end, which passes in
                target as gl->gl_target.
                This is preceded by one of two cases:
                (2aa) If the GLF_DEMOTE bit is set, gh is passed in NULL,
                   but in that case gl->gl_target = gl->gl_demote_state.
                   We know gl->gl_demote_state cannot be LM_ST_EXCLUSIVE
                   or it would BUG_ON before the call. Since the freeze
                   glock is never taken in DEFERRED, it must be either
                   LM_ST_UNLOCKED or LM_ST_SHARED. If it is
                   LM_ST_UNLOCKED, finish_xmote will never call
                   go_xmote_bh because of "if (state != LM_ST_UNLOCKED)".
                   So we need only worry about SHARED.
                   (2aaa) There's only one place gl_demote_state is set
                      to a variable that could result in SHARED:
                      handle_callback. There's only one place
                      handle_callback is called with a variable that
                      could result in SHARED: gfs2_glock_cb.
                      There's only one place gfs2_glock_cb is called with
                      the constant LM_ST_SHARED: a callback from dlm,
                      which is impossible because we stated in (2) that
                      this is lock_nolock.
                      There's only one place gfs2_glock_cb is called with
                      a variable that could result in SHARED: sys.c when
                      the special gfs2 sysfs file is used to manually
                      force-promote a glock. This should never be used by
                      customers except in extreme deadlock situations,
                      and under the direction of support. It is not
                      tested, advised, nor recommended. Under normal
                      running conditions, this will never be called.
                (2ab) The GLF_DEMOTE test fails in run_queue. Prior to
                      its call to do_xmote it does:
                      "gh = find_first_waiter(gl);" then immediately
                      dereferences gh. So we know gh must be valid or
                      we would see kernel panic: finish_xmote will surely
                      find the same gh it dereferences here.
           (2b) finish_xmote has two calls back into do_xmote if its lock
                state change was not granted by dlm (conversion
                deadlock). We don't need to worry about these two calls
                because this is lock_nolock as per (2).
(3) glock_work_func calls finish_xmote if the GLF_REPLY_PENDING bit it
     set, passing gl_reply. Essentially this is the async/dlm version of
     the synchronous/nolock found in (1). But gl_reply is only set one
     place: gfs2_glock_complete which is a dlm callback.
     We can only get a dlm callback to gfs2_glock_complete from gdlm_ast,
     which is a reply to a lock (not unlock) request. A lock request is
     only made from do_xmote(), in which case glock_work_func's call to
     finish_xmote is responding to an async reply from dlm to that lock
     request. A lock request implies there's a "waiter" holder record
     which will be found by finish_xmote and passed into go_xmote_bh.

If you like, I can add a comment similar to the above to patch 1, but
I didn't see the great need to document all this, since the whole
go_xmote_bh concept is being replaced by the third patch anyway.

Before the third patch, finish_xmote() called BOTH go_xmote_bh() glop
(if one exists) AND do_promote(), which calls go_lock() glop (if one
exists) when the head-of-the-list holder is promoted. Calling
go_xmote_bh multiple times for multiple holders doesn't even make sense
for the freeze glock: we should not check the journal for a clean
unmount multiple times during a single freeze/thaw process.

The freeze glock will never have multiple SH holders on the same
cluster node because of the way gfs2's freeze/thaw mechanism works.
There are only 2 possible holders: (1) the EX holder while the file
system is frozen and (2) the SH holder that waits for the unfreeze.

So patch 3 switches the freeze glock to use the standard go_lock()
mechanism rather than go_xmote_bh, which is so poorly-thought-out that
patch 2 is needed to fix its short-comings.

Since the freeze glock did not already have a go_lock glop, switching
it from the go_xmote_bh glop to the go_lock glop retains the same
function called at essentially the same time.

Regards,

Bob




More information about the Cluster-devel mailing list