[Cluster-devel] [GFS2 PATCH v2 6/6] gfs2: introduce and use new glops go_lock_needed

Bob Peterson rpeterso at redhat.com
Wed Sep 22 12:47:32 UTC 2021


On 9/22/21 6:57 AM, Andreas Gruenbacher wrote:
> On Thu, Sep 16, 2021 at 9:11 PM Bob Peterson <rpeterso at redhat.com> wrote:
>> Before this patch, when a glock was locked, the very first holder on the
>> queue would unlock the lockref and call the go_lock glops function (if
>> one exists), unless GL_SKIP was specified. When we introduced the new
>> node-scope concept, we allowed multiple holders to lock glocks in EX mode
>> and share the lock, but node-scope introduced a new problem: if the
>> first holder has GL_SKIP and the next one does NOT, since it is not the
>> first holder on the queue, the go_lock op was not called.
> 
> We use go_lock to (re)validate inodes (for inode glocks) and to read
> in bitmaps (for resource group glocks). I can see how calling go_lock
> was originally tied to the first lock holder, but GL_SKIP already
> broke the simple model that the first holder will call go_lock. The
> go_lock_needed callback only makes things worse yet again,
> unfortunately.

In what way does go_lock_needed make things worse?

> How about we introduce a new GLF_REVALIDATE flag that indicates that
> go_lock needs to be called? The flag would be set when instantiating a
> new glock and when dequeuing the last holder, and cleared in go_lock
> (and in gfs2_inode_refresh for GL_SKIP holders). I'm not sure if

That was my original design, and it makes the most sense. I named the 
flag GLF_GO_LOCK_SKIPPED, but essentially the same thing. Unfortunately, 
I ran into all kinds of problems implementing it. In those patches, 
first holders would either call glops->go_lock() or set 
GLF_GO_LOCK_SKIPPED. Once the go_lock function was complete, it cleared 
GLF_GO_LOCK_SKIPPED, and called wake_up_bit. Secondary holders did 
wait_on_bit and waited for the other process's go_lock to complete.

But I had tons of problems getting this to work properly. Processes 
would hang and deadlock for seemingly no reason. Finally I got 
frustrated and sought other solutions.

I'm willing to try to resurrect that patch set and try again. Maybe you 
can help me figure out what I'm doing wrong and why it's not working.

Bob Peterson

> GLF_REVALIDATE can fully replace GIF_INVALID as well, but it looks
> like it at first glance.
> 
> Thanks,
> Andreas




More information about the Cluster-devel mailing list