[Cluster-devel] [GFS2 PATCH 10/10] gfs2: replace sd_aspace with sd_inode

Andreas Gruenbacher agruenba at redhat.com
Wed Jul 28 06:50:47 UTC 2021


On Tue, Jul 13, 2021 at 9:34 PM Bob Peterson <rpeterso at redhat.com> wrote:
> On 7/13/21 1:26 PM, Steven Whitehouse wrote:
>
> Hi,
>
> On Tue, 2021-07-13 at 13:09 -0500, Bob Peterson wrote:
>
> Before this patch, gfs2 kept its own address space for rgrps, but
> this
> caused a lockdep problem because vfs assumes a 1:1 relationship
> between
> address spaces and their inode. One problematic area is this:
>
> I don't think that is the case. The reason that the address space is a
> separate structure in the first place is to allow them to exist without
> an inode. Maybe that has changed, but we should see why that is, in
> that case rather than just making this change immediately.
>
> I can't see any reason why if we have to have an inode here that it
> needs to be hashed... what would need to look it up via the hashes?
>
> Steve.
>
> Hi,
>
> The actual use case, which is easily demonstrated with lockdep, is given
> in the patch text shortly after where you placed your comment. This goes
> back to this discussion from April 2018:
>
> https://listman.redhat.com/archives/cluster-devel/2018-April/msg00017.html
>
> in which Jan Kara pointed out that:
>
> "The problem is we really do expect mapping->host->i_mapping == mapping as
> we pass mapping and inode interchangeably in the mm code. The address_space
> and inodes are separate structures because you can have many inodes
> pointing to one address space (block devices). However it is not allowed
> for several address_spaces to point to one inode!"

This is fundamentally at adds with how we manage inodes: we have
inode->i_mapping which is the logical address space of the inode, and
we have gfs2_glock2aspace(GFS2_I(inode)->i_gl) which is the metadata
address space of the inode. The most important function of the
metadata address space is to remove the inode's metadata from memory
by truncating the metadata address space (inode_go_inval). We need
that when moving an inode to another node. I don't have the faintest
idea how we could otherwise achieve that in a somewhat efficient way.

Thanks,
Andreas




More information about the Cluster-devel mailing list