[Virtio-fs] [PATCH 2/2] fuse: remove dmap when truncating inode
Tao Peng
bergwolf at hyper.sh
Sat May 11 15:24:44 UTC 2019
On Sat, May 11, 2019 at 1:31 AM Vivek Goyal <vgoyal at redhat.com> wrote:
>
> On Sat, May 11, 2019 at 12:16:23AM +0800, Tao Peng wrote:
> > On Fri, May 10, 2019 at 11:23 PM Vivek Goyal <vgoyal at redhat.com> wrote:
> > >
> > > On Fri, May 10, 2019 at 06:29:57PM +0800, Peng Tao wrote:
> > > > The obseleted dmap entries can be put back to global free list
> > > > immediately and we don't have to rely on reclaim code to free them.
> > > >
> > > > Signed-off-by: Peng Tao <tao.peng at linux.alibaba.com>
> > > > ---
> > > > fs/fuse/dir.c | 5 ++++
> > > > fs/fuse/file.c | 69 +++++++++++++++++++++++++++++++++++++++---------
> > > > fs/fuse/fuse_i.h | 2 ++
> > > > 3 files changed, 63 insertions(+), 13 deletions(-)
> > > >
> > > > diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> > > > index 3f923fe7841a..233c3ed391f1 100644
> > > > --- a/fs/fuse/dir.c
> > > > +++ b/fs/fuse/dir.c
> > > > @@ -1751,6 +1751,11 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,
> > > > down_write(&fi->i_mmap_sem);
> > > > truncate_pagecache(inode, outarg.attr.size);
> > > > invalidate_inode_pages2(inode->i_mapping);
> > > > + // Free dmap beyond i_size
> > > > + if (IS_DAX(inode) && oldsize > outarg.attr.size)
> > > > + fuse_dax_free_mappings_range(fc, inode,
> > > > + outarg.attr.size, -1);
> > > > +
> > > > up_write(&fi->i_mmap_sem);
> > > > }
> > > >
> > > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> > > > index 51faed351c7c..6d130f2f3a23 100644
> > > > --- a/fs/fuse/file.c
> > > > +++ b/fs/fuse/file.c
> > > > @@ -30,6 +30,8 @@ static long __fuse_file_fallocate(struct file *file, int mode,
> > > > loff_t offset, loff_t length);
> > > > static struct fuse_dax_mapping *alloc_dax_mapping_reclaim(struct fuse_conn *fc,
> > > > struct inode *inode);
> > > > +static void fuse_dax_do_remove_mapping_locked(struct fuse_inode *fi,
> > > > + struct fuse_dax_mapping *dmap);
> > > >
> > > > static int fuse_send_open(struct fuse_conn *fc, u64 nodeid, struct file *file,
> > > > int opcode, struct fuse_open_out *outargp)
> > > > @@ -3752,9 +3754,7 @@ int fuse_dax_reclaim_dmap_locked(struct fuse_conn *fc, struct inode *inode,
> > > > return ret;
> > > > }
> > > >
> > > > - /* Remove dax mapping from inode interval tree now */
> > > > - fuse_dax_interval_tree_remove(dmap, &fi->dmap_tree);
> > > > - fi->nr_dmaps--;
> > > > + fuse_dax_do_remove_mapping_locked(fi, dmap);
> > > > return 0;
> > > > }
> > > >
> > > > @@ -3831,6 +3831,58 @@ static struct fuse_dax_mapping *alloc_dax_mapping_reclaim(struct fuse_conn *fc,
> > > > }
> > > > }
> > > >
> > > > +/* Cleanup dmap entry and add back to free list */
> > > > +static void fuse_dax_do_free_mapping_locked(struct fuse_conn *fc, struct fuse_dax_mapping *dmap)
> > > > +{
> > > > + pr_debug("fuse: freed memory range start=0x%llx end=0x%llx "
> > > > + "window_offset=0x%llx length=0x%llx\n", dmap->start,
> > > > + dmap->end, dmap->window_offset, dmap->length);
> > > > + spin_lock(&fc->lock);
> > > > + __dmap_remove_busy_list(fc, dmap);
> > > > + dmap->inode = NULL;
> > > > + dmap->start = dmap->end = 0;
> > > > + __free_dax_mapping(fc, dmap);
> > > > + spin_unlock(&fc->lock);
> > > > +}
> > > > +
> > > > +/* Remove dax mapping from inode interval tree */
> > > > +static void fuse_dax_do_remove_mapping_locked(struct fuse_inode *fi, struct fuse_dax_mapping *dmap)
> > > > +{
> > > > + fuse_dax_interval_tree_remove(dmap, &fi->dmap_tree);
> > > > + fi->nr_dmaps--;
> > > > +}
> > >
> > > this is just a two line funciton. I think lets open code it and we don't
> > > need a separate function for this.
> > ok.
> >
> > >
> > >
> > > > +
> > > > +/*
> > > > + * Free inode dmap entries whose range falls entirely inside [start, end].
> > > > + * Called with inode->i_rwsem and fuse_inode->i_mmap_sem held.
> > > > + *
> > > > + * No need to send FUSE_REMOVEMAPPING to userspace because the userspace mappings
> > > > + * will be overridden next time dmap is used -- same logic is applied in
> > > > + * fuse_dax_reclaim_first_mapping().
> > >
> > > This is true that sending FUSE_REMOVEMAPPING is not strictly needed. But,
> > > thinking more about it, what if range does not get used soon. Then it
> > > will keep resources unnecessarily busy on host? (fd, file, inode, dentry,
> > > vma etc).
> > >
> > > I think it probably is a good idea to send FUSE_REMOVEMAPPING to daemon
> > > when range is being freed.
> > We are under inode->i_rwsem and i_mmap_sem here. It can be very
> > expensive if we wait for userspace holding locks. How about pushing it
> > to a work queue? Still we at least need to grab the dmap lock but it
> > seems less expensive and doesn't block truncate.
>
> How about freeing multiple ranges in single removemapping call. I think
> that's allowed. We can look into making it async also at some point of
> time, but right now it probably is best to keep code as simple as
> possible. Do typical workloads do truncate at high frequencey?
>
Good point! I'll change to do it. Thanks!
Cheers,
Tao
--
bergwolf at hyper.sh
More information about the Virtio-fs
mailing list