[Virtio-fs] [PATCH RFC] virtiofs: use fine-grained lock for dmap reclaim
Vivek Goyal
vgoyal at redhat.com
Tue Jul 16 18:36:35 UTC 2019
On Mon, Jul 15, 2019 at 02:38:06PM -0700, Liu Bo wrote:
> On Mon, Jul 15, 2019 at 04:37:39PM -0400, Vivek Goyal wrote:
> > On Thu, Jul 04, 2019 at 03:25:31PM +0800, Liu Bo wrote:
> > > With free fuse dax mapping reducing, read performance is impacted
> > > significantly because reads need to wait for a free fuse dax mapping.
> > >
> > > Although reads will trigger reclaim work to try to reclaim fuse dax
> > > mapping, reclaim code can barely make any progress if most of fuse dax
> > > mappings are used by the file we're reading since inode lock is required
> > > by reclaim code.
> > >
> > > However, we don't have to take inode lock for reclaiming if dax mapping
> > > has its own reference count, reference counting is to tell reclaim code to
> > > skip those in use dax mappings, such that we can avoid the risk of
> > > accidentally reclaiming a dax mapping that other readers are using.
> > >
> > > On the other hand, holding ->i_dmap_sem during reclaim can be used to
> > > prevent the follwing reads to get a dax mapping under reclaim.
> > >
> > > Another reason is that reads/writes only use fuse dax mapping within
> > > dax_iomap_rw(), so we can do such a trick, while mmap/faulting is a
> > > different story and we have to take ->i_mmap_sem prior to reclaiming a dax
> > > mapping in order to avoid the race.
> > >
> > > This adds reference count for fuse dax mapping and removes the acquisition
> > > of inode lock during reclaim.
> >
> > I am not sure that this reference count implementation is safe. For
> > example, what prevents atomic_dec() from being reordered so that it
> > could be executed before actually copying to user space is finished.
> >
> > Say cpu1 is reading a dax page and cpu 2 is freeing memory.
> >
> > cpu1 read cpu2 free dmap
> > --------- -------------
> > atomic_inc() atomic_read()
> > copy data to user space
> > atomic_dec
> >
> > Say atomic_dec() gets reordered w.r.t copy_data_to_user_space.
> >
> > cpu1 read cpu2 free dmap
> > --------- -------------
> > atomic_inc() atomic_read()
> > atomic_dec
> > copy data to user space
> >
> > Now cpu2 will free up dax range while it is still being read?
>
> Yep, I think this is possible.
>
> For this specific reorder, barriers like smp_mb__{before,after}_atomic could fix
> it.
Hi Liu Bo,
I have modified the patch to use refcount_t. Our use case is little
different than typical reference counts. I hope that are no bugs there.
I have also fixed bunch of other issues and enabled inline range reclaim
for read path. (It became possible with dmap refcount patch).
Pushed my changes here for now.
https://github.com/rhvgoyal/linux/commits/virtio-fs-dev-5.1
Please have a look and test. If everything looks good, I will squash these
new patches into existing patches.
Thanks
Vivek
More information about the Virtio-fs
mailing list