[Virtio-fs] [PATCH RFC] virtiofs: use fine-grained lock for dmap reclaim
Liu Bo
bo.liu at linux.alibaba.com
Tue Jul 16 19:07:38 UTC 2019
Hi Vivek,
On Tue, Jul 16, 2019 at 02:36:35PM -0400, Vivek Goyal wrote:
> On Mon, Jul 15, 2019 at 02:38:06PM -0700, Liu Bo wrote:
> > On Mon, Jul 15, 2019 at 04:37:39PM -0400, Vivek Goyal wrote:
> > > On Thu, Jul 04, 2019 at 03:25:31PM +0800, Liu Bo wrote:
> > > > With free fuse dax mapping reducing, read performance is impacted
> > > > significantly because reads need to wait for a free fuse dax mapping.
> > > >
> > > > Although reads will trigger reclaim work to try to reclaim fuse dax
> > > > mapping, reclaim code can barely make any progress if most of fuse dax
> > > > mappings are used by the file we're reading since inode lock is required
> > > > by reclaim code.
> > > >
> > > > However, we don't have to take inode lock for reclaiming if dax mapping
> > > > has its own reference count, reference counting is to tell reclaim code to
> > > > skip those in use dax mappings, such that we can avoid the risk of
> > > > accidentally reclaiming a dax mapping that other readers are using.
> > > >
> > > > On the other hand, holding ->i_dmap_sem during reclaim can be used to
> > > > prevent the follwing reads to get a dax mapping under reclaim.
> > > >
> > > > Another reason is that reads/writes only use fuse dax mapping within
> > > > dax_iomap_rw(), so we can do such a trick, while mmap/faulting is a
> > > > different story and we have to take ->i_mmap_sem prior to reclaiming a dax
> > > > mapping in order to avoid the race.
> > > >
> > > > This adds reference count for fuse dax mapping and removes the acquisition
> > > > of inode lock during reclaim.
> > >
> > > I am not sure that this reference count implementation is safe. For
> > > example, what prevents atomic_dec() from being reordered so that it
> > > could be executed before actually copying to user space is finished.
> > >
> > > Say cpu1 is reading a dax page and cpu 2 is freeing memory.
> > >
> > > cpu1 read cpu2 free dmap
> > > --------- -------------
> > > atomic_inc() atomic_read()
> > > copy data to user space
> > > atomic_dec
> > >
> > > Say atomic_dec() gets reordered w.r.t copy_data_to_user_space.
> > >
> > > cpu1 read cpu2 free dmap
> > > --------- -------------
> > > atomic_inc() atomic_read()
> > > atomic_dec
> > > copy data to user space
> > >
> > > Now cpu2 will free up dax range while it is still being read?
> >
> > Yep, I think this is possible.
> >
> > For this specific reorder, barriers like smp_mb__{before,after}_atomic could fix
> > it.
>
> Hi Liu Bo,
>
> I have modified the patch to use refcount_t. Our use case is little
> different than typical reference counts. I hope that are no bugs there.
>
> I have also fixed bunch of other issues and enabled inline range reclaim
> for read path. (It became possible with dmap refcount patch).
>
> Pushed my changes here for now.
>
> https://github.com/rhvgoyal/linux/commits/virtio-fs-dev-5.1
>
> Please have a look and test. If everything looks good, I will squash these
> new patches into existing patches.
That's nice, I'll check it out.
Can you please send your patches out to this maillist?
(In fact it's difficult for us to track patches/changes on github,
esp. when they're squashed and we need to rebase our internal virtiofs
base by looking at the code line by line.)
thanks,
-liubo
More information about the Virtio-fs
mailing list