[Virtio-fs] [PATCH RFC] virtiofs: use fine-grained lock for dmap reclaim

Liu Bo bo.liu at linux.alibaba.com
Tue Jul 16 19:07:38 UTC 2019


Hi Vivek,

On Tue, Jul 16, 2019 at 02:36:35PM -0400, Vivek Goyal wrote:
> On Mon, Jul 15, 2019 at 02:38:06PM -0700, Liu Bo wrote:
> > On Mon, Jul 15, 2019 at 04:37:39PM -0400, Vivek Goyal wrote:
> > > On Thu, Jul 04, 2019 at 03:25:31PM +0800, Liu Bo wrote:
> > > > With free fuse dax mapping reducing, read performance is impacted
> > > > significantly because reads need to wait for a free fuse dax mapping.
> > > > 
> > > > Although reads will trigger reclaim work to try to reclaim fuse dax
> > > > mapping, reclaim code can barely make any progress if most of fuse dax
> > > > mappings are used by the file we're reading since inode lock is required
> > > > by reclaim code.
> > > > 
> > > > However, we don't have to take inode lock for reclaiming if dax mapping
> > > > has its own reference count, reference counting is to tell reclaim code to
> > > > skip those in use dax mappings, such that we can avoid the risk of
> > > > accidentally reclaiming a dax mapping that other readers are using.
> > > > 
> > > > On the other hand, holding ->i_dmap_sem during reclaim can be used to
> > > > prevent the follwing reads to get a dax mapping under reclaim.
> > > > 
> > > > Another reason is that reads/writes only use fuse dax mapping within
> > > > dax_iomap_rw(), so we can do such a trick, while mmap/faulting is a
> > > > different story and we have to take ->i_mmap_sem prior to reclaiming a dax
> > > > mapping in order to avoid the race.
> > > > 
> > > > This adds reference count for fuse dax mapping and removes the acquisition
> > > > of inode lock during reclaim.
> > > 
> > > I am not sure that this reference count implementation is safe. For
> > > example, what prevents atomic_dec() from being reordered so that it
> > > could be executed before actually copying to user space is finished.
> > > 
> > > Say cpu1 is reading a dax page and cpu 2 is freeing memory. 
> > > 
> > > cpu1 read				cpu2 free dmap
> > > ---------				-------------
> > > atomic_inc()				atomic_read()			
> > > copy data to user space
> > > atomic_dec
> > > 
> > > Say atomic_dec() gets reordered w.r.t copy_data_to_user_space.
> > > 
> > > cpu1 read				cpu2 free dmap
> > > ---------				-------------
> > > atomic_inc()				atomic_read()			
> > > atomic_dec
> > > copy data to user space
> > > 
> > > Now cpu2 will free up dax range while it is still being read?
> > 
> > Yep, I think this is possible.
> > 
> > For this specific reorder, barriers like smp_mb__{before,after}_atomic could fix
> > it.
> 
> Hi Liu Bo,
> 
> I have modified the patch to use refcount_t. Our use case is little
> different than typical reference counts. I hope that are no bugs there.
> 
> I have also fixed bunch of other issues and enabled inline range reclaim
> for read path. (It became possible with dmap refcount patch).
>
> Pushed my changes here for now. 
> 
> https://github.com/rhvgoyal/linux/commits/virtio-fs-dev-5.1
> 
> Please have a look and test. If everything looks good, I will squash these
> new patches into existing patches.

That's nice, I'll check it out.

Can you please send your patches out to this maillist?

(In fact it's difficult for us to track patches/changes on github,
esp. when they're squashed and we need to rebase our internal virtiofs
base by looking at the code line by line.)

thanks,
-liubo




More information about the Virtio-fs mailing list