[Linux-cachefs] Problems doing DIO to netfs cache on XFS from Ceph

Dave Chinner david at fromorbit.com
Thu Dec 3 22:12:02 UTC 2020


On Thu, Dec 03, 2020 at 02:10:56PM +0000, David Howells wrote:
> Hi Christoph,
> 
> We're having a problem making the fscache/cachefiles rewrite work with XFS, if
> you could have a look?  Jeff Layton just tripped the attached warning from
> this:
> 
> 	/*
> 	 * Given that we do not allow direct reclaim to call us, we should
> 	 * never be called in a recursive filesystem reclaim context.
> 	 */
> 	if (WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS))
> 		goto redirty;

I've pointed out in other threads where issues like this have been
raised that this check is not correct and was broken some time ago
by the PF_FSTRANS removal. The "NOFS" case here was originally using
PF_FSTRANS to protect against recursion from within transaction
contexts, not recursion through memory reclaim.  Doing writeback
from memory reclaim is caught by the preceeding PF_MEMALLOC check,
not this one.

What it is supposed to be warning about is that writeback in XFS can
start new transactions and nesting transactions is a guaranteed way
to deadlock the journal. IOWs, doing writeback from an active
transaction context is a bug in XFS.

IOWs, we are waiting on a new version of this patchset to be posted:

https://lore.kernel.org/linux-xfs/20201103131754.94949-1-laoar.shao@gmail.com/

so that we can get rid of this from iomap and check the transaction
recursion case directly in the XFS code. Then your problem goes away
completely....

Cheers,

Dave.
-- 
Dave Chinner
david at fromorbit.com




More information about the Linux-cachefs mailing list