[Cluster-devel] [PATCH v5 00/12] gfs2: Fix mmap + page fault deadlocks

Andreas Gruenbacher agruenba at redhat.com
Tue Aug 3 19:18:06 UTC 2021


Hi all,

here's another update on top of v5.14-rc4.  There seems to be a bug in
get_user_pages_fast when called with FOLL_FAST_ONLY; please see below.

Changes:

 * Change fault_in_pages_{readable,writeable} to return the number of
   bytes that should be accessible instead of failing outright when
   part of the requested region cannot be faulted in.  Change
   iov_iter_fault_in_readable to those same semantics.

 * Add fault_in_iov_iter_writeable for safely faulting in pages for
   writing without modifying the pages.


With this patch queue, fstest generic/208 (aio-dio-invalidate-failure.c)
endlessly spins in gfs2_file_direct_write.  It looks as if there's a bug
in get_user_pages_fast when called with FOLL_FAST_ONLY:

 (1) The test case performs an aio write into a 32 MB buffer.

 (2) The buffer is initially not in memory, so when iomap_dio_rw() ->
     ... -> bio_iov_iter_get_pages() is called with the iter->noio flag
     set, we get to get_user_pages_fast() with FOLL_FAST_ONLY set.
     get_user_pages_fast() returns 0, which causes
     bio_iov_iter_get_pages to return -EFAULT.

 (3) Then gfs2_file_direct_write faults in the entire buffer with
     fault_in_iov_iter_readable(), which succeeds.

 (4) With the buffer in memory, we retry the iomap_dio_rw() ->
     ... -> bio_iov_iter_get_pages() -> ... -> get_user_pages_fast().
     This should succeed now, but get_user_pages_fast() still returns 0.

 (5) Thus we end up in step (3) again.

The buffered writes generic/208 performs are unrelated to this hang.


Apart from the generic/208 hang, gfs2 still needs a better strategy for
faulting in more reasonable chunks of memory at a time and for resuming
requests after faulting in pages.  We've got some of the pieces in place
for safely allowing that, but more work remains to be done.


For immediate consideration by Al Viro:

  iov_iter: Fix iov_iter_get_pages{,_alloc} page fault return value


For immediate consideration by Paul Mackerras:

  powerpc/kvm: Fix kvm_use_magic_page


Thanks,
Andreas


Andreas Gruenbacher (12):
  iov_iter: Fix iov_iter_get_pages{,_alloc} page fault return value
  powerpc/kvm: Fix kvm_use_magic_page
  Turn fault_in_pages_{readable,writeable} into
    fault_in_{readable,writeable}
  Turn iov_iter_fault_in_readable into fault_in_iov_iter_readable
  iov_iter: Introduce fault_in_iov_iter_writeable
  gfs2: Add wrapper for iomap_file_buffered_write
  gfs2: Fix mmap + page fault deadlocks for buffered I/O
  iomap: Fix iomap_dio_rw return value for user copies
  iomap: Support restarting direct I/O requests after user copy failures
  iomap: Add done_before argument to iomap_dio_rw
  iov_iter: Introduce noio flag to disable page faults
  gfs2: Fix mmap + page fault deadlocks for direct I/O

 arch/powerpc/kernel/kvm.c           |   3 +-
 arch/powerpc/kernel/signal_32.c     |   4 +-
 arch/powerpc/kernel/signal_64.c     |   2 +-
 arch/x86/kernel/fpu/signal.c        |   8 +-
 drivers/gpu/drm/armada/armada_gem.c |   7 +-
 fs/btrfs/file.c                     |   8 +-
 fs/btrfs/ioctl.c                    |   7 +-
 fs/ext4/file.c                      |   5 +-
 fs/f2fs/file.c                      |   6 +-
 fs/fuse/file.c                      |   2 +-
 fs/gfs2/file.c                      |  95 ++++++++++++++++++++---
 fs/iomap/buffered-io.c              |   2 +-
 fs/iomap/direct-io.c                |  28 +++++--
 fs/ntfs/file.c                      |   2 +-
 fs/xfs/xfs_file.c                   |   6 +-
 fs/zonefs/super.c                   |   4 +-
 include/linux/iomap.h               |  11 ++-
 include/linux/pagemap.h             |  58 +-------------
 include/linux/uio.h                 |   4 +-
 lib/iov_iter.c                      | 107 ++++++++++++++++++++------
 mm/filemap.c                        |   4 +-
 mm/gup.c                            | 113 ++++++++++++++++++++++++++++
 22 files changed, 360 insertions(+), 126 deletions(-)

-- 
2.26.3




More information about the Cluster-devel mailing list