[Cluster-devel] [PATCH v7 16/19] iomap: Add done_before argument to iomap_dio_rw

Darrick J. Wong djwong at kernel.org
Fri Aug 27 21:32:39 UTC 2021


On Fri, Aug 27, 2021 at 10:15:11PM +0200, Andreas Gruenbacher wrote:
> On Fri, Aug 27, 2021 at 8:30 PM Darrick J. Wong <djwong at kernel.org> wrote:
> > On Fri, Aug 27, 2021 at 06:49:23PM +0200, Andreas Gruenbacher wrote:
> > > Add a done_before argument to iomap_dio_rw that indicates how much of
> > > the request has already been transferred.  When the request succeeds, we
> > > report that done_before additional bytes were tranferred.  This is
> > > useful for finishing a request asynchronously when part of the request
> > > has already been completed synchronously.
> > >
> > > We'll use that to allow iomap_dio_rw to be used with page faults
> > > disabled: when a page fault occurs while submitting a request, we
> > > synchronously complete the part of the request that has already been
> > > submitted.  The caller can then take care of the page fault and call
> > > iomap_dio_rw again for the rest of the request, passing in the number of
> > > bytes already tranferred.
> > >
> > > Signed-off-by: Andreas Gruenbacher <agruenba at redhat.com>
> > > ---
> > >  fs/btrfs/file.c       |  5 +++--
> > >  fs/ext4/file.c        |  5 +++--
> > >  fs/gfs2/file.c        |  4 ++--
> > >  fs/iomap/direct-io.c  | 11 ++++++++---
> > >  fs/xfs/xfs_file.c     |  6 +++---
> > >  fs/zonefs/super.c     |  4 ++--
> > >  include/linux/iomap.h |  4 ++--
> > >  7 files changed, 23 insertions(+), 16 deletions(-)
> > >
> >
> > <snip to the interesting parts>
> >
> > > diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> > > index ba88fe51b77a..dcf9a2b4381f 100644
> > > --- a/fs/iomap/direct-io.c
> > > +++ b/fs/iomap/direct-io.c
> > > @@ -31,6 +31,7 @@ struct iomap_dio {
> > >       atomic_t                ref;
> > >       unsigned                flags;
> > >       int                     error;
> > > +     size_t                  done_before;
> > >       bool                    wait_for_completion;
> > >
> > >       union {
> > > @@ -126,6 +127,9 @@ ssize_t iomap_dio_complete(struct iomap_dio *dio)
> > >       if (ret > 0 && (dio->flags & IOMAP_DIO_NEED_SYNC))
> > >               ret = generic_write_sync(iocb, ret);
> > >
> > > +     if (ret > 0)
> > > +             ret += dio->done_before;
> >
> > Pardon my ignorance since this is the first time I've had a crack at
> > this patchset, but why is it necessary to carry the "bytes copied"
> > count from the /previous/ iomap_dio_rw call all the way through to dio
> > completion of the current call?
> 
> Consider the following situation:
> 
>  * A user submits an asynchronous read request.
> 
>  * The first page of the buffer is in memory, but the following
>    pages are not. This isn't uncommon for consecutive reads
>    into freshly allocated memory.
> 
>  * iomap_dio_rw writes into the first page. Then it
>    hits the next page which is missing, so it returns a partial
>    result, synchronously.
> 
>  * We then fault in the remaining pages and call iomap_dio_rw
>    for the rest of the request.
> 
>  * The rest of the request completes asynchronously.
> 
> Does that answer your question?

No, because you totally ignored the second question:

If the directio operation succeeds even partially and the PARTIAL flag
is set, won't that push the iov iter ahead by however many bytes
completed?

We already finished the IO for the first page, so the second attempt
should pick up where it left off, i.e. the second page.

--D

> Thanks,
> Andreas
> 




More information about the Cluster-devel mailing list