[Libguestfs] [PATCH libnbd] copy: Set POSIX_FADV_SEQUENTIAL and POSIX_FADV_DONTNEED.

Richard W.M. Jones rjones at redhat.com
Fri Feb 19 21:36:58 UTC 2021


On Fri, Feb 19, 2021 at 01:58:04PM -0600, Eric Blake wrote:
> On 2/19/21 1:41 PM, Richard W.M. Jones wrote:
> 
> >>> +#if defined (HAVE_POSIX_FADVISE) && defined (POSIX_FADV_DONTNEED)
> >>> +    /* On Linux this will evict the pages we just read from the page cache. */
> >>> +    posix_fadvise (fd, offset, r, POSIX_FADV_DONTNEED);
> >>
> >> I don't think this is a good idea, since this affects the current
> >> page cache, for the entire host.
> >>
> >> So if the host is having an image in cache for good reason,
> >> running nbdcopy will drop the cache since nbdcopy does not need
> >> it, but maybe the host will need that cache after running
> >> nbdcopy.
> >>
> >> The right way to avoid polluting the page cache is to bypass the
> >> cache using O_DIRECT, so nbdcopy is not using or affecting the
> >> page cache.  This can be useful if the user can enable this with
> >> a flag.

After thinking about this a bit more I'm not too convinced this would
affect many people.  It would require a situation where multiple
processes on the host are all reading the same file that is also being
processed by nbdcopy.  It's possible, but for RHV/OpenStack/etc there
are lots of hypervisors running but generally confined to their own
sets of disks.

> > The trouble with O_DIRECT is it's a pain to use correctly, and I guess
> > doesn't use readahead or the existing cache (kind of the opposite
> > problem).
> > 
> > However I take your point that we probably ought not to remove files
> > from the cache that are already there.  I somehow thought that
> > POSIX_FADV_DONTNEED only affected the current 'fd', but looking at the
> > Linux impl I see it affects the whole system.
> > 
> > This argues for having a flag to enable this.

BTW it's extremely hairy, racy and probably slow, but it seems
possible using mmap + mincore to determine which pages are resident,
and so in theory avoid using POSIX_FADV_DONTNEED for those pages.

> If I understand correctly, unconditionally attempting FADV_SEQUENTIAL
> should always should be fine both for reads (from existing file) and
> writes (when copying to a new file), so its only FADV_DONTNEED for
> clearing out the cache that needs a flag?

Looking at the code, FADV_SEQUENTIAL just doubles the size of
readahead:

  https://github.com/torvalds/linux/blob/f40ddce88593482919761f74910f42f4b84c004b/mm/fadvise.c#L91

I assume it's only used for reads and would not affect writes, but
it's hard to follow the code.  I'm sure it should be safe though.

So yes FADV_DONTNEED seems like it's the only one that would need a
command line option to enable or disable.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org




More information about the Libguestfs mailing list