[Libguestfs] [PATCH nbdkit] file: Implement cache=none and fadvise=normal|random|sequential.

Mon Aug 10 14:01:07 UTC 2020

On Sat, Aug 08, 2020 at 02:14:22AM +0300, Nir Soffer wrote:
> On Fri, Aug 7, 2020 at 5:36 PM Richard W.M. Jones <rjones at redhat.com> wrote:
> >
> > On Fri, Aug 07, 2020 at 05:29:24PM +0300, Nir Soffer wrote:
> > > On Fri, Aug 7, 2020 at 5:07 PM Richard W.M. Jones <rjones at redhat.com> wrote:
> > > > These ones?
> > > > https://www.redhat.com/archives/libguestfs/2020-August/msg00078.html
> > >
> > > No, we had a bug when copying image from glance caused sanlock timeouts
> > > because of the unpredictable page cache flushes.
> > >
> > > We tried to use fadvice but it did not help. The only way to avoid such issues
> > > is with O_SYNC or O_DIRECT. O_SYNC is much slower but this is the path
> > > we took for now in this flow.
> >
> > I'm interested in more background about this, because while it is true
> > that O_DIRECT and POSIX_FADV_DONTNEED are not exactly equivalent, I
> > think I've shown here that DONTNEED can be used to avoid polluting the
> > page cache.
> 
> This fixes the minor issue of polluting the page cache, but it does not help
> to avoid stale data in the cache, or unconfrolled flushes.
> 
> The bug I mentioned is:
> https://bugzilla.redhat.com/1832967
> 
> This explains the issue:
> https://bugzilla.redhat.com/1247135#c29
>
> And here you can how unrelated I/O is affected by uncontrolled flushes:
> https://bugzilla.redhat.com/1247135#c30
> https://bugzilla.redhat.com/1247135#c36

Thanks for the explanation.

Our use of file_flush (ie fdatasync) is less than ideal - we should
probably use sync_file_range, which is what Linus suggested.  However
in this case it won't be a problem because we're only flushing the few
pages we have just written.  We control the file and there is no
chance that the flush will cause an uncontrollable flood of data.

The bigger issue to me:

Sanlock should really be using cgroups or some mechanism to prioritize
its block traffic over everything else.  Any method to solve this
where every single other process in the system is required to use
direct I/O is IMO a ridiculous hack.  What happens if some other
process on the same machine happens to block the NFS server by writing
lots?  Perhaps the admin installs some proprietary backup software
that we are unable to modify?  Same thing would happen.

> Here we compare oflag=nocache,dsync with oflag=direct for NFS and iSCSI disks:
> https://gerrit.ovirt.org/c/108912/
> 
> cahe=none now is basically oflag=nocache,dsync

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/