[Libguestfs] [nbdkit PATCH] readahead: Fix multi-conn inconsistency

Mon Feb 22 16:38:10 UTC 2021

On Fri, Feb 19, 2021 at 04:02:48PM -0600, Eric Blake wrote:
> On 2/19/21 3:18 PM, Eric Blake wrote:
> > On 2/19/21 3:03 PM, Eric Blake wrote:
> >> The readahead filter was blindly passing through the plugin's
> >> .can_multi_conn, but does not kill the readahead buffer on a flush.
> >> This is a bug in the following scenario, when the plugin advertised
> >> multi-conn:
> >>
> > 
> >> In order to preserve the plugin's multi-conn status, we MUST drop our
> >> cache any time we flush, so that our next read picks up whatever got
> >> flushed from another connection.
> >>
> >> But when the server lacks multi-conn, we don't have to kill our
> >> readahead buffer on flush, because the client already has no guarantee
> >> that a flush from one connection will affect the read seen by another
> >> (killing on write was already good enough to maintain semantics).
> >> ---
> > 
> >> +static int
> >> +readahead_flush (struct nbdkit_next_ops *next_ops, void *nxdata,
> >> +                 void *handle, uint32_t flags, int *err)
> >> +{
> >> +  if (next_ops->can_multi_conn(nxdata) != 1)
> > 
> > That should be ==, to match my commit message.
> > 
> >> +    kill_readahead ();
> >> +  return next_ops->flush (nxdata, flags, err);
> >> +}
> >> +
> > 
> 
> In fact, the more I think about it, the more I want it to be
> unconditional, killing readahead even when not multi-conn.  Why?
> Because when multi-conn is not advertised, it seems odd that there is no
> way at all to force the NBD server to flush its caches; it seems like
> clients should be able to expect a flush on connA (to write pending data
> to cache), then a flush on connB (to drop any readahead data), prior to
> a read on connB, should let connB see what connA wrote.  All that
> multi-conn buys you is that you can skip one of the two flush calls.

I see the modified version of this patch went upstream - thanks!

The readahead filter is fairly broken at the moment, and we stopped
using it in virt-v2v.  There are several causes of the brokenness:

 - Not having the ability of using a background thread.  (See TODO)

 - Readahead happens synchronously with the request which makes the
   requests run slower.

 - Readahead causes too large plugin requests which breaks VDDK.  Even
   though strictly this is yet another VDDK-is-crap bug, it was the
   final straw that made us drop the readahead filter.

 - Very naive/stupid readahead policy.

 - Related to previous point and to your message: The naive policy is
   practically incompatible with multi-conn, because if two threads
   are reading different parts of the disk then the readahead buffer
   will be dropped between the requests.

So I believe we should also set can_multi_conn = 0 in this filter.

It might also be a good idea to drop the filter until such time as we
have the background threads feature in the server and can do it
properly.  But at the moment this filter has a very narrow and limited
usefulness if you know what you're doing.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/