[Libguestfs] [PATCH 4/6] v2v: rhv-upload-plugin: Support multiple connections

Tue Jan 26 15:40:03 UTC 2021

On Tue, Jan 26, 2021 at 09:20:04AM -0600, Eric Blake wrote:
> On 1/26/21 9:03 AM, Richard W.M. Jones wrote:
> 
> >>>   https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md
> >>>
> >>>   "bit 8, NBD_FLAG_CAN_MULTI_CONN: Indicates that the server operates
> >>>   entirely without cache, or that the cache it uses is shared among
> >>>   all connections to the given device. In particular, if this flag is
> >>>   present, then the effects of NBD_CMD_FLUSH and NBD_CMD_FLAG_FUA MUST
> >>>   be visible across all connections when the server sends its reply to
> >>>   that command to the client. In the absence of this flag, clients
> >>>   SHOULD NOT multiplex their commands over more than one connection to
> >>>   the export."
> > 
> > Although the text only mentions flush & FUA, I wonder if there's
> > another case where multi-conn should not be advertised or used: That
> > is where the block or sector size is larger than the size of writes,
> > so writes are being turned into r/m/w cycles.  Multiple adjacent (even
> > non-overlapping) writes could then interfere with each other.
> 
> That's the complication that explains why qemu-nbd does not advertise it
> on writeable connections.  For proper cross-thread consistency when the
> bit is set, and assuming a situation where there are 2 writers and 1
> reader client, if the two writers both issue an overlapping write
> request with CMD_FLAG_FUA or followed by NBD_CMD_FLUSH, then the reader
> must see exactly one of five states: the pre-write state, the state
> after just writer 1, the state after just writer 2, the state after both
> writers with writer 1 completing before writer 2 starts, or the state
> after both writers with writer 2 completing before writer 1 starts.  Any
> other result would indicate write shredding (where the two writers were
> not properly interlocked, and therefore the second writer partially
> undoes some of the effects of the first).
> 
> > This is vanishingly unlikely to affect nbdcopy which uses a huge block
> > size, but would it be a problem in the NBD protocol itself?
> 
> I'm not convinced it is a hole in the protocol, so much as a reason that
> CAN_MULTI_CONN was added in the first place (if you don't advertise the
> bit, clients can't assume consistency at the server, and so have to do
> their own locking or otherwise avoid overlapping writes client-side to
> ensure consistency; if you do advertise the bits, clients can take
> shortcuts).  But maybe you do have a point that when a server advertises
> block sizes, a client should pay attention to the recommended block
> size, and assume anything smaller than that may trigger rmw that might
> not be safe even if CAN_MULTI_CONN was advertised.  Should we ask for
> clarification on the NBD mailing list?

Looking at the spec it does already say this which seems sufficient:

  "The preferred block size represents the minimum size at which
  aligned requests will have efficient I/O, avoiding behaviour such as
  read-modify-write."

Is there a particular reason this would only affect multi-conn?  I
think the same thing would be true for a single connection with
multiple adjacent write requests in flight?

So I think I was probably wrong about saying there's a hole in the
specification.  Perhaps it needs more emphasis.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v