[Libguestfs] [PATCH 4/6] v2v: rhv-upload-plugin: Support multiple connections
Richard W.M. Jones
rjones at redhat.com
Tue Jan 26 15:40:03 UTC 2021
On Tue, Jan 26, 2021 at 09:20:04AM -0600, Eric Blake wrote:
> On 1/26/21 9:03 AM, Richard W.M. Jones wrote:
>
> >>> https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md
> >>>
> >>> "bit 8, NBD_FLAG_CAN_MULTI_CONN: Indicates that the server operates
> >>> entirely without cache, or that the cache it uses is shared among
> >>> all connections to the given device. In particular, if this flag is
> >>> present, then the effects of NBD_CMD_FLUSH and NBD_CMD_FLAG_FUA MUST
> >>> be visible across all connections when the server sends its reply to
> >>> that command to the client. In the absence of this flag, clients
> >>> SHOULD NOT multiplex their commands over more than one connection to
> >>> the export."
> >
> > Although the text only mentions flush & FUA, I wonder if there's
> > another case where multi-conn should not be advertised or used: That
> > is where the block or sector size is larger than the size of writes,
> > so writes are being turned into r/m/w cycles. Multiple adjacent (even
> > non-overlapping) writes could then interfere with each other.
>
> That's the complication that explains why qemu-nbd does not advertise it
> on writeable connections. For proper cross-thread consistency when the
> bit is set, and assuming a situation where there are 2 writers and 1
> reader client, if the two writers both issue an overlapping write
> request with CMD_FLAG_FUA or followed by NBD_CMD_FLUSH, then the reader
> must see exactly one of five states: the pre-write state, the state
> after just writer 1, the state after just writer 2, the state after both
> writers with writer 1 completing before writer 2 starts, or the state
> after both writers with writer 2 completing before writer 1 starts. Any
> other result would indicate write shredding (where the two writers were
> not properly interlocked, and therefore the second writer partially
> undoes some of the effects of the first).
>
> > This is vanishingly unlikely to affect nbdcopy which uses a huge block
> > size, but would it be a problem in the NBD protocol itself?
>
> I'm not convinced it is a hole in the protocol, so much as a reason that
> CAN_MULTI_CONN was added in the first place (if you don't advertise the
> bit, clients can't assume consistency at the server, and so have to do
> their own locking or otherwise avoid overlapping writes client-side to
> ensure consistency; if you do advertise the bits, clients can take
> shortcuts). But maybe you do have a point that when a server advertises
> block sizes, a client should pay attention to the recommended block
> size, and assume anything smaller than that may trigger rmw that might
> not be safe even if CAN_MULTI_CONN was advertised. Should we ask for
> clarification on the NBD mailing list?
Looking at the spec it does already say this which seems sufficient:
"The preferred block size represents the minimum size at which
aligned requests will have efficient I/O, avoiding behaviour such as
read-modify-write."
Is there a particular reason this would only affect multi-conn? I
think the same thing would be true for a single connection with
multiple adjacent write requests in flight?
So I think I was probably wrong about saying there's a hole in the
specification. Perhaps it needs more emphasis.
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines. Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v
More information about the Libguestfs
mailing list