[Libguestfs] nbdkit & qemu 2.12: qemu-img: Protocol error: simple reply when structured reply chunk was expected

Richard W.M. Jones rjones at redhat.com
Sat Mar 23 19:40:05 UTC 2019


On Sat, Mar 23, 2019 at 12:48:39PM -0500, Eric Blake wrote:
> So you hack the server into sending adjacent replies that were not
> coalesced, and sending a reply that covers 128k of the image regardless
> of how much length the client requested.  There are several ways in
> which that can violate protocol (but hey - testing qemu resiliency is
> WORTH temporary protocol violations):
>  - If the client requested REQ_ONE, you sent too many extents (sadly,
> qemu still uses REQ_ONE always, even though I'd like to lift that
> restriction in the future; the rest of my bullets are applicable when
> REQ_ONE is not in use)

Yes that correct.  My original block-status patch had a bug which
ignored the REQ_ONE flag by accident, hence it was behaving as you
describe.  This is now fixed in the latest version, but the patch
reintroduces the bug intentionally.

[...]
> > nbdkit: memory.0: debug: pread count=512 offset=0
> > nbdkit: memory.1: debug: extents count=67108864 offset=0 req_one=1
> > nbdkit: memory.3: debug: client sent NBD_CMD_DISC, closing connection
> > nbdkit: memory.3: debug: exiting worker thread memory.3
> > 
> 
> So the immediate NBD_CMD_DISC is reasonable, implying that the client
> flagged one of the ways you could have violated protocol (but whether
> qemu should just hang up, or should just ignore that particular reply
> but still keep the connection alive, is a QoI question for qemu).

I don't know.  If the server is out of spec then disconnecting is
reasonable.  But it depends on whether the qemu community wants to be
resilient against bad servers or wants to use its undoubted influence
to get them fixed.

> > and qemu-img exits with code 1, but there is no message printed.
> > (You can use --run '... || echo FAIL' if you don't believe that
> > qemu-img is exiting with a failure.)
> > 
> > While preparing this email I noticed that v3.1.0 printed an
> > error message:
> > 
> >   qemu-img: Protocol error: invalid payload for NBD_REPLY_TYPE_BLOCK_STATUS
> 
> Which says that qemu used to be noisy at diagnosing the protocol error,
> and now it is not.  Ideally, qemu as a client shouldn't encounter a
> server that is violating protocol, but when it does, it makes sense for
> qemu to call attention to the buggy server.  That sounds like a
> regression in qemu worth fixing, so I'll whip up a patch.

Yes this is my opinion too.  Obviously it shouldn't happen, but in
real life it might happen I can't see any great down-side to
qemu/qemu-img printing something useful on stderr.  We should be
concerned about malicious NBD servers, but I don't think that making
the client print an error is a problem (or it's at least arguable).

Thanks,

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/




More information about the Libguestfs mailing list