[Libguestfs] [libnbd PATCH 1/2] api: Tighter checking of structured read replies

Tue May 31 19:32:23 UTC 2022

On Tue, May 31, 2022 at 10:23 PM Eric Blake <eblake at redhat.com> wrote:
>
> On Tue, May 31, 2022 at 08:40:57PM +0300, Nir Soffer wrote:
> > > > > @@ -364,6 +363,8 @@ STATE_MACHINE {
> > > > >        SET_NEXT_STATE (%.DEAD);
> > > > >        return 0;
> > > > >      }
> > > > > +    if (cmd->data_seen <= cmd->count)
> > > > > +      cmd->data_seen += length;
> > > >
> > > > This does not feel right. if you received more data, it should be counted,
> > > > and if this causes data_seen to be bigger than cmd->count, isn't this a fatal
> > > > error?
> > >
> > > cmd->count is at most 64M; it represents how much we asked the server
> > > to provide.  length was just validated (in the elided statement
> > > between these two hunks) to be <= cmd->count (otherwise, the server is
> > > known-buggy for replying with more than we asked, and we've already
> > > moved to %.DEAD state).  cmd->data_seen is a cumulative count of all
> > > bytes seen in prior chunks, plus the current chunk.  If we have not
> > > yet passed cmd->count, then this chunk counts towards the cumulative
> > > limit (and there is no overflow, since 64M*2 < UINT32_MAX).  If we
> > > have already passed cmd->count (in a previous chunk), then we don't
> > > further increase cmd->count, but we already know that we will fail the
> > > overall read later.  In other words, we can stop incrementing
> > > cmd->data_seen as soon as we know it exceeds cmd->count, and by
> > > stopping early, we still detect server bugs without overflowing
> > > uint32_t.
> >
> > But we can have this case:
> >
> > 1. ask for 32m
> > 2. server sends 16m (data_seen increase to 16m)
> > 3. server sends 16m (data_seen increase to 32m)
> > 4. server sends 1m (data_seen does not increase)
>
> Yes it does. 32m <= cmd->count is true, so we bump data_seen to 33m.

Right! I missed this.

> Then, later on when retiring the command, we note that 33m != 32m and
> fail the read with EIO (if it has not already failed for other
> reasons).
>
> > 5. entire request succeeds
> >
> > Shouldn't we fail if server sends unexpected data?
> >
> > If we detected that all data was received, and we get
> > unexpected data, why not fail immediately?
> >
> >     cmd->data_seen += length
> >     if (cmd->data_seen > cmd->count)
> >         switch to dead state?
>
> Switching immediately to a dead state is also possible, but it's nice
> to try and keep the connection alive as long as we can with a nice
> diagnosis of a failed CMD_READ but still allow further commands,
> rather than an abrupt disconnect that takes out all other use of the
> server.

I agree, this is better.

Reviewed-by: Nir Soffer <nsoffer at redhat.com>