[Libguestfs] [libnbd PATCH 2/3] api: Guarantee sanitized buffer on pread failure

Thu Feb 10 13:51:56 UTC 2022

On 02/09/22 23:07, Eric Blake wrote:
> As mentioned in the previous patch, we left the state of the buffer
> undefined if we fail pread prior to attempting NBD_CMD_READ.  Better
> is to tweak the generator to sanitize the buffer unconditionally, as a
> way of hardening against potential bugs in client applications that
> fail to check for errors, but which should not be leaking
> uninitialized data.  In the process, we can document that the contents
> of buf are now merely unspecified, rather than undefined (valgrind
> will no longer complain if you read buf, regardless of how nbd_pread
> failed).
> 
> As always calling memset() can be a performance hit if the client also
> sanitizes the buffer or is willing to take the audit risk, the next
> patch will add a knob that allows the client to control when this
> happens.
> ---
>  generator/API.ml | 24 ++++++++++++++----------
>  generator/C.ml   | 11 ++++++++++-
>  lib/rw.c         | 18 +++++++-----------
>  tests/errors.c   |  9 ++++++++-
>  4 files changed, 39 insertions(+), 23 deletions(-)
> 
> diff --git a/generator/API.ml b/generator/API.ml
> index 012016bc..98f90031 100644
> --- a/generator/API.ml
> +++ b/generator/API.ml
> @@ -1831,7 +1831,8 @@   "pread", {
>  The C<flags> parameter must be C<0> for now (it exists for future NBD
>  protocol extensions).
> 
> -Note that if this command fails, the contents of C<buf> are undefined."
> +Note that if this command fails, it is unspecified whether the contents
> +of C<buf> will read as zero or as partial results from the server."
>  ^ strict_call_description;
>      see_also = [Link "aio_pread"; Link "pread_structured";
>                  Link "get_block_size"; Link "set_strict_mode"];
> @@ -1918,7 +1919,8 @@   "pread_structured", {
>  this, see L<nbd_can_df(3)>). Libnbd does not validate that the server
>  actually obeys the flag.
> 
> -Note that if this command fails, the contents of C<buf> are undefined."
> +Note that if this command fails, it is unspecified whether the contents
> +of C<buf> will read as zero or as partial results from the server."
>  ^ strict_call_description;
>      see_also = [Link "can_df"; Link "pread";
>                  Link "aio_pread_structured"; Link "get_block_size";
> @@ -2459,10 +2461,11 @@   "aio_pread", {
>  as described in L<libnbd(3)/Completion callbacks>.
> 
>  Note that you must ensure C<buf> is valid until the command has
> -completed.  Furthermore, the contents of C<buf> are undefined if the
> -C<error> parameter to C<completion_callback> is set, or if
> -L<nbd_aio_command_completed(3)> reports failure.  Other parameters behave
> -as documented in L<nbd_pread(3)>."
> +completed.  Furthermore, if the C<error> parameter to
> +C<completion_callback> is set or if L<nbd_aio_command_completed(3)>
> +reports failure, it is unspecified whether the contents
> +of C<buf> will read as zero or as partial results from the server.
> +Other parameters behave as documented in L<nbd_pread(3)>."
>  ^ strict_call_description;
>      example = Some "examples/aio-connect-read.c";
>      see_also = [SectionLink "Issuing asynchronous commands";
> @@ -2487,10 +2490,11 @@   "aio_pread_structured", {
>  as described in L<libnbd(3)/Completion callbacks>.
> 
>  Note that you must ensure C<buf> is valid until the command has
> -completed.  Furthermore, the contents of C<buf> are undefined if the
> -C<error> parameter to C<completion_callback> is set, or if
> -L<nbd_aio_command_completed(3)> reports failure.  Other parameters behave
> -as documented in L<nbd_pread_structured(3)>."
> +completed.  Furthermore, if the C<error> parameter to
> +C<completion_callback> is set or if L<nbd_aio_command_completed(3)>
> +reports failure, it is unspecified whether the contents
> +of C<buf> will read as zero or as partial results from the server.
> +Other parameters behave as documented in L<nbd_pread_structured(3)>."
>  ^ strict_call_description;
>      see_also = [SectionLink "Issuing asynchronous commands";
>                  Link "aio_pread"; Link "pread_structured";
> diff --git a/generator/C.ml b/generator/C.ml
> index 797af531..4a5bb589 100644
> --- a/generator/C.ml
> +++ b/generator/C.ml
> @@ -1,6 +1,6 @@
>  (* hey emacs, this is OCaml code: -*- tuareg -*- *)
>  (* nbd client library in userspace: generate the C API and documentation
> - * Copyright (C) 2013-2020 Red Hat Inc.
> + * Copyright (C) 2013-2022 Red Hat Inc.
>   *
>   * This library is free software; you can redistribute it and/or
>   * modify it under the terms of the GNU Lesser General Public
> @@ -491,6 +491,15 @@ let
>        pr "\n"
>      );
> 
> +    (* Sanitize read buffers before any error is possible. *)
> +    List.iter (
> +      function
> +      | BytesOut (n, count)
> +      | BytesPersistOut (n, count) ->
> +         pr "  memset (%s, 0, %s);\n" n count
> +      | _ -> ()
> +    ) args;
> +
>      (* Check current state is permitted for this call. *)
>      if permitted_states <> [] then (
>        let value = match errcode with

It tends to assist reviewers if you include a diff in the cover letter
(or better yet, in the Notes section of the patch, suitably
quoted/escaped) that shows the effect of the patch on the generated C
source code.

That said:

Acked-by: Laszlo Ersek <lersek at redhat.com>

Thanks,
Laszlo

> diff --git a/lib/rw.c b/lib/rw.c
> index b5d3dc44..1202d71c 100644
> --- a/lib/rw.c
> +++ b/lib/rw.c
> @@ -244,18 +244,14 @@ nbd_internal_command_common (struct nbd_handle *h,
>    if (cb)
>      cmd->cb = *cb;
> 
> -  /* If structured replies were negotiated then we trust the server to
> -   * send back sufficient data to cover the whole buffer.  It's tricky
> -   * to check this, so an easier thing is simply to zero the buffer
> -   * ahead of time which avoids any security problems.  I measured the
> -   * overhead of this and for non-TLS there is no measurable overhead
> -   * in the highly intensive loopback case.  For TLS we get a
> -   * performance gain, go figure.  For an older server with only
> -   * simple replies, it's still better to do the same memset() so we
> -   * don't have behavior that is server-dependent.
> +  /* For NBD_CMD_READ, cmd->data was pre-zeroed in the prologue
> +   * created by the generator.  Thus, if a (non-compliant) server with
> +   * structured replies fails to send back sufficient data to cover
> +   * the whole buffer, we still behave as if it had sent zeroes for
> +   * those portions, rather than leaking any uninitialized data, and
> +   * without having to complicate our state machine to track which
> +   * portions of the read buffer were actually populated.
>     */
> -  if (cmd->data && type == NBD_CMD_READ)
> -    memset (cmd->data, 0, cmd->count);
> 
>    /* Add the command to the end of the queue. Kick the state machine
>     * if there is no other command being processed, otherwise, it will
> diff --git a/tests/errors.c b/tests/errors.c
> index 2c800c7c..f597b7ee 100644
> --- a/tests/errors.c
> +++ b/tests/errors.c
> @@ -1,5 +1,5 @@
>  /* NBD client library in userspace
> - * Copyright (C) 2013-2021 Red Hat Inc.
> + * Copyright (C) 2013-2022 Red Hat Inc.
>   *
>   * This library is free software; you can redistribute it and/or
>   * modify it under the terms of the GNU Lesser General Public
> @@ -214,6 +214,7 @@ main (int argc, char *argv[])
> 
> 
>    /* Issue a connected command when not connected. */
> +  buf[0] = '1';
>    if (nbd_pread (nbd, buf, 512, 0, 0) != -1) {
>      fprintf (stderr, "%s: test failed: "
>               "nbd_pread did not fail on non-connected handle\n",
> @@ -221,6 +222,12 @@ main (int argc, char *argv[])
>      exit (EXIT_FAILURE);
>    }
>    check (ENOTCONN, "nbd_pread: ");
> +  if (buf[0] != '\0') {
> +    fprintf (stderr, "%s: test failed: "
> +             "nbd_pread did not sanitize buffer on error\n",
> +             argv[0]);
> +    exit (EXIT_FAILURE);
> +  }
> 
>    /* Request a name that is too long. */
>    memset (buf, 'a', 4999);
>