[Libguestfs] [libnbd PATCH v2 5/5] copy: CVE-2022-0485: Fail nbdcopy if NBD read or write fails
Nir Soffer
nsoffer at redhat.com
Fri Feb 4 00:08:21 UTC 2022
On Thu, Feb 3, 2022 at 10:26 PM Eric Blake <eblake at redhat.com> wrote:
>
> nbdcopy has a nasty bug when performing multi-threaded copies using
> asynchronous nbd calls - it was blindly treating the completion of an
> asynchronous command as successful, rather than checking the *error
> parameter. This can result in the silent creation of a corrupted
> image in two different ways: when a read fails, we blindly wrote
> garbage to the destination; when a write fails, we did not flag that
> the destination was not written.
>
> Since nbdcopy already calls exit() on a synchronous read or write
> failure to a file, doing the same for an asynchronous op to an NBD
> server is the simplest solution. A nicer solution, but more invasive
> to code and thus not done here, might be to allow up to N retries of
> the transaction (in case the read or write failure was transient), or
> even having a mode where as much data is copied as possible (portions
> of the copy that failed would be logged on stderr, and nbdcopy would
> still fail with a non-zero exit status, but this would copy more than
> just stopping at the first error, as can be done with rsync or
> ddrescue).
Maybe this belongs to the nbdresuce tool :-)
> Note that since we rely on auto-retiring and do NOT call
> nbd_aio_command_completed, our completion callbacks must always return
> 1 (if they do not exit() first), even when acting on *error, so as not
> leave the command allocated until nbd_close. As such, there is no
> sane way to return an error to a manual caller of the callback, and
> therefore we can drop dead code that calls perror() and exit() if the
> callback "failed". It is also worth documenting the contract on when
> we must manually call the callback during the asynch_zero callback, so
> that we do not leak or double-free the command; thankfully, all the
> existing code paths were correct.
>
> The added testsuite script demonstrates several scenarios, some of
> which fail without the rest of this patch in place, and others which
> showcase ways in which sparse images can bypass errors.
>
> Once backports are complete, a followup patch on the main branch will
> edit docs/libnbd-security.pod with the mailing list announcement of
> the stable branch commit ids and release versions that incorporate
> this fix.
>
> Reported-by: Nir Soffer <nsoffer at redhat.com>
> Fixes: bc896eec4d ("copy: Implement multi-conn, multiple threads, multiple requests in flight.", v1.5.6)
> Fixes: https://bugzilla.redhat.com/2046194
> ---
> TODO | 1 +
> copy/Makefile.am | 4 +-
> copy/copy-nbd-error.sh | 78 +++++++++++++++++++++++++++++++++++++
> copy/file-ops.c | 17 +++-----
> copy/multi-thread-copying.c | 15 +++++++
> copy/nbdcopy.h | 7 ++--
> copy/null-ops.c | 10 +----
> 7 files changed, 108 insertions(+), 24 deletions(-)
> create mode 100755 copy/copy-nbd-error.sh
>
> diff --git a/TODO b/TODO
> index da157942..7c9c15e2 100644
> --- a/TODO
> +++ b/TODO
> @@ -33,6 +33,7 @@ nbdcopy:
> - Better page cache usage, see nbdkit-file-plugin options
> fadvise=sequential cache=none.
> - Consider io_uring if there are performance bottlenecks.
> + - Configurable retries in response to read or write failures.
>
> nbdfuse:
> - If you write beyond the end of the virtual file, it returns EIO.
> diff --git a/copy/Makefile.am b/copy/Makefile.am
> index f2100853..e729f86a 100644
> --- a/copy/Makefile.am
> +++ b/copy/Makefile.am
> @@ -1,5 +1,5 @@
> # nbd client library in userspace
> -# Copyright (C) 2020 Red Hat Inc.
> +# Copyright (C) 2020-2022 Red Hat Inc.
> #
> # This library is free software; you can redistribute it and/or
> # modify it under the terms of the GNU Lesser General Public
> @@ -33,6 +33,7 @@ EXTRA_DIST = \
> copy-nbd-to-small-nbd-error.sh \
> copy-nbd-to-sparse-file.sh \
> copy-nbd-to-stdout.sh \
> + copy-nbd-error.sh \
> copy-progress-bar.sh \
> copy-sparse.sh \
> copy-sparse-allocated.sh \
> @@ -124,6 +125,7 @@ TESTS += \
> copy-stdin-to-nbd.sh \
> copy-stdin-to-null.sh \
> copy-nbd-to-stdout.sh \
> + copy-nbd-error.sh \
> copy-progress-bar.sh \
> copy-sparse.sh \
> copy-sparse-allocated.sh \
> diff --git a/copy/copy-nbd-error.sh b/copy/copy-nbd-error.sh
> new file mode 100755
> index 00000000..89f0a2f1
> --- /dev/null
> +++ b/copy/copy-nbd-error.sh
> @@ -0,0 +1,78 @@
> +#!/usr/bin/env bash
> +# nbd client library in userspace
> +# Copyright (C) 2022 Red Hat Inc.
> +#
> +# This library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2 of the License, or (at your option) any later version.
> +#
> +# This library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +# Lesser General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with this library; if not, write to the Free Software
> +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> +
> +# Tests several scenarios of handling NBD server errors
> +# Serves as a regression test for the CVE-2022-0485 fix.
> +
> +. ../tests/functions.sh
> +
> +set -e
> +set -x
> +
> +requires nbdkit --exit-with-parent --version
> +
> +fail=0
> +
> +# Failure to get block status should not be fatal, but merely downgrade to
> +# reading the entire image as if data
> +echo "Testing extents failures on source"
> +$VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error pattern 5M \
> + error-extents-rate=1 ] null: || fail=1
> +
> +# Failure to read should be fatal
> +echo "Testing read failures on non-sparse source"
> +$VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error pattern 5M \
> + error-pread-rate=0.5 ] null: && fail=1
> +
> +# However, reliable block status on a sparse image can avoid the need to read
> +echo "Testing read failures on sparse source"
> +$VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error null 5M \
> + error-pread-rate=1 ] null: || fail=1
> +
> +# Failure to write data should be fatal
> +echo "Testing write data failures on arbitrary destination"
> +$VG nbdcopy -- [ nbdkit --exit-with-parent -v pattern 5M ] \
> + [ nbdkit --exit-with-parent -v --filter=error --filter=noextents \
> + memory 5M error-pwrite-rate=0.5 ] && fail=1
> +
> +# However, writing zeroes can bypass the need for normal writes
> +echo "Testing write data failures from sparse source"
> +$VG nbdcopy -- [ nbdkit --exit-with-parent -v null 5M ] \
> + [ nbdkit --exit-with-parent -v --filter=error --filter=noextents \
> + memory 5M error-pwrite-rate=1 ] || fail=1
> +
> +# Failure to write zeroes should be fatal
> +echo "Testing write zero failures on arbitrary destination"
> +$VG nbdcopy -- [ nbdkit --exit-with-parent -v null 5M ] \
> + [ nbdkit --exit-with-parent -v --filter=error memory 5M \
> + error-zero-rate=1 ] && fail=1
> +
> +# However, assuming/learning destination is zero can skip need to write
> +echo "Testing write failures on pre-zeroed destination"
> +$VG nbdcopy --destination-is-zero -- \
> + [ nbdkit --exit-with-parent -v null 5M ] \
> + [ nbdkit --exit-with-parent -v --filter=error memory 5M \
> + error-pwrite-rate=1 error-zero-rate=1 ] || fail=1
> +
> +# Likewise, when write zero is not advertised, fallback to normal write works
> +echo "Testing write zeroes to destination without zero support"
> +$VG nbdcopy -- [ nbdkit --exit-with-parent -v null 5M ] \
> + [ nbdkit --exit-with-parent -v --filter=nozero --filter=error memory 5M \
> + error-zero-rate=1 ] || fail=1
Good test!
All this behavior was not tested before?
> +
> +exit $fail
> diff --git a/copy/file-ops.c b/copy/file-ops.c
> index e37a5014..aaf04ade 100644
> --- a/copy/file-ops.c
> +++ b/copy/file-ops.c
> @@ -591,10 +591,8 @@ file_asynch_read (struct rw *rw,
>
> file_synch_read (rw, slice_ptr (command->slice),
> command->slice.len, command->offset);
> - if (cb.callback (cb.user_data, &dummy) == -1) {
> - perror (rw->name);
> - exit (EXIT_FAILURE);
> - }
> + /* file_synch_read called exit() on error */
> + cb.callback (cb.user_data, &dummy);
> }
>
> static void
> @@ -606,10 +604,8 @@ file_asynch_write (struct rw *rw,
>
> file_synch_write (rw, slice_ptr (command->slice),
> command->slice.len, command->offset);
> - if (cb.callback (cb.user_data, &dummy) == -1) {
> - perror (rw->name);
> - exit (EXIT_FAILURE);
> - }
> + /* file_synch_write called exit() on error */
> + cb.callback (cb.user_data, &dummy);
> }
>
> static bool
> @@ -620,10 +616,7 @@ file_asynch_zero (struct rw *rw, struct command *command,
>
> if (!file_synch_zero (rw, command->offset, command->slice.len, allocate))
> return false;
> - if (cb.callback (cb.user_data, &dummy) == -1) {
> - perror (rw->name);
> - exit (EXIT_FAILURE);
> - }
> + cb.callback (cb.user_data, &dummy);
> return true;
> }
>
> diff --git a/copy/multi-thread-copying.c b/copy/multi-thread-copying.c
> index 815b8a02..2aa74d63 100644
> --- a/copy/multi-thread-copying.c
> +++ b/copy/multi-thread-copying.c
> @@ -28,6 +28,7 @@
> #include <errno.h>
> #include <assert.h>
> #include <sys/stat.h>
> +#include <inttypes.h>
>
> #include <pthread.h>
>
> @@ -376,6 +377,13 @@ finished_read (void *vp, int *error)
> {
> struct command *command = vp;
>
> + /* XXX - is it worth retrying a failed command? */
> + if (*error) {
> + fprintf (stderr, "read at offset 0x%" PRIx64 "failed: %s\n",
Good error message, but there is missing space before "failed".
Why use hex offset? I find it very confusing. I don't think nbdcopy users
think in hex.
> + command->offset, strerror (*error));
> + exit (EXIT_FAILURE);
> + }
> +
> if (allocated || sparse_size == 0) {
> /* If sparseness detection (see below) is turned off then we write
> * the whole command.
> @@ -548,6 +556,13 @@ free_command (void *vp, int *error)
> struct command *command = vp;
> struct buffer *buffer = command->slice.buffer;
>
> + /* XXX - is it worth retrying a failed command? */
> + if (*error) {
> + fprintf (stderr, "write at offset 0x%" PRIx64 "failed: %s\n",
> + command->offset, strerror (*error));
> + exit (EXIT_FAILURE);
> + }
> +
> if (buffer != NULL) {
> if (--buffer->refs == 0) {
> free (buffer->data);
> diff --git a/copy/nbdcopy.h b/copy/nbdcopy.h
> index e7fe1eab..c070f8d7 100644
> --- a/copy/nbdcopy.h
> +++ b/copy/nbdcopy.h
> @@ -1,5 +1,5 @@
> /* NBD client library in userspace.
> - * Copyright (C) 2020-2021 Red Hat Inc.
> + * Copyright (C) 2020-2022 Red Hat Inc.
> *
> * This library is free software; you can redistribute it and/or
> * modify it under the terms of the GNU Lesser General Public
> @@ -157,7 +157,8 @@ struct rw_ops {
> bool allocate);
>
> /* Asynchronous I/O operations. These start the operation and call
> - * 'cb' on completion.
> + * 'cb' on completion. 'cb' will return 1, for auto-retiring with
> + * asynchronous libnbd calls.
> *
> * The file_ops versions are actually implemented synchronously, but
> * still call 'cb'.
> @@ -173,7 +174,7 @@ struct rw_ops {
> nbd_completion_callback cb);
>
> /* Asynchronously zero. command->slice.buffer is not used. If not possible,
> - * returns false.
> + * returns false. 'cb' must be called only if returning true.
> */
> bool (*asynch_zero) (struct rw *rw, struct command *command,
> nbd_completion_callback cb, bool allocate);
> diff --git a/copy/null-ops.c b/copy/null-ops.c
> index 11cd5116..5f1fda50 100644
> --- a/copy/null-ops.c
> +++ b/copy/null-ops.c
> @@ -119,10 +119,7 @@ null_asynch_write (struct rw *rw,
> {
> int dummy = 0;
>
> - if (cb.callback (cb.user_data, &dummy) == -1) {
> - perror (rw->name);
> - exit (EXIT_FAILURE);
> - }
> + cb.callback (cb.user_data, &dummy);
> }
>
> static bool
> @@ -131,10 +128,7 @@ null_asynch_zero (struct rw *rw, struct command *command,
> {
> int dummy = 0;
>
> - if (cb.callback (cb.user_data, &dummy) == -1) {
> - perror (rw->name);
> - exit (EXIT_FAILURE);
> - }
> + cb.callback (cb.user_data, &dummy);
> return true;
> }
>
> --
> 2.34.1
>
Looks ready after fixing the error message.
Nir
More information about the Libguestfs
mailing list