[Libguestfs] [nbdkit PATCH v2] blocksize: Avoid losing aligned writes to RMW race

Eric Blake eblake at redhat.com
Fri May 27 15:55:47 UTC 2022


On Fri, May 27, 2022 at 08:59:23AM +0200, Laszlo Ersek wrote:
> > Add a test that fails without the change to blocksize.c.  With some
> > carefully timed setups (using the delay filter to stall writes
> > reaching the plugin, and the plugin itself to stall reads coming back
> > from the plugin, so that we are reasonably sure of the overall
> > timeline), we can demonstrate the bug present in the unpatched code,
> > where an aligned write is lost when it lands in the wrong part of an
> > unaligned RMW cycle; the fixed code instead shows that the overlapping
> > operations were serialized.  We may need to further tweak the test to
> > be more reliable even under heavy load, but hopefully 2 seconds per
> > stall (where a successful test requires 18 seconds) is sufficient for
> > now.
> >
> > Reported-by: Nikolaus Rath <Nikolaus at rath.org>
> > Fixes: 1aadbf9971 ("blocksize: Allow parallel requests", v1.25.3)
> > ---
> >
> > In v2:
> > - Implement the blocksize fix.
> > - Drop RFC; I think this is ready, other than determining if we want
> >   to tag it with a CVE number.
> > - Improve the test: print out more details before assertions, to aid
> >   in debugging if it ever dies under CI resources
> 
> Apparently there was a bug in the zero callback too (which I had
> missed):
> 
>     -+    zero='dd if=/dev/zero skip=$4 count=$3 iflag=count_bytes,skip_bytes' \
>     ++    zero='dd if=/dev/zero of=$tmpdir/img skip=$4 count=$3 \
>     ++      iflag=count_bytes,skip_bytes' \

Yes; I found out after re-arranging the debug print statements that my
original zero= was a no-op (because it wrote to stdout instead of the
intended file).

> 
> After comparing RFC against v2: if you can split the test case to a
> separate patch (= the second patch in a two-part series), then I'd be
> happy for you to carry my R-b forward. I don't feel confident enough to
> review the change to the blocksize filter itself. (My confidence is
> lacking in my knowledge of nbdkit, not in your patch.)

Splitting the test coverage from the patch proper does make it easier
to prove the test fails when applied out-of-order (or when reverting
the patch proper); I will do that.


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


More information about the Libguestfs mailing list