[Libguestfs] virt-v2v check-valgrind?

Laszlo Ersek lersek at redhat.com
Tue May 16 14:25:04 UTC 2023


On 5/16/23 15:26, Richard W.M. Jones wrote:

> How many of the tests fail for you?  Just a small number or all of
> them?

Almost all of them fail.

I think I've figured out why.

First, as I mention up-thread, there's upstream glibc bug
<https://sourceware.org/bugzilla/show_bug.cgi?id=28256>, reported by
you, fixed in 2.35; but the upstream fix has never been backported to
RHEL-9.

However, there's another piece to the puzzle. In the upstream glibc bug
report, you wrote:

"I'm getting this when I run any program under valgrind with glibc
tunables"

Keyword being "glibc tunables". I don't set them myself -- so why does
my (unfixed) glibc nonetheless trigger the valgrind false positive?

The answer to that is the following: I build all libguestfs projects
from source.

I keep referring to the following dependency graph (constructed earlier
with your help):

                       libvirt-ocaml ---------
                                              \
                       libnbd <--> nbdkit---   \
                                            \   \
  hivex    ----> libguestfs  --------------------> virt-v2v -> virt-p2v
              /              \                  /
   supermin --                -> guestfs-tools -

(It does not show augeas, which I cannot be bothered to build locally.)

Accordingly, I have a good number of shell scripts that are called:

  r-guestfs-tools
  r-libguestfs
  r-nbdkit
  r-virt-p2v
  r-virt-v2v

For example, "r-guestfs-tools" looks like this:

> #!/bin/bash
>
> # enable dependencies needed by guestfs-tools
> SUPERMIN=$HOME/src/v2v/supermin/src/supermin \
> $HOME/src/v2v/hivex/run \
> $HOME/src/v2v/libguestfs/run \
> "$@"

and, for example, "r-virt-v2v" is:

> #!/bin/bash
>
> # enable dependencies needed by virt-v2v
> SUPERMIN=$HOME/src/v2v/supermin/src/supermin \
> $HOME/src/v2v/hivex/run \
> $HOME/src/v2v/libguestfs/run \
> $HOME/src/v2v/libnbd/run \
> $HOME/src/v2v/libvirt-ocaml/run \
> $HOME/src/v2v/guestfs-tools/run \
> "$@"

And when I build virt-v2v locally, I do:

  r-virt-v2v autoreconf -i
  r-virt-v2v ./configure CFLAGS=-fPIC --enable-werror=yes --prefix=/usr
  r-virt-v2v make -j6
  r-virt-v2v make -j6 check
  r-virt-v2v make -j6 check-valgrind

In effect this chains the "run" scripts from all the other local build
trees that virt-v2v depends upon, for building.

Note that virt-v2v's own run script is not chained by my own
"r-virt-v2v" script -- that run script is only needed if someone wants
to run (i.e., not build) virt-v2v locally. (In fact, when I'm gearing up
to autoreconf & configure the virt-v2v tree, the "run" script doesn't
even exist in that tree (only configure will generate it!), so I
couldn't even run it from "r-virt-v2v"!)

Therefore, whenever I also run virt-v2v locally, I spell out the local,
now-existent, "./run" in addition, from the virt-v2v project root:

  r-virt-v2v ./run virt-v2v ...

In effect this chains virt-v2v's own run script in addition to the run
scripts of its dependencies.

Now here's the problem. Consider virt-v2v's own "run.in" script:

> # This is a cheap way to find some use-after-free and uninitialized
> # read problems when using glibc.  But if we are valgrinding then
> # don't use this because it can stop valgrind from working.
> if [ -z "$VG" ]; then
>     random_val="$(@AWK@ 'BEGIN{srand(); print 1+int(255*rand())}' < /dev/null)"
>     LD_PRELOAD="${LD_PRELOAD:+"$LD_PRELOAD:"}libc_malloc_debug.so.0"
>     GLIBC_TUNABLES=glibc.malloc.check=1:glibc.malloc.perturb=$random_val
>     export LD_PRELOAD GLIBC_TUNABLES
> fi

Ouch. GLIBC_TUNABLES are known to break valgrind. Splendid.

It turns out however that virt-v2v's own "run" script does not
participate in "make check-valgrind". I verified that by adding an
"else" branch above, printing an error message, and exiting with status
1. It does not fire. So this is all fine: the above safety check is for
running virt-v2v *manually* under valgrind. "make check-valgrind" sets
VG, but it does not call virt-v2v's own "run", so the VG nullity check
is not even necessary to reach in the "run" script, for "make
check-valgrind". And the check handles manual VG settings properly.

But... remember "r-virt-v2v" again:

> #!/bin/bash
>
> # enable dependencies needed by virt-v2v
> SUPERMIN=$HOME/src/v2v/supermin/src/supermin \
> $HOME/src/v2v/hivex/run \
> $HOME/src/v2v/libguestfs/run \
> $HOME/src/v2v/libnbd/run \
> $HOME/src/v2v/libvirt-ocaml/run \
> $HOME/src/v2v/guestfs-tools/run \
> "$@"

It turns out that "guestfs-tools/run" has the exact same logic for
setting GLIBC_TUNABLES! So when I execute

  r-virt-v2v make -j6 check-valgrind

then the environment for "make -j6 check-valgrind" will *inherit* a
GLIBC_TUNABLES variable, from (at least!) "guestfs-tools/run". The VG
variable will only be set internally to "make check-valgrind", which is
too late; it does not prevent "guestfs-tools" from setting
GLIBC_TUNABLES. I've verified this in the output of

  r-virt-v2v env

which does show GLIBC_TUNABLES.

And that way I hit glibc bug
<https://sourceware.org/bugzilla/show_bug.cgi?id=28256>.

Now here's another interesting difference:

- the "run" script in guestfs-tools, virt-p2v, and virt-v2v (1) don't
  touch GLIBC_TUNABLES when valgrinding, and (2) set GLIBC_TUNABLES when
  not valgrinding,

- whereas the "run" script in libnbd (which I also chain in
  "r-virt-v2v", at an earlier stage, see above) (1) *unsets*
  GLIBC_TUNABLES when valgrinding, and (2) doesn't touch GLIBC_TUNABLES
  when not valgrinding. See libnbd commit 2eeb0c693ce1 ("tests: Remove
  GLIBC_TUNABLES when running under valgrind", 2021-08-26) -- it even
  references the same glibc bug.

Either way, if I use "env -u" to unset LD_PRELOAD and GLIBC_TUNABLES
between the "chain of run scripts" and "make check-valgrind", as in:

  r-virt-v2v \
    env -u LD_PRELOAD -u GLIBC_TUNABLES \
    make -j6 check-valgrind TESTS=test-v2v-fedora-luks-on-lvm-conversion.sh

then the test (test-v2v-fedora-luks-on-lvm-conversion.sh) passes;
valgrind doesn't complain.

Therefore, IMO, this is a bug in how the "run" scripts compose.

Arguably, it should be possible to chain any number of those run scripts
(it's a valid use case for a user to depend on all of the build trees at
the same time), and they should all agree about GLIBC_TUNABLES. Namely,
the run scripts should neither set, nor unset, GLIBC_TUNABLES and
LD_PRELOAD; those variables should be ignored altogether in the "run"
scripts.

Should I submit patches to remove the LD_PRELOAD and GLIBC_TUNABLES
tweaking from all the run scripts?

Thanks!
Laszlo


More information about the Libguestfs mailing list