[Libguestfs] Libguestfs Failure on latest Ubuntu 22.04 LTS

Laszlo Ersek lersek at redhat.com
Mon Mar 20 05:31:43 UTC 2023


On 3/17/23 16:10, Justin Churchey wrote:
> Hello Everyone,
>
> I was having some difficulties converting OVA images yesterday. At
> first, I thought it may have been a compatibility issue with
> VirtualBox 7.0.  However, when I went to run libguestfs-test-tool, it
> began failing with the exact same error as the conversions, which
> leads me to believe the issue may lie with libguestfs and not the
> images themselves.
>
> To test further, I created a fresh install of Ubuntu 22.04, and the
> libguestfs-test-tool seems to fail with the same error, even on a
> fresh install.  I am attaching the libguestfs-test-tool output for
> reference.
>
> Ubuntu 22.04 is running libguestfs-tools 1.46.2-10ubuntu3
>
> If anybody has any insight into the issue, or if you feel a bug report
> needs to be filed, please let me know.

Your appliance kernel crashes.

Here's my theory on why this might happen, based on your log.

The guestfish appliance runs with KVM acceleration.

The crash happens after/while inserting the modules crc32-pclmul.ko and
crct10dif-pclmul.ko.

The "pclmul" in the names of those modules indicates that these modules
calculate various (crc32) checksums with the PCLMULQDQ instruction. I
believe that PCLMULQDQ is an advanced / accelerated instruction and not
all CPUs may support it.

Your appliance guest is started with "-cpu max" on the QEMU command line
(from libguestfs commit 30f74f38bd6e, "appliance: Use -cpu max.",
2021-01-28). This is probably why the appliance kernel thinks PCLMULQDQ
is available.

I think the PCLMULQDQ instruction may cause an issue here. I don't know
why it misbehaves under KVM, but that's my suspicion anyway.

Note that the kernel crash log provides the following instruction
(assembly binary) dump:

46 70 48 8b 56 68 48 03 97 90 01 00 00 48 c1 e0 06 48 03 46 20 48 89 97
08 02 00 00 48 be ab aa aa aa aa aa aa aa 48 8b 48 10 <48> 89 0a 48 8b
50 20 48 8b 8f 08 02 00 00 48 89 d0 48 f7 e6 48 c1

with the instruction starting at <48> causing the page fault, as the
direct symptom. Now, we can disassemble this:

printf \
  '%b' \
  '\x46\x70\x48\x8b\x56\x68\x48\x03\x97\x90\x01\x00\x00\x48\xc1\xe0\x06\x48\x03\x46\x20\x48\x89\x97\x08\x02\x00\x00\x48\xbe\xab\xaa\xaa\xaa\xaa\xaa\xaa\xaa\x48\x8b\x48\x10\x48\x89\x0a\x48\x8b\x50\x20\x48\x8b\x8f\x08\x02\x00\x00\x48\x89\xd0\x48\xf7\xe6\x48\xc1' \
  > bin

$ ndisasm -b64 bin

00000000  467048            jo 0x4b
00000003  8B5668            mov edx,[rsi+0x68]
00000006  48039790010000    add rdx,[rdi+0x190]
0000000D  48C1E006          shl rax,byte 0x6
00000011  48034620          add rax,[rsi+0x20]
00000015  48899708020000    mov [rdi+0x208],rdx
0000001C  48BEABAAAAAAAAAA  mov rsi,0xaaaaaaaaaaaaaaab
         -AAAA
00000026  488B4810          mov rcx,[rax+0x10]
0000002A  48890A            mov [rdx],rcx        <----------- crash
0000002D  488B5020          mov rdx,[rax+0x20]
00000031  488B8F08020000    mov rcx,[rdi+0x208]
00000038  4889D0            mov rax,rdx
0000003B  48F7E6            mul rsi
0000003E  48                rex.w
0000003F  C1                db 0xc1

Note the constant 0xaaaaaaaaaaaaaaab; that seems very special. We can
search the kernel tree for it (I'm not bothering about checking out the
particular ubuntu kernel version for now):

$ git grep -i aaaaaaaaaaaaaaab
arch/x86/math-emu/poly_atan.c:/*  0xaaaaaaaaaaaaaaabLL,  transferred to fixedpterm[] */
arch/x86/math-emu/poly_sin.c:   0xaaaaaaaaaaaaaaabLL,
arch/x86/math-emu/poly_tan.c:static const unsigned long long twothirds = 0xaaaaaaaaaaaaaaabLL;

In particular, in the last file (poly_tan.c) contains a snippet like

        mul64_Xsig(&accum, &twothirds);

which seems vagely related to

0000001C  48BEABAAAAAAAAAA  mov rsi,0xaaaaaaaaaaaaaaab
         -AAAA
...
0000003B  48F7E6            mul rsi

Now this does not seem connected to PCLMULQDQ, but it does somehow look
connected to multiplication.

I don't really know where to go with this, except for asking KVM experts.

For now, can you try:

  export LIBGUESTFS_BACKEND_SETTINGS=force_tcg

from <https://libguestfs.org/guestfs.3.html#backend-settings>, and see
if that makes a difference?

Laszlo


More information about the Libguestfs mailing list