[Libguestfs] Debugging nested KVM guest (L2) booting with libguestfs/gdb

Kashyap Chamarthy kchamart at redhat.com
Wed Feb 12 18:38:37 UTC 2014


Heya,

With latest Fedora Rawhide Kernel, I see a nested KVM guest hanging at
boot (not unusual). Rich once suggested this[1] to try to attach gdb to
the nested L2 guest to find out where_ it's stuck, tonight I set out to
try it out (with KVM & TCG).

Below is all what I tried.

In guest hypervisor (L1):

    $ git clone git://github.com/libguestfs/libguestfs.git
    $ git log  | head -1
    commit 82a4a8f02c5706979d961ad6c4ac767a37a3a7c9

    # Activate gdb debugging code by turning its
    # conditional compilation directive: s/#if 0/#if 1
    $ vi src/launch.c
    [. . .] # Edit, save it
 
Double ensure the gdb pre-processor directive is turned on:

    $ grep "\-S" src/launch-direct.c -A4 -B2
       */
    #if 1
      ADD_CMDLINE ("-S");
      ADD_CMDLINE ("-s");
      warning (g, "qemu debugging is enabled, connect gdb to tcp::1234 to begin");
    #endif

Compile:

    $ ./autogen.sh
    $ make -j4
    [. . .] # Compile, address anything that comes up
    $ echo $?
    0

Install Kernel-debug info:

    $ yum install --enablerepo=fedora-debuginfo kernel-debuginfo

Enable the 'direct' backend and invoke the appliance using the `run` script:

    $ export LIBGUESTFS_BACKEND=direct        
    $ ./run libguestfs-test-tool 

The 'qemu-kvm' command-line just hung at:

    $ ./run libguestfs-test-tool
    [. . .]
        -chardev socket,path=/home/tuser1/src/libguestfs/tmp/libguestfspCGc1F/guestfsd.sock,id=channel0 \
    -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \
    -append 'panic=1 console=ttyS0 udevtimeout=600 no_timer_check lpj=2294686 acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=screen'


Let's try with TCG
------------------

    $ guestfish get_backend
    direct
    $ export LIBGUESTFS_BACKEND_SETTINGS=force_tcg
    $ guestfish get_backend_settings
    force_tcg

Run the appliance:

    $ ./run libguestfs-test-tool

Again, `qemu-kvm` CLI just hung just like with KVM acceleration case


Try a couple more things
------------------------

- Run `qemu-sanity-check` in L1 -- Of course, this fails too.

- Invoke (from a different shell, as root) QEMU directly with gdb
  debugging options -s -S with KVM on L1:

   $ qemu-system-x86_64 -s -S -nographic -nodefconfig \
     -nodefaults -machine accel=kvm -m 4000 \
     -drive file=/home/tuser1/vmimages/fedora-20.qcow2,if=ide,format=qcow2,cache=none \
     -serial stdio

  Result: Just hung there.

  Tru with TCG:

   $ qemu-system-x86_64 -s -S -nographic -nodefconfig \
     -nodefaults -machine accel=tcg -m 4000 \
     -drive file=/home/tuser1/vmimages/fedora-20.qcow2,if=ide,format=qcow2,cache=none \
     -serial stdio

  to no avail: it's just hung there, again.


What else I can try? 

Did I get this right or am I making any elementary mistake here? (I'm yet
to try on a different L1 guest on a different hardware, before that I
just wanted to run this by the list.)


Version info
------------

On L1:

    $ virt-what
    kvm
    
    $ uname -r; rpm -q libvirt-daemon-kvm qemu-system-x86 \
      libguestfs kernel-debuginfo
    3.14.0-0.rc2.git0.1.fc21.x86_64
    libvirt-daemon-kvm-1.2.1-2.fc21.x86_64
    qemu-system-x86-1.7.0-4.fc21.x86_64
    libguestfs-1.25.33-1.fc21.x86_64
    kernel-debuginfo-3.11.10-301.fc20.x86_64


  [1] https://github.com/libguestfs/libguestfs/blob/master/src/launch-direct.c#L404

-- 
/kashyap




More information about the Libguestfs mailing list