[libvirt] libvirtd crashes

Matthias Bolte matthias.bolte at googlemail.com
Fri Dec 4 00:55:25 UTC 2009


2009/12/4 Shi Jin <jinzishuai at yahoo.com>:
> FOr me, the error seems to vary.  I am going to show two versions.
>
> This is one.
> The gdb bt is
> #0  0x00007f7ce92986b4 in pthread_mutex_unlock () from /lib/libpthread.so.0
> #1  0x000000000042f661 in ?? ()
> #2  0x000000000043da36 in ?? ()
> #3  0x000000000043ef4b in ?? ()
> #4  0x00007f7ce94f90fb in virDomainCreateXML () from /usr/lib/libvirt.so.0
> #5  0x000000000041f228 in ?? ()
> #6  0x0000000000420e41 in ?? ()
> #7  0x00000000004211f3 in ?? ()
> #8  0x000000000041478c in ?? ()
> #9  0x00007f7ce9294a04 in start_thread () from /lib/libpthread.so.0
> #10 0x00007f7ce8ffe7bd in clone () from /lib/libc.so.6
> #11 0x0000000000000000 in ?? ()

Missing debug symbols here, but the last function is
pthread_mutex_unlock, like in the other bug report.

> Its corresponding debug is
> 17:09:46.140: debug : virEventRemoveHandleImpl:173 : Remove handle w=81
> 17:09:46.140: debug : virEventRemoveHandleImpl:186 : mark delete 38 68
> 17:09:46.140: debug : virEventInterruptLocked:658 : Skip interrupt, 1 -438118128
> 17:09:46.140: debug : qemuMonitorClose:532 : Mark monitor to be deleted 0x7f7cd80cf480
> 17:09:46.140: debug : qemuDomainSetFileOwnership:1971 : Setting ownership on /srv/cloud/one/var//6666/images/disk.0 to 0:0
> 17:09:46.140: debug : virEventUpdateHandleImpl:146 : Update handle w=81 e=12
> 17:09:46.140: debug : virEventInterruptLocked:662 : Interrupting
> 17:09:46.140: debug : qemuMonitorCommandWithHandler:271 : Receive command reply ret=-1 errno=104 0 bytes '(null)'
> 17:09:46.140: error : qemuMonitorCommandWithHandler:290 : cannot send monitor command 'info cpus': Connection reset by peer
> 17:09:46.140: error : qemuMonitorTextGetCPUInfo:436 : internal error cannot run monitor command to fetch CPU thread info
>
>
>
>
> Here is another one:
> (gdb) bt
> #0  0x00007ff61ca026b4 in pthread_mutex_unlock () from /lib/libpthread.so.0
> #1  0x000000000042f661 in qemuDomainObjExitMonitorWithDriver (driver=0x7ff61000c410, obj=0x1e8c310) at qemu/qemu_driver.c:318
> #2  0x000000000043da36 in qemudStartVMDaemon (conn=<value optimized out>, driver=0x7ff61000c410, vm=0x1e8c310,
>    migrateFrom=<value optimized out>, stdin_fd=<value optimized out>) at qemu/qemu_driver.c:2327
> #3  0x000000000043ef4b in qemudDomainCreate (conn=0x7ff610009a00, xml=<value optimized out>, flags=<value optimized out>)
>    at qemu/qemu_driver.c:2881
> #4  0x00007ff61cc630fb in virDomainCreateXML (conn=0x7ff610009a00,
>    xmlDesc=0x7ff610004d70 "<domain type='kvm'>\n\t<name>one-7238</name>\n\t<vcpu>1</vcpu>\n\t<memory>524288</memory>\n\t<os>\n\t\t<type>hvm</type>\n\t\t<boot dev='hd'/>\n\t</os>\n\t<devices>\n\t\t<emulator>/usr/bin/kvm</emulator>\n\t\t<disk type='file"..., flags=0) at libvirt.c:1745
> #5  0x000000000041f228 in remoteDispatchDomainCreateXml (server=<value optimized out>, client=<value optimized out>,
>    conn=0x7ff610009a00, hdr=0x6572687420555043, rerr=0x6f666e69206461, args=0x616d6d6f6320726f, ret=0x7ff60e7fbed0)
>    at remote.c:873
> #6  0x0000000000420e41 in remoteDispatchClientCall (server=<value optimized out>, client=0x7ff6080c4960, msg=0x7ff6080ca910)
>    at dispatch.c:506
> #7  0x00000000004211f3 in remoteDispatchClientRequest (server=0x1e62070, client=0x7ff6080c4960, msg=0x7ff6080ca910)
>    at dispatch.c:388
> #8  0x000000000041478c in qemudWorker (data=<value optimized out>) at libvirtd.c:1518
> #9  0x00007ff61c9fea04 in start_thread () from /lib/libpthread.so.0
> #10 0x00007ff61c7687bd in clone () from /lib/libc.so.6
> #11 0x0000000000000000 in ?? ()

Yep, it's the same bug that Nikola Ciprich reported, even if the
actual backtraces are not completely equal.

The call to qemuDomainObjExitMonitorWithDriver results in trying to
unlock the monitor, but the monitor has been deleted in between
because an error occurred while interacting with QEMU.

So, my initial guess was correct and this is a known bug. As pointed
out in the referenced thread, a preliminary patch is already
available.

> and its debug:
> 17:33:35.424: debug : virEventUpdateHandleImpl:146 : Update handle w=1389 e=12
> 17:33:35.424: debug : virEventInterruptLocked:662 : Interrupting
> 17:33:35.424: debug : qemuMonitorCommandWithHandler:271 : Receive command reply ret=-1 errno=104 0 bytes '(null)'
> 17:33:35.424: error : qemuMonitorCommandWithHandler:290 : cannot send monitor command 'info cpus': Connection reset by peer
> 17:33:35.424: error : qemuMonitorTextGetCPUInfo:436 : internal error cannot run monitor command to fetch CPU thread info
>
>
> Thanks a lot.
> Shi
> --
> Shi Jin, PhD
>
>
> --- On Thu, 12/3/09, Matthias Bolte <matthias.bolte at googlemail.com> wrote:
>
>> From: Matthias Bolte <matthias.bolte at googlemail.com>
>> Subject: Re: [libvirt] libvirtd crashes
>> To: "Shi Jin" <jinzishuai at yahoo.com>
>> Cc: libvir-list at redhat.com, jinzishuai at gmail.com
>> Date: Thursday, December 3, 2009, 4:20 PM
>> 2009/12/3 Shi Jin <jinzishuai at yahoo.com>:
>> > Hi there,
>> >
>> > My libvirtd built from the latest git code keeps on
>> crashing on all machines.
>> > I turned on debugging and this is the information I
>> have in the log file before crashing:
>> > 14:31:50.828: debug : virEventUpdateHandleImpl:146 :
>> Update handle w=110 e=12
>> > 14:31:50.828: debug : virEventInterruptLocked:662 :
>> Interrupting
>> > 14:31:50.828: debug :
>> qemuMonitorCommandWithHandler:271 : Receive command reply
>> ret=-1 errno=104 0 bytes '(null)'
>> > 14:31:50.828: error :
>> qemuMonitorCommandWithHandler:290 : cannot send monitor
>> command 'info cpus': Connection reset by peer
>> > 14:31:50.828: error : qemuMonitorTextGetCPUInfo:436 :
>> internal error cannot run monitor command to fetch CPU
>> thread info
>> >
>> > I am not sure if there is any other information needed
>> to help identify the problem.  My building options are:
>> > ./autogen.sh  --prefix=/usr --sysconfdir=/etc
>> --localstatedir=/var  --without-xen --with-qemu
>> --with-qemu-user=oneadmin --with-qemu-group=oneadmin
>> --without-uml --without-vbox --without-openvz --without-lxc
>> >
>> > Please help me here. I can accept the service failing
>> queries from time to time since I have error handling
>> written so that they can be re-tried. But a crashing
>> libvirtd takes the whole thing down.
>> >
>> > Thanks a lot.
>> >
>> > Shi
>> > --
>> > Shi Jin, PhD
>> >
>>
>> A GDB backtrace would be helpful. Judging by the debug log
>> alone it
>> could be a known issue, see:
>>
>> https://www.redhat.com/archives/libvir-list/2009-December/msg00063.html
>>
>> Matthias
>>
>

Matthias




More information about the libvir-list mailing list