[libvirt] libvirtd segfault
Scott Sullivan
ssullivan at liquidweb.com
Thu Jan 3 17:20:03 UTC 2013
On 01/02/2013 09:45 AM, Scott Sullivan wrote:
> On 12/29/2012 04:09 AM, Michal Privoznik wrote:
>> On 28.12.2012 20:23, Scott Sullivan wrote:
>> <snip/>
>>> I have just now received another SIGSEGV, with your patch applied.
>>>
>>> Here's the info from the GDB session:
>>>
>>> Detaching after fork from child process 11266.
>>> 2012-12-28 18:56:53.261+0000: 29943: error : qemuMonitorIO:614 :
>>> internal error End of file from monitor
>>>
>>> Program received signal SIGSEGV, Segmentation fault.
>>> [Switching to Thread 0x7fffec0cd700 (LWP 29955)]
>>> qemuDomainObjBeginJobInternal (driver=0x7fffe4013520,
>>> driver_locked=true, obj=0x7fff7801fc80, job=QEMU_JOB_DESTROY,
>>> asyncJob=QEMU_ASYNC_JOB_NONE) at qemu/qemu_domain.c:780
>>> 780 priv->jobs_queued++;
>>> (gdb) bt
>>> #0 qemuDomainObjBeginJobInternal (driver=0x7fffe4013520,
>>> driver_locked=true, obj=0x7fff7801fc80, job=QEMU_JOB_DESTROY,
>>> asyncJob=QEMU_ASYNC_JOB_NONE) at qemu/qemu_domain.c:780
>>> #1 0x00007fffea599f46 in qemuDomainDestroyFlags (dom=<value
>>> optimized out>, flags=<value optimized out>) at qemu/qemu_driver.c:2189
>>> #2 0x00007ffff7a83587 in virDomainDestroy (domain=0x7fffe414a510)
>>> at libvirt.c:2215
>>> #3 0x00000000004296e2 in remoteDispatchDomainDestroy (server=<value
>>> optimized out>, client=<value optimized out>, msg=<value optimized
>>> out>, rerr=0x7fffec0ccbc0, args=<value optimized out>, ret=<value
>>> optimized out>) at remote_dispatch.h:1277
>>> #4 remoteDispatchDomainDestroyHelper (server=<value optimized out>,
>>> client=<value optimized out>, msg=<value optimized out>,
>>> rerr=0x7fffec0ccbc0, args=<value optimized out>, ret=<value
>>> optimized out>) at remote_dispatch.h:1255
>>> #5 0x00007ffff7ad0d02 in virNetServerProgramDispatchCall
>>> (prog=0x6814d0, server=0x678df0, client=0x693a80, msg=0x6986d0) at
>>> rpc/virnetserverprogram.c:431
>>> #6 virNetServerProgramDispatch (prog=0x6814d0, server=0x678df0,
>>> client=0x693a80, msg=0x6986d0) at rpc/virnetserverprogram.c:304
>>> #7 0x00007ffff7aceaa6 in virNetServerProcessMsg (srv=<value
>>> optimized out>, client=0x693a80, prog=<value optimized out>,
>>> msg=0x6986d0) at rpc/virnetserver.c:173
>>> #8 0x00007ffff7acf5e3 in virNetServerHandleJob (jobOpaque=<value
>>> optimized out>, opaque=0x678df0) at rpc/virnetserver.c:194
>>> #9 0x00007ffff79e8fdc in virThreadPoolWorker (opaque=<value
>>> optimized out>) at util/threadpool.c:144
>>> #10 0x00007ffff79e88c9 in virThreadHelper (data=<value optimized
>>> out>) at util/threads-pthread.c:161
>>> #11 0x000000300a2077f1 in start_thread () from /lib64/libpthread.so.0
>>> #12 0x0000003009ae570d in clone () from /lib64/libc.so.6
>>> (gdb)
>> This means, even though we successfully incremented reference counter on
>> virDomainObjPtr object, somebody free()d it anyway (well, the
>> privateData at least). Looks like a locking/concurrent access issue to
>> me then. Unfortunately, I don't have any suggestions yet, as the domain
>> object is supposed to be locked when entering the
>> qemuDomainObjBeginJobInternal() function so it shouldn't get free()d
>> meanwhile.
>>
>> Michal
> Michal,
>
> I have a faster way to reproduce the crash (~10 minutes). Continue to
> read for new (easier) steps.
> This test was done with the standard v1.0.0 libvirtd code source, with
> no other patches applied.
>
> <snip>
>
Not sure how much this helps, but in my testing I have found this issue
was introduced with v0.9.12.
I cannot reproduce this issue under v0.9.11.X or older. Comparing
src/qemu/qemu_domain.c between v0.9.11.X and v0.9.12 I see numerous
changes to the code related to locking/concurrency. For instance, the
introduction of qemuDomainTrackJob as one large difference I see.
More information about the libvir-list
mailing list