<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312">
<meta name="Generator" content="Microsoft Word 12 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:宋体;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"\@宋体";
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
text-align:justify;
text-justify:inter-ideograph;
font-size:10.5pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;}
/* Page Definitions */
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="ZH-CN" link="blue" vlink="purple" style="text-justify-trim:punctuation">
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US">Hi everyone,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">I do ram migration operation in KVM environment(libvirt1.2.4 qemu1.5.1).<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">I encountered libvirtd deadlock or segmentfault when I destroy the<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">migration VM on destination.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">I got the problem by flowing steps:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">step 1: migrate VM.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">step 2: execute "virsh destroy [VMName]" to destroy the migration VM on
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> destination immediately.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">step 3: the destination libvirtd will be probably deadlock or segmentfault.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">Deadlock stack as followed:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#0 0x00007fb5c18132d4 in __lll_lock_wait () from /lib64/libpthread.so.0<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#1 0x00007fb5c180e659 in _L_lock_1008 () from /lib64/libpthread.so.0<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#2 0x00007fb5c180e46e in pthread_mutex_lock () from /lib64/libpthread.so.0<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#3 0x00007fb5c45d175f in virMutexLock (m=0x7fb5b0066ed0) at util/virthread.c:88<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#4 0x00007fb5c45b6b04 in virObjectLock (anyobj=0x7fb5b0066ec0) at<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> util/virobject.c:323<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#5 0x00007fb5b8f4842a in qemuMonitorEmitEvent (mon=0x7fb5b0066ec0,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> event=0x7fb5b00688d0 "SHUTDOWN", seconds=1399374472, micros=509994,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> details=0x0) at qemu/qemu_monitor.c:1185<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#6 0x00007fb5b8f62af2 in qemuMonitorJSONIOProcessEvent (mon=0x7fb5b0066ec0,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> obj=0x7fb5b0069080) at qemu/qemu_monitor_json.c:158<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#7 0x00007fb5b8f62d25 in qemuMonitorJSONIOProcessLine (mon=0x7fb5b0066ec0,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> line=0x7fb5b005bbe0 "{\"timestamp\": {\"seconds\": 1399374472,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> \"microseconds\": 509994}, \"event\": \"SHUTDOWN\"}",msg=0x7fb5bd873c80)<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> at qemu/qemu_monitor_json.c:195<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#8 0x00007fb5b8f62f85 in qemuMonitorJSONIOProcess (mon=0x7fb5b0066ec0,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> data=0x7fb5b0060770 "{\"timestamp\": {\"seconds\": 1399374472,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> \"microseconds\": 509994},\"event\": \"SHUTDOWN\"}\r\n", len=85,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> msg=0x7fb5bd873c80) at qemu/qemu_monitor_json.c:237<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#9 0x00007fb5b8f49aa0 in qemuMonitorIOProcess (mon=0x7fb5b0066ec0)<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> at qemu/qemu_monitor.c:402<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#10 0x00007fb5b8f4a09b in qemuMonitorIO (watch=20, fd=24, events=0,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> opaque=0x7fb5b0066ec0) at qemu/qemu_monitor.c:651<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#11 0x00007fb5c458c4d9 in virEventPollDispatchHandles (nfds=17, fds=0x7fb5b0068a60)<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> at util/vireventpoll.c:510<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#12 0x00007fb5c458decf in virEventPollRunOnce () at util/vireventpoll.c:659<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#13 0x00007fb5c458bfcc in virEventRunDefaultImpl () at util/virevent.c:308<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#14 0x00007fb5c51a17a9 in virNetServerRun (srv=0x7fb5c5411d70)<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> at rpc/virnetserver.c:1139<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#15 0x00007fb5c5157f63 in main (argc=3, argv=0x7fff7fc04f48) at libvirtd.c:150<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">After analysis, I found it may be caused by multithreaded simultaneously
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">access to the global variables "vm->privateData->mon".<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">When problems occur</span><span style="font-family:宋体">,</span><span lang="EN-US">there are three libvirtd threads at work on destination<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">host</span><span style="font-family:宋体">,</span><span lang="EN-US">suppose</span><span style="font-family:宋体">:</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">ThreadA: migration thread</span><span style="font-family:宋体">,</span><span lang="EN-US">do qemuProcessStart.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">ThreadB: destroy thread</span><span style="font-family:宋体">,</span><span lang="EN-US">do qemuDoaminDestroy -> qemuProcessStop.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">ThreadC</span><span style="font-family:宋体">:</span><span lang="EN-US">Monitor Thread</span><span style="font-family:宋体">,</span><span lang="EN-US">do IOWrite</span><span style="font-family:宋体">、</span><span lang="EN-US">IORead
and some other operations according to<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">the mon->msg when mon->fd change. When threadB destroy happpens, this thread would<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">handle the SUHTDOWN event.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">In threadA, when it sends QMP command to Qemu, it will operate the vm->privateData->mon<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">lock. Such as the operation "qemuDomainObjEnterMonitor -> qemuMonitorSetBalloon -><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">qemuDomainObjExitMonitor", but it's not an atomic operation. If "virsh destroy [VMName]"<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">happens during this operation, threadB will set the lock vm->privateData->mon to NULL in<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">qemuProcessStop. And then in threadA, the function qemuDomainObjExitMonitor will fail to<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">unlock vm->privateData->mon as it's NULL. So, threadC will never acquire the<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">vm->privateData->mon lock and the deadlock problem happened.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">what was worse, if qemuMonitorSetBalloon perform succeed in threadA.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">ThreadA will coutinue to execute till it enter the function qemuMonitorSetDomainLog,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">it would cause segmentfault at VIR_FORCE_CLOSE(mon->logfd) due to mon is NULL.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">I could not find a good way to sovle this problem. Does anyone have good ideas?
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">Thanks.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">Ps:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">I find an easy way to reproduce this problem more probably by following steps:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">step 1: Fault Injection, fit into this patch and update the libvirtd on destination host:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">--- src/qemu/qemu_process.c 2014-05-06 19:06:00.000000000 +0800<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">+++ src/qemu/qemu_process.c 2014-05-06 19:07:12.000000000 +0800<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">@@ -4131,6 +4131,8 @@<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> vm->def->mem.cur_balloon);<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> goto cleanup;<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> }<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">+ VIR_DEBUG("Fault Injection, sleep 3 seconds.");<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">+ sleep(3);<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> qemuDomainObjEnterMonitor(driver, vm);<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> if (vm->def->memballoon && vm->def->memballoon->period)<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> qemuMonitorSetMemoryStatsPeriod(priv->mon, vm->def->memballoon->period);<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">step 2: migrate VM.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">step 3: execute "virsh destroy [VMName]" to destroy the migration VM on destination<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> when log prints "Fault Injection, sleep 3 seconds."<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">step 4: the libvirtd deadlock stack occurs as above mentioned.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">Regards<o:p></o:p></span></p>
</div>
</body>
</html>