[libvirt] [PATCH v2]daemon: Fix a crash during virNetlinkEventServiceStopAll

Liu Haitao haitao.liu at windriver.com
Wed Jun 12 07:18:52 UTC 2019


When reboot the host, a core dump file would be generated.

The call traces are:

Note.In this case, the  main thread is thread 5.
                
(gdb) thread 5
[Switching to thread 5 (LWP 4142)]
(gdb) bt
0  0x00007f00a6838273 in futex_wait_cancelable (private=<optimized out>, 
    expected=0, futex_word=0x7f004c0125c0)
    at /usr/src/debug/glibc/2.24-r0/git/sysdeps/unix/sysv/linux/futex-internal.h:88
1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7f004c012540, 
    cond=0x7f004c012598)
    at /usr/src/debug/glibc/2.24-r0/git/nptl/pthread_cond_wait.c:502
2  __pthread_cond_wait (cond=0x7f004c012598, mutex=0x7f004c012540)
    at /usr/src/debug/glibc/2.24-r0/git/nptl/pthread_cond_wait.c:655
3  0x00007f00aa467246 in virCondWait (c=<optimized out>, m=<optimized out>)
    at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virthread.c:154
4  0x00007f00aa467eb0 in virThreadPoolFree (pool=<optimized out>)
    at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virthreadpool.c:286
5  0x00007f0074349f9d in qemuStateCleanup ()
    at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/qemu/qemu_driver.c:1036
6  0x00007f00aa5e9486 in virStateCleanup ()
    at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/libvirt.c:682
7  0x000055a687ab86a4 in main (argc=<optimized out>, argv=<optimized out>)
    at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/remote/remote_daemon.c:1473

(gdb) thread 1
[Switching to thread 1 (LWP 4403)]
(gdb) bt
0  __GI___pthread_mutex_lock (mutex=mutex at entry=0x0)
    at /usr/src/debug/glibc/2.24-r0/git/nptl/pthread_mutex_lock.c:67
1  0x00007f00aa467165 in virMutexLock (m=m at entry=0x0)
    at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virthread.c:89
2  0x00007f00aa43c0f9 in virNetlinkEventServerLock (driver=<optimized out>)
    at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virnetlink.c:799
3  virNetlinkEventRemoveClient (watch=watch at entry=0, 
    macaddr=macaddr at entry=0x7f0088014944, protocol=protocol at entry=0)
    at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virnetlink.c:1197
4  0x00007f00aa4341df in virNetDevMacVLanDeleteWithVPortProfile (
    ifname=<optimized out>, macaddr=macaddr at entry=0x7f0088014944, 
    linkdev=0x7f0088014920 "eth1", mode=mode at entry=1, 
    virtPortProfile=virtPortProfile at entry=0x0, 
    stateDir=stateDir at entry=0x7f004c12fa90 "/var/run/libvirt/qemu")
    at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virnetdevmacvlan.c:1112
5  0x00007f0074312251 in qemuProcessStop (driver=driver at entry=0x7f004c0ecef0, 
    vm=vm at entry=0x7f0088000b00, 
    reason=reason at entry=VIR_DOMAIN_SHUTOFF_SHUTDOWN, 
    asyncJob=asyncJob at entry=QEMU_ASYNC_JOB_NONE, flags=<optimized out>)
    at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/qemu/qemu_process.c:7291
6  0x00007f007437a5ea in processMonitorEOFEvent (vm=0x7f0088000b00, driver=0x7f004c0ecef0)
    at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/qemu/qemu_driver.c:4756
7  qemuProcessEventHandler (data=0x55a687d6df10, opaque=0x7f004c0ecef0)
    at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/qemu/qemu_driver.c:4859
8  0x00007f00aa467c5b in virThreadPoolWorker (
    opaque=opaque at entry=0x55a687d6c110)
    at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virthreadpool.c:163
9  0x00007f00aa466fe8 in virThreadHelper (data=<optimized out>)
    at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virthread.c:206
10 0x00007f00a68323f4 in start_thread (arg=0x7f00699df700)
    at /usr/src/debug/glibc/2.24-r0/git/nptl/pthread_create.c:456
11 0x00007f00a616e10f in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105


1. The execution flow of main thread (Thread 5 LWP 4142):
main()
  -->virNetDaemonRun()
  -->virNetDaemonClose(dmn)  //cleanup 
  -->virNetlinkEventServiceStopAll() 
  -->virStateCleanup() 
	 -->qemuStateCleanup() 
	   -->virThreadPoolFree()
	     -->__pthread_cond_wait()

virNetDaemonRun()
    -->virEventRunDefaultImpl
      -->virEventPollRunOnce
       -->virEventPollDispatchHandles
        -->qemuMonitorIO
          -->qemuProcessHandleMonitorEOF
            -->processEvent->eventType = QEMU_PROCESS_EVENT_MONITOR_EOF
             -->virThreadPoolSendJob()

After typing reboot command on the host, the main thread would send an event message to another thread. 
Here it would let thread 1 to handle the shutdown of qemu process. But it could
not be executed immediately.

virNetlinkEventServiceStopAll() 
	--> virNetlinkEventServiceStop()
	  --> server[protocol] = NULL;   // set server to null 

IN virNetlinkEventServiceStopAll(), some variables related to network are freed,
like (static virNetlinkEventSrvPrivatePtr server).

virStateCleanup() 
	-->qemuStateCleanup() 
	   -->virThreadPoolFree()
	     -->__pthread_cond_wait()

In virThreadPoolFree() it will wait other thread to end up. 

2. The execution flow of thread 5 (LWP 4403):
qemuProcessStop()
   -->virNetDevMacVLanDeleteWithVPortProfile()
	  -->virNetlinkEventRemoveClient()
	     --> srv = server[protocol]


Although the main thread had sent the message to thread 1(4403), it could not be
run instantly. It means that the  virNetlinkEventServiceStopAll() might be
executed earlier than virNetlinkEventRemoveClient(). We could get it from the log file.

""
2019-06-12 00:10:09.230+0000: 4142: info : virNetlinkEventServiceStopAll:941 : stopping all netlink event services
2019-06-12 00:10:09.230+0000: 4142: info : virNetlinkEventServiceStop:904 : stopping netlink event service
2019-06-12 00:10:21.165+0000: 4403: debug : virNetlinkEventRemoveClient:1190 : removing client watch=0, mac=0x7f0088014944.
"

In virNetlinkEventRemoveClient() the variable server is used again, but now it
is null that is freed by virNetlinkEventServiceStopAll().So it would case a crash .

The virNetlinkEventServiceStopAll() should be executed behind virStateCleanup(),

Signed-off-by: Liu Haitao <haitao.liu at windriver.com>
---
 src/remote/remote_daemon.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/remote/remote_daemon.c b/src/remote/remote_daemon.c
index c3782971f1..7da20a6644 100644
--- a/src/remote/remote_daemon.c
+++ b/src/remote/remote_daemon.c
@@ -1464,8 +1464,6 @@ int main(int argc, char **argv) {
     /* Keep cleanup order in inverse order of startup */
     virNetDaemonClose(dmn);
 
-    virNetlinkEventServiceStopAll();
-
     if (driversInitialized) {
         /* NB: Possible issue with timing window between driversInitialized
          * setting if virNetlinkEventServerStart fails */
@@ -1473,6 +1471,8 @@ int main(int argc, char **argv) {
         virStateCleanup();
     }
 
+    virNetlinkEventServiceStopAll();
+
     virObjectUnref(adminProgram);
     virObjectUnref(srvAdm);
     virObjectUnref(qemuProgram);
-- 
2.21.0




More information about the libvir-list mailing list