[libvirt] Deadlock when using custom handlers

Guido Günther agx at sigxcpu.org
Fri Aug 12 21:54:28 UTC 2011


Hi,
virt-viewer is using it's own virEventRegisterImpl. With current libvirt
this can deadlock when connection to nonexistant URIs like

    qemu+ssh:///unknownhost.example.com/system

like:

23:47:00.338: 1526: debug : doRemoteOpen:503 : proceeding with name = qemu:///system
23:47:00.338: 1526: debug : doRemoteOpen:513 : Connecting with transport 2
23:47:00.338: 1526: debug : virCommandRunAsync:2042 : About to run LC_ALL=C LD_LIBRARY_PATH=/var/scratch/debian/libvirt/upstream/libvirt/src/.libs/ PATH=/usr/local/bin:/usr/bin:/bin:/usr/games:/home/agx/bin:/sbin:/usr/sbin:/usr/lib/git-core HOME=/home/agx USER=agx LOGNAME=agx SSH_AUTH_SOCK=/tmp/keyring-RNvlnT/ssh DISPLAY=:0 ssh pustekiste nc -U /var/run/libvirt/libvirt-sock-ro
23:47:00.339: 1526: debug : virCommandRunAsync:2058 : Command result 0, with PID 1527
23:47:00.339: 1526: debug : virNetSocketNew:115 : localAddr=(nil) remoteAddr=(nil) fd=6 errfd=8 pid=1527
23:47:00.339: 1526: debug : virNetSocketNew:173 : sock=0x8926d20 localAddrStr=(null) remoteAddrStr=(null)
** (virt-viewer:1526): DEBUG: Add handle 6 1 0x8926d20
23:47:00.339: 1526: debug : virNetClientNew:160 : client=0xb5b5f008 refs=2
23:47:00.339: 1526: debug : doRemoteOpen:640 : Trying authentication
23:47:00.339: 1526: debug : virNetMessageNew:44 : msg=0xb5b1e008
23:47:00.340: 1526: debug : virNetMessageEncodePayload:255 : Encode length as 28
23:47:00.340: 1526: debug : virNetClientIO:1035 : program=536903814 version=1 serial=0 proc=66 type=0 length=28 dispatch=(nil)
23:47:00.340: 1526: debug : virNetClientIO:1103 : We have the buck 0x8939940 0x8939940
23:47:00.350: 1526: debug : virNetClientIOEventLoop:979 : Giving up the buck due to I/O error 0x8939940 (nil)
23:47:00.350: 1526: debug : virNetClientIO:1130 : All done with our call (nil) 0x8939940 -1
23:47:00.350: 1526: debug : virNetMessageFree:57 : msg=0xb5b1e008
** (virt-viewer:1526): DEBUG: Remove handle 1 6
<deadlock>

Gdb displays the deadlock nicely:

#0  0xb7710424 in __kernel_vsyscall ()
#1  0xb692cf02 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142
#2  0xb692839b in _L_lock_728 () from /lib/i386-linux-gnu/i686/cmov/libpthread.so.0
#3  0xb69281c1 in __pthread_mutex_lock (mutex=0x95e1c30) at pthread_mutex_lock.c:61
#4  0xb698fd07 in virMutexLock (m=0x95e1c30) at util/threads-pthread.c:85
#5  0xb6a5dd56 in virNetSocketEventFree (opaque=0x95e1c30) at rpc/virnetsocket.c:1147
#6  0x080514a1 in virt_viewer_events_remove_handle (watch=1) at virt-viewer-events.c:178
#7  0xb697264e in virEventRemoveHandle (watch=1) at util/event.c:84
#8  0xb6a608ed in virNetSocketRemoveIOCallback (sock=0x95e1c30) at rpc/virnetsocket.c:1221
#9  0xb6a57114 in virNetClientClose (client=0xb5b7e008) at rpc/virnetclient.c:280
#10 0xb6a46fc4 in doRemoteOpen (conn=0x95e1aa8, priv=0x95b1370, auth=0xbfc3b178, flags=1) at remote/remote_driver.c:705
#11 0xb6a49612 in remoteOpen (conn=0x95e1aa8, auth=0xbfc3b178, flags=1) at remote/remote_driver.c:820
#12 0xb69f70c6 in do_open (name=0x95aac10 "qemu+ssh://pustekiste:2222/system", auth=0xbfc3b178, flags=1) at libvirt.c:1054
#13 0xb69f9b88 in virConnectOpenAuth (name=0x95aac10 "qemu+ssh://pustekiste:2222/system", auth=0xbfc3b178, flags=1) at libvirt.c:1280
#14 0x080500cf in virt_viewer_start (app=0x95a1010) at virt-viewer.c:483
#15 0x08053be8 in virt_viewer_app_start (self=0x95a1010) at virt-viewer-app.c:1044
#16 0x0804f54d in main (argc=1, argv=0xbfc3b3f4) at virt-viewer-main.c:120

which is the sock->lock:

    virNetSocketRemoveIOCallback holds sock->lock
	-> virEventRemoveHandle
	    -> impl_remove_handle
		-> opaque->ff
		    -> virNetSocketEventFree want to hold sock->lock

Working around this by removing the locks from
virNetSocketRemoveIOCallback leads to another deadlock:

#0  0xb77de424 in __kernel_vsyscall ()
#1  0xb69faf02 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142
#2  0xb69f639b in _L_lock_728 () from /lib/i386-linux-gnu/i686/cmov/libpthread.so.0
#3  0xb69f61c1 in __pthread_mutex_lock (mutex=0xb5c4c00c) at pthread_mutex_lock.c:61
#4  0xb6a5dd07 in virMutexLock (m=0xb5c4c00c) at util/threads-pthread.c:85
#5  0xb6b24a98 in virNetClientLock (client=0xb5c4c008) at rpc/virnetclient.c:99
#6  virNetClientFree (client=0xb5c4c008) at rpc/virnetclient.c:243
#7  0xb6b250b7 in virNetClientEventFree (opaque=0xb5c4c008) at rpc/virnetclient.c:117
#8  0xb6b2bd82 in virNetSocketEventFree (opaque=0x8aabc30) at rpc/virnetsocket.c:1156
#9  0x080514a1 in virt_viewer_events_remove_handle (watch=1) at virt-viewer-events.c:178
#10 0xb6a4064e in virEventRemoveHandle (watch=1) at util/event.c:84
#11 0xb6b2e8e5 in virNetSocketRemoveIOCallback (sock=0x8aabc30) at rpc/virnetsocket.c:1219
#12 0xb6b25114 in virNetClientClose (client=0xb5c4c008) at rpc/virnetclient.c:280
#13 0xb6b14fc4 in doRemoteOpen (conn=0x8aabaa8, priv=0x8a7b370, auth=0xbfbf1798, flags=1) at remote/remote_driver.c:705
#14 0xb6b17612 in remoteOpen (conn=0x8aabaa8, auth=0xbfbf1798, flags=1) at remote/remote_driver.c:820
#15 0xb6ac50c6 in do_open (name=0x8a74c10 "qemu+ssh://pustekiste:2222/system", auth=0xbfbf1798, flags=1) at libvirt.c:1054
#16 0xb6ac7b88 in virConnectOpenAuth (name=0x8a74c10 "qemu+ssh://pustekiste:2222/system", auth=0xbfbf1798, flags=1) at libvirt.c:1280
#17 0x080500cf in virt_viewer_start (app=0x8a6b010) at virt-viewer.c:483
#18 0x08053be8 in virt_viewer_app_start (self=0x8a6b010) at virt-viewer-app.c:1044
#19 0x0804f54d in main (argc=1, argv=0xbfbf1a14) at virt-viewer-main.c:120

which is the virNetClient lock:

   virNetClientClose holds client->lock
    -> virNetSocketRemoveIOCallback 
        -> virEventRemoveHandle
            -> impl_remove_handle 
                -> virNetSocketEventFree
                    -> virNetClientEventFree wants the lock

I didn't see a simple way to fix this but would welcome any suggestions.
Cheers,
 -- Guido




More information about the libvir-list mailing list