[libvirt] [PATCH 0/3] Avoid crash due to race in nwfilter reload/libvirtd startup

Daniel P. Berrange berrange at redhat.com
Thu Oct 3 13:06:57 UTC 2013


From: "Daniel P. Berrange" <berrange at redhat.com>

The past 24 hours have seen a flurry of libvirtd crash reports from
Fedora users.

  https://bugzilla.redhat.com/show_bug.cgi?id=1014933

In one thread we have the libvirtd daemon startup code running, and
it is in the middle of QEMU state initialization.

#9  0xb00882e4 in qemuStateInitialize (privileged=true, callback=0xb77a0420 <daemonInhibitCallback>, opaque=0xb8b1fc98) at qemu/qemu_driver.c:595
        driverConf = 0xaf5afcd8 "/etc/libvirt/qemu.conf"
        conn = 0x0
        ebuf = "\000\260\025\267\024\071P\257\214\000\000\000\360\316\341\257\335\242\023\267\214\000\000\000\210\177X\257\001\000\000\000l\000\000\000\360\316\341\257\000\260\025\267\264\316\341\257\210\177X\257$\316\341\257$\316\341\257l\000\000\000\304\316\341\257\201\321LRl\000\000\000\235R\022\267\000)\233\351\260\316\341\257\000\000\000\000\253G\022\267\000\260\025\267\340\316\341\257\a\000\000\000\v\260\023\267\000\260\025\267\001\000\000\000\254\325\334\266\000\260\025\267\214\261\023\267 :P\257\037:P\257\000\000\000\000/\261\023\267\000\260\025\267uc\334\266\000\260\025\267A\262\023\267\037:P\257\000\000\000\000\001\000\000\000\000\000\000\000\340\316\341\257\334\316\341\257\001\000\000\000\001\000\000\000\033c\024\267"...
        membase = 0x0
        mempath = 0x0
        cfg = 0xaf509050
        run_uid = 4294967295
        run_gid = 4294967295
        __func__ = "qemuStateInitialize"
        __FUNCTION__ = "qemuStateInitialize"
#10 0xb74c5325 in virStateInitialize (privileged=true, callback=callback at entry=0xb77a0420 <daemonInhibitCallback>, opaque=opaque at entry=0xb8b1fc98) at libvirt.c:833
        i = 6
        __func__ = "virStateInitialize"
#11 0xb77a049e in daemonRunStateInit (opaque=opaque at entry=0xb8b1fc98) at libvirtd.c:876
        srv = 0xb8b1fc98
        __func__ = "daemonRunStateInit"



In another thread, we have a dbus event being handled by the nwfilter
driver, and the nwfilter driver calls into the QEMU driver....which
has not finished initializing itself yet! 

Thread 1 (Thread 0xb6366ac0 (LWP 7041)):
#0  0xb0052861 in virQEMUCloseCallbacksGetForConn (closeCallbacks=0x0, conn=0xb8b2cc20) at qemu/qemu_conf.c:861
        list = 0xb8ac57e8
        data = {conn = 0xb8b2cc20, list = 0xb8ac57e8, oom = false}
#1  virQEMUCloseCallbacksRun (closeCallbacks=0x0, conn=conn at entry=0xb8b2cc20, driver=0xaf50b350) at qemu/qemu_conf.c:890
        list = 0xb8b2cc20
        i = <optimized out>
        __func__ = "virQEMUCloseCallbacksRun"
#2  0xb009df3b in qemuConnectClose (conn=0xb8b2cc20) at qemu/qemu_driver.c:1057
        driver = <optimized out>
#3  0xb74babc1 in virConnectDispose (obj=0xb8b2cc20) at datatypes.c:159
        conn = 0xb8b2cc20
#4  0xb742f22c in virObjectUnref (anyobj=anyobj at entry=0xb8b2cc20) at util/virobject.c:264
        klass = 0xb8b2cba0
        obj = 0xb8b2cc20
        lastRef = true
        __func__ = "virObjectUnref"
#5  0xb74c5811 in virConnectClose (conn=conn at entry=0xb8b2cc20) at libvirt.c:1503
        __func__ = "virConnectClose"
        __FUNCTION__ = "virConnectClose"
#6  0xb023424e in nwfilterStateReload () at nwfilter/nwfilter_driver.c:301
        conn = 0xb8b2cc20
#7  0xb02342fc in nwfilterFirewalldDBusFilter (connection=0xaf501038, message=0xaf503910, user_data=0x0) at nwfilter/nwfilter_driver.c:90
        __func__ = "nwfilterFirewalldDBusFilter"
#8  0xb711efb9 in dbus_connection_dispatch (connection=0xaf501038) at dbus-connection.c:4631
        filter = <optimized out>
        next = 0x0
        message = 0xaf503910
        link = <optimized out>
        filter_list_copy = 0xaf5009dc
        message_link = 0xaf500a18
        result = DBUS_HANDLER_RESULT_NOT_YET_HANDLED
        pending = <optimized out>
        reply_serial = <optimized out>
        status = <optimized out>
        found_object = 3071507249
        __FUNCTION__ = "dbus_connection_dispatch"
#9  0xb740caeb in virDBusWatchCallback (fdatch=fdatch at entry=8, fd=15, events=1, opaque=0xaf500ca8) at util/virdbus.c:144
        watch = 0xaf500ca8
        info = 0xaf500de0
        dbus_flags = 1


This DBus event is triggered when the firewalld driver is
reloaded, or restarted.

I confirmed this analysis by adding a sleep(10) to the QEMU
driver startup code, and then triggering a firewalld restart.
Sure enough it crashed & burned with the same trace.

The reason why it has suddenly hit us is that we are unlucky
enough to have a firewalld update in Fedora repos at the same
time as a libvirt update, and lots of people are pulling both
updates down in one yum transaction!

After wasting time figuring out how to avoid the race condition
with mutexes and other synchronization ideas, I realized that
the nwfilter code was in fact bogus.

The only reason it gets a virConnectPtr is so that the code
for reloading filters can access its nwfilterPrivateData
field to get the virNWFilterDriverStatePtr object instance.

This is insanely convoluted, since the nwfilter driver  can
trivially pass the driver state instance into the
virNWFilterConfLayerInit method at startup.

Thus these patches just rip out all use of virConnectPtr
from the nwfilter driver code, thus avoiding the race with
the QEMU driver initialization code.

This also fixes the nwfilter driver in cases where the QEMU
driver is disabled, but LXC driver still wants to use nwfilter.

Daniel P. Berrange (3):
  Remove virConnectPtr arg from virNWFilterDefParse*
  Don't pass virConnectPtr in nwfilter 'struct domUpdateCBStruct'
  Remove use of virConnectPtr from all remaining nwfilter code

 src/conf/nwfilter_conf.c               | 78 ++++++++++++++++------------------
 src/conf/nwfilter_conf.h               | 24 ++++-------
 src/lxc/lxc_driver.c                   |  3 +-
 src/nwfilter/nwfilter_dhcpsnoop.c      | 12 +++---
 src/nwfilter/nwfilter_driver.c         | 49 +++++++++------------
 src/nwfilter/nwfilter_gentech_driver.c | 32 +++++++-------
 src/nwfilter/nwfilter_gentech_driver.h | 10 ++---
 src/nwfilter/nwfilter_learnipaddr.c    |  6 +--
 src/qemu/qemu_driver.c                 |  6 ++-
 src/uml/uml_driver.c                   |  3 +-
 tests/nwfilterxml2xmltest.c            |  2 +-
 11 files changed, 102 insertions(+), 123 deletions(-)

-- 
1.8.3.1




More information about the libvir-list mailing list