[libvirt] [PATCH v3] nwfilter: resolve deadlock between VM operations and filter update

Stefan Berger stefanb at linux.vnet.ibm.com
Thu Oct 7 19:07:38 UTC 2010


  V3:
   - removed debugging code
   - I looked through existing code and I could not find any other 
instances where the filter and domain_lock were locked without a 
preceding qemu/uml_driver lock. So the 2nd assumption seems to hold and 
no further changes should be required for this issue.

  V2:
   - remove the locks from qemudVMFilterRebuild & umlVMFilterRebuild

  This is from a bug report and conversation on IRC where Soren reported 
that while a filter update is occurring on one or more VMs (due to a 
rule having been edited for example), a deadlock can occur when a VM 
referencing a filter is started.

The problem is caused by the two locking sequences of

qemu driver, qemu domain, filter             # for the VM start operation
filter, qemu_driver, qemu_domain            # for the filter update 
operation

that obviously don't lock in the same order. The problem is the 2nd lock 
sequence. Here the qemu_driver lock is being grabbed in 
qemu_driver:qemudVMFilterRebuild()

The following solution is based on the idea of trying to re-arrange the 
2nd sequence of locks as follows:

qemu_driver, filter, qemu_driver, qemu_domain

and making the qemu driver recursively lockable so that a second lock 
can occur, this would then lead to the following net-locking sequence

qemu_driver, filter, qemu_domain

where the 2nd qemu_driver lock has been ( logically ) eliminated.

The 2nd part of the idea is that the sequence of locks (filter, 
qemu_domain) and (qemu_domain, filter) becomes interchangeable if all 
code paths where filter AND qemu_domain are locked have a preceding 
qemu_domain lock that basically blocks their concurrent execution

So, the following code paths exist towards 
qemu_driver:qemudVMFilterRebuild where we now want to put a qemu_driver 
lock in front of the filter lock.

-> nwfilterUndefine()   [ locks the filter ]
     -> virNWFilterTestUnassignDef()
         -> virNWFilterTriggerVMFilterRebuild()
             -> qemudVMFilterRebuild()

-> nwfilterDefine()
     -> virNWFilterPoolAssignDef() [ locks the filter ]
         -> virNWFilterTriggerVMFilterRebuild()
             -> qemudVMFilterRebuild()

-> nwfilterDriverReload()
     -> virNWFilterPoolLoadAllConfigs()
         ->virNWFilterPoolObjLoad()
             -> virNWFilterPoolAssignDef() [ locks the filter ]
                 -> virNWFilterTriggerVMFilterRebuild()
                     -> qemudVMFilterRebuild()

-> nwfilterDriverStartup()
     -> virNWFilterPoolLoadAllConfigs()
         ->virNWFilterPoolObjLoad()
             -> virNWFilterPoolAssignDef() [ locks the filter ]
                 -> virNWFilterTriggerVMFilterRebuild()
                     -> qemudVMFilterRebuild()

Qemu is not the only driver using the nwfilter driver, but also the UML 
driver calls into it. Therefore qemuVMFilterRebuild() can be exchanged 
with umlVMFilterRebuild() along with the driver lock of qemu_driver that 
can now be a uml_driver. Further, since UML and Qemu domains can be 
running on the same machine, the triggering of a rebuild of the filter 
can touch both types of drivers and their domains.

In the patch below I am now extending each nwfilter callback driver with 
functions for locking and unlocking the (VM) driver (UML, QEMU) and 
introduce new functions for locking all registered callback drivers and 
unlocking them. Then I am distributing the 
lock-all-cbdrivers/unlock-all-cbdrivers call into the above call paths. 
The last shown callpath starting with nwfilterDriverStart() is 
problematic since it is initialize before the Qemu and UML drives are 
and thus a lock in the path would result in a NULL pointer attempted to 
be locked -- the call to virNWFilterTriggerVMFilterRebuild() is never 
called, so we never lock either the qemu_driver or the uml_driver in 
that path. Therefore, only the first 3 paths now receive calls to lock 
and unlock all callback drivers. Now that the locks are distributed 
where it matters I can remove the qemu_driver and uml_driver lock from 
qemudVMFilterRebuild() and umlVMFilterRebuild() and not requiring the 
recursive locks.

For now I want to put this out as an RFC patch. I have tested it by 
'stretching' the critical section after the define/undefine functions 
each lock the filter so I can (easily) concurrently execute another VM 
operation (suspend,start). That code is in this patch and if you want 
you can de-activate it. It seems to work ok and operations are being 
blocked while the update is being done.
I still also want to verify the other assumption above that locking 
filter and qemu_domain always has a preceding qemu_driver lock.

Signed-off-by: Stefan Berger <stefanb at us.ibm.com>

---
  src/conf/nwfilter_conf.c       |   18 ++++++++++++++++++
  src/conf/nwfilter_conf.h       |    6 ++++++
  src/libvirt_private.syms       |    2 ++
  src/nwfilter/nwfilter_driver.c |   13 +++++++++++++
  src/qemu/qemu_driver.c         |   19 +++++++++++++++----
  src/uml/uml_driver.c           |   18 ++++++++++++++----
  6 files changed, 68 insertions(+), 8 deletions(-)

Index: libvirt-acl/src/conf/nwfilter_conf.h
===================================================================
--- libvirt-acl.orig/src/conf/nwfilter_conf.h
+++ libvirt-acl/src/conf/nwfilter_conf.h
@@ -658,6 +658,8 @@ void virNWFilterConfLayerShutdown(void);

  typedef int (*virNWFilterRebuild)(virConnectPtr conn,
                                    virHashIterator, void *data);
+typedef void (*virNWFilterVoidCall)(void);
+

  typedef struct _virNWFilterCallbackDriver virNWFilterCallbackDriver;
  typedef virNWFilterCallbackDriver *virNWFilterCallbackDriverPtr;
@@ -665,9 +667,13 @@ struct _virNWFilterCallbackDriver {
      const char *name;

      virNWFilterRebuild vmFilterRebuild;
+    virNWFilterVoidCall vmDriverLock;
+    virNWFilterVoidCall vmDriverUnlock;
  };

  void virNWFilterRegisterCallbackDriver(virNWFilterCallbackDriverPtr);
+void virNWFilterCallbackDriversLock(void);
+void virNWFilterCallbackDriversUnlock(void);


  VIR_ENUM_DECL(virNWFilterRuleAction);
Index: libvirt-acl/src/qemu/qemu_driver.c
===================================================================
--- libvirt-acl.orig/src/qemu/qemu_driver.c
+++ libvirt-acl/src/qemu/qemu_driver.c
@@ -12725,11 +12725,7 @@ static int
  qemudVMFilterRebuild(virConnectPtr conn ATTRIBUTE_UNUSED,
                       virHashIterator iter, void *data)
  {
-    struct qemud_driver *driver = qemu_driver;
-
-    qemuDriverLock(driver);
      virHashForEach(qemu_driver->domains.objs, iter, data);
-    qemuDriverUnlock(driver);

      return 0;
  }
@@ -12757,9 +12753,24 @@ qemudVMFiltersInstantiate(virConnectPtr
      return err;
  }

+
+static void
+qemudVMDriverLock(void) {
+    qemuDriverLock(qemu_driver);
+};
+
+
+static void
+qemudVMDriverUnlock(void) {
+    qemuDriverUnlock(qemu_driver);
+};
+
+
  static virNWFilterCallbackDriver qemuCallbackDriver = {
      .name = "QEMU",
      .vmFilterRebuild = qemudVMFilterRebuild,
+    .vmDriverLock = qemudVMDriverLock,
+    .vmDriverUnlock = qemudVMDriverUnlock,
  };

  int qemuRegister(void) {
Index: libvirt-acl/src/uml/uml_driver.c
===================================================================
--- libvirt-acl.orig/src/uml/uml_driver.c
+++ libvirt-acl/src/uml/uml_driver.c
@@ -2202,11 +2202,7 @@ static int
  umlVMFilterRebuild(virConnectPtr conn ATTRIBUTE_UNUSED,
                     virHashIterator iter, void *data)
  {
-    struct uml_driver *driver = uml_driver;
-
-    umlDriverLock(driver);
      virHashForEach(uml_driver->domains.objs, iter, data);
-    umlDriverUnlock(driver);

      return 0;
  }
@@ -2219,9 +2215,23 @@ static virStateDriver umlStateDriver = {
      .active = umlActive,
  };

+static void
+umlVMDriverLock(void)
+{
+    umlDriverLock(uml_driver);
+}
+
+static void
+umlVMDriverUnlock(void)
+{
+    umlDriverUnlock(uml_driver);
+}
+
  static virNWFilterCallbackDriver umlCallbackDriver = {
      .name = "UML",
      .vmFilterRebuild = umlVMFilterRebuild,
+    .vmDriverLock = umlVMDriverLock,
+    .vmDriverUnlock = umlVMDriverUnlock,
  };

  int umlRegister(void) {
Index: libvirt-acl/src/conf/nwfilter_conf.c
===================================================================
--- libvirt-acl.orig/src/conf/nwfilter_conf.c
+++ libvirt-acl/src/conf/nwfilter_conf.c
@@ -2303,6 +2303,24 @@ virNWFilterRegisterCallbackDriver(virNWF
      }
  }

+void
+virNWFilterCallbackDriversLock(void)
+{
+    int i;
+
+    for (i = 0; i < nCallbackDriver; i++)
+        callbackDrvArray[i]->vmDriverLock();
+}
+
+void
+virNWFilterCallbackDriversUnlock(void)
+{
+    int i;
+
+    for (i = 0; i < nCallbackDriver; i++)
+        callbackDrvArray[i]->vmDriverUnlock();
+}
+

  static virHashIterator virNWFilterDomainFWUpdateCB;

Index: libvirt-acl/src/libvirt_private.syms
===================================================================
--- libvirt-acl.orig/src/libvirt_private.syms
+++ libvirt-acl/src/libvirt_private.syms
@@ -535,6 +535,8 @@ virNWFilterConfLayerShutdown;
  virNWFilterLockFilterUpdates;
  virNWFilterUnlockFilterUpdates;
  virNWFilterPrintStateMatchFlags;
+virNWFilterCallbackDriversLock;
+virNWFilterCallbackDriversUnlock;


  # nwfilter_params.h
Index: libvirt-acl/src/nwfilter/nwfilter_driver.c
===================================================================
--- libvirt-acl.orig/src/nwfilter/nwfilter_driver.c
+++ libvirt-acl/src/nwfilter/nwfilter_driver.c
@@ -34,6 +34,7 @@
  #include "memory.h"
  #include "domain_conf.h"
  #include "domain_nwfilter.h"
+#include "nwfilter_conf.h"
  #include "nwfilter_driver.h"
  #include "nwfilter_gentech_driver.h"

@@ -152,9 +153,13 @@ nwfilterDriverReload(void) {
          virNWFilterLearnThreadsTerminate(true);

          nwfilterDriverLock(driverState);
+        virNWFilterCallbackDriversLock();
+
          virNWFilterPoolLoadAllConfigs(conn,
&driverState->pools,
                                        driverState->configDir);
+
+        virNWFilterCallbackDriversUnlock();
          nwfilterDriverUnlock(driverState);

          virConnectClose(conn);
@@ -328,6 +333,8 @@ nwfilterDefine(virConnectPtr conn,
      virNWFilterPtr ret = NULL;

      nwfilterDriverLock(driver);
+    virNWFilterCallbackDriversLock();
+
      if (!(def = virNWFilterDefParseString(conn, xml)))
          goto cleanup;

@@ -347,6 +354,8 @@ cleanup:
      virNWFilterDefFree(def);
      if (pool)
          virNWFilterPoolObjUnlock(pool);
+
+    virNWFilterCallbackDriversUnlock();
      nwfilterDriverUnlock(driver);
      return ret;
  }
@@ -359,6 +368,8 @@ nwfilterUndefine(virNWFilterPtr obj) {
      int ret = -1;

      nwfilterDriverLock(driver);
+    virNWFilterCallbackDriversLock();
+
      pool = virNWFilterPoolObjFindByUUID(&driver->pools, obj->uuid);
      if (!pool) {
          virNWFilterReportError(VIR_ERR_INVALID_NWFILTER,
@@ -385,6 +396,8 @@ nwfilterUndefine(virNWFilterPtr obj) {
  cleanup:
      if (pool)
          virNWFilterPoolObjUnlock(pool);
+
+    virNWFilterCallbackDriversUnlock();
      nwfilterDriverUnlock(driver);
      return ret;
  }




More information about the libvir-list mailing list