[libvirt] [PATCH v2 2/3] qemuProcessHook: Call virNuma*() iff needed

Michal Privoznik mprivozn at redhat.com
Tue Mar 31 11:59:09 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1198645

Once upon a time, there was a little domain. And the domain was pinned
onto a NUMA node and hasn't fully allocated its memory:

  <memory unit='KiB'>2355200</memory>
  <currentMemory unit='KiB'>1048576</currentMemory>

  <numatune>
    <memory mode='strict' nodeset='0'/>
  </numatune>

Oh little me, said the domain, what will I do with so little memory.
If I only had a few megabytes more. But the old admin noticed the
whimpering, barely audible to untrained human ear. And good admin he
was, he gave the domain yet more memory. But the old NUMA topology
witch forbade to allocate more memory on the node zero. So he
decided to allocate it on a different node:

virsh # numatune little_domain --nodeset 0-1

virsh # setmem little_domain 2355200

The little domain was happy. For a while. Until bad, sharp teeth
shaped creature came. Every process in the system was afraid of him.
The OOM Killer they called him. Oh no, he's after the little domain.
There's no escape.

Do you kids know why? Because when the little domain was born, her
father, Libvirt, called numa_set_membind(). So even if the admin
allowed her to allocate memory from other nodes in the cgroups, the
membind() forbid it.

So what's the lesson? Libvirt should rely on cgroups, whenever
possible and use numa_set_membind() as the last ditch effort.

Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
---
 src/qemu/qemu_process.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 2d86eb8..6c955c3 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -3166,6 +3166,7 @@ static int qemuProcessHook(void *data)
     int fd;
     virBitmapPtr nodeset = NULL;
     virDomainNumatuneMemMode mode;
+    bool doNuma = true;
 
     /* This method cannot use any mutexes, which are not
      * protected across fork()
@@ -3197,11 +3198,23 @@ static int qemuProcessHook(void *data)
         goto cleanup;
 
     mode = virDomainNumatuneGetMode(h->vm->def->numa, -1);
-    nodeset = virDomainNumatuneGetNodeset(h->vm->def->numa,
-                                          priv->autoNodeset, -1);
 
-    if (virNumaSetupMemoryPolicy(mode, nodeset) < 0)
-        goto cleanup;
+    if (mode == VIR_DOMAIN_NUMATUNE_MEM_STRICT &&
+        h->cfg->cgroupControllers & (1 << VIR_CGROUP_CONTROLLER_CPUSET) &&
+        virCgroupControllerAvailable(VIR_CGROUP_CONTROLLER_CPUSET)) {
+        /* Use virNuma* API iff necessary. Once set and child is exec()-ed,
+         * there's no way for us to change it. Rely on cgroups (if available
+         * and enabled in the config) rather then virNuma*. */
+        doNuma = false;
+    }
+
+    if (doNuma) {
+        nodeset = virDomainNumatuneGetNodeset(h->vm->def->numa,
+                                              priv->autoNodeset, -1);
+
+        if (virNumaSetupMemoryPolicy(mode, nodeset) < 0)
+            goto cleanup;
+    }
 
     ret = 0;
 
-- 
2.0.5




More information about the libvir-list mailing list