[libvirt] [PATCH 6/7] qemu: Set cpuset.cpus for domain process

Daniel P. Berrange berrange at redhat.com
Mon May 20 11:18:19 UTC 2013


On Fri, May 17, 2013 at 07:59:36PM +0800, Osier Yang wrote:
> When either "cpuset" of <vcpu> is specified, or the "placement" of
> <vcpu> is "auto", only setting the cpuset.mems might cause the guest
> starting to fail. E.g. ("placement" of both <vcpu> and <numatune> is
> "auto"):
> 
> 1) Related XMLs
>   <vcpu placement='auto'>4</vcpu>
>   <numatune>
>     <memory mode='strict' placement='auto'/>
>   </numatune>
> 
> 2) Host NUMA topology
>   % numactl --hardware
>   available: 8 nodes (0-7)
>   node 0 cpus: 0 4 8 12 16 20 24 28
>   node 0 size: 16374 MB
>   node 0 free: 11899 MB
>   node 1 cpus: 32 36 40 44 48 52 56 60
>   node 1 size: 16384 MB
>   node 1 free: 15318 MB
>   node 2 cpus: 2 6 10 14 18 22 26 30
>   node 2 size: 16384 MB
>   node 2 free: 15766 MB
>   node 3 cpus: 34 38 42 46 50 54 58 62
>   node 3 size: 16384 MB
>   node 3 free: 15347 MB
>   node 4 cpus: 3 7 11 15 19 23 27 31
>   node 4 size: 16384 MB
>   node 4 free: 15041 MB
>   node 5 cpus: 35 39 43 47 51 55 59 63
>   node 5 size: 16384 MB
>   node 5 free: 15202 MB
>   node 6 cpus: 1 5 9 13 17 21 25 29
>   node 6 size: 16384 MB
>   node 6 free: 15197 MB
>   node 7 cpus: 33 37 41 45 49 53 57 61
>   node 7 size: 16368 MB
>   node 7 free: 15669 MB
> 
> 4) cpuset.cpus will be set as: (from debug log)
> 
> 2013-05-09 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 :
> Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/toy/cpuset.cpus'
> to '0-63'
> 
> 5) The advisory nodeset got from querying numad (from debug log)
> 
> 2013-05-09 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 :
> Nodeset returned from numad: 1
> 
> 6) cpuset.mems will be set as: (from debug log)
> 
> 2013-05-09 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 :
> Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/toy/cpuset.mems'
> to '0-7'
> 
> I.E, the domain process's memory is restricted on the first NUMA node,
> however, it can use all of the CPUs, which will very likely cause the
> domain process to fail to start because of the kernel fails to allocate
> memory with the possible mismatching between CPU nodes and memory nodes.

This is only a problem if the kernel is forced to do allocation
from a memory node which matches the CPU node.

It is perfectly acceptable for the kernel to allocate memory from
a node that is different from the CPU node in general.

eg, it is the mode='strict' attribute in the XML above that causes
the bug.


> @@ -665,9 +666,35 @@ qemuSetupCpusetCgroup(virDomainObjPtr vm,
>          }
>      }
>  
> +    if (vm->def->cpumask ||
> +        (vm->def->placement_mode ==
> +         VIR_DOMAIN_CPU_PLACEMENT_MODE_AUTO)) {

I think you should only be doing this if  placement==auto *and*
mode=strict.

> +        if (vm->def->placement_mode ==
> +            VIR_DOMAIN_CPU_PLACEMENT_MODE_AUTO)
> +            cpu_mask = virBitmapFormat(nodemask);
> +        else
> +            cpu_mask = virBitmapFormat(vm->def->cpumask);
> +
> +        if (!cpu_mask) {
> +            virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                           _("failed to convert memory nodemask"));
> +            goto cleanup;
> +        }
> +
> +        rc = virCgroupSetCpusetCpus(priv->cgroup, cpu_mask);
> +
> +        if (rc != 0) {
> +            virReportSystemError(-rc,
> +                                 _("Unable to set cpuset.cpus for domain %s"),
> +                                 vm->def->name);
> +            goto cleanup;
> +        }
> +    }
> +
>      ret = 0;
>  cleanup:
> -    VIR_FREE(mask);
> +    VIR_FREE(mem_mask);
> +    VIR_FREE(cpu_mask);
>      return ret;


Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list