[libvirt] Question : Configuring a VM with backing 1G huge pages across 2 NUMA nodes

Michal Privoznik mprivozn at redhat.com
Fri Sep 12 07:39:41 UTC 2014


[CCing Martin Kletzander]

On 12.09.2014 08:25, Vinod, Chegu wrote:
> Hi Michal,
>
> ‘have a kernel+qemu+libvirt setup with all recent upstream bits on a
> given host & was trying to configure a VM with backing 1G huge
> pages…spanning 2 NUMA nodes.
>
> The host had 3 1G huge pages on each of the 2 NUMA nodes :
>
> # cat
> /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
>
> 3
>
> # cat
> /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages
>
> 3
>
> And I had the following in the /etc/fstab
>
> hugetlbfs      /hugepages_1G    hugetlbfs  pagesize=1GB     0     0
>
> I added the following entries in the xml file for the 4G/4vcpu VM
>
>    <memoryBacking>
>
>      <hugepages>
>
>        <page size='1048576' unit='KiB' nodeset='0'/>
>
>        <page size='1048576' unit='KiB' nodeset='1'/>
>
>      </hugepages>
>
>    </memoryBacking>
>
>    <vcpu placement='static'>4</vcpu>
>
>    <cputune>
>
>      <vcpupin vcpu='0' cpuset='0'/>
>
>      <vcpupin vcpu='1' cpuset='1'/>
>
>      <vcpupin vcpu='2' cpuset='8'/>
>
>      <vcpupin vcpu='3' cpuset='9'/>
>
>    </cputune>
>
> ….
>
>    <numatune>
>
>      <memory node=”strict” nodeset=”0-1”/>

[This is a copy-paste error, right? It should have been s/node/mode/]

This is incomplete. This basically says nothing more than: All the guest 
numa nodes must be placed on host numa nodes 0-1. And the qemu command 
line that libvirt came up with satisfied the constrain.
You may wan to pin guest numa nodes to host numa nodes like this:

A) use <memory mode="interleave" placement="static" nodeset="0-1"/>
    I haven't tested myself, but IIRC correctly, this should start 
placing guest numa nodes sequentially over host numa nodes 0-1. So 
You'll end up with:

host0: guest0, guest2
host1: guest1, guest3

b) use the manual guest <-> host pinning:

   <numatune>
     <memory mode='strict' nodeset='0-1'/>
     <memnode cellid='0' mode='strict' nodeset='0'/>
     <memnode cellid='1' mode='strict' nodeset='0'/>
     <memnode cellid='2' mode='strict' nodeset='1'/>
     <memnode cellid='3' mode='strict' nodeset='1'/>
   </numatune>

This will tie guest0 and guest1 onto host0, and guest2 and guets3 onto 
host1.

I must admit this is not the bit I've implemented, so I don't know all 
the details. Therefore I'm CCing Martin Kletzander, who's done the major 
piece of work in this field.

>
>    </numatune>
>
> ….
>
>    <cpu>
>
>      <numa>
>
>        <cell id='0' cpus='0-1' memory='2097152'/>
>
>        <cell id='1' cpus='2-3' memory='2097152'/>
>
>      </numa>
>
>    </cpu>
>
> The resulting qemu command looked like this :
>
> /usr/local/bin/qemu-system-x86_64 -name vm1 -S -machine
> pc-i440fx-2.2,accel=kvm,usb=off \
>
> -m 4096 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 \
>
> -object
> memory-backend-file,prealloc=yes,mem-path=/hugepages_1G/libvirt/qemu,size=2048M,id=ram-node0,host-nodes=0-1,policy=bind
> -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 \
>
> -object
> memory-backend-file,prealloc=yes,mem-path=/hugepages_1G/libvirt/qemu,size=2048M,id=ram-node1,host-nodes=0-1,policy=bind
> -numa node,nodeid=1,cpus=2-3,memdev=ram-node1 \
>
> ....
>
> There were 3 1G pages available on each NUMA node on the host as shown
> above... and I noticed that the VM got backed by 3 1G pages from node0
> and 1 1G page from node1.
>
> #cat
> /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/free_hugepages
>
> 0
>
> # cat
> /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/free_hugepages
>
> 2
>
> Not sure if this was expected behavior given the options I specified in
> the xml file ? If yes…Is there some additional option to specify (in the
> XML file) such that only a given number of 1Gig huge pages per node are
> picked to back the VM (i.e. in the above case just 2 1G from each node) ?
>
> Thanks!
>
> Vinod
>

Hopefully, my answer is sufficient, Martin?

Michal




More information about the libvir-list mailing list