[vfio-users] Allocating hugepages to a specific numa node on boot

Ryan Flagler ryan.flagler at gmail.com
Wed May 17 19:44:30 UTC 2017


So, I just wanted to share the process I found for allocating hugepages
from memory tied to a specific numa node on your system. Traditionally, I
had a /etc/default/grub that looked like this.

GRUB_CMDLINE_LINUX_DEFAULT="nomodeset quiet splash intel_iommu=on
hugepages=6144 pci_stub.ids=10de:13c2,10de:0fbb,104c:8241"

This would allocate 12GB of hugepages (utilizing the default size of
2MB/hugepage). However, my allocation would look like this.

cat /sys/devices/system/node/node*/meminfo | fgrep Huge
Node 0 AnonHugePages:     20480 kB
Node 0 HugePages_Total:  3072
Node 0 HugePages_Free:   3072
Node 0 HugePages_Surp:      0
Node 1 AnonHugePages:         0 kB
Node 1 HugePages_Total:  3072
Node 1 HugePages_Free:   3072
Node 1 HugePages_Surp:      0

As you can see, half of the hugepages were allocated to each of my numa
nodes. In my VM configuration, I had all my CPUs pinned to Numa Node 1, and
I allocated 8GB of memory to that VM. So some of my hugepage memory was
coming from numa node 1 and some from numa node 0.

If you're dynamically allocating hugepages on the fly, you can easily
specify which numa node you allocate the memory from with something like
this.
echo 6144 > $nodes_path/node1/hugepages/hugepages-2048kB/nr_hugepages

The problem is you may not have 6144 available pages on numa node1 when you
need them.

I came across the following page that documented how to dyanmically
allocate hugepages betweene numa nodes on boot on RHEL7 using systemd.
http://fibrevillage.com/sysadmin/536-how-to-enable-and-config-hugepage-and-transparent-hugepages-on-rhel-centos

Thankfully, I'm running Ubuntu 16.04 which also utilizes systemd. There
were a couple differences, so here's what I did.

Create /lib/systemd/system/hugetlb-gigantic-pages.service with the
following contents
[Unit]
Description=HugeTLB Gigantic Pages Reservation
DefaultDependencies=no
Before=dev-hugepages.mount
ConditionPathExists=/sys/devices/system/node
ConditionKernelCommandLine=hugepagesz=2M
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/lib/systemd/hugetlb-reserve-pages
[Install]
WantedBy=sysinit.target

Create /lib/systemd/hugetlb-reserve-pages with the following contents
#!/bin/sh
nodes_path=/sys/devices/system/node/
if [ ! -d $nodes_path ]; then
echo "ERROR: $nodes_path does not exist"
exit 1
fi
reserve_pages()
{
echo $1 > $nodes_path/$2/hugepages/hugepages-2048kB/nr_hugepages
}
# This example reserves 2 2M pages on node0 and 1 1M page on node1.
# You can modify it to your needs or add more lines to reserve memory in
# other nodes. Don't forget to uncomment the lines, otherwise they won't
# be executed.
# reserve_pages 2 node0
# reserve_pages 1 node1
reserve_pages 6144 node1

Note my uncommented line to allocate 6144 pages to numa node1.

Update permissions and enable the job
chmod +x /lib/systemd/hugetlb-reserve-pages
systemctl enable hugetlb-gigantic-pages

Reboot

After reboot, I saw the following.
cat /sys/devices/system/node/node*/meminfo | fgrep Huge
Node 0 AnonHugePages:    169984 kB
Node 0 HugePages_Total:     0
Node 0 HugePages_Free:      0
Node 0 HugePages_Surp:      0
Node 1 AnonHugePages:    184320 kB
Node 1 HugePages_Total:  6144
Node 1 HugePages_Free:   6144
Node 1 HugePages_Surp:      0

And after starting my VM with all CPUs pinned on Node 1 and with 8GB of
memory, I see this.
cat /sys/devices/system/node/node*/meminfo | fgrep Huge
Node 0 AnonHugePages:    724992 kB
Node 0 HugePages_Total:     0
Node 0 HugePages_Free:      0
Node 0 HugePages_Surp:      0
Node 1 AnonHugePages:    270336 kB
Node 1 HugePages_Total:  6144
Node 1 HugePages_Free:   2048
Node 1 HugePages_Surp:      0

Lastly, here is the status output of the systemd job.
service hugetlb-gigantic-pages status
â— hugetlb-gigantic-pages.service - HugeTLB Gigantic Pages Reservation
   Loaded: loaded (/lib/systemd/system/hugetlb-gigantic-pages.service;
enabled; vendor preset: enabled)
   Active: active (exited) since Wed 2017-05-17 12:01:49 CDT; 3min 17s ago
  Process: 872 ExecStart=/lib/systemd/hugetlb-reserve-pages (code=exited,
status=0/SUCCESS)
 Main PID: 872 (code=exited, status=0/SUCCESS)
    Tasks: 0
   Memory: 0B
      CPU: 0
   CGroup: /system.slice/hugetlb-gigantic-pages.service

Hopefully this helps someone else!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20170517/48c8bbcc/attachment.htm>


More information about the vfio-users mailing list