[libvirt-users] VMs fail to start with NUMA configuration

Wayne Sun gsun at redhat.com
Wed Jan 30 07:21:44 UTC 2013


On 01/30/2013 01:25 PM, Doug Goldstein wrote:
> On Mon, Jan 28, 2013 at 10:23 AM, Osier Yang<jyang at redhat.com>  wrote:
>> On 2013年01月29日 00:17, Doug Goldstein wrote:
>>> On Sun, Jan 27, 2013 at 10:46 PM, Osier Yang<jyang at redhat.com>   wrote:
>>>> On 2013年01月28日 11:47, Osier Yang wrote:
>>>>>
>>>>> On 2013年01月28日 11:44, Osier Yang wrote:
>>>>>>
>>>>>> On 2013年01月26日 01:07, Doug Goldstein wrote:
>>>>>>>
>>>>>>> On Thu, Jan 24, 2013 at 12:58 AM, Osier Yang<jyang at redhat.com>   wrote:
>>>>>>>>
>>>>>>>> On 2013年01月24日 14:26, Doug Goldstein wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jan 23, 2013 at 11:02 PM, Osier Yang<jyang at redhat.com>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2013年01月24日 12:11, Doug Goldstein wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jan 23, 2013 at 3:45 PM, Doug Goldstein<cardoe at gentoo.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 +
>>>>>>>>>>>> qemu
>>>>>>>>>>>> 1.2.2 applied on top plus a number of stability patches). Having
>>>>>>>>>>>> issue
>>>>>>>>>>>> where my VMs fail to start with the following message:
>>>>>>>>>>>>
>>>>>>>>>>>> kvm_init_vcpu failed: Cannot allocate memory
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Smell likes we have problem on setting the NUMA policy (perhaps
>>>>>>>>>> caused by the incorrect host NUMA topology), given that the system
>>>>>>>>>> still has enough memory. Or numad (if it's installed) is doing
>>>>>>>>>> something wrong.
>>>>>>>>>>
>>>>>>>>>> Can you see if there is something about the Nodeset used to set
>>>>>>>>>> the policy in debug log?
>>>>>>>>>>
>>>>>>>>>> E.g.
>>>>>>>>>>
>>>>>>>>>> % cat libvirtd.debug | grep Nodeset
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Well I don't see anything but its likely because I didn't do
>>>>>>>>> something
>>>>>>>>> correct. I had LIBVIRT_DEBUG=1 exported and ran libvirtd --verbose
>>>>>>>>> from the command line.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> If the process is in background, it's expected you can't see anything
>>>>>>>>
>>>>>>>>
>>>>>>>> My /etc/libvirt/libvirtd.conf had:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> log_outputs="3:syslog:libvirtd 1:file:/tmp/libvirtd.log" But I
>>>>>>>>> didn't
>>>>>>>>> get any debug messages.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> log_level=1 has to be set.
>>>>>>>>
>>>>>>>> Anyway, let's simply do this:
>>>>>>>>
>>>>>>>> % service libvirtd stop
>>>>>>>> % LIBVIRT_DEBUG=1 /usr/sbin/libvirtd 2>&1 | tee -a libvirtd.debug
>>>>>>>>
>>>>>>> That's what I was doing, minus the tee just to the console and nothing
>>>>>>> was coming out. Which is why I added the 1:file:/tmp/libvirtd.log,
>>>>>>> which also didn't get any debug messages. Turns out this instance must
>>>>>>> have been built with --disable-debug,
>>>>>>>
>>>>>>> All I've got in the log is:
>>>>>>>
>>>>>>> # grep -i 'numa' libvirtd.debug
>>>>>>> 2013-01-25 16:50:15.287+0000: 417: debug : virCommandRunAsync:2200 :
>>>>>>> About to run /usr/bin/numad -w 2:2048
>>>>>>> 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 :
>>>>>>> Nodeset returned from numad: 1
>>>>>>
>>>>>>
>>>>>> This looks right.
>>>>>>
>>>>>>> Immediately below that is
>>>>>>>
>>>>>>> 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3622 :
>>>>>>> Setting up domain cgroup (if required)
>>>>>>> 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupNew:619 : New
>>>>>>> group /libvirt/qemu/bb-2.6.35.9-i686
>>>>>>> 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 :
>>>>>>> Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpuacct in
>>>>>>> 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 :
>>>>>>> Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:537 :
>>>>>>> Make group /libvirt/qemu/bb-2.6.35.9-i686
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 :
>>>>>>> Make controller /sys/fs/cgroup/cpuacct/libvirt/qemu/bb-2.6.35.9-i686/
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 :
>>>>>>> Make controller /sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:469
>>>>>>> : Setting up inheritance /libvirt/qemu ->
>>>>>>> /libvirt/qemu/bb-2.6.35.9-i686
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 :
>>>>>>> Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.cpus
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed
>>>>>>> fd 39
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482
>>>>>>> : Inherit cpuset.cpus = 0-63
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 :
>>>>>>> Set value
>>>>>>> '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.cpus'
>>>>>>> to '0-63'
>>>>>>
>>>>>>
>>>>>> This looks not right, it should be 0-7 instead.
>>>>>>
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed
>>>>>>> fd 39
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 :
>>>>>>> Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.mems
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed
>>>>>>> fd 39
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482
>>>>>>> : Inherit cpuset.mems = 0-7
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 :
>>>>>>> Set value
>>>>>>> '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems'
>>>>>>> to '0-7'
>>>>>>
>>>>>>
>>>>>> This is right.
>>>>>>
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed
>>>>>>> fd 39
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: warning : qemuSetupCgroup:388 :
>>>>>>> Could not autoset a RSS limit for domain bb-2.6.35.9-i686
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 :
>>>>>>> Set value
>>>>>>> '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems'
>>>>>>> to '1'
>>>>>>
>>>>>>
>>>>>> And it's strange that the cpuset.mems is changed to '1' here.
>>>>
>>>>
>>>> Oh, actually this is right, cpuset.mems is about the memory nodes.
>>>>
>>>>
>>>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed
>>>>>>> fd 39
>>>>>>>
>>>>>>> Could the RSS issue be related? Some kernel related option not playing
>>>>>>> nice or enabled?
>>>>>
>>>>>
>>>>> Instead, I'm wondering if the problem is caused by the mismatch
>>>>> (from libvirt p.o.v) between cpuset.cpus and cpuset.mems, which
>>>>> thus cause the problem for kernel memory management?
>>>>
>>>>
>>>> So, the simple method to prove the guess is to use static placement
>>>> like:
>>>>
>>>> <vcpu placement='static' cpuset='0-63'>2</vcpu>
>>>> <numatune>
>>>>     <memory placement='static' nodeset='1'/>
>>>> </numatune>
>>>>
>>>> Osier
>>>
>>> Same error. Which I don't know if you expected or didn't expect.
>>>
>> It's expected. as "0-63" is the final result when using "auto"
>> placement.
> Since there's another user on the libvirt-list asking about the exact
> same CPU I've got, I figured I'd do some poking. Oddly enough him and
> I had different outputs from virsh nodeinfo. Just as background its
> AMD 6272 CPUs. I've for 4 of them in the box but they're organized as
> follows:
>
> Sockets: 4
> Cores: 16
> Threads: 1 per core (16)
> NUMA nodes: 8
> Mem per node: 16GB
> Total: 128GB
>
> # virsh nodeinfo
> CPU model:           x86_64
> CPU(s):              64
> CPU frequency:       2100 MHz
> CPU socket(s):       1
> Core(s) per socket:  64
> Thread(s) per core:  1
> NUMA cell(s):        1
> Memory size:         132013200 KiB
>
> # virsh capabilities
> <snip>
>        <topology sockets='1' cores='64' threads='1'/>
> <snip>
>      <topology>
>        <cells num='8'>
> <snip>
>
> I've hand verified all the values in
> /sys/devices/system/nodeX/cpuX/topology/physical_package_id to show
> that the physical package is oriented in pairs (0&1, 2&3, 4&5, 6&7)
> for the NUMA nodes.
>
> Need to give git a whirl as I know that's got a bit different code
> than 1.0.1 but I'll report back.
>
For AMD 62xx CPUs, the output is expected.

Check out this bug:
virsh nodeinfo can't get the right info on AMD Bulldozer cpu
https://bugzilla.redhat.com/show_bug.cgi?id=874050

Wayne Sun
2013-01-30




More information about the libvirt-users mailing list