[libvirt] [PATCH RFC v2] qemu: Support numad

Osier Yang jyang at redhat.com
Wed Mar 7 15:38:56 UTC 2012


On 2012年03月07日 21:48, Daniel P. Berrange wrote:
> On Wed, Mar 07, 2012 at 09:55:16PM +0800, Osier Yang wrote:
>> numad is an user-level daemon that monitors NUMA topology and
>> processes resource consumption to facilitate good NUMA resource
>> alignment of applications/virtual machines to improve performance
>> and minimize cost of remote memory latencies. It provides a
>> pre-placement advisory interface, so significant processes can
>> be pre-bound to nodes with sufficient available resources.
>>
>> More details: http://fedoraproject.org/wiki/Features/numad
>>
>> "numad -w ncpus:memory_amount" is the advisory interface numad
>> provides currently.
>>
>> This patch add the support by introducing a bool XML element:
>>    <numatune>
>>      <autonuma/>
>>    </numatune>
>>
>> If it's specified, the number of vcpus and the current memory
>> amount specified in domain XML will be used for numad command
>> line (numad uses MB for memory amount):
>>    numad -w $num_of_vcpus:$current_memory_amount / 1024
>>
>> The advisory nodeset returned from numad will be used to set
>> domain process CPU affinity then. (e.g. qemuProcessInitCpuAffinity).
>>
>> If the user specifies both CPU affinity policy (e.g.
>> (<vcpu cpuset="1-10,^7,^8">4</vcpu>) and XML indicating to use
>> numad for the advisory nodeset, the specified CPU affinity will be
>> ignored.
>
> I'm not sure that's a good idea. When we do dynamic generation
> of parts of libvirt XML, we tend to report in the XML what was
> generated, and if 2 parts contradict each other we shouldn't
> silently ignore it.

Agreed, v1 overrides the cpuset="1-10,^6", but I thought it
could be confused for user to see things are different after
domain is started.

>
> eg, with VNC with autoport=yes we then report the generated
> port number.
>
> Similarly with<cpu>   mode=host, we then report what the host
> CPU features were.
>
> So, if we want to auto-set placement for a guest we should
> likely do this via the<vcpu>  element
>
> eg, Current mode where placement is completely static
>
>   - Input XML:
>
>         <vcpu placement="static" cpuset="1-10" />
>
>   - Output XML:
>
>         <vcpu placement="static" cpuset="1-10" />
>
> Or where we want to use numad:
>
>   - Input XML:
>
>         <vcpu placement="auto"/>
>
>   - Output XML:
>
>         <vcpu placement="auto" cpuset="1-10" />
>

I must admit this is much better. :-)

>
> The current numad functionality you propose only sets the initial guest
> placement. Are we likely to have a mode in the future where numad will
> be called to update the placement periodically for existing guests ?

Very possiable, now numad just provides the advisory interface,
in future I guess it could manage the placement dynamically.

> If so, then "placement" would need to have more enum values.
>

I will post a v3 tomorrow.

Regards,
Osier




More information about the libvir-list mailing list