[libvirt] [PATCH 19/27] conf: Add XML for individual vCPU hotplug

Wed Aug 17 12:42:21 UTC 2016

On Fri, Aug 12, 2016 at 10:29:53 +0200, Pavel Hrdina wrote:
> On Fri, Aug 05, 2016 at 03:56:15PM +0200, Peter Krempa wrote:
> > Individual vCPU hotplug requires us to track the state of any vCPU. To
> > allow this add the following XML:
> > 
> > <domain>
> >   ...
> >   <vcpu current='1'>2</vcpu>
> >   <vcpus>
> >     <vcpu id='0' enabled='no' hotpluggable='yes'/>
> >     <vcpu id='1' enabled='yes' hotpluggable='no' order='1'/>
> >   </vcpus>
> >   ...
> > 
> > The 'enabled' attribute allows to control the state of the vcpu.
> > 'hotpluggable' controls whether given vcpu can be hotplugged and 'order'
> > allows to specify the order to add the vcpus.
> 
> Based on CPU arch there are some restriction how many vcpus must be plugged in
> together, currently only for Power arch.  Based on configured topology we can

Yep. I admit though that the documentation is pretty weak since I wanted
to get review started. I plan to add more of it though.

> plug only whole cores, which means group of vcpus.  Because of this I would
> suggest that we group those vcpus in the XML like this:
> 
>     <vcpus>
>       <group id='1' enabled='yes' hotpluggable='no' order='1'>
>         <vcpu id='0'/>
>         <vcpu id='1'/>
>       </group>

I don't think this is a good idea:

1) This XML part is not input-only thus users will need to provide this
   (also see below)

2) it's extremely verbose for non-weird architectures
   granted, for ppc64 it allows to display the wiredness of the
   core-level hotplug, but for x86 it's 3 times more verbose

   on the off-hand it requires the user to provide this in advance an

3) this does not hide the weirdness of the "hotpluggable entities" as
   reported by qemu.

   With this we basically could add the vcpus to <devices>. At the
   'group' granularity which would be basically a dumb wrapper on top of
   the qemu design.

4) It can't be properly verified at define time.
   (okay, my approach can't be validated either, but it's vastly
   simpler and more tolerant to config changes)

>       <group id='2' enabled='yes' hotpluggable='yes' order='2'>
>         <vcpu id='2'/>
>         <vcpu id='3'/>
>       </group>
>       <group id='3' enabled='no' hotpluggable='yes'>
>         <vcpu id='4'/>
>         <vcpu id='5'/>
>       </group>
>       <group id='4' enabled='no' hotpluggable='yes'>
>         <vcpu id='6'/>
>         <vcpu id='7'/>
>       </group>
>     </vcpus>
> 
> we know the topology from XML and if none is provided the default is:
> 
>     <topology sockets='1' cores='n' threads='1'/>

The problem is if a user specifies a topology with threads > 1 you will
get two behaviors:

1) on x86 the "groups" still need to be one per thread
2) on ppc64 the groups need to be one per core (thus including all
'threads')

The approach that tries to not include the groups hides this wiredness
as a internal detail. Thus if this ever changes (e.g. gets more
benevolent) the users won't need to adjust the grouping to reflect what
qemu/arch thinks it's cool at that point.

> If user changes the topology the vcpu configuration must be modified too and
> this grouping would help to easily move/add/remove vcpus from groups or
> add/remove the whole groups.

Well the simplicity would only simplify the case of ppc64 (and any
other weird arch that doesn't support thread level virtual vcpus). For
the regular case it won't simplify anything.

One shortcomming of current implementation though (which the proposed
design doesn't solve) is that it's impossible to detect what the
granularity level is. I plan to add it to the domain capabilities output
(if possible).

Peter