[libvirt] [RFC PATCH 0/2] nodeinfo: PPC64: Fix topology and siblings info on capabilities and nodeinfo
Daniel P. Berrange
berrange at redhat.com
Tue Jul 19 14:35:33 UTC 2016
On Thu, May 05, 2016 at 08:48:05PM +0200, Andrea Bolognani wrote:
> On Fri, 2016-01-29 at 01:32 -0500, Shivaprasad G Bhat wrote:
> > The nodeinfo output was fixed earlier to reflect the actual cpus available in
> > KVM mode on PPC64. The earlier fixes covered the aspect of not making a host
> > look overcommitted when its not. The current fixes are aimed at helping the
> > users make better decisions on the kind of guest cpu topology that can be
> > supported on the given sucore_per_core setting of KVM host and also hint the
> > way to pin the guest vcpus efficiently.
> >
> > I am planning to add some test cases once the approach is accepted.
> >
> > With respect to Patch 2:
> > The second patch adds a new element to the cpus tag and I need your inputs on
> > if that is okay. Also if there is a better way. I am not sure if the existing
> > clients have RNG checks that might fail with the approach. Or if the checks
> > are not enoforced on the elements but only on the tags.
> >
> > With my approach if the rng checks pass, the new element "capacity" even if
> > ignored by many clients would have no impact except for PPC64.
> >
> > To the extent I looked at code, the siblings changes dont affect existing
> > libvirt functionality. Please do let me know otherwise.
>
> So, I've been going through this old thread trying to figure out
> a way to improve the status quo. I'd like to collect as much
> feedback as possible, especially from people who have worked in
> this area of libvirt before or have written tools based on it.
>
> As hinted above, this series is really trying to address two
> different issue, and I think it's helpful to reason about them
> separately.
>
>
> ** Guest threads limit **
>
> My dual-core laptop will happily run a guest configured with
>
> <cpu>
> <topology sockets='1' cores='1' threads='128'/>
> </cpu>
>
> but POWER guests are limited to 8/subcores_per_core threads.
>
> We need to report this information to the user somehow, and
> I can't see an existing place where it would fit nicely. We
> definitely don't want to overload the meaning of an existing
> element/attribute with this. It should also only appear in
> the (dom)capabilities XML of ppc64 hosts.
>
> I don't think this is too problematic or controversial, we
> just need to pick a nice place to display this information.
>
>
> ** Efficient guest topology **
>
> To achieve optimal performance, you want to match guest
> threads with host threads.
>
> On x86, you can choose suitable host threads by looking at
> the capabilities XML: the presence of elements like
>
> <cpu id='2' socket_id='0' core_id='1' siblings='2-3'/>
> <cpu id='3' socket_id='0' core_id='1' siblings='2-3'/>
>
> means you should configure your guest to use
>
> <vcpu placement='static' cpuset='2-3'>2</vcpu>
> <cpu>
> <topology sockets='1' cores='1' threads='2'/>
> </cpu>
>
> Notice how siblings can be found either looking at the
> attribute with the same name, or by matching them using the
> value of the core_id attribute. Also notice how you are
> supposed to pin as many vCPUs as the number of elements in
> the cpuset - one guest thread per host thread.
>
> On POWER, this gets much trickier: only the *primary* thread
> of each (sub)core appears to be online in the host, but all
> threads can actually have a vCPU running on them. So
>
> <cpu id='0' socket_id='0' core_id='32' siblings='0,4'/>
> <cpu id='4' socket_id='0' core_id='32' siblings='0,4'/>
>
> which is what you'd get with subcores_per_core=2, is very
> confusing.
>
> The optimal guest topology in this case would be
>
> <vcpu placement='static' cpuset='4'>4</vcpu>
> <cpu>
> <topology sockets='1' cores='1' threads='4'/>
> </cpu>
>
> but neither approaches mentioned above work to figure out the
> correct value for the cpuset attribute.
>
> In this case, a possible solution would be to alter the values
> of the core_id and siblings attribute such that both would be
> the same as the id attribute, which would naturally make both
> approaches described above work.
>
> Additionaly, a new attribute would be introduced to serve as
> a multiplier for the "one guest thread per host thread" rule
> mentioned earlier: the resulting XML would look like
>
> <cpu id='0' socket_id='0' core_id='0' siblings='0' capacity='4'/>
> <cpu id='4' socket_id='0' core_id='4' siblings='4' capacity='4'/>
>
> which contains all the information needed to build the right
> guest topology. The capacity attribute would have value 1 on
> all architectures except for ppc64.
I don't really like the fact that with this design, we effectively
have a bunch of <cpu> which are invisible whose existance is just
implied by the 'capacity=4' attribute.
I also don't like tailoring output of capabilities XML for one
specific use case.
IOW, I think we should explicitly represent all the CPUs in the
node capabilities, even if they are offline in the host. We could
introduce a new attribute to indicate the status of CPUs. So
instead of
<cpu id='0' socket_id='0' core_id='0' siblings='0' capacity='4'/>
<cpu id='4' socket_id='0' core_id='4' siblings='4' capacity='4'/>
I'd like to see
<cpu id='0' socket_id='0' core_id='0' siblings='0-3' state="online"/>
<cpu id='0' socket_id='0' core_id='0' siblings='0-3' state="offline"/>
<cpu id='0' socket_id='0' core_id='0' siblings='0-3' state="offline"/>
<cpu id='0' socket_id='0' core_id='0' siblings='0-3' state="offline"/>
<cpu id='4' socket_id='0' core_id='4' siblings='4-7' state="online"/>
<cpu id='4' socket_id='0' core_id='4' siblings='4-7' state="offline"/>
<cpu id='4' socket_id='0' core_id='4' siblings='4-7' state="offline"/>
<cpu id='4' socket_id='0' core_id='4' siblings='4-7' state="offline"/>
The domain capabilities meanwhile is where you'd express any usage
constraint for cores/threads requried by QEMU.
Regards,
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
More information about the libvir-list
mailing list