[libvirt] [RFC] kvm: x86: export vCPU halted state to sysfs

Luiz Capitulino lcapitulino at redhat.com
Mon Feb 5 16:36:26 UTC 2018


On Mon, 5 Feb 2018 17:10:18 +0100
Viktor Mihajlovski <mihajlov at linux.vnet.ibm.com> wrote:

> On 05.02.2018 16:37, Luiz Capitulino wrote:
> > On Mon, 5 Feb 2018 13:47:27 +0000
> > Daniel P. Berrangé <berrange at redhat.com> wrote:
> >   
> >> On Mon, Feb 05, 2018 at 02:43:15PM +0100, Viktor Mihajlovski wrote:  
> >>> On 02.02.2018 21:41, Eduardo Habkost wrote:    
> >>>> On Fri, Feb 02, 2018 at 03:19:45PM -0500, Luiz Capitulino wrote:    
> >>>>> On Fri, 2 Feb 2018 18:09:12 -0200
> >>>>> Eduardo Habkost <ehabkost at redhat.com> wrote:    
> >>>> [...]    
> >>>>>> Your plan above covers what will happen when using newer QEMU
> >>>>>> versions, but libvirt still needs to work sanely if running QEMU
> >>>>>> 2.11.  My suggestion is that libvirt do not run query-cpus to ask
> >>>>>> for the "halted" field on any architecture except s390.    
> >>>>>
> >>>>> My current plan is to ask libvirt to completely remove query-cpus
> >>>>> usage, independent of the arch and use the new command instead.    
> >>>>
> >>>> This would be a regression for people running QEMU 2.11 on s390.
> >>>>
> >>>> (But maybe it would be an acceptable regression?  Viktor, what do
> >>>> you think?  Are there production releases of management systems
> >>>> that already rely on vcpu.<n>.halted?)
> >>>>     
> >>> Unfortunately, there's code out there looking at vcpu.<n>.halted. I've
> >>> informed the product team about the issue.
> >>>
> >>> If we drop/deprecate vcpu.<n>.halted from the domain statistics, this
> >>> should be done for all arches, if there's a replacement mechanism (i.e.
> >>> new VCPU states). As a stop-gap measure we can make the call
> >>> arch-dependent until the new stuff is in place.    
> >>
> >> Yes, I think libvirt should just restrict this 'halted' feature reporting
> >> to s390 only, since the other archs have different semantics for this
> >> item, and the s390 semantics are the ones we want.  
> > 
> > From this whole discussion, there's only one thing that I still don't
> > understand (in a very honest way): what makes s390 halted semantics
> > different?One problem is that using the halted property to indicate that the CPU  
> has assumed the architected disabled wait state may not have been the
> wisest decision (my fault). If the CPU enters disabled wait, it will
> stay inactive until it is explicitly restarted which is different on x86.

Ah, OK. So, s390 does indeed have different semantics.

> > By quickly looking at the code, it seems to be very like the x86 one
> > when in kernel irqchip is not used: if a guest vCPU executes HLT, the
> > vCPU exits to userspace and qemu will put the vCPU thread to sleep.
> > This is the semantics I'd expect for HLT, and maybe for all archs.>
> > What makes x86 different, is when the in kernel irqchip is used (which
> > should be the default with libvirt). In this case, the vCPU thread avoids
> > exiting to user-space. So, qemu doesn't know the vCPU halted.
> > 
> > That's only one of the reasons why query-cpus forces vCPUs to user-space.
> > But there are other reasons, and that's why even on s390 query-cpus
> > will also force vCPUs to user-space, which means s390 has the same perf
> > issue but maybe this hasn't been detected yet.
> > 
> > For the immediate term, I still think we should have a query-cpus
> > replacement that doesn't cause vCPUs to go to userspace. I'll work this
> > this week.  
> FWIW: I currently exploring an extension to query-cpus to report
> s390-specific information, allowing to ditch halted in the long run.
> Further, I'm considering a new QAPI event along the lines of "CPU info
> has changed" allowing QEMU to announce low-frequency changes of CPU
> state (as is the case for s390) and finally wire up a handler in libvirt
> to update a tbd. property (!= halted).

I very much prefer adding a replacement for query-cpus, which works
for all archs and which doesn't have any performance impact.

> > 
> > However, IMHO, what we really want is to add an API to the guest agent
> > to export the CPU online bit from the guest userspace sysfs. This will
> > give the ultimate semantics and move us away from this halted mess.
> >   
> 





More information about the libvir-list mailing list