[libvirt] [RFC] kvm: x86: export vCPU halted state to sysfs

Luiz Capitulino lcapitulino at redhat.com
Fri Feb 2 14:21:59 UTC 2018


On Fri, 2 Feb 2018 14:19:38 +0000
Daniel P. Berrangé <berrange at redhat.com> wrote:

> On Fri, Feb 02, 2018 at 12:15:54PM -0200, Eduardo Habkost wrote:
> > On Fri, Feb 02, 2018 at 02:53:50PM +0100, Viktor Mihajlovski wrote:  
> > > On 01.02.2018 21:26, Eduardo Habkost wrote:  
> > > > On Thu, Feb 01, 2018 at 09:15:15PM +0100, Radim Krčmář wrote:  
> > > >> 2018-02-01 12:54-0500, Luiz Capitulino:  
> > > >>>
> > > >>> Libvirt needs to know when a vCPU is halted. To get this information,  
> > > >>
> > > >> I don't see why upper level management should care about that, a single
> > > >> bit about halted state that can be incorrect at the time it is processed
> > > >> seems of very limited use.  
> > > > 
> > > > I don't see why, either.
> > > > 
> > > > I'm CCing libvir-list and the people involved in the code that
> > > > added halt state to libvirt domain statistics.
> > > >   
> > > I'll try to explain the motivation for the "halted" state exposure and
> > > why it ended int the libvirt domain stats.
> > > 
> > > s390 CPUs can be present in a system (e.g. after being hotplugged) but
> > > be offline (disabled) in which case they are not used by the operating
> > > system. In Linux disabled CPUs show a value of '0' in
> > > /sys/devices/system/cpu/cpu<n>/online.
> > > 
> > > Higher level management software (on top of libvirt) can take advantage
> > > of knowing whether a guest CPU is online and thus used or not.
> > > Specifically it might not make sense to plug more CPUs if the guest OS
> > > isn't using the CPUs at all.  
> > 
> > Wasn't this already represented on "vcpu.<n>.state"?  Why is
> > "vcpu.<n>.halted" needed?
> >   
> > > 
> > > A disabled guest CPU is represented as halted in the QEMU object model
> > > and can therefore be identified by the QMP query-cpus command.
> > > 
> > > The initial patch proposal to expose this via virsh vcpuinfo was not
> > > considered to be desirable because there was a concern that legacy
> > > management software might be confused seeing halted vcpus. Therefore the
> > > state information was added to the cpu domain statistics.
> > > 
> > > One issue we're facing is that the semantics of "halted" are different
> > > between s390 and at least x86. The question might be whether they are
> > > different enough to grant a specific "disabled" indicator.  
> > 
> > From your description, it looks like they are completely
> > different.  On x86, a CPU that is online and in use can be moved
> > between halted and non-halted state many times a second.
> > 
> > If that's the case, we can probably fix this without breaking
> > existing code: explicitly documenting the semantics of
> > "vcpu.<n>.halted" at virConnectGetAllDomainStats() to mean "not
> > online" (i.e. the s390 semantics, not the x86 one), and making
> > qemuMonitorGetCpuHalted() s390-specific.
> > 
> > Possibly a better long-term solution is to deprecate
> > "vcpu.<n>.halted" and make "vcpu.<n>.state" work correctly on
> > s390.
> > 
> > It would be also interesting to update QEMU QMP documentation to
> > clarify the arch-specific semantics of "halted".  
> 
> Any also especially clarify the awful performance implications of running
> this particular query command. In general I would not expect query-xxx
> monitor commands to interrupt all vcpus, so we should clearly warn about
> this !

Or deprecate it...




More information about the libvir-list mailing list