[libvirt] [RFC] kvm: x86: export vCPU halted state to sysfs

Viktor Mihajlovski mihajlov at linux.vnet.ibm.com
Fri Feb 2 15:08:25 UTC 2018


On 02.02.2018 15:15, Eduardo Habkost wrote:
> On Fri, Feb 02, 2018 at 02:53:50PM +0100, Viktor Mihajlovski wrote:
>> On 01.02.2018 21:26, Eduardo Habkost wrote:
>>> On Thu, Feb 01, 2018 at 09:15:15PM +0100, Radim Krčmář wrote:
>>>> 2018-02-01 12:54-0500, Luiz Capitulino:
>>>>>
>>>>> Libvirt needs to know when a vCPU is halted. To get this information,
>>>>
>>>> I don't see why upper level management should care about that, a single
>>>> bit about halted state that can be incorrect at the time it is processed
>>>> seems of very limited use.
>>>
>>> I don't see why, either.
>>>
>>> I'm CCing libvir-list and the people involved in the code that
>>> added halt state to libvirt domain statistics.
>>>
>> I'll try to explain the motivation for the "halted" state exposure and
>> why it ended int the libvirt domain stats.
>>
>> s390 CPUs can be present in a system (e.g. after being hotplugged) but
>> be offline (disabled) in which case they are not used by the operating
>> system. In Linux disabled CPUs show a value of '0' in
>> /sys/devices/system/cpu/cpu<n>/online.
>>
>> Higher level management software (on top of libvirt) can take advantage
>> of knowing whether a guest CPU is online and thus used or not.
>> Specifically it might not make sense to plug more CPUs if the guest OS
>> isn't using the CPUs at all.
> 
> Wasn't this already represented on "vcpu.<n>.state"?  Why is
> "vcpu.<n>.halted" needed?
The state would match that of vcpuinfo, and there was consensus not to
change it (on x86 the CPU is in state running, even if halted).
> 
>>
>> A disabled guest CPU is represented as halted in the QEMU object model
>> and can therefore be identified by the QMP query-cpus command.
>>
>> The initial patch proposal to expose this via virsh vcpuinfo was not
>> considered to be desirable because there was a concern that legacy
>> management software might be confused seeing halted vcpus. Therefore the
>> state information was added to the cpu domain statistics.
>>
>> One issue we're facing is that the semantics of "halted" are different
>> between s390 and at least x86. The question might be whether they are
>> different enough to grant a specific "disabled" indicator.
> 
> From your description, it looks like they are completely
> different.  On x86, a CPU that is online and in use can be moved
> between halted and non-halted state many times a second.
> 
> If that's the case, we can probably fix this without breaking
> existing code: explicitly documenting the semantics of
> "vcpu.<n>.halted" at virConnectGetAllDomainStats() to mean "not
> online" (i.e. the s390 semantics, not the x86 one), and making
> qemuMonitorGetCpuHalted() s390-specific.
> 
> Possibly a better long-term solution is to deprecate
> "vcpu.<n>.halted" and make "vcpu.<n>.state" work correctly on
> s390>
As it seems that nobody was ever *really* interested in x86.halted, one
could also return 0 unconditionally there (and for other
expensive-to-query arches)?
> It would be also interesting to update QEMU QMP documentation to
> clarify the arch-specific semantics of "halted".
> 


-- 
Regards,
 Viktor Mihajlovski




More information about the libvir-list mailing list