[libvirt] [PATCH] qemu: Restore lost shutdown reason

Thu Nov 1 14:24:21 UTC 2018

On 10/22/18 3:36 AM, Nikolay Shirokovskiy wrote:
> 
> 
> On 18.10.2018 18:28, John Ferlan wrote:
>> When qemuProcessReconnectHelper was introduced (commit d38897a5d)
>> reconnection failure used VIR_DOMAIN_SHUTOFF_FAILED; however, that
>> was changed in commit bda2f17d to either VIR_DOMAIN_SHUTOFF_CRASHED
>> or VIR_DOMAIN_SHUTOFF_UNKNOWN.
>>
>> When QEMU_CAPS_NO_SHUTDOWN checking was removed in commit fe35b1ad6
>> the conditional state was just left at VIR_DOMAIN_SHUTOFF_CRASHED.
>>
>> Restore the logic adjustment using the same conditions as command
>> line building and alter the comment to describe the reasoning.
>>
>> Signed-off-by: John Ferlan <jferlan at redhat.com>
>> ---
>>
>>  This was "discovered" while reviewing Nikolay's patch related
>>  to adding "DAEMON" as the shutdown reason:
>>
>> https://www.redhat.com/archives/libvir-list/2018-October/msg00493.html
>>
>>  While working through the iterations of that - figured I'd at
>>  least post this.
>>
>>
>>  src/qemu/qemu_process.c | 15 ++++++++++-----
>>  1 file changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
>> index 3955eda17c..4a39111d51 100644
>> --- a/src/qemu/qemu_process.c
>> +++ b/src/qemu/qemu_process.c
>> @@ -7985,11 +7985,16 @@ qemuProcessReconnect(void *opaque)
>>      if (virDomainObjIsActive(obj)) {
>>          /* We can't get the monitor back, so must kill the VM
>>           * to remove danger of it ending up running twice if
>> -         * user tries to start it again later
>> -         * If we couldn't get the monitor since QEMU supports
>> -         * no-shutdown, we can safely say that the domain
>> -         * crashed ... */
>> -        state = VIR_DOMAIN_SHUTOFF_CRASHED;
>> +         * user tries to start it again later.
>> +         *
>> +         * If we cannot get to the monitor when the QEMU command
>> +         * line used -no-shutdown, then we can safely say that the
>> +         * domain crashed; otherwise we don't really know. */
>> +        if (priv->monJSON && priv->allowReboot == VIR_TRISTATE_BOOL_YES)
>> +            state = VIR_DOMAIN_SHUTOFF_CRASHED;
>> +        else
>> +            state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
>> +
>>          /* If BeginJob failed, we jumped here without a job, let's hope another
>>           * thread didn't have a chance to start playing with the domain yet
>>           * (it's all we can do anyway).
>>
> 
> This is reasonable but not complete. This is only applied to case when we do connect(2)

I understand your point; however, the goal of this patch wasn't to fix
the various other failure scenarios/reasons. It was to merely restore
the lost UNKNOWN reason in the event that either priv->monJSON == false
(unlikely) or priv->allowReboot != VIR_TRISTATE_BOOL_YES (possible).

Rather than copy-pasta, I'll create qemuDomainIsUsingNoShutdown(priv)
which will perform the condition checks for both command line generation
and reconnection failure paths.

If that's not desirable, then that's fine. I won't spend more cycles on
this.

John

> to qemu and it is failed with ECONNREFUSED which means qemu process does not exists
> (in which case it is either crashed if was configured with -no-shutdown or terminated
> for unknown reason) or just closed monitor connection (in this case we are going
> to terminate it by ourselves).
> 
> But there are other cases. For example we can jump to error path even before connect(2)
> or after successfull connect. See discussion of such cases in [1].
> 
> [1] https://www.redhat.com/archives/libvir-list/2018-October/msg01031.html
> 
> Nikolay
>