[libvirt] [PATCH REBASE 5/5] qemu: fix races in beingDestroyed usage

Fri May 4 14:55:17 UTC 2018

On 05/03/2018 05:26 AM, Nikolay Shirokovskiy wrote:
> 
> 
> On 01.05.2018 01:03, John Ferlan wrote:
>>
>>
>> On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
>>> Clearing beingDestroyed right after acquiring job condition is racy.
>>> It is not known when EOF will be delivired. Let's keep this flag
>>
>> delivered
>>
>>> set. This makes possible to make a clear distinction between monitor
>>> errors/eofs and domain being destroyed in qemuDomainObjWait.
>>>
>>> Also let's move setting destroyed flag out of qemuProcessBeginStopJob
>>> as the function is called from eof handler too.
>>>
>>> Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy at virtuozzo.com>
>>> ---
>>>  src/qemu/qemu_domain.c  |  4 ++--
>>>  src/qemu/qemu_domain.h  |  2 +-
>>>  src/qemu/qemu_driver.c  |  8 +++++++-
>>>  src/qemu/qemu_process.c | 24 ++++++++++++------------
>>>  4 files changed, 22 insertions(+), 16 deletions(-)
>>>
>>
>> This one didn't git am -3 apply as well, but I see you're changing
>> DomainObjWait so that makes sense as to why.
>>
>> I do recall looking at this code at one time, but I cannot recall my
>> exact conclusion given how qemuDomainObjBeginJobInternal allows the
>> QEMU_JOB_DESTROY to be run, but then waits for whatever is taking place
>> now to finish.
>>
>>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>>> index 1f40ff1..431901c 100644
>>> --- a/src/qemu/qemu_domain.c
>>> +++ b/src/qemu/qemu_domain.c
>>> @@ -11913,9 +11913,9 @@ qemuDomainObjWait(virDomainObjPtr vm, unsigned long long until)
>>>          return -1;
>>>      }
>>>  
>>> -    if (!virDomainObjIsActive(vm)) {
>>> +    if (priv->destroyed) {
>>>          virReportError(VIR_ERR_OPERATION_FAILED, "%s",
>>> -                       _("domain is not running"));
>>> +                       _("domain is destroyed"));
>>>          return -1;
>>>      }
>>>  
>>> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
>>> index 494ed35..69112ea 100644
>>> --- a/src/qemu/qemu_domain.h
>>> +++ b/src/qemu/qemu_domain.h
>>> @@ -269,7 +269,7 @@ struct _qemuDomainObjPrivate {
>>>      bool agentError;
>>>  
>>>      bool gotShutdown;
>>> -    bool beingDestroyed;
>>> +    bool destroyed;
>>>      char *pidfile;
>>>  
>>>      virDomainPCIAddressSetPtr pciaddrs;
>>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>>> index 03969d8..4356c0d 100644
>>> --- a/src/qemu/qemu_driver.c
>>> +++ b/src/qemu/qemu_driver.c
>>> @@ -2227,7 +2227,13 @@ qemuDomainDestroyFlags(virDomainPtr dom,
>>>      state = virDomainObjGetState(vm, &reason);
>>>      starting = (state == VIR_DOMAIN_PAUSED &&
>>>                  reason == VIR_DOMAIN_PAUSED_STARTING_UP &&
>>> -                !priv->beingDestroyed);
>>> +                !priv->destroyed);
>>> +
>>> +    /* We need to prevent monitor EOF callback from doing our work (and
>>> +     * sending misleading events) while the vm is unlocked inside
>>> +     * BeginJob/ProcessKill API
>>> +     */
>>> +    priv->destroyed = true;
>>
>> would this be the right place anyway?  especially since you don't clear
>> it in error paths after setting it.  Once the job starts and we either
>> goto cleanup or endjob on failure - how does a future call distinguish?
> 
> We send SIGTERM/SIGKILL right away after this flag is set so even if
> this API fails eventually keeping destroyed flag set looks good because we
> send signal to qemu and good chances are qemu will die because of that.
> 
> I guess it will be nicer then to move setting the flag to qemuProcessBeginStopJob
> function to keep setting the flag and killing domain together.
> 
> Anyway we can clear the flag on failure too.
> 
>>
>> Not sure this works conceptually.  At least with the current model if
>> DestroyFlags finally gets the job, it checks the domain active state,
>> and goes to endjob after printing a message.  So if while waiting, EOF
>> occurred, there's no qemuProcessStop
> 
> Not true. After sending signal to qemu to terminate EOF will occur but
> handler will return right away because of beingDestroyed/destroyed flag
> check and then after destroyFlags gets the job it will call qemuProcessStop.
> 
> This is not the place I'm addressing with the patch. It is rather if destroyFlags
> is called when we are migrating for example then the interrupted
> API call can report 'domain is not active'/'some monitor error' rather 
> then much nicer 'domain is destroyed' without this patch. See the
> scenario below.
> 
>>
>> Perhaps the only thing I recall wondering is whether we clear
>> beingDestroyed too soon... If we're successful, we're still destroying
>> but not until destroy is successful should the flag then be cleared
>> while still holding the job.
> 
> It does not matter if we clear the flag at the begin or the end of time
> we holding the job because during the job nobody else who needs the job
> too can progress.
> 
> I propose to prolong setting the flag after destroy job is finished (and
> thus rename flag too). Consider next scenario:
> 
> - migrate is called and wait for migrationg complete event from qemu
> - we call destroy and as destroy can run concurrently with migration
>   destroy gets the job and executes domain stop
> - migrate awaiks on monitor EOF, beingDestroyed is cleared back at this
>   point. We can check for domain is active and give 'domain is not active'
>   error but 'domain is destroyed' is nicer thus we need 'destroyed' flag
> 
> Nikolay
> 

Honestly - hoping things will be clearer for the round of this.

John

[...]