[libvirt] [PATCH v2] Close the source fd if the destination qemu exits during tunnelled migration

Wed Sep 23 20:22:15 UTC 2015

On 09/22/2015 07:21 AM, Shivaprasad bhat wrote:
> On Mon, Sep 21, 2015 at 8:04 PM, John Ferlan <jferlan at redhat.com> wrote:
>>
>>
>> On 09/21/2015 05:09 AM, Shivaprasad bhat wrote:
>>> Thanks John for the comments.
>>>
>>>
>>> On Fri, Sep 18, 2015 at 10:34 PM, John Ferlan <jferlan at redhat.com> wrote:
>>>>
>>>>
>>>> On 09/14/2015 10:44 AM, Shivaprasad G Bhat wrote:
>>>>> Tunnelled migration can hang if the destination qemu exits despite all the
>>>>> ABI checks. This happens whenever the destination qemu exits before the
>>>>> complete transfer is noticed by source qemu. The savevm state checks at
>>>>> runtime can fail at destination and cause qemu to error out.
>>>>> The source qemu cant notice it as the EPIPE is not propogated to it.
>>>>> The qemuMigrationIOFunc() notices the stream being broken from virStreamSend()
>>>>> and it cleans up the stream alone. The qemuMigrationWaitForCompletion() would
>>>>> never get to 100% transfer completion.
>>>>> The qemuMigrationWaitForCompletion() never breaks out as well since
>>>>> the ssh connection to destination is healthy, and the source qemu also thinks
>>>>> the migration is ongoing as the Fd to which it transfers, is never
>>>>> closed or broken. So, the migration will hang forever. Even Ctrl-C on the
>>>>> virsh migrate wouldn't be honoured. Close the source side FD when there is
>>>>> an error in the stream. That way, the source qemu updates itself and
>>>>> qemuMigrationWaitForCompletion() notices the failure.
>>>>>
>>>>> Close the FD for all kinds of errors to be sure. The error message is not
>>>>> copied for EPIPE so that the destination error is copied instead later.
>>>>>
>>>>> Note:
>>>>> Reproducible with repeated migrations between Power hosts running in different
>>>>> subcores-per-core modes.
>>>>>
>>>>> Changes from v1 -> v2:
>>>>>    VIR_FORCE_CLOSE() was called twice for this use case which would log
>>>>>    unneccessary warnings. So, move the fd close to qemuMigrationIOFunc
>>>>>    so that there are no unnecessary duplicate attempts.(Would this trigger
>>>>>    a Coverity error? I don't have a setup to check.)
>>>>>
>>>>> Signed-off-by: Shivaprasad G Bhat <sbhat at linux.vnet.ibm.com>
>>>>> ---
>>>>>  src/qemu/qemu_migration.c |    8 ++++++--
>>>>>  1 file changed, 6 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
>>>>> index ff89ab5..9602fb2 100644
>>>>> --- a/src/qemu/qemu_migration.c
>>>>> +++ b/src/qemu/qemu_migration.c
>>>>> @@ -4012,6 +4012,7 @@ static void qemuMigrationIOFunc(void *arg)
>>>>>      if (virStreamFinish(data->st) < 0)
>>>>>          goto error;
>>>>>
>>>>> +    VIR_FORCE_CLOSE(data->sock);
>>>>>      VIR_FREE(buffer);
>>>>>
>>>>>      return;
>>>>> @@ -4029,7 +4030,11 @@ static void qemuMigrationIOFunc(void *arg)
>>>>>      }
>>>>>
>>>>>   error:
>>>>> -    virCopyLastError(&data->err);
>>>>> +    /* Let the source qemu know that the transfer cant continue anymore.
>>>>> +     * Don't copy the error for EPIPE as destination has the actual error. */
>>>>> +    VIR_FORCE_CLOSE(data->sock);
>>>>> +    if (!virLastErrorIsSystemErrno(EPIPE))
>>>>> +        virCopyLastError(&data->err);
>>>>>      virResetLastError();
>>>>>      VIR_FREE(buffer);
>>>>>  }
>>>>> @@ -4397,7 +4402,6 @@ qemuMigrationRun(virQEMUDriverPtr driver,
>>>>>          if (iothread && qemuMigrationStopTunnel(iothread, ret < 0) < 0)
>>>>>              ret = -1;
>>>>>      }
>>>>> -    VIR_FORCE_CLOSE(fd);
>>>>
>>>> ^^^
>>>>
>>>> This causes Coverity to claim a RESOURCE_LEAK
>>>>
>>>> Feels like this was a mistake edit...
>>>>
>>>
>>> The VIR_FORCE_CLOSE() inside qemuMigrationIOFunc() alone is sufficient.
>>> Having this again here would lead to Warning in the logs. I too thought coverity
>>> would complain. Is there a way to force ignore a coverity warning?
>>>
>>
>> Typically a marker of sorts, such as
>>
>>     /* coverity[leaked_handle] <some extra comment explaining why> */
>>
>> Although I'm not sure that's the best way to handle this either.
>>
>> The problem I see though is this is an "error path" issue and when
>> perhaps it's safe/required to close fd/io->sock/data->sock.
>>
>> Your commit comment notes that the issue is seen on a fairly specific
>> event (virStreamSend failure) for a specific type of migration.
> 
> I believe the failure can be seen for all types of migration with savestate
> mismatch.
> 

My thoughts were based mostly on your commit message comments:

"The qemuMigrationIOFunc() notices the stream being broken from
virStreamSend() and it cleans up the stream alone. The
qemuMigrationWaitForCompletion() would never get to 100% transfer
completion."

and the belief that the core issue you described is there's no way for
qemuMigrationCompleted to know the qemuMigrationIOFunc thread failed and
thus that's what caused the hang. Since we can pass the iothread "io"
object to the Completion function, I figured we could also use that to
check the status of the thread in the list of things already checked. I
guess it's not clear to me why your test fails below. Maybe it'll help
to attach/show you my thoughts in a patch format. I just took your
changes and "adjusted" them, merging the differences. So you could 'git
am' to top of trunk in order to compare my thoughts.

I'm not opposed to the chosen method, although perhaps Dan or Jiri who
understand more of the inner workings of migration than I do can provide
some comments. Changing where the fd is closed could have some corner
race condition I'm not aware of...

One issue that needs to be resolved with your patch is what happens to
the 'fd' when qemuMigrationIOFunc isn't involved?  IOW: if "fwdType ==
MIGRATION_FWD_DIRECT", then qemuMigrationStartTunnel isn't called and
thus qemuMigrationIOFunc won't be closing the 'fd' in the form of
data->sock. That's perhaps a path Coverity was considering (and I hadn't
considered initially).

I think to resolve this the following should be added to your patch (I
tried it, Coverity is happy too)...

if (spec->fwdType != MIGRATION_FWD_DIRECT) {
    if (!(iothread = qemuMigrationStartTunnel(spec->fwd.stream, fd)))
        goto cancel;
    /* If we've created a tunnel, then the 'fd' will be closed in the
     * qemuMigrationIOFunc as data->sock
     */
    fd = -1;
}

and restore the VIR_FORCE_CLOSE(fd);

>> As I
>> read the code, that failure jumps to error (as does virStreamFinish). So
>> now you have a fairly specific set of instances which perhaps would
>> cause qemuMigrationWaitForCompletion to need to fail. The fix you have
>> proposed is to close the data->sock (io->sock, fd). However, your
>> proposal is a larger hammer. I assume closure of data->sock causes
>> WaitForCompletion to fail (perhaps differently) and that's why you chose it.
>>
>> Going back to the beginning of qemuMigrationRun, setting the 'fd' and
>> using StartTunnel seems to rely on MIGRATION_FWD_DIRECT, but if a tunnel
>> is created an (ugh) 'iothread' variable is NON-NULL. What if 'iothread'
>> were passed to qemuMigrationWaitForCompletion? Then again passed to
>> qemuMigrationCompleted which would check if iothread non-NULL and for
>> some new flag that could be created in _qemuMigrationIOThread, being
>> true. If it were true, then that would cause failure so that
>> WaitForCompletion would return error status. The only way the flag is
>> set to true is in qemuMigrationIOFunc when the code jumps to error.
>> (e.g. add bool stream_abort and data->stream_abort = true in "error:",
>> then check iothread && iothread->stream_abort, force error).
>>
> 
> I tried this. Until the fd is closed, the 'info migrate' wouldn't
> return from the
> qemu. So, the migration hangs. Checking the stream status before/after
> fetching the job status would still leave a race. So, having it in a different
> thread (qemuMigrationIOFunc) here seems inevitable to me.

Although perhaps susceptible to a smaller "timing" window, if we move
the iothread check to before the CheckJobStatus call in Completed, then
that'll work better?

It seems the "query-migrate" command has no timeout... I assume qemu
would indicate that if you call it when the migration pipe had an error,
then you unpredictable results (only a guess - I didn't look at that
code as well). Perhaps that means libvirt needs some code to handle the
condition where the command doesn't finish in a timely manner.

Perhaps worth a shot...  I did change my attached patch to reflect this
thought..

John
> 
> #1  0x00003fff83ff593c in virCondWait (c=<optimized out>, m=<optimized out>)
>     at util/virthread.c:154
> #2  0x00003fff76235544 in qemuMonitorSend (mon=0x3fff54003970,
>     msg=<optimized out>) at qemu/qemu_monitor.c:1035
> #3  0x00003fff7624fb30 in qemuMonitorJSONCommandWithFd (mon=0x3fff54003970,
>     cmd=0x3fff5c001420, scm_fd=-1, reply=0x3fff79fbd388)
>     at qemu/qemu_monitor_json.c:293
> #4  0x00003fff76254b90 in qemuMonitorJSONCommand (reply=0x3fff79fbd388,
>     cmd=0x3fff5c001420, mon=0x3fff54003970) at qemu/qemu_monitor_json.c:323
> #5  qemuMonitorJSONGetMigrationStatus (mon=0x3fff54003970,
>     status=0x3fff79fbd538) at qemu/qemu_monitor_json.c:2620
> #6  0x00003fff7623a664 in qemuMonitorGetMigrationStatus (mon=<optimized out>,
>     status=<optimized out>) at qemu/qemu_monitor.c:2134
> #7  0x00003fff76228e6c in qemuMigrationFetchJobStatus (driver=0x3fff6c118d80,
>     vm=0x3fff6c27faf0, asyncJob=<optimized out>, jobInfo=0x3fff79fbd4f0)
>     at qemu/qemu_migration.c:2528
> #8  0x00003fff76228fb4 in qemuMigrationUpdateJobStatus (driver=0x3fff6c118d80,
>     vm=0x3fff6c27faf0, asyncJob=QEMU_ASYNC_JOB_MIGRATION_OUT)
>     at qemu/qemu_migration.c:2565
> #9  0x00003fff76229200 in qemuMigrationCheckJobStatus (driver=0x3fff6c118d80,
>     vm=0x3fff6c27faf0, asyncJob=QEMU_ASYNC_JOB_MIGRATION_OUT)
> ---Type <return> to continue, or q <return> to quit---cont
>     at qemu/qemu_migration.c:2585
> #10 0x00003fff76229408 in qemuMigrationCompleted (iothread=<optimized out>,
> 
>> The only question then in my mind then is would this also be something
>> that should be done for the virStreamFinish failure path which also
>> jumps to error?
>>
> 
> Yes, I too am not sure if its appropriate for virStreamFinish failure
> case. But, given
> this an error path, I feel we can safely go ahead and close the
> data->sock.
> 
>> Doing this means you shouldn't need the VIR_FILE_CLOSE(data->sock) for
>> either the normal or error path.  Does this seem to be a reasonable
>> approach and solve the issue you're facing?
>>
>> John
>>> Thanks and Regards,
>>> Shivaprasad
>>>
>>>> The rest of the patch looks reasonable; however, I'm far from the expert
>>>> in these matters.
>>>>
>>>> John
>>>>>
>>>>>      if (priv->job.completed) {
>>>>>          qemuDomainJobInfoUpdateTime(priv->job.completed);
>>>>>
>>>>> --
>>>>> libvir-list mailing list
>>>>> libvir-list at redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/libvir-list
>>>>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Close-the-source-fd-if-the-destination-qemu-exits-du.patch
Type: text/x-patch
Size: 6035 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20150923/72bfd1b5/attachment-0001.bin>