[libvirt] [PATCH REBASE 2/5] qemu: monitor: set error flag even in OOM conditions

John Ferlan jferlan at redhat.com
Mon Apr 30 22:03:20 UTC 2018



On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
> lastError.code is used as flag in qemu monitor. Unfortunately
> due to temporary OOM conditions (very unlikely through as thread local
> error is allocated on first use) we can fail to set this flag
> in case of monitor eofs/errors. This can cause hangs.
> 
> Let's make sure flag is always set in case eofs/errors.
> 
> Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy at virtuozzo.com>
> ---
>  src/qemu/qemu_monitor.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
> index f642d9a..906a320 100644
> --- a/src/qemu/qemu_monitor.c
> +++ b/src/qemu/qemu_monitor.c
> @@ -757,6 +757,11 @@ qemuMonitorIO(int watch, int fd, int events, void *opaque)
>                  virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
>                                 _("Error while processing monitor IO"));
>              virCopyLastError(&mon->lastError);
> +
> +            /* set error code if due to OOM conditions we fail to set it before */
> +            if (mon->lastError.code == VIR_ERR_OK)

So this can only happen if virCopyLastError fails, right?  And code == 0
because of memset(..., 0, ...).  Wouldn't you only be working around the
issue for one of the callers? and only setting *.code and not *.domain?

BTW: I feel we've been this way in rather recent history... See commit
'e711a391' - I recall we had varying opinions on that one.

Anyway, assuming CopyLastError is the issue, then perhaps the right
place to fix this is there and I would think at least the following
would need to be set in the error path:

   to->code = VIR_ERR_NO_MEMORY;
   to->domain = VIR_FROM_THREAD;

We cannot create a to->message...

Of course, how can we be "sure" that the failure is because the
VIR_ALLOC_QUIET in virLastErrorObject caused the issue other than
changing it's return values or passing a "bool &memfail" that's only set
when the VIR_ALLOC_QUIET fails allowing the caller to decide what to do.
It's local, so change is limited. Thus virGetLastErrorMessage and
virCopyLastError could make use of the returned, while other callers
could just pass "&dummy".

In the long run, it's a somewhat interesting corner case problem and we
may have some varying opinions over the "best way" to resolve.  Suffice
to say, libvirt probably isn't going to "survive" much longer, but
unless others read this response and have varying opinions and/or we
come to a resolution - can this be posted/consider on it's own merits?

Tks -

John

> +                mon->lastError.code = VIR_ERR_INTERNAL_ERROR;
> +
>              virResetLastError();
>          }
>  
> 




More information about the libvir-list mailing list