[libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache

Daniel P. Berrangé berrange at redhat.com
Thu Apr 21 16:27:09 UTC 2022


On Tue, Apr 12, 2022 at 11:18:15AM +0200, Claudio Fontana wrote:
> align the "save" with the "restore" code,
> by only using the wrapper when using --bypass-cache.
> 
> This avoids a copy, resulting in better performance.
> 
> Signed-off-by: Claudio Fontana <cfontana at suse.de>
> ---
>  src/qemu/qemu_saveimage.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
> index 4fd4c5cfcd..5ea1b2fbcc 100644
> --- a/src/qemu/qemu_saveimage.c
> +++ b/src/qemu/qemu_saveimage.c
> @@ -289,8 +289,10 @@ qemuSaveImageCreate(virQEMUDriver *driver,
>      if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def, fd) < 0)
>          goto cleanup;
>  
> -    if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> -        goto cleanup;
> +    if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
> +        if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> +            goto cleanup;
> +    }

This effectively reverts:

  commit c4caab538effb57411ad787fedb61e80d557caae
  Author: Jiri Denemark <jdenemar at redhat.com>
  Date:   Wed Feb 8 14:08:54 2012 +0100

    qemu: Always use iohelper for domain save
    
    This is probably not strictly needed as save operation is not live but
    we may have other reasons to avoid blocking qemu's main loop.

As the commit message mentions, using a plain file FD results
in the QEMU thread being blocked, even when O_NONBLOCK is used.

Originally this impacted the main QEMU event loop, which means
things like timers won't fire, or VNC clients won't be serviced,
and the monitor won't respond while the save is taking place.

Today IIUC, the migrate should run in a background thread, so
much of these ill effects go away, as only the migration thread
gets blocked.

This would however impact the ability to cancel the migration
operation. The cancel request could be delayed significantly
if there's alot of pending disk I/O, as the migrate thread
will be in kernel space in an uninterruptable sleep.

Similarly if there is a problem with the host storage,
eg dead NFS server, we'll be unable to cancel migration at
all on the QEMU side.

With the I/O helper, we'll still be unable to kill the I/O
helper for the same reasons, but at least the impact is
isolated from QEMU.

I didn't realize that we actually passed the file FD to
QEMU directly when restoring - I thought  we always used
the I/O helper there too. IMHO this is a bug in libvirt,
as again QEMU has always (implicitly) expected a socket/pipe
FD for fd: protocol with migration, and explicitly never
tried to provide a 'file:' protocol. 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


More information about the libvir-list mailing list