[libvirt] PATCH: Disable QEMU drive caching

Anthony Liguori anthony at codemonkey.ws
Wed Oct 8 15:51:16 UTC 2008


Daniel P. Berrange wrote:
> QEMU defaults to allowing the host OS to cache all disk I/O. THis has a
> couple of problems


Oh, say it ain't so.  This is precisely what I didn't want to see happen :-(

>  - It is a waste of memory because the guest already caches I/O ops

Page cache memory is easily reclaimable and has relatively low priority. 
   If a guest needs memory, the size of the page cache will be reduced.

>  - It is unsafe on host OS crash - all unflushed guest I/O will be
>    lost, and there's no ordering guarentees, so metadata updates could
>    be flushe to disk, while the journal updates were not. Say goodbye
>    to your filesystem.

This has nothing to do with cache=off.  The IDE device defaults to 
write-back caching.  As such, IDE makes no guarantee that when a data 
write completes, it's actually completed on disk.  This only comes into 
play when write-back is disabled.  I'm perfectly happy to accept a patch 
that adds explicit sync's when write-back is disabled.

For SCSI, an unordered queue is advertised.  Again, everything depends 
on whether or not write-back caching is enabled or not.  Again, 
perfectly happy to take patches here.

More importantly, the most common journaled filesystem, ext3, does not 
enable write barriers by default (even for journal updates).  This is 
how it ship in Red Hat distros.  So there is no greater risk of 
corrupting a journal in QEMU than there is on bare metal.

>  - It makes benchmarking more or less impossible / worthless because
>    what the benchmark things are disk writes just sit around in memory
>    so guest disk performance appears to exceed host diskperformance.

It just means you have to understand the extra level of caching.

A great deal of virtualization users are doing some form of homogeneous 
consolidation.  If they have a good set of management tools or 
sophisticated storage, then their guests will be sharing base images or 
something like that.  Caching in the host will result in major 
performance improvements because otherwise, the same data will be 
fetched multiple times.

> This patch disables caching on all QEMU guests. NB, Xen has long done this
> for both PV & HVM guests

They don't for HVM actually.  When using file: for PV disks, it also 
goes through the host page cache.  For HVM, Xen uses the write-back 
disabled synchronization stuff I mentioned early.

This is a really bad thing to do by default.  I don't even think it 
should be an option for users because it's so terribly misunderstood.

Regards,

Anthony Liguori

  - QEMU only gained this ability when -drive was
> introduced, and sadly kept the default to unsafe cache=on settings.
> 
> Daniel
> 
> diff -r 4a0ccc9dc530 src/qemu_conf.c
> --- a/src/qemu_conf.c	Wed Oct 08 11:53:45 2008 +0100
> +++ b/src/qemu_conf.c	Wed Oct 08 11:59:33 2008 +0100
> @@ -460,6 +460,8 @@
>          flags |= QEMUD_CMD_FLAG_DRIVE;
>      if (strstr(help, "boot=on"))
>          flags |= QEMUD_CMD_FLAG_DRIVE_BOOT;
> +    if (strstr(help, "cache=on"))
> +        flags |= QEMUD_CMD_FLAG_DRIVE_CACHE;
>      if (version >= 9000)
>          flags |= QEMUD_CMD_FLAG_VNC_COLON;
>  
> @@ -959,13 +961,15 @@
>                  break;
>              }
>  
> -            snprintf(opt, PATH_MAX, "file=%s,if=%s,%sindex=%d%s",
> +            snprintf(opt, PATH_MAX, "file=%s,if=%s,%sindex=%d%s%s",
>                       disk->src ? disk->src : "", bus,
>                       media ? media : "",
>                       idx,
>                       bootable &&
>                       disk->device == VIR_DOMAIN_DISK_DEVICE_DISK
> -                     ? ",boot=on" : "");
> +                     ? ",boot=on" : "",
> +                     qemuCmdFlags & QEMUD_CMD_FLAG_DRIVE_BOOT
> +                     ? ",cache=off" : "");
>  
>              ADD_ARG_LIT("-drive");
>              ADD_ARG_LIT(opt);
> diff -r 4a0ccc9dc530 src/qemu_conf.h
> --- a/src/qemu_conf.h	Wed Oct 08 11:53:45 2008 +0100
> +++ b/src/qemu_conf.h	Wed Oct 08 11:59:33 2008 +0100
> @@ -44,7 +44,8 @@
>      QEMUD_CMD_FLAG_NO_REBOOT      = (1 << 2),
>      QEMUD_CMD_FLAG_DRIVE          = (1 << 3),
>      QEMUD_CMD_FLAG_DRIVE_BOOT     = (1 << 4),
> -    QEMUD_CMD_FLAG_NAME           = (1 << 5),
> +    QEMUD_CMD_FLAG_DRIVE_CACHE    = (1 << 5),
> +    QEMUD_CMD_FLAG_NAME           = (1 << 6),
>  };
>  
>  /* Main driver state */
> 




More information about the libvir-list mailing list