[libvirt] PATCH: Disable QEMU drive caching

Anthony Liguori anthony at codemonkey.ws
Thu Oct 9 13:43:17 UTC 2008


Daniel Veillard wrote:
> On Wed, Oct 08, 2008 at 10:51:16AM -0500, Anthony Liguori wrote:
>   
>> Daniel P. Berrange wrote:
>>     
>>>  - It is unsafe on host OS crash - all unflushed guest I/O will be
>>>    lost, and there's no ordering guarentees, so metadata updates could
>>>    be flushe to disk, while the journal updates were not. Say goodbye
>>>    to your filesystem.
>>>       
>> This has nothing to do with cache=off.  The IDE device defaults to  
>> write-back caching.  As such, IDE makes no guarantee that when a data  
>> write completes, it's actually completed on disk.  This only comes into  
>> play when write-back is disabled.  I'm perfectly happy to accept a patch  
>> that adds explicit sync's when write-back is disabled.
>>
>> For SCSI, an unordered queue is advertised.  Again, everything depends  
>> on whether or not write-back caching is enabled or not.  Again,  
>> perfectly happy to take patches here.
>>
>> More importantly, the most common journaled filesystem, ext3, does not  
>> enable write barriers by default (even for journal updates).  This is  
>> how it ship in Red Hat distros.  So there is no greater risk of  
>> corrupting a journal in QEMU than there is on bare metal.
>>     
>
>   Interesting discussion, I'm wondering about the non-local storage
> effect though, if the Node is caching writes, how can we ensure a
> coherent view on remote storage for example when migrating a domain ?
>   

In the case of remote storage, cache coherency is part of the network 
storage protocol/architecture.  In NFS for instance, the most common 
coherency model is close-to-open.  Other network storage solutions 
provide stronger coherency models.

> Maybe migration is easy to fix because qemu is aware and can issue a
> sync, but as we start adding cloning APIs to libvirt, we could face the
> issue if issuing an LVM snapshot operation on the guest storage while
> the Node still cache some of the data. The more layers of caching the
> harder it is to have a predictable behaviour, no ?
>   

With respect to migration, QEMU does a flush(), but not an fdatasync.  
Even if we did an fdatasync, I'm not sure that's good enough with NFS 
because I don't know if fdatasync on the source *after* the target has 
opened a file and read data will guarantee consistency.

Regards,

Anthony Liguori

> Daniel
>
>   




More information about the libvir-list mailing list