[libvirt] PATCH: Disable QEMU drive caching
Anthony Liguori
anthony at codemonkey.ws
Thu Oct 9 13:43:17 UTC 2008
Daniel Veillard wrote:
> On Wed, Oct 08, 2008 at 10:51:16AM -0500, Anthony Liguori wrote:
>
>> Daniel P. Berrange wrote:
>>
>>> - It is unsafe on host OS crash - all unflushed guest I/O will be
>>> lost, and there's no ordering guarentees, so metadata updates could
>>> be flushe to disk, while the journal updates were not. Say goodbye
>>> to your filesystem.
>>>
>> This has nothing to do with cache=off. The IDE device defaults to
>> write-back caching. As such, IDE makes no guarantee that when a data
>> write completes, it's actually completed on disk. This only comes into
>> play when write-back is disabled. I'm perfectly happy to accept a patch
>> that adds explicit sync's when write-back is disabled.
>>
>> For SCSI, an unordered queue is advertised. Again, everything depends
>> on whether or not write-back caching is enabled or not. Again,
>> perfectly happy to take patches here.
>>
>> More importantly, the most common journaled filesystem, ext3, does not
>> enable write barriers by default (even for journal updates). This is
>> how it ship in Red Hat distros. So there is no greater risk of
>> corrupting a journal in QEMU than there is on bare metal.
>>
>
> Interesting discussion, I'm wondering about the non-local storage
> effect though, if the Node is caching writes, how can we ensure a
> coherent view on remote storage for example when migrating a domain ?
>
In the case of remote storage, cache coherency is part of the network
storage protocol/architecture. In NFS for instance, the most common
coherency model is close-to-open. Other network storage solutions
provide stronger coherency models.
> Maybe migration is easy to fix because qemu is aware and can issue a
> sync, but as we start adding cloning APIs to libvirt, we could face the
> issue if issuing an LVM snapshot operation on the guest storage while
> the Node still cache some of the data. The more layers of caching the
> harder it is to have a predictable behaviour, no ?
>
With respect to migration, QEMU does a flush(), but not an fdatasync.
Even if we did an fdatasync, I'm not sure that's good enough with NFS
because I don't know if fdatasync on the source *after* the target has
opened a file and read data will guarantee consistency.
Regards,
Anthony Liguori
> Daniel
>
>
More information about the libvir-list
mailing list