[linux-lvm] fsync() and LVM

Thu Mar 19 09:20:58 UTC 2009

On Tue, 2009-03-17 at 08:33 -0700, Joshua D. Drake wrote:
> On Mon, 2009-03-16 at 15:51 -0700, Joshua D. Drake wrote:
> > On Mon, 2009-03-16 at 16:53 -0500, Les Mikesell wrote:
> > 
> > > The point of fsync() is for an application to know that a write has been 
> > > safely committed, as for example sendmail would do before acknowledging 
> > > to the sender that a message has been accepted.  The question isn't 
> > > whether an application can call fsync() but rather whether it's return 
> > > status is lying, making the underlying storage unsuitable for anything 
> > > that needs reliability.
> > 
> > Right and for databases this is critical. So enlightenment here would be
> > good.
> 
> Anyone?
> 
> Joshua D. Drake

If a logical volume spans physical devices where write caching is
enabled, the results of fsync() can not be trusted. This is an issue
with device mapper, lvm is one of a few possible customers of DM.

Now it gets interesting:

Enter virtualization. When you have something like this:

fsync -> guest block device -> block tap driver -> CLVM -> iscsi ->
storage -> physical disk.

Even if device mapper passed along the write barrier, would it be
reliable? Is every part of that chain going to pass the same along, and
how many opportunities for re-ordering are presented in the above?

So, even if its fixed in DM, can fsync() still be trusted? I think, at
the least, more testing should be done with various configurations even
after a suitable patch to DM is merged. What about PGSQL users using
some kind of elastic hosting?

Given the craze in 'cloud' technology, its an important question to ask
(and research). 

Cheers,
--Tim