[linux-lvm] Page cache corruption when creating a snapshot
malahal at us.ibm.com
malahal at us.ibm.com
Fri Feb 29 19:29:20 UTC 2008
Greg Hudson [ghudson at MIT.EDU] wrote:
> On Fri, 2008-02-29 at 18:31 +0000, Alasdair G Kergon wrote:
> > On Fri, Feb 29, 2008 at 12:32:41PM -0500, ghudson at MIT.EDU wrote:
> > > The reproduction recipe looks like:
> > > rm -rf /tmp/test
> > > mkdir /tmp/test
> > > # Put around 60MB of files into /tmp/test
> > > find /tmp/test -type f | xargs md5sum > /tmp/sum.pre
> > > lvcreate --size 2G --snapshot /dev/dink/gutsy-i386-sbuild --name testsnapshot
> > > find /tmp/test -type f | xargs md5sum > /tmp/sum.post
> >
> > Can you do that twice?
> > find /tmp/test -type f | xargs md5sum > /tmp/sum.post2
> > and check the two post files are the same?
>
> In three reproductions of the page cache corruption, sum.post2 was
> always the same as sum.post.
>
> In my experiences with this problem in general, the page cache
> corruption is not particularly transient; once it happens, the file
> continues to appear modified (with the same incorrect contents) for the
> indefinite future, until the machine is rebooted.
>
> > And add some syncs/blockdev --flushbufs at different places
> > in the script to see if you can make it go away.
>
> Nope, that never made it go away. I'm not sure in what situations
> flushing write buffers would have any effect. If I had a way to throw
> away the read-only page cache and force a file reload from disk, I would
> expect that to eliminate the visible effect of the corruption; at the
> moment the only reliable way I know how to do that is to reboot.
Not an expert on O_DIRECT, but it is supposed to read from the disk
without creating page cache. I don't really know what it does if page
cache exists! The "dd" command has O_DIRECT support and see if you
notice any change with the corrupted file when you do "dd" with and
without O_DIRECT.
--Malahal.
More information about the linux-lvm
mailing list