OT - Journaling File Systems?

Wed Apr 28 13:45:14 UTC 2004

On Tue, 2004-04-27 at 18:30, Edwards, Scott (MED, Kelly IT Resouces)
wrote:
> A few minutes ago I wrote:
> 
> > I have started wondering if it is because of write caching on the hard
> > drive.
> 
> I found this article http://sr5tech.com/write_back_cache_experiments.htm
> which says: "The lesson is if write back cache is turned on, it is not
> difficult to create metadata inconsistency or corruption at the file
> system upon power failure.". 
> 
That's why enterprise storage controllers are protected with battery
backup, to enable the writeback function on their disks/arrays.

i have a Compaq DL380G2 server to study with, it has a smartarray 5i
(32mb read/write cache) controller without the memory backup battery, so
the writeback options of the controller are disabled (a power failure
can corrupt the array if writeback is on and no battery backup is on the
controller). 

Compaq/HP enforces the use of a memory battery backup module, for the
use of the writeback cache on the controller. so if power goes down ...
the battery keeps the ram data of the controller up to 4 days. so when
the systems powers on again the controller finishes fushing the ram to
the arrays and no data is lost...

In ATA, SATA and normal SCSI is not the case... there is not a battery
to protect the data in "traffic" from the OS to the disk plates, so on a
power failure that in traffic data is eventually lost.

some new harddisks includes to speed up operation 2,4,8MB of DRAM for
buffer/cache operations... i don't know if that disks also use a sort of
writeback function on their own...  that could explain some of the data
lost im your tests... perhaps the OS did his part, but the hardware
using their own writeback function messed the FS... could be.

> There are other links from that article, from the netbsd.org
> http://mail-index.netbsd.org/tech-kern/2002/12/08/0031.html list:
> 
> > just to be specific, i'll post here what i told sean. having writeback
> cache
> > on allows the drive to delay and reorder writes. softdeps and
> journaling
> > fses depend on writes occurring in specific order. a drive's writeback
> cache
> > then obviously defeats the purpose of ordering those writes (which
> softdeps
> > and reiserfs, etc goto a good deal of trouble being correct).
> >
> > if you want *real* protection (that is, metadata consistency) you must
> (on
> > netbsd and linux) disable write cache. using writeback cache on the
> drive,
> > you're only protected from some things (accidently hit reset, kernel
> panic).
> > you're not protected from power failure. i have a ups, but i still
> disable
> > write cache. a ups can fail, and a machine's psu can fail as well.
> 
> And this from an article
> http://www.linuxjournal.com/article.php?sid=4466 on ReiserFS in Linux
> Journal:
> 
> > For performance benchmarks, some of the new drives have write-back
> caching 
> > by default. This means the drive reports a write is completed before
> it 
> > is actually on the media. The block is still in the drive's cache,
> where 
> > the writes can be reordered. If this happens, metadata changes might
> be 
> > written before the log commit blocks, leading to corruption if the
> machine 
> > loses power. It is very important to disable write-back caching on
> both IDE 
> > and SCSI drives. 
> 
> I have run a few tests with an old IDE drive, which had write caching
> turned
> off by default (I assume that wcache = 0 in /proc/ide/hda/settings means
> it's
> off), and I haven't been able to corrupt the FS yet.
> 
> I am going to switch now to FC2T3 and test all 4 FS's again.
> 
> Thanks for everyone's suggestions and advice.
> 
> -Scott
-- 
Christian B. Ellsworth Capo (k at dicec.cl)
Linux Chief Engineer
RedHat Certified Engineer (RHCE)

DICEC Ltda.
Mariano Sanchez Fontecillas 966b, Las Condes, Santiago, Chile.
Phone (56 2) 2633340
Fax (56 2) 2071820
Mobile (56 9) 4195632

All Your Base Are Belong To Tux