barrier and commit options?

Miller, Mike (OS Dev) Mike.Miller at
Mon Feb 2 15:55:50 UTC 2009

Theodore Tso wrote: 

> Well, we still need the barrier on the block I/O elevantor 
> side to make sure that requests don't get reordered in the 
> block layer.  But what you're saying is that once the write 
> is posted to the array, it is guaranteed that it is on 
> "stable storage" (even if it is BBWC) such that if someone 
> hits the Big Red Switch at the exit to the data center, and 
> power is forcibly cut from the entire data center in case of 
> a fire, the battery will still keep the cache alive, at least 
> until the sprinklers go off, anyway, right?  :-)

That's an accurate accessment. ;-)

> In that case, I suspect the right thing for the cciss array 
> to do is to ignore the barrier, but not to return an error.  

We agree and will fix the IO error.

> If you return an error, and refuse the write with barrier 
> operation (which is what the cciss driver seems to be doing 
> starting in 2.6.29-rcX), ext4 will retry the write without 
> the barrier, at which point we are vulnerable to the block 
> layer reordering things at the I/O scheduler layer.  In 
> effect, you're claiming that every single write to cciss is 
> implicitly a "barrier write" in that once it is received by 
> the device, it is guaranteed not to be lost even if the power 
> to the entire system is forcibly removed.

Of course, we can't cover all possible scenarios like the data center exploding or something crazy. But under _most_ circumstances the data will remain in cache for up to 72 hours of no power. So if there is a complete power outage the controller will write any cached data (in order) to the disks on the next power up.

-- mikem

> _______________________________________________
> Ext3-users mailing list
> Ext3-users at

More information about the Ext3-users mailing list