[K12OSN] Re: why RAID?

Fri Dec 10 18:35:08 UTC 2004

>> RAID levels 1 and 5 provide a sufficient redundancy that you
>> can lose a disk without losing data. There are, of course,
>> other tradeoffs; the redundancy comes at the price of
>> additional disks and a performance impact on writes. There
>> is also a gain in read performance.

> Generally, speeds aren't impacted terribly because writes
> are done in parallel.

That depends. 

If the write is large enough that spans all the data disks and
the write is aligned so that it covers the entire stripe then
the parity block can be calculated from the available data and
all writes are can be done in parallel.  This adds a single
extra write to the I/O stream, a relatively small impact. But
this is not the typical situation except for RAID 1 (mirrored
drives) where it is the norm.

When the write does not completely cover the stripe, then
calculation of the parity requires additional data to be read
from the disks: usually the data to be overwritten and the old
parity are read from disk and used to calculate the new parity,
then new data and new parity are written: a doubling in the
time to do the write and a quadrupling of the number of I/O's.

So what is the optimal stripe size? Does it make sense to
make it small enough that most writes cover a single stripe?
In most cases the answer is no. When the stripe gets smaller,
more reads will require multiple disk I/O's and this tends to
cut into overall throughput. Making the stripe small is nearly
equivalent to forcing a RAID 5 into RAID 3 behavior. There
are few situations where RAID 3 is practical.

-- 
	Mike Wescott
	Wescott_Mike at EMC.COM