RAID5 gets a bad rap
Gordon Messmer
yinyang at eburg.com
Tue Dec 30 09:02:15 UTC 2008
Philip A. Prindeville wrote:
>
> If you're *not* a database weenie, and you're doing usual manly things
> with your filesystem (like lots of compiles, for instance), you're
> typically not going to be modifying files in place at all.
That's not quite it. RAID 5 performance suffers because every write
requires that the entire block that's being written be read from every
drive in the array, parity calculated, and then the data and parity
written out. For each block written, the array has to do N reads plus
two writes.
It doesn't matter whether you're writing new files or modifying existing
files, because all of this happens at the block level. It's especially
bad on journalled filesystems, where writing to a file will update the
files blocks, plus the filesystem's journal's blocks, and finally the
filesystem's blocks.
> So is it just the database-heads that are maligning RAID5, or are there
> other performance issues I don't know about?
Most of your comments don't reflect the way RAID 5 actually functions in
any way.
> Because my empirical experience has always been that when writing large
> files, RAID5 performs on par with RAID0.
The system on which you were testing was probably limited by other
factors, if that was the case. A RAID 0 disk array will be much faster
than a RAID 5 array.
RAID 5 tends to be most appropriate when you're trying to get as much
disk space as you can with the lowest cost, you won't be running
multiple simultaneous jobs on the same disk array, and when you'll be
collecting data at a rate that's relatively low. Usually, that's
backups. Your network is probably slower than your disk array (unless
the array is very large -- array speed decreases with array size), so
streaming data in over the network to your disk array won't bog it down.
Virtually any interactive workload will benefit from a better disk
configuration.
More information about the fedora-list
mailing list