RAID 5 Multiple Hard-drives failure
Reuben D. Budiardja
techlist at pathfinder.phys.utk.edu
Tue Mar 14 15:24:25 UTC 2006
On Tuesday 14 March 2006 08:49, Scot L. Harris wrote:
> On Tue, 2006-03-14 at 08:21 -0500, Reuben D. Budiardja wrote:
<snip>
> > I am hoping for comments, etc. Thank you in advance.
>
> Have you monitored the temperature of the system at all times? Heat can
> play a major roll in causing drives and other hardware to fail long
> before it should. Good air flow around the system/drives is critical.
I have not. Any recommended tools , methods ? Thank you.
> And make sure you have a good UPS system connected. Power fluctuations
> can cause all kinds of havoc.
It's connected to UPS with proper shutdown in the event of power outage.
>
> Also note that using RAID by itself does not replace the need for
> backups. RAID protects against hardware failure. And depending on the
> value of the data it is usually recommended to run RAID with a hot spare
> drive so multiple drive failures won't bring the system down. I am not
> sure if the card you are using allows you to run a hot spare or not.
The machine is a backup machine. It's main job is to backup data from other
machines, so if I have to have a backup for the backup ... well I am going to
have hard time to justify that :). Yes, I should have had a hot spare ready,
but resource is not unlimited so I did not have hot spare. The data lost were
non-critical (I am not loosing sleep), but this just indicates there is
something wrong with the system and it's getting ridiculous to keep replacing
drives with its warranty.
> And make sure you have something in place that notifies you that there
> is a problem.
Yes, email notification is in place by default (from mdadm and smartd).
Thank you for respond.
RDB
--
Reuben D. Budiardja
Dept. Physics and Astronomy
University of Tennessee, Knoxville, TN
More information about the fedora-list
mailing list