Re: RAID 5 Multiple Hard-drives failure

On Tuesday 14 March 2006 08:49, Scot L. Harris wrote:
> On Tue, 2006-03-14 at 08:21 -0500, Reuben D. Budiardja wrote:
> > I am hoping for comments, etc. Thank you in advance.
> Have you monitored the temperature of the system at all times?  Heat can
> play a major roll in causing drives and other hardware to fail long
> before it should.  Good air flow around the system/drives is critical.

I have not. Any recommended tools , methods ? Thank you.

> And make sure you have a good UPS system connected.  Power fluctuations
> can cause all kinds of havoc.

It's connected to UPS with proper shutdown in the event of power outage.

> Also note that using RAID by itself does not replace the need for
> backups.  RAID protects against hardware failure.  And depending on the
> value of the data it is usually recommended to run RAID with a hot spare
> drive so multiple drive failures won't bring the system down.  I am not
> sure if the card you are using allows you to run a hot spare or not.

The machine is a backup machine. It's main job is to backup data from other 
machines, so if I have to have a backup for the backup ... well I am going to 
have hard time to justify that :). Yes, I should have had a hot spare ready, 
but resource is not unlimited so I did not have hot spare. The data lost were 
non-critical (I am not loosing sleep), but this just indicates there is 
something wrong with the system and it's getting ridiculous to keep replacing 
drives with its warranty. 

> And make sure you have something in place that notifies you that there
> is a problem.

Yes, email notification is in place by default (from mdadm and smartd).

Thank you for respond.

Reuben D. Budiardja
Dept. Physics and Astronomy
University of Tennessee, Knoxville, TN

