RAID 5 Multiple Hard-drives failure

Chris Wright linux-list at cwic-solutions.co.uk
Tue Mar 14 19:08:13 UTC 2006


Reuben D. Budiardja wrote:
> Hello,
> First, I apologize if this is rather OT. 
> 
> I had multiple hard-drives failure yesterday on a RAID 5 array. Two out of 
> three drives died almost at the same time, rendering the RAID array useless. 
> I tried to recover by doing --assembly and even --assembly --force but it 
> failed. The raid device /dev/md0 run with damaged fail system after that, and 
> fsck would not fixed it, and message scrolled on the screen and log files 
> indicating that writing to the two drives were really failed. SMART reported 
> the same thing. So I lost all data. 
> 
> There is a very very small probabilities that something like this can happen. 
> However, in the last two years, I've had a strings of bad luck with these 
> hard drives: all Maxtor DiamondMax 250 GB IDE HD. In the last two years, I 
> had 4 drives failure with these drives (including the ones yesterday). The 
> two failures in the past, I had a replacement drives sent for both of them 
> since they were under warranty and it indicated a fail drive with Maxtor's 
> diagnostic software.
> 
> I am using these consumer level drives thinking that I could build a rather 
> cheap backup system. The machine, running RAID 5, did backup for some 
> machines in the network using rsnapshot, twice a day. The size of data being 
> backed-up is about 250~300GB. The hard drives is on a Promise controller 
> running software RAID 5
> 
> So my questions having said all that, is there any thing else other than a 
> real hard-drive problem that would cause something like this ? 
> In other words, could the problem be in the controller, motherboard, etc other 
> than the hard drive itself that would cause hard-drives to fail like that ? 
> Or is it just Maxtor makes bad drives ? 
> Or is a consumer level hard-drive just cannot be used for this kind of work 
> 
> I am hoping for comments, etc. Thank you in advance.
> 
> RDB

I avoid Maxtor (IBM Maxtor) drives because of a problem I experienced a 
few years back.
After prolonged use they would start getting noisy / generating a lot of 
heat.

They did have a problem before with the wrong type of grease being 
packed into so called 'life time' bearings, which would break down and 
then cause the noise/heat.

To be fair to Maxtor, they did replace them under warranty, but I had 
already lost faith in them after the 10th failure.
(We had over 50 machines running 24/7 controlling test applications).

Because its a backup PC, it's probably hidden away, out of sight, out of 
mind.  We tried to spin a few of the failed drives up, and it was 
obvious from the noise coming from it that something was bad.

The chances are that your MAXTOR drives are SMART capable (seeing as IBM 
helped develop the concept/application).  So you should consider running 
a SMART aware application that can read the SMART information and 
forewarn you of any impending problems.  Now the more cynical of us 
might suggest that SMART might be SMART enough not to report problems 
about IBM/IBM Maxtor drives, but, since SMART measures things like spin 
up times, access times, motor RPM, temperature etc, the chances are if 
your drives have a problem, SMART will report it well before the drive 
dies. (SMART is only useless against catastrophic failures, i.e. those 
with no warning).  In the case of the 'grease' problem, doing trend 
analysis on the SMART data showed something was failing.

Googling for SMART reporting will yield the names of some apps I think.

Regards

Chris




More information about the fedora-list mailing list