Re: RAID 5 Multiple Hard-drives failure

On Tue, 2006-03-14 at 08:21 -0500, Reuben D. Budiardja wrote:
> Hello,
> First, I apologize if this is rather OT. 
> I had multiple hard-drives failure yesterday on a RAID 5 array. Two out of 
> three drives died almost at the same time, rendering the RAID array useless. 
> I tried to recover by doing --assembly and even --assembly --force but it 
> failed. The raid device /dev/md0 run with damaged fail system after that, and 
> fsck would not fixed it, and message scrolled on the screen and log files 
> indicating that writing to the two drives were really failed. SMART reported 
> the same thing. So I lost all data. 
> There is a very very small probabilities that something like this can happen. 
> However, in the last two years, I've had a strings of bad luck with these 
> hard drives: all Maxtor DiamondMax 250 GB IDE HD. In the last two years, I 
> had 4 drives failure with these drives (including the ones yesterday). The 
> two failures in the past, I had a replacement drives sent for both of them 
> since they were under warranty and it indicated a fail drive with Maxtor's 
> diagnostic software.
> I am using these consumer level drives thinking that I could build a rather 
> cheap backup system. The machine, running RAID 5, did backup for some 
> machines in the network using rsnapshot, twice a day. The size of data being 
> backed-up is about 250~300GB. The hard drives is on a Promise controller 
> running software RAID 5
> So my questions having said all that, is there any thing else other than a 
> real hard-drive problem that would cause something like this ? 
> In other words, could the problem be in the controller, motherboard, etc other 
> than the hard drive itself that would cause hard-drives to fail like that ? 
> Or is it just Maxtor makes bad drives ? 
> Or is a consumer level hard-drive just cannot be used for this kind of work 
> I am hoping for comments, etc. Thank you in advance.
I have always loved my maxtor drives and they work for a long time.

That being said, the 200gb diamondmax that came out in the last year or
so gave me 3 failures in < 5 months (I bought one and they warranteed it
3 times).  I suspect that they may have had a glitch in manufacturing
that resulted in a bad batch, although the 3 that failed for me were
made in 2 different locations

I would venture to guess that you bought the drives in your raid array
at the same time and they may have been part of the same batch.  A
defect that affects one may have been shared on others from the same
plant at the same time, so it is quite possible that that can happen.

I have no problems using what you call "consumer level" drives for
anything I do.

> -- 
> Reuben D. Budiardja
> Dept. Physics and Astronomy
> University of Tennessee, Knoxville, TN

