[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: RAID 5 Multiple Hard-drives failure



Reuben D. Budiardja wrote:
Hello,
First, I apologize if this is rather OT. I had multiple hard-drives failure yesterday on a RAID 5 array. Two out of three drives died almost at the same time, rendering the RAID array useless. I tried to recover by doing --assembly and even --assembly --force but it failed. The raid device /dev/md0 run with damaged fail system after that, and fsck would not fixed it, and message scrolled on the screen and log files indicating that writing to the two drives were really failed. SMART reported the same thing. So I lost all data. There is a very very small probabilities that something like this can happen. However, in the last two years, I've had a strings of bad luck with these hard drives: all Maxtor DiamondMax 250 GB IDE HD. In the last two years, I had 4 drives failure with these drives (including the ones yesterday). The two failures in the past, I had a replacement drives sent for both of them since they were under warranty and it indicated a fail drive with Maxtor's diagnostic software.

I am using these consumer level drives thinking that I could build a rather cheap backup system. The machine, running RAID 5, did backup for some machines in the network using rsnapshot, twice a day. The size of data being backed-up is about 250~300GB. The hard drives is on a Promise controller running software RAID 5

So my questions having said all that, is there any thing else other than a real hard-drive problem that would cause something like this ? In other words, could the problem be in the controller, motherboard, etc other than the hard drive itself that would cause hard-drives to fail like that ? Or is it just Maxtor makes bad drives ? Or is a consumer level hard-drive just cannot be used for this kind of work
I am hoping for comments, etc. Thank you in advance.

RDB

I avoid Maxtor (IBM Maxtor) drives because of a problem I experienced a few years back. After prolonged use they would start getting noisy / generating a lot of heat.

They did have a problem before with the wrong type of grease being packed into so called 'life time' bearings, which would break down and then cause the noise/heat.

To be fair to Maxtor, they did replace them under warranty, but I had already lost faith in them after the 10th failure.
(We had over 50 machines running 24/7 controlling test applications).

Because its a backup PC, it's probably hidden away, out of sight, out of mind. We tried to spin a few of the failed drives up, and it was obvious from the noise coming from it that something was bad.

The chances are that your MAXTOR drives are SMART capable (seeing as IBM helped develop the concept/application). So you should consider running a SMART aware application that can read the SMART information and forewarn you of any impending problems. Now the more cynical of us might suggest that SMART might be SMART enough not to report problems about IBM/IBM Maxtor drives, but, since SMART measures things like spin up times, access times, motor RPM, temperature etc, the chances are if your drives have a problem, SMART will report it well before the drive dies. (SMART is only useless against catastrophic failures, i.e. those with no warning). In the case of the 'grease' problem, doing trend analysis on the SMART data showed something was failing.

Googling for SMART reporting will yield the names of some apps I think.

Regards

Chris


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]