[K12OSN] RAID1 failure: need help

Les Mikesell les at futuresource.com
Sat Oct 1 17:50:17 UTC 2005


On Sat, 2005-10-01 at 01:00, Robert Arkiletian wrote:

> After looking at the logs it seems this happened when a full class was
> logging on. It's not a power failure situation. Last year (when I was
> running 3.1.2 without raid) I had this same drive flake out on me with
> i/o errors. So I ran multiple checks on it with seagate diagnostic
> utils and also the built-in scsi controller bios tests. Everything
> came up negative. So I figured the drive was okay. I figured it must
> have been the fault of the scsi controller driver. My controller is
> the Adaptec aic7902w. Wondering if you guys have the same controller?

I have had a few machines where the controller or hot-swap backplane
was flaky and caused errors once in a while.  Drives in a raid would
be kicked out, ones not in a raid would hang an while doing retries
and then recover with some errors in the log.   New drives in
those machines would do the same thing, but when the errors are
months apart it is hard to tell which is the problem.   But, a
bad drive is much more likely.  I'm a little superstitious about
always doing a low level on scsi drives when moving from one
controller model to another.  It may not matter any more but older
controllers had small timing differences that made errors more
likely - and it still can't hurt anything (and it eliminates the
problems of labels and raid ID's being confused when you move
drives around).  You might also double-check the cables and
termination.  Newer drives don't have on-board termination and
need it on the end of the cable.  If you mix LVD and non-LVD
components everything shifts down to SE so you need a combo
LVD/SE terminator and the shorter cable length restriction applies.

> Anyway, now I don't trust this drive any more even if the diagnostics
> say it's okay again. I'm going to buy another hd. Question: Do I need
> to get the exact same model? I know they have to be the same size but
> I can't get the older model with 2 heads/ 1 platter. The new ones
> comes with 1head for the same 36gb size. So it's higher density.

The only thing that really matters is that you have the same size
partitions to mirror.  They can be larger but you'll waste the
extra space.  If the head/cylinder/sector geometry is different
you'll have more work to compute the appropriate partitions sizes.

> BTW have you guys seen this
> 
> sfdisk -d /dev/sda > partition.sda
> sfdisk -d /dev/sdb > partition.sdb
> then to restore
> sfdisk /dev/sdb < partition.sdb
> or
> sfdisk /dev/sda < partition.sda
> 
> Is it better to do it manually with fdisk?

The results should be the same.  I'm not sure if it gets the sector
count right automatically when the geometry is different or not.
If it does, it will save you some work.  If the sectors/cylinder
is the same it is pretty easy to do the fdisk by hand since you
just put in the end cylinder number for each partition (the start
default will be right).

> > And if the drive is still under warranty, go the the mfg's web
> > site, put in the serial number, and get an RMA.
> 
> It is. But I'm afraid it may show up as good. I'll try doing
> diagnostics on it again.

Usually they don't second guess it if you say it is broken - and
you have your 'error on sector #####' logs to prove it.

--  
   Les Mikesell
     les at futuresource.com





More information about the K12OSN mailing list