[K12OSN] RAID1 failure: need help
Robert Arkiletian
robark at gmail.com
Sat Oct 1 06:00:22 UTC 2005
On 9/30/05, Les Mikesell <les at futuresource.com> wrote:
>
> Assuming you can shut the machine down, swap in a good drive for
> sdb - preferably one with no linux filesystem labels or raid
> partitions that might confuse things on the initial boot.
> The machine should boot normally with all of the md devices
> 'broken' but working anyway.
Thanks for the quick replies guys. I have a few more comments and questions.
After looking at the logs it seems this happened when a full class was
logging on. It's not a power failure situation. Last year (when I was
running 3.1.2 without raid) I had this same drive flake out on me with
i/o errors. So I ran multiple checks on it with seagate diagnostic
utils and also the built-in scsi controller bios tests. Everything
came up negative. So I figured the drive was okay. I figured it must
have been the fault of the scsi controller driver. My controller is
the Adaptec aic7902w. Wondering if you guys have the same controller?
Anyway, now I don't trust this drive any more even if the diagnostics
say it's okay again. I'm going to buy another hd. Question: Do I need
to get the exact same model? I know they have to be the same size but
I can't get the older model with 2 heads/ 1 platter. The new ones
comes with 1head for the same 36gb size. So it's higher density.
BTW have you guys seen this
sfdisk -d /dev/sda > partition.sda
sfdisk -d /dev/sdb > partition.sdb
then to restore
sfdisk /dev/sdb < partition.sdb
or
sfdisk /dev/sda < partition.sda
Is it better to do it manually with fdisk?
> Do an 'fdisk -l /dev/sda' to see the partition setup on
> the working disk. fdisk /dev/sdb and duplicate it, setting
> the partition types to 'FD' (linux raid).
> Then for each md device, add the corresponding partition:
> mdadm /dev/md0 --add /dev/sb1 (etc.)
> cat /proc/mdstat
> will show the resync status.
Will do. Thanks Les.
>
> If you have hot swap disk carriers you can do this without
> shutting down, but you need a few more steps to fail and
> remove the still-working partitions from the raid, and then
> to remove the scsi device and add one back.
>
No hot swapping.
> If you are using grub and want to be able to boot with a
> failed first drive you need to repeat the operation to
> install grub on it.
Thanks. I would have forgot this. I still have the notes you helped me
do this with (not that long ago)
>
> And if the drive is still under warranty, go the the mfg's web
> site, put in the serial number, and get an RMA.
It is. But I'm afraid it may show up as good. I'll try doing
diagnostics on it again.
--
Robert Arkiletian
C++ GUI tutorial http://fltk.org/links.php?V19
More information about the K12OSN
mailing list