RAID & HDD failure recovery

Paul Howarth paul at city-fan.org
Thu Nov 16 12:47:57 UTC 2006


Laurence Vanek wrote:
> I thought I knew how to do this.  Thought I was prepared.
> 
> My FC6 has a simple RAID1 setup with two ATA HDD.  Three paritions on 
> each drive (/boot, /, swap).  Three RAID devices defined (i.e. hda1 & 
> hdc1 for md0, hda2& hdc2 for md1, hda3 & hdc3 for md2).  works great, 
> can boot off either drive with the other powered down.
> 
> A week ago hda began to show disk read failures that seemed to increase 
> by the day (smartd).  Checked hda with smartctl & hdd vendor test 
> software & sure enough drive was failing.
> 
> I removed hda from arrays (marked as failed then removed with mdadm).  
> Shutdown & replaced with new identical drive.  Plan was to boot then use:
> 
> sfdisk -d /dev/hdc | sfdisk /dev/hda
> 
> to copy partition table from remaining good drive to newly installed 
> drive.  Then add new drive back into arrays.
> 
> Surprise! boot hangs, cant find partitions on hda (of course not).  
> drops me to simple shell.
> 
> I do not understand why I was not able to boot using the remaining good 
> drive (hdc).  I had done so prior during raid testing.  machine acts 
> like doesnt see hdc.

Perhaps it's trying to boot from the first hard disk? With the new hda 
not installed, hdc is the first hard disk the BIOS sees and hence it 
works? With the new hda installed, that's the first hard disk and since 
there's nothing on it, it won't boot. Perhaps you could boot the rescue 
CD and do the "sfdisk" operation from there?

Paul.




More information about the fedora-list mailing list