software raid with pending failure

Julian De Marchi julian at jdcomputers.com.au
Wed Sep 19 01:37:10 UTC 2007


Aaron Bliss wrote:
> Hi everyone,
> I'm running redhat es 5 with several raid 1 partitions setup.  It looks like /dev/sdb is getting ready to fail.  I noticed the following in the logwatch report:
> 
> /dev/sdb - 29 Time(s)
>   1 offline uncorrectable sectors detected 
> 
> So, in order to correct the pending failed drive, I marked each /dev/sdbx partition that was partitioning in a raid1 as failed with mdadm, and then removed the dev/sdbx with madam.  So I was running the os from /dev/sda only.  So far so good.  
> 
> I then took the box down and unplugged the device that I believed was /dev/sdb, however the box wouldn't boot.  It just sat at the grub prompt.  So, I thought, maybe the box is seeing the other drive as /dev/sdb.  So, I turned the box back off, plugged in the previous drive, unplugged the other drive, and the box wouldn't boot.  I got to the grub splash screen, however the box just kept resetting itself.  So, I plugged that drive back in, and the box booted up fine.  So, I'm now working with what I believe to be a good drive and a soon to be failed drive.  So, a few questions here.  1. How do I identify which hard drive is /dev/sda and which is /dev/sdb?  2. Why wasn't I able to boot with a single drive (assuming that at least 1 of them is good)?  3. How do I go about replacing the bad drive?  Thanks for your help.  Below is a print out of /dev/mdstat before failing and removing /dev/sdb from the mirrors (all raid partitions were setup during the install of the operating s
yst
>  em)

Have you tried to use MDADM to remove the bad drive, then replace and 
use MDADM to add the good drive back into your raid config (Ofcourse 
after creating the Linux Raid partition on the drive)?

<snip>

Regards,

J




More information about the redhat-list mailing list