Replacing failed raid (boot) disk
Mark
msalists at gmx.net
Thu Jan 19 00:43:28 UTC 2006
First piece of the puzzle is solved:
/root/anaconda-ks.cfg says
"bootloader --location=partition"
So that answeres the "Where?" - remains the question about the "how?" - "how to get it installed in the same place of the new disk".
> -----Original Message-----
> From: fedora-list-bounces at redhat.com
> [mailto:fedora-list-bounces at redhat.com] On Behalf Of Mark
> Sent: Wednesday, January 18, 2006 4:16 PM
> To: 'For users of Fedora Core releases'
> Subject: RE: Replacing failed raid (boot) disk
>
>
> Actually, I just thought of something:
> Would it be easier to copy the boot partition from the mirror
> server on to the unused partition of the good drive before
> replacing the bad drive?
>
> Here is how the drives are partitioned right now (SDA is the
> bad drive that needs to be replaced): sda1 -> /boot sda2 ->
> raid sda3 -> raid1 (md0)
>
> sdb1 -> swap
> sdb2 -> raid1 (md0)
> sdb3 -> unused (the counterpart of sda1)
>
> BTW, these are SATA drives, in case it matters...
>
> The good drive ad the bad drive have identical partitions,
> however the order is different. I did not do this
> intentionally, I tried to keep the order the same, but
> DiskDruid kept switching around the partitions of sdb on me.
>
> Could I use sdb as sda, or would this not work, since /boot
> would then be on sda3, rather than sda1?
>
> If I could switch them around I could save the part with the
> rescue disk and do something like this:
>
> 1. Copy the content of the second server's /boot partition to
> sdb3 2. change /etc/fstab so that /boot is on sda3 rather
> than sda1 3. ?? Where do I define which partitions make up
> md0? 4. Install boot loader onto good disk 5. Shut down,
> replace bad sda drive with good sdb drive, plug new
> replacement into where sdb used to be. 6. Boot (from sda,
> previously sdb), partition sdb, and get mdadm to resync md0
> onto the new drive.
>
> This way I would have less downtime, since I do not need to
> run in rescue mode.
>
> I would still have the same problems in step 4 that I had
> with the first version, of course.
>
> Thanks,
>
> MARK
>
>
> > -----Original Message-----
> > From: fedora-list-bounces at redhat.com
> > [mailto:fedora-list-bounces at redhat.com] On Behalf Of Mark
> > Sent: Wednesday, January 18, 2006 3:54 PM
> > To: fedora-list at redhat.com
> > Subject: Replacing failed raid (boot) disk
> >
> >
> > Hi everybody,
> >
> > I just got this log output a few days ago:
> > Jan 11 15:34:24 webserv1 kernel: ata1: status=0x51 {
> > DriveReady SeekComplete Error } Jan 11 15:34:24 webserv1
> > kernel: ata1: error=0x10 { SectorIdNotFound } Jan 11 15:34:29
> > webserv1 kernel: ata1: status=0x51 { DriveReady SeekComplete
> > Error } Jan 11 15:34:29 webserv1 kernel: ata1: error=0x10 {
> > SectorIdNotFound } Jan 11 15:34:59 webserv1 kernel: ata1:
> > command 0xc8 timeout, stat 0x51 host_stat 0x61 Jan 11
> > 15:34:59 webserv1 kernel: ata1: status=0x51 { DriveReady
> > SeekComplete Error } Jan 11 15:34:59 webserv1 kernel: ata1:
> > error=0x10 { SectorIdNotFound } Jan 11 15:34:59 webserv1
> > kernel: SCSI error : <0 0 0 0> return code = 0x8000002 Jan 11
> > 15:34:59 webserv1 kernel: sda: Current: sense key: Aborted Command
> > Jan 11 15:34:59 webserv1 kernel: Additional sense:
> > Recorded entity not found
> > Jan 11 15:34:59 webserv1 kernel: end_request: I/O error, dev
> > sda, sector 11217554 Jan 11 15:34:59 webserv1 kernel: raid1:
> > Disk failure on sda3, disabling device.
> > Jan 11 15:34:59 webserv1 kernel: Operation continuing
> > on 1 devices
> > Jan 11 15:34:59 webserv1 kernel: raid1: sda3: rescheduling
> > sector 6815744 Jan 11 15:34:59 webserv1 kernel: raid1: sdb2:
> > redirecting sector 6815744 to another mirror Jan 11 15:34:59
> > webserv1 kernel: RAID1 conf printout: Jan 11 15:34:59
> > webserv1 kernel: --- wd:1 rd:2 Jan 11 15:34:59 webserv1
> > kernel: disk 0, wo:1, o:0, dev:sda3 Jan 11 15:34:59 webserv1
> > kernel: disk 1, wo:0, o:1, dev:sdb2 Jan 11 15:34:59 webserv1
> > kernel: RAID1 conf printout: Jan 11 15:34:59 webserv1 kernel:
> > --- wd:1 rd:2 Jan 11 15:34:59 webserv1 kernel: disk 1,
> > wo:0, o:1, dev:sdb2
> >
> >
> > This is on a server with an unraided /boot on sda1 and a
> > software-raid1 raided / partition
> >
> > Dell says the HD needs to be replaced, so now I got the
> > replacement hard disk. The problem is: the failed disk is the
> > one I boot from and the boot partition is not mirrored. So I
> > can not copy the content of the boot partition, nor get the
> > fdisk information to partition the new disk the same way as
> > the old one What is the best and easiest way to get the new
> > system up and running as painlessly as possible?
> >
> > I have a second machine with an identical setup, so I guess I
> > could get the info from that box.
> >
> > I am thinking I need to:
> > 1. Plug the new disk in and boot from the rescue CD
> > 2. Look up the partition info on the mirror box and partition
> > the new disk accordingly. 3. Copy the content of the boot
> > partition over from the mirrored box 4. install grub on sda
> > (how!?!?!?) 5. Hopefully boot the machine with the replaced
> > HD and hope that mdadm will automatically start synching the
> > raid from the good raid disk (sdb)
> >
> > The problem is mainly step 4: I am not sure what I had picked
> > as boot loader location from the "Advanced Boot Loader
> > Configuration" screen ("MBR vs. first sector of boot
> > partition). So I need to figure out
> > a) what the location was, and
> > b) how to get the boot loader installed there manually (I've
> > always just used the automated install for the boot loader).
> >
> >
> > Is my assumption about steps 1-5 correct?
> > Does anybody have any hints regarding how to do step 4?
> >
> > And then for the future: how can I be better prepared for
> > this next time? Is there a way to capture the partition and
> > boot loader information (at a point before the disk actually
> > goes bad) and then restore it to an identical drive in a more
> > automated fashion?
> >
> > Thanks,
> >
> > MARK
> >
> >
> > --
> > fedora-list mailing list
> > fedora-list at redhat.com
> > To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
> >
>
> --
> fedora-list mailing list
> fedora-list at redhat.com
> To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
>
More information about the fedora-list
mailing list