bootable failed sw raid 1 with F9

John Whitley jrw at gwsevern.co.uk
Tue Jun 17 15:01:09 UTC 2008


Christopher K. Johnson wrote:
> jrw wrote:
>> Sander Hoentjen wrote:
>>> Hi list,
>>>
>>> For the first time in my life i tried to install Fedora with sw raid.
>>> See below what went wrong.
>>>
>>> Here is what I did:
>>> Start with 2 empty 500GB sata disks.
>>> Make sure nvraid is turned off in my BIOS.
>>> Start an F9 install, creating 2 sw RAID partitions: md0 and md1.
>>> md0 is 100MB and has an ext3 /boot.
>>> md1 has the rest of the space and is LVM.
>>> In the lvm I have created the rest of my partitions.
>>>
>>> Install went great, after reboot my system booted fine, so far so good.
>>> I then shutdown my system, pulled out a disk and started again. I got
>>> the message "GRUB Hard Disk Error". So I shut down, plugged the disk
>>> back in, pulled out the other one and started again. This time I was 
>>> met
>>> by a GRUB shell, no boot logo, no idea what to do (no menu).
>>> Shutdown again, replug the disk, start again, get on IRC, type:
>>> grub
>>> root (hd0,0)
>>> setup (hd0)
>>> root (hd1,0)
>>> setup (hd1)
>>>
>>> After that: reboot minus 1 disk. I can see grub, with logo and boot
>>> options. It starts ok, i even get rhgb for a second and then I see:
>>> "fsck.ext3: Invalid argument while trying to open /dev/md0"
>>> I can go into a maintenance shell and when I do cat /proc/mdstat is 
>>> see:
>>> md0 : inactive sda1[0](s)
>>>
>>> "mdadm --assemble /dev/md0" turns it active again, but well I have no
>>> idea how I can continue normal boot, if it is even possible.
>>>
>>> So this is my story, now my questions:
>>> - Did I do anything wrong? I performed the installation twice, with 
>>> both
>>> times the same result.
>>> - Is this a bug somewhere? Do other people get the same or better
>>> results?
>>> - Is there anything I can do to fix this?
>>>
>>> Thanks for reading this far,
>>>
>>> Sander
>>>
>>>
>>>   
>> I have already experienced this problem and raised a report on Redhat 
>> bugzilla (no. 450722) although there  has been no response to it so 
>> far. I spent some time pinning the problem down to Fedora 9, (it is 
>> OK on Fedora 8 plus updates).
>
> Chances are excellent that the initial problem was grub not writing 
> mbr correctly on both disks of the mirror.  And when you did so in 
> your interactive grub session, I believe you created a dependency on 
> both disks being present through the use of root (hd1,0) - in effect 
> saying look for /boot on the first partition of the second disk.
>
> The subsequent problem with the mirror being broken may have been 
> caused by the process of booting on one disk, not both, depending on 
> the exact sequence of disk removal versus boots.  The raid superblock 
> would be updated on one disk, and be stale on the other, and it is 
> appropriate that you had to re-add the stale disk afterward.
>
> Although this should definitely be addressed as a bug in the 
> installation process, it can also be dealt with pro-actively when 
> booted on the newly installed system and once synchronization of 
> mirrors backing /boot has completed.
>
> If your grub.conf notes use of root (hd0,0), and the md (md0 in your 
> case) for that is mirrored on disks sda and sdb:
> [root at myhost]# grub
> grub> device (hd0) /dev/sdb
> grub> setup (hd0)
> grub> quit
>
> The difference is that here we are saying to grub, pretend the second 
> disk is your first disk, and then write mbr for a root on this disk 
> accordingly.
>
> Be aware that once you boot on one disk only the raid superblocks on 
> mirrors there are updated and no longer match those on the removed 
> disk, thus you will need to re-synchronize your mirrors when booted 
> with both disks present again.
>
When I had the problem, I had already run grub as you suggest ( using 
hd0) on both /dev/sda and /dev/sdb, and had gone through the process of 
removing dev/sdb, booting onto /dev/sda successfully, re-booting with 
both /dev/sda and /dev/sdb, and re-building the mirrors with mdadm 
--add. I repeated this cycle on Fedora 8 install, Fedora 8 plus updated 
kernel, Fedora 8 plus all updates successfully. I only experienced the 
problem either after a Fedora 9 install, or updating from Fedora 8 to 
Fedora 9.


John Whitley




More information about the fedora-list mailing list