Problem with raid modual starting before scsi modual
Aleksandar Milivojevic
amilivojevic at pbl.ca
Mon Jan 10 15:39:23 UTC 2005
google at familycook1.net wrote:
> I am running an updated FC2 system.
>
> My problem is that my raid modual is starting before my scsi modual and not
> auto-decting my software raid 5 on SATA disks (with scsi drivers).
Are you sure md module (RAID) is loaded before SCSI module? Check init
script from initrd file. Something like:
$ mkdir /tmp/initrd
$ cd /tmp/initrd
$ gzip -dc < /boot/initrd-your-kernel.img | cpio -i
$ less init
(you'll get some errors that you can ignore if you attempt to extract
initrd cpio archive as normal user, ignore those, not important).
If modules in init script are loaded (look for insmod lines) in correct
order, than you are one of few unlucky people that are hitting a known
race condition. The problem is that device drivers when loaded are not
waiting for each other to initialize completely. So when your SCSI
driver is loaded, it starts to detect disk drives, and insmod command
that loaded it exits while it is still doing that. md driver is than
loaded, and if SCSI driver hasn't detected your disks by then, md driver
isn't going to find them. Classic race condition.
If this is what happens to you, each time you install new kernel, you'll
have to make custom made initrd image by editing init script, and insert
"sleep 10" (or 20, or 30, whatever works for you) to give drivers enough
time to initialize before next stage is invoked. Also, inserting sleep
before/after udevstart is sometimes needed. On one system, I even had
to do this for ext3 driver (sleep between loading of jbd and ext3
modules, sigh).
To do that, do this (this time as root):
# cp /boot/initrd-your-kernel.img /boot/initrd-your-kernel.img.orig
# mkdir /tmp/initrd
# cd /tmp/initrd
# gzip -dc /boot/initrd-your-kernel.img | cpio -idmv
# vi init (insert some sleeps between invocations of insmod)
# find . -print | cpio -ocv | gzip -c > /boot/initrd-your-kernel.img
If you are using LILO, than you must do:
# /sbin/lilo
If you are using Grub, do not run /sbin/lilo !!!!! (or you'll screw up
Grub).
This was also discussed on kernel mailing list. When I asked, one of
the responses was that PCI is hot-pluggable subsystem, and that init
script should (after it loads device driver) poll and wait for needed
devices to appear (currently it doesn't, it just blindly loads driver
after driver, and hopes that device will appear before next line that
depends on it has been executed).
--
Aleksandar Milivojevic <amilivojevic at pbl.ca> Pollard Banknote Limited
Systems Administrator 1499 Buffalo Place
Tel: (204) 474-2323 ext 276 Winnipeg, MB R3T 1L7
More information about the fedora-list
mailing list