rh9 wont boot past decompressing kernel stage

Pete Nesbitt pete at linux1.ca
Fri Jun 18 05:01:44 UTC 2004


On June 15, 2004 01:13 pm, Chris W. Parker wrote:
> hello.
>
> i've been having some very strange problems with a server of mine
> lately. i'm not sure if it's a hardware or software problem.
>
> the first time i had a problem was last week. i tried to login to the
> server via ssh but nothing happened. after some investigation i was able
> to get the machine to boot again using the 'no-hlt' option. i.e. 'linux
> no-hlt'.
>
> so it stays up for a few days* and i try to login to it again today and
> i am again presented with a dead-in-the-water server. this server has 4
> hd's with an adaptec 39160 scsi controller and is using software raid.
> this time things do not look promising. :(
>
> once i rebooted the system it stopped at the very beginning of loading
> the os. the part *right* after it decompresses the kernel. i tried
> booting with my boot disk (that i made when i installed the system) and
> it got as far as "decompressing vmlinuz..........." and then stopped.
>
> i then attempted reboot after reboot trying different things and it
> seemed to slowly get worse. at one point it said the CPU had changed and
> that i need to go into CMOS and detect it and then save on exit. well it
> wouldn't even go into CMOS.
>
> so at first i thought it might be a software thing, but then all this
> crazy stuff leads me to believe it's a hardware thing, but i have no
> idea what. oh btw, there are four (or maybe there are only three) small,
> flat, square leds right next to the pci slot where the scsi controller
> is and they are all red. whereas in the past i've seen them
> green/yellow.
>
> i know this is pretty vague but i really dont know how to troubleshoot
> something like this so i'm hoping that with some extra brains i'll be
> able to find a solution.
>
> the most important thing is that i get the data off the harddrives.
>
>
>
> thanks,
> chris.


Hi Chris,
it definately sounds like hardware.
What is the computer (486, PIII...)? How old is the board?
I would be suspicious of the board and maybe the scsi controller.

It would be great if you had a similar known-good system (or close) to test 
things on. You could test the scsi controller and disks on  another box, that 
way you know definitively if that is or is not the problem.

What happens if you boot the cd  in rescue mode and try and access the 
existing system? 

You could strip the board down to minimum of devices (just kb, video, ram, 
cpu). If it bootup (to a 'no drive' kind of error), add things one at time. 
It is actually easier to detect if the problem is pretty reproducable, 
otherwise you may need to repeat or extend the tests at each level.

Basic rules for troubleshooting:
- always try and take a suspect part and place it in a known good environment 
(except power supplies!). 
- Don't run a test unless it will tell you catagoriclly that a part passes or 
fails. For example if you cannot boot from a floppy and you test a second 
floppy disk. If it fails, you still can't say the original disk is good, but 
if you boot a different box with that disk it tells you the disk is 
definately good.

Hope that helps.
-- 
Pete Nesbitt, rhce





More information about the redhat-list mailing list