[K12OSN] odd reboot issue

Calvin Dodge caldodge at fpcc.net
Thu Oct 20 18:56:35 UTC 2005


On Thu, Oct 20, 2005 at 08:48:31AM -0400, Calvin Park wrote:
> 
> Thanks for the advice. I turned off X on the server, rebooted, and it locked
> in the rhgb, I editted /etc/sysconfig/init and turned off the graphical boot
> option. Rebooted, and it booted in fine. Terminals are working, and the
> server is running, no out of control X process. So, one problem solved.

You can also remove "rhgb" from any "kernel" lines in /etc/grub.conf.

Or ... you can permanently turn it off with "rpm -e rhgb".

> Now, I'm not sure if that is just coincidence or not. I may go through some
> old logs and see if that was the last message before the crash several weeks
> ago. If it is...is it possible for a terminal to cause the server to crash?

Hmmm ... that's just the terminal shouting "I'm alive!" to the server.

I doubt it's causing the crash.

It's _possible_ for someone using the terminal to cause a system crash, or
at least use enough resources to make it non-responsive (which is why K12LTSP
4.4.1 includes "/etc/sysconfig/k12ltsp-limits"), but it's not likely someone
was in the building at 4 a.m.

> Maybe it was related to X running on the server and such? If that's the case
> it shouldn't be an issue anymore, but what if it was related to something
> else? Just throwing some things out there. Thanks everyone for all your help
> already. Oh, and BTW, I checked the HDD and it still has ~40GB free.

Do you have any other terminals which are calling "MARK" between *:53 and *:00?

Could it be that the terminal was merely the last one to do the "MARK" bit before
the top of the hour?

I ask this because crashes around 4 a.m. make me suspect the hardware.

Why?

Because 4:02 a.m. is the default time for daily cron jobs on Red Hat/Fedora systems
 (4:22 for weekly, 4:42 for monthly).  If your server is set up to do an "updatedb"
every day, then the hard drive subsystem will be heavily stressed at that time.

(that's how we identified the hangup problem on an LTSP server - and the proof came
after we replaced the "RocketRaid" card with a 3ware (no more lockups)).

Of course, my theory doesn't account for the lockup after 5:53 a.m.  Ummm ... do you
have any cron jobs running around 6 a.m.?

Or do you just want to ignore the above until the system proves unstable while NOT
running X?

Calvin
-- 
Calvin Dodge
Certified Linux Bigot (tm)
http://www.caldodge.fpcc.net




More information about the K12OSN mailing list