[K12OSN] Random system crashes: Linux gurus, what would you do?
Eric Harrison
eharrison at mail.mesd.k12.or.us
Wed Dec 21 05:21:53 UTC 2005
On Tue, 20 Dec 2005, Carl Keil wrote:
> I'm sorry if this is considered spamming the list. (3 emails in rapid
> succession) I just have been slammed with weird problems lately. I promise
> I've been googling and trying to solve them on my own, but I'm at my wit's
> end.
> My k12ltsp 4.2.1 server has been frozen every night when I wake up in the
> morning. The message, when I can read the screen, usually says:
>
> kernel panic - not syncing: fs/dcache.c:413:spin_lock(fs/dcache.c:8837820)
> already locked by fs/dcache.c/158(not tainted)
> [<c0118f47>]error_code+0x4f/0x54
> journal_get_write_access
> do_page_fault
> ll_rw_block
> etc. etc. etc.
>
> I copied more down, but I'm not sure it helps to troubleshoot this. Googling
> on parts of this error message quickly got me into threads about the overall
> instability of the linux kernel lately, rants about Linus' kernel updating
> philosophy, and some very technical bug reports that I couldn't begin to
> understand, much less apply to solving my problem. I run fsck as I reboot
> each morning and sometimes it has to fix some inodes and other times it
> doesn't.
> I should say that I tried rolling back to the previous kernel on boot (from
> the boot menu) and it still crashed. The crashes happen at random times in
> the early morning. The earliest seems to be about 2AM and the latest was at
> around 9AM. I can tell when the server crashes because of logs of an every 5
> minute cron job that moodle runs. I run my backuppc, webalizer and logwatch
> processes between 3-5AM daily. The hardware is a "custom" PIII 500 box that
> is my retired workstation, running 640 megs of RAM from various sources, a 5
> year old boot drive, etc. I would say that the hardware is very suspect, but
> it doesn't, in my mind, account for the server crashing every single night and
> never during the day, when it gets more use.
One thing that happens late at night are the jobs in /etc/cron.daily/
Some of these jobs can chew up a lot of memory.
You might want to run memtest on this box to see if you have a bad
stick of ram.
>I should also say that I don't
> use this computer for serving thin clients, it runs LAMP with a few moodle
> instances, drupal, wordpress and about 10 different web domains. I believe
> that my computer was broken into via webmin right before all this started
> happening(there was an unauthorized login as root from a church in town), but
> I couldn't find any signs of damage, other than my computer crashed the next
> day.
If you think your computer may have been broken into, it is best to do
a fresh re-install. If a root kit was installed, it would hide the signs
of damage and could very well cause stability problems.
> I have also been having what I call dictionary attacks almost every
> night, repeated login attempts via various ports and user names via ssh, all
> from the same IP. But the only validated logins via ssh can all be accounted
> for as coming from me or a trusted user.
If you keep getting attacked from a specific IP address, it would be a good
idea to firewall off that IP.
-Eric
More information about the K12OSN
mailing list