[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[K12OSN] Random system crashes: Linux gurus, what would you do?



I'm sorry if this is considered spamming the list. (3 emails in rapid succession) I just have been slammed with weird problems lately. I promise I've been googling and trying to solve them on my own, but I'm at my wit's end. My k12ltsp 4.2.1 server has been frozen every night when I wake up in the morning. The message, when I can read the screen, usually says:

kernel panic - not syncing: fs/dcache.c:413:spin_lock(fs/dcache.c:8837820) already locked by fs/dcache.c/158(not tainted)
[<c0118f47>]error_code+0x4f/0x54
journal_get_write_access
do_page_fault
ll_rw_block
etc. etc. etc.

I copied more down, but I'm not sure it helps to troubleshoot this. Googling on parts of this error message quickly got me into threads about the overall instability of the linux kernel lately, rants about Linus' kernel updating philosophy, and some very technical bug reports that I couldn't begin to understand, much less apply to solving my problem. I run fsck as I reboot each morning and sometimes it has to fix some inodes and other times it doesn't. I should say that I tried rolling back to the previous kernel on boot (from the boot menu) and it still crashed. The crashes happen at random times in the early morning. The earliest seems to be about 2AM and the latest was at around 9AM. I can tell when the server crashes because of logs of an every 5 minute cron job that moodle runs. I run my backuppc, webalizer and logwatch processes between 3-5AM daily. The hardware is a "custom" PIII 500 box that is my retired workstation, running 640 megs of RAM from various sources, a 5 year old boot drive, etc. I would say that the hardware is very suspect, but it doesn't, in my mind, account for the server crashing every single night and never during the day, when it gets more use. I should also say that I don't use this computer for serving thin clients, it runs LAMP with a few moodle instances, drupal, wordpress and about 10 different web domains. I believe that my computer was broken into via webmin right before all this started happening(there was an unauthorized login as root from a church in town), but I couldn't find any signs of damage, other than my computer crashed the next day. I have also been having what I call dictionary attacks almost every night, repeated login attempts via various ports and user names via ssh, all from the same IP. But the only validated logins via ssh can all be accounted for as coming from me or a trusted user. So, how do I sort this out? Like I said, I've googled. I've tried different kernels. I've tried warming up the very cold room that the server is in. I've tried restricting services as much as I can. But I'm wondering what real linux pros would do to get to the bottom of this. The server crashes every single night, there has to be a way to figure out what the cause is.

Thank you very much for any help you can give me,

ck


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]