redhat 9 dies randomly

kalin mintchev kalin at el.net
Mon Aug 9 18:52:45 UTC 2004


hi all..

i posted this question a few months back here and the only answer i got
was to update the kernel which i did but it didn't solve the problem..

i have a redhat 9 machine that dies randomly. sometimes after a day,
sometimes after 8 days. there isn't any indication of a problem in any of
the logs.
all the machine runs is qmail with spamd and some procmail and a bind name
server. it also has a helix real media server but  that was added a few
weeks ago. there is a web server with webmail interface - squirrelmail -
too. that's all.
non of the logs show anything. there is enough disk space. i did a memory
test  which came out fine.

now having said that i heed to mention that a few weeks ago the machine
had a second drive on which the main swap partition was living. that drive
died and i had to create a swap file under /var on the main disk. i
thought this will solve the problem but apparently it's not.

the machine is monitored 24 hours but sometimes people notice that the
mail is down before the techs - just a minute or so after the machine
dies. and rebooting it all the time is not solving the issue. of course i
can rebuild the machine but it be nice to know what the problem might be..

can somebody provide some advice of where to start looking for a problem.
i'd appreciate it.

also is there a way to find out what was the cpu load or memory usage at
the time of the crash?

thank you...




--
Software is like sex: It's better when it's free. (Linus Torvalds)





More information about the redhat-list mailing list