Crashing TUX

Tue Nov 8 13:05:52 UTC 2005

Hi Nicolas,

Like Chris says, that find process seems to be from a cron job.  But
contrary to what he says I think it might be affecting tux because it is
using the disk heavily (at least performance wise).  I'm not really sure
that tux crashed, I skimmed through this thread and I don't see that you
posted any crash report, can you check /var/log/messages to see if there
is any sign of problems with tux? if there is a crash it might be useful
to get the backtrace in that file posted on this list.

Another thing, what distro do you have? right now I'm using Fedora Core
3 (2.6 kernel) on my desktop and on my production server and the
system's crontab says something like this:
	# run-parts
	01 * * * * root run-parts /etc/cron.hourly
	02 4 * * * root run-parts /etc/cron.daily
	22 4 * * 0 root run-parts /etc/cron.weekly
	42 4 1 * * root run-parts /etc/cron.monthly

It means it will execute every script in those directories (/etc/cron.*)
and may be the 'find' process is executed from one of them.  Try to
adjust the hours to start in a "quiet" hour according to your server
usage.

As a blind guess I think the script might be in /etc/cron.weekly since
you said the problem happens every 5 days or so.  Besides, you also said
that it happens between 5 and 7 AM and looking at the ps output, it
started and 6:25 AM and is running with the 'nobody' user.

The other strange thing I see is that kswapd is in a zombie state (Z),
may be there is nothing bad about it but I can't recall seeing that on
my systems. How is your ram coming along? are you sure you have enough?
next time try to check you RAM when this problems happens again.

Good luck,

-William

El mar, 08-11-2005 a las 02:08 +0100, Nicolas Van Eenaeme escribió: 
> Hi William,
> 
> Last week I had a crash again and I managed to do the ps aux | grep '
> [DRZ]'. This is the output:
> 
> poison at static:~$ ps aux | grep ' [DRZ]'
> USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
> root         5  0.0  0.0     0    0 ?        Z    Nov02   1:23 [kswapd]
> <defunct>
> web        412  0.0  0.0  2260  816 ?        D    Nov02   0:08 [TUX worker
> 0]
> web        413  3.5  0.0  2260  816 ?        R    Nov02 101:00 [TUX worker
> 1]
> web        414  0.0  0.0  2260  816 ?        D    Nov02   0:00 [TUX worker
> 0]
> web        417  0.1  0.0  2260  816 ?        D    Nov02   3:24 [TUX worker
> 1]
> web        418  0.0  0.0  2260  816 ?        D    Nov02   0:02 [TUX worker
> 1]
> nobody   18631  0.6  0.0  1532  736 ?        D    06:25   1:13 /usr/bin/find
> / ( -fstype NFS -o -fstype nfs -o -fstype afs -o -fstype proc -o -fstype
> smbfs -o -fstype autofs -o -fstype iso9660 -o -fstype ncpfs -o -fstype coda
> -o -fstype devpts -o -fstype ftpfs -o -fstype devfs -o -fstype mfs -o
> -fstype shfs -o -fstype sysfs -o -fstype cifs -o -fstype lustre_lite -o
> -type d -regex
> \(^/tmp$\)\|\(^/usr/tmp$\)\|\(^/var/tmp$\)\|\(^/afs$\)\|\(^/amd$\)\|\(^/alex
> $\)\|\(^/var/spool$\)\|\(^/sfs$\)\|\(^/media$\) ) -prune -o -print
> poison   18919  1.0  0.0  2488  872 pts/0    R+   09:22   0:00 ps aux
> poison   18920  0.0  0.0  1544  476 pts/0    R+   09:22   0:00 grep  [DRZ]
> 
> I don't know what that find process is but I think it shouldn't be there.
> Maybe the TUX worker 1 (PID 413) is the problem.
> 
> Do you have any ideas what could be causing these crashes?
> 
> Thanks in advance,
> Nicolas Van Eenaeme
> 
> -----Oorspronkelijk bericht-----
> Van: tux-list-bounces at redhat.com [mailto:tux-list-bounces at redhat.com] Namens
> William Lovaton
> Verzonden: vrijdag 21 oktober 2005 14:29
> Aan: TUX discussion list
> Onderwerp: Re: Crashing TUX
> 
> Hi Nicolas,
> 
> Are you sure your load problems are about tux? Did you checked your cron
> jobs? may be it's logrotate or something like that.  When the load is
> that high you can log into the system and execute this command:
> 	ps aux | grep ' [DRZ]'
> 
> All of the processes with state D or R are the ones generating the high
> load, with this information you can identify the offending processes.
> 
> BTW, what distro and version are you using?  Right now I have a lot of
> experience with Fedora Core 3.
> 
> -William
> 
> 
> El vie, 21-10-2005 a las 10:02 +0200, Nicolas Van Eenaeme escribió:
> > Hi all,
> > 
> >  
> > 
> > I’m running TUX on a popular website here in Belgium (>2.000.000
> > pageviews / day) and I use it to serve all my static content (images,
> > css, …).
> > 
> > This works great. The average load on my server stays below 1. But
> > after each period of 5 days (average) the load suddenly goes to 6 or
> > above (I can see this on my MRTG stats) and it takes ages to serve a
> > simple image. If this happens the only way to fix it is to reboot my
> > server. It’s strange because this happens early in the morning
> > (usually between 5 – 7 am.) when there are almost no users on the
> > site.
> > 
> >  
> > 
> > Is there somebody who can help me or do you have some suggestions for
> > me how I can fix this? I’m running the 2.4 kernel.
> > 
> >  
> > 
> > Thanks in advance,
> > 
> > Nicolas Van Eenaeme
> > 
> > 
> > _______________________________________________
> > tux-list mailing list
> > tux-list at redhat.com
> > https://www.redhat.com/mailman/listinfo/tux-list
> 
> _______________________________________________
> tux-list mailing list
> tux-list at redhat.com
> https://www.redhat.com/mailman/listinfo/tux-list
> 
> 
> _______________________________________________
> tux-list mailing list
> tux-list at redhat.com
> https://www.redhat.com/mailman/listinfo/tux-list