Help needed with hanging bash script

Tony Nelson tonynelson at georgeanelson.com
Tue Jun 26 00:36:18 UTC 2007


At 5:24 PM -0400 6/25/07, Matthew J. Roth wrote:
>Bash gurus,
>
>I have a bash script that monitors a directory for files.  Whenever it
>finds files in this directory, it passes them to a support script for
>processing.  The support script moves the files to another directory
>prior to processing them, and it is run in the background to prevent
>blocking the main script.

This is a race condition.  If for any reason the moving happens while the
ls is running your script could get very confused.  Move the files in the
main script, not asynchronously.  Also, ensure that bg_script is not
running before running it again.

A simplified version of the main script loop
>follows:
>
>  # Execute once every 10 seconds
>  while true;
>  do
>     # Fork a background script to process each file in the spool directory
>     for fname in `ls /spool/dir/*.ext 2> /dev/null`
>     do
>        bname=`basename $fname`
>
>        bg_script $bname &
>     done
>
>     sleep 10
>  done
>
>This is pretty simple and it worked flawlessly for over a year on a dual
>processor server running Fedora Core 3.  However, after upgrading to an
>8 core (2 CPUs x 4 cores) server running Fedora Core 6 the script hangs
>a few times a week.  This is a bad thing, so I have to keep a close eye
>on the server until the bug is resolved.
>
>The process tree of the script when it's hanging follows:
>
>  [root at server ~]# ps axjf
>   PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
>      1  3512  3510  2302 ?           -1 S        0   0:59 /bin/bash
>/usr/local/bin/script
>   3512 21432  3510  2302 ?           -1 R        0  40:50  \_ /bin/bash
>/usr/local/bin/script
>
>Note that the parent process (PID 3512) is sleeping and has accumulated
>relatively little CPU time since boot.  The child process (PID 21432) is
>running in a hard loop and top shows that it is consuming 100% of one of
>the cores.  It also never terminates, so it permanently blocks the
>parent process.  If the child process is killed, the execution of the
>parent process restarts without any problems.
>
>The interesting thing is that the script never calls itself.
 ...

It does.  Subshells:

    `ls /spool/dir/*.ext 2> /dev/null`
    `basename $fname`
-- 
____________________________________________________________________
TonyN.:'                       <mailto:tonynelson at georgeanelson.com>
      '                              <http://www.georgeanelson.com/>




More information about the fedora-list mailing list