[Linux-cluster] problem with deadlocked processes (D)
Bryn M. Reeves
breeves at redhat.com
Wed Apr 4 12:44:38 UTC 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Peter Sopko wrote:
> Hi,
>
> today a strange thing occurred - on both of our cluster nodes a lot of
> processes suddenly started to become locked in the D state (i/o lock). This
> thing has already happened once before (six months ago), but a simple reboot
> helped to solve this issue. But as it appeared again, I don't want to solve
> it this way again, I would like to find the reason why this is happening,
> but have no idea where to start. In /var/log/messages there is nothing
> unusual, the only thing is that some directories are unremoveable and a lot
> of processes locked.
For problems where processes are getting stuck in D state it's usually
helpful to get sysrq-t data to see where the threads are stuck. Grab two
sets of data a few seconds apart so that you can see if things are
really stuck or just making slow progress.
You can also get some information from the wchan data exposed in /proc -
it's easiest to view with ps:
$ ps ax -ocomm,pid,state,wchan
COMMAND PID S WCHAN
vim 22322 S -
bash 22471 S -
man 22817 S wait
sh 22820 S wait
sh 22821 S wait
less 22826 S -
bash 22839 S wait
screen 23435 S pause
[...]
Regards,
Bryn.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
iD8DBQFGE5226YSQoMYUY94RAgm0AKDdPg/mcTHilSwMpd6+Meno2zBLtACgt+/j
TT3MsBrg6/gpdBdPDYMEp5Q=
=ADyt
-----END PGP SIGNATURE-----
More information about the Linux-cluster
mailing list