[Linux-cluster] problem with deadlocked processes (D)

Bryn M. Reeves breeves at redhat.com
Wed Apr 4 12:44:38 UTC 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Peter Sopko wrote:
> Hi,
> 
> today a strange thing occurred - on both of our cluster nodes a lot of
> processes suddenly started to become locked in the D state (i/o lock). This
> thing has already happened once before (six months ago), but a simple reboot
> helped to solve this issue. But as it appeared again, I don't want to solve
> it this way again, I would like to find the reason why this is happening,
> but have no idea where to start. In /var/log/messages there is nothing
> unusual, the only thing is that some directories are unremoveable and a lot
> of processes locked. 

For problems where processes are getting stuck in D state it's usually
helpful to get sysrq-t data to see where the threads are stuck. Grab two
sets of data a few seconds apart so that you can see if things are
really stuck or just making slow progress.

You can also get some information from the wchan data exposed in /proc -
it's easiest to view with ps:

$ ps ax -ocomm,pid,state,wchan
COMMAND           PID S WCHAN
vim             22322 S -
bash            22471 S -
man             22817 S wait
sh              22820 S wait
sh              22821 S wait
less            22826 S -
bash            22839 S wait
screen          23435 S pause
[...]

Regards,
Bryn.



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFGE5226YSQoMYUY94RAgm0AKDdPg/mcTHilSwMpd6+Meno2zBLtACgt+/j
TT3MsBrg6/gpdBdPDYMEp5Q=
=ADyt
-----END PGP SIGNATURE-----




More information about the Linux-cluster mailing list