[Linux-cluster] Freeze with cluster-2.03.11
s.wendy.cheng at gmail.com
Sun Mar 29 04:05:30 UTC 2009
Wendy Cheng wrote:
> ..... [snip] ... There are many foot-prints of spin_lock - that's
> worrisome. Hit a couple of "sysrq-w" next time when you have hangs,
> other than sysrq-t. This should give traces of the threads that are
> actively on CPUs at that time. Also check your kernel change log (to
> see whether GFS has any new patch that touches spin lock that doesn't
> in previous release).
I re-read your console log few minutes ago, followed by a quick browse
into cluster git tree. Few of python processes (e.g. pid 4104, 4105,
etc) are blocked by locks within gfs_readdir(). This somehow relates to
a performance patch committed on 11/6/2008. The gfs_getattr() has a
piece of new code that touches vfs inode operation while glock is taken.
That's an area that needs examination. I don't have linux kernel source
handy to see whether that iput() and igrab() can lead to deadlock though.
If you have the patch in your kernel and if you can, temporarily remove
it (and rebuild the kernel) to see how it goes:
Again, take my advice with a grain of salt :) ...I'll stop here. Good luck !
More information about the Linux-cluster