[Linux-cluster] GFS2 and D state HTTPD processes

Bob Peterson rpeterso at redhat.com
Fri Sep 25 12:26:47 UTC 2009


----- "Gavin Conway" <gavin.conway at uksolutions.co.uk> wrote:
| We'll give this a go and see what it does. We did manage to track down
| the latest issue to a bad script that the customer had written which
| caused one of the nodes to exhaust all of its available memory. That
| then caused a knock-on effect to the lock_dlm process which was unable
| to drop it's file locks, which then rolled the affect on to the rest
| of the cluster as they started being unable to open files.

Hi Gavin,

You could also try my hang analyzer to see if it finds anything:

http://people.redhat.com/rpeterso/Experimental/RHEL5.x/gfs2/gfs2_hangalyzer.c

Compile with: gcc -o gfs2_hangalyzer gfs2_hangalyzer.c

Run with: ./gfs2_hangalyzer -n <any node in the cluster>

This leaves a bunch of files in /tmp/ so you may want to clean them up.

But be forewarned that you should have rsa keys set up ahead of time
so you can ssh to all the nodes in your cluster without a password
before running this tool.

Regards,

Bob Peterson
Red Hat File Systems




More information about the Linux-cluster mailing list