[Linux-cluster] GFS lockups ?

Shawn Hood shawnlhood at gmail.com
Thu Oct 9 05:00:00 UTC 2008


See my thread from yesterday.  Same general thing, but the dlm kernel  
threads were eating cycles.

Sent from my iPhone

On Oct 8, 2008, at 7:24 PM, Janar Kartau <janar.kartau at gmail.com> wrote:

> Hi,
> Recently our three-node webserver cluster started randomly crashing. I
> never had time to investigate what the problem was, cause i needed to
> bring them back online again. But it seemed like alla Apache processes
> just hang (couldn't even kill them).. waiting for something. The only
> thing that helped, was a reboot for all or couple of the nodes.  
> Anyway,
> today i encountered this problem at night and i could look into it a
> little more. I noticed that some of the GFS filesystems were
> unaccessable (we have 5 of them, mounted on every nide) and of the  
> nodes
> was completely unaccessable. So i guessed that this half-dead node was
> holding locks on the filesystems or sth. Did a hard reset on this dead
> node and all stabilized.
> Absolutely no cluster/GFS errors in the logs (besides the ones which
> tell that the half-dead node was leaving the cluster when i reset it).
> Nodes have CentOS 4.6 installed (2.6.9-67.0.7.ELsmp, dlm-1.0.7-1,
> GFS-6.1.15-1, cman-1.0.17-0.el4_6.5). We use EMC CX3-10c for GFS  
> storage
> (over iSCSI) and EMC PowerPath for multipathing. Separate VLAN is used
> for CMAN/DLM traffic.
> Please give me ideas how to solve this or atleast some debugging  
> tips as
> it's happening twice a day now and seems i simply can't help it. :(
>
> Janar Kartau
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




More information about the Linux-cluster mailing list