[Linux-cluster] GFS lockups ?

Janar Kartau janar.kartau at gmail.com
Thu Oct 9 18:38:43 UTC 2008


Like i said, i couldn't find anything in the logs besides eviction
messages after i manually reset the server. Yes, we do use PHP and
sessions which use memcached as a backend.

Janar

Marc Grimme wrote:
> On Thursday 09 October 2008 01:24:51 Janar Kartau wrote:
>   
>> Hi,
>> Recently our three-node webserver cluster started randomly crashing. I
>> never had time to investigate what the problem was, cause i needed to
>> bring them back online again. But it seemed like alla Apache processes
>> just hang (couldn't even kill them).. waiting for something. The only
>> thing that helped, was a reboot for all or couple of the nodes. Anyway,
>> today i encountered this problem at night and i could look into it a
>> little more. I noticed that some of the GFS filesystems were
>> unaccessable (we have 5 of them, mounted on every nide) and of the nodes
>> was completely unaccessable. So i guessed that this half-dead node was
>> holding locks on the filesystems or sth. Did a hard reset on this dead
>> node and all stabilized.
>> Absolutely no cluster/GFS errors in the logs (besides the ones which
>> tell that the half-dead node was leaving the cluster when i reset it).
>> Nodes have CentOS 4.6 installed (2.6.9-67.0.7.ELsmp, dlm-1.0.7-1,
>> GFS-6.1.15-1, cman-1.0.17-0.el4_6.5). We use EMC CX3-10c for GFS storage
>> (over iSCSI) and EMC PowerPath for multipathing. Separate VLAN is used
>> for CMAN/DLM traffic.
>> Please give me ideas how to solve this or atleast some debugging tips as
>> it's happening twice a day now and seems i simply can't help it. :(
>>     
>
> Could you provide more information like relevant syslogs and console messages?
>
> Are you using php with sessions?
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081009/dcdc347e/attachment.htm>


More information about the Linux-cluster mailing list