[Linux-cluster] File system slow & crash

Wed Apr 21 19:27:55 UTC 2010

Hello,

We are using GFS2 on 3 nodes cluster, kernel 2.6.18-164.6.1.el5,
RHEL/CentOS5, x86_64 with 8-12GB memory in each node. The underlying storage
is HP 2312fc smart array equipped with 12 SAS 15K rpm, configured as RAID10
using 10 HDDs + 2 spares. The array has about 4GB cache. Communication is
4Gbps FC, through HP StorageWorks 8/8 Base e-port SAN Switch.

Our application is apache version 1.3.41, mostly serving static HTML file +
few PHP. Note that, we have to downgrade to 1.3.41 due to application
requirement. Apache was configured with 500 MaxClients. Each HTML file is
placed in different directory. The PHP script modify HTML file and do some
locking prior to HTML modification. We use round-robin DNS to load balance
between each web server.

The GFS2 storage was formatted with 4 journals, which is run over a LVM
volume. We have configured CMAN, QDiskd, Fencing as appropriate and
everything works just fine. We used QDiskd since the cluster initially only
has 2 nodes. We used manual_fence temporarily since no fencing hardware was
configured yet. GFS2 is mounted with noatime,nodiratime option.

Initially, the application was running fine. The problem we encountered is
that, over time, load average on some nodes would gradually reach about
300-500, where in normal workload the machine should have about 10. When the
load piled up, HTML modification will mostly fail.

We suspected that this might be plock_rate issue, so we modified
cluster.conf configuration as well as adding some more mount options, such
as num_glockd=16 and data=writeback to increase the performance. After we
successfully reboot the system and mount the volume. We tried ping_pong (
http://wiki.samba.org/index.php/Ping_pong) test to see how fast the lock can
perform. The lock speed greatly increase from 100 to 3-5k/sec. However,
after running ping_pong on all 3 nodes simultaneously, the ping_pong program
hang with D state and we could not kill the process even with SIGKILL.

Due to the time constraint, we decided to leave the system as is, letting
ping_pong stuck on all nodes while serving web request. After runing for
hours, the httpd process got stuck in D state and couldn't be killed. All
web serving was not possible at all. We have to reset all machine (unmount
was not possible). The machines were back and GFS volume was back to normal.

Since we have to reset all machines, I decided to run gfs2_fsck on the
volume. So I unmounted GFS2 on all nodes, run gfs2_fsck, answer "y" to many
question about freeing block, and I got the volume back. However, the
process stuck up occurred again very quickly. More seriously, trying to kill
a running process in GFS or unmount it yield kernel panic and suspend the
volume.

After this, the volume was never back to normal again. The volume will crash
(kernel panic) almost immediately when we try to write something to it. This
happened even if I removed mount option and just leave noatime and
nodiratime. I didn't run gfs2_fsck again yet, since we decided to leave it
as is and trying to backup as much data as possible.

Sorry for such a long story. In summary, my question is

   - What could be the cause of load average pile up? Note that sometimes
   happened only on some nodes, although DNS round robin should fairly
   distribute workload to all nodes. At the least the load different shouldn't
   be that much.
   - Should we run gfs2_fsck again? Why the lock up occur?

I have attached our cluster.conf as well as kernel panic log with this
e-mail.

Thank you very much in advance

Best Regards,

===========================================
Somsak Sriprayoonsakul

INOX
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100422/31dfe867/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cluster.conf
Type: application/octet-stream
Size: 1635 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100422/31dfe867/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: panic.log
Type: text/x-log
Size: 3044 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100422/31dfe867/attachment.bin>