[Linux-cluster] GFS filesystem "hang" with cluster-1.03.00

Ramon van Alteren ramon at vanalteren.nl
Fri Oct 20 07:40:40 UTC 2006


Hi List,

I'm hoping someone can provide me with pointers to solve the following 
problem:

I've setup a cluster of 5 nodes with cluster-1.03.00 compiled from 
source. The cluster works fine.
I can fence the nodes and all nodes see each other.

I've created a gfs filesystem on a coraid shared device using clvm

All nodes see the filesystem and see small changes to the filesystem.

Last night I started a stress-test with iozone on all nodes:

mkdir /mnt/$(hostname)
iozone -Rab /home/iozone-$(hostname)-test-${DATE}.xls -i0 -g16G -f 
/mnt/$(hostname)/iozone-test${DATE}

This test started at 4 AM and is still running on all nodes.

If I run the same test on a single node it produces a nice test-report 
indicating that we get an average write performance of 35MB/s.
This is within expectations of the hardware with the current setup.

Most operations on the gfs filesystem take long the first time, gfs_tool 
counters /mnt takes roughly a minute the first time, afterwards they 
react normal, within a second response. The same is true for operations 
like ls, df, etc.

I have no clue why concurrent writes hang and would appreciate any 
pointers on where to start looking.

kernel:
2.6.16-gentoo-r13
glibc:
glibc-2.3.6-r4
gcc:
gcc-3.4.6
Hardware:
x86_64 Intel(R) Xeon(R) CPU 5140  @ 2.33GHz

Thank you,

Ramon van Alteren




More information about the Linux-cluster mailing list