[Linux-cluster] YNT: GFS Tuning - it's just slow, to slow for production

Fri Mar 5 10:07:40 UTC 2010

Hi,

On Thu, 2010-03-04 at 20:58 +0200, aydin sasmaz wrote:
> Hi all,
> 
>  
> 
> I have an question and probably some advices about gfs relating to
> this performans issue. We have use DDN SA6620 system for storage. This
> has 60 sas disks in it. This device capable of making 4data+2parity or
> 8data+2parity disks for raid 6 arrays. We have also 2 TB sas disks. We
> have san with 2 san switch and 4 hp dl 585 G2 servers.
> 
[snip]

> 
> First we have deployed, GFS2 for 4 DL585 servers. Make standalone “dd”
> test both in serial and parallel from different servers. In serial
> tests we have measured 70GB/s ~ 96GB/s. after noatime option we have
> 100GB/s and 140 GB/s io results for writing. In parallel it is getting
> much worse.
> 
The question here is what is the I/O pattern wrt to each node? If each
server is writing into files in its own directory, then there shouldn't
really be much of a slow down.

If on the other hand, there are a lot of accesses to the same files
(assuming that at least some % of the accesses are writes) then it will
slow down a lot.

>  
> 
> Secondly we have formatted LUN with GFS instead of GFS2. We get
> 500GB/s for one server at a time and 450 GB/s 4 node i/o tests
> 
>  
> 
> In this I have agree with for Corey CKOVACS for tuning on storage
> would be important. But comparing gfs with gfs2 formatting option is
> very interesting. Because gfs seems faster than gfs2. I didn’t expect
> this result. Is it normal?
> 
That depends on what the test is. If you are referring to large
streaming writes, then we know that at the moment gfs is faster. gfs2
however is faster for almost everything else. We are working on fixing
the streaming writes issue which is a consequence of the "page at a
time" writing which is currently enforced by the vfs/vm.

>  
> 
> Another important result for gfs or gfs2 journal numbers. If your gfs
> volume journal number is higher than number of servers for future use.
> It affects gfs performance very dramaticly.it is better to add
> journals later while you need it.
> 
I suspect that its not the number of journals which makes the
difference, but the number of resource groups. You ideally want a number
of resource groups which is much greater than the number of nodes
mounting the filesystem. This reduces the probability of lock contention
on any one resource group and improves performance.

There are however other tradeoffs with very large numbers of resource
groups, namely that the cpu usage goes up as they are searched looking
for free space. So there is usually an optimal number for any given
cluster based upon the characteristics of the storage and the nodes.
Some experimentation is often required to find the optimal settings,

Steve.