[Linux-cluster] GFS Tuning - it's just slow, to slow for production

Thu Mar 4 15:17:14 UTC 2010

Application is single threaded application that handles cgi-bin calls from
Apache, opens up file for writing and writes data. We can have up to 200
concurrent sessions on single application instance hitting the GFS mount. We
noticed major slowdown once we pass 30 concurrent users.

We can run 10 instances of this application on 8 threaded server without any
problem in non-GFS environment, yet I can't get 40 users due to GFS slowing
down Apache page refresh.

On Thu, Mar 4, 2010 at 9:12 AM, Alan A <alan.zg at gmail.com> wrote:

> Hello all - GFS2 is what we have deployed. It is fiber channel
> connection/HBA to HP XP SAN.
>
> *What workload are you tuning for? The chances are that you'll do a lot
> better by adjusting the way in which the application(s) use the
> filesystem rather than tweeking any specific tuning parameters. What
> mount parameters are you using?*
>
> From /etc/fstab:
> /dev/mapper/vg_acct10-lv_acct10 /acct10 gfs
> rw,hostdata=jid=0:id=589826:first=1  0 0
>
>
> We decided to delay production (no one is using GFS right now) but here are
> the stats for GFS mount:
>
> [root at fenclxmrcati11 ~]# gfs_tool counters /acct10
>
>                                   locks 4206
>                              locks held 2046
>                            freeze count 0
>                           incore inodes 2003
>                        metadata buffers 84
>                         unlinked inodes 0
>                               quota IDs 0
>                      incore log buffers 0
>                          log space used 0.10%
>               meta header cache entries 0
>                      glock dependencies 0
>                  glocks on reclaim list 0
>                               log wraps 3
>                    outstanding LM calls 0
>                   outstanding BIO calls 0
>                        fh2dentry misses 0
>                        glocks reclaimed 154841
>                          glock nq calls 15604058
>                          glock dq calls 15600149
>                    glock prefetch calls 3684
>                           lm_lock calls 155504
>                         lm_unlock calls 113531
>                            lm callbacks 290967
>                      address operations 22796796
>                       dentry operations 1532231
>                       export operations 0
>                         file operations 16918046
>                        inode operations 2190281
>                        super operations 10224698
>                           vm operations 201974
>                         block I/O reads 0
>                        block I/O writes 0
> [root at fenclxmrcati11 ~]# gfs_tool stat /acct10
>   mh_magic = 0x01161970
>   mh_type = 4
>   mh_generation = 63
>   mh_format = 400
>   mh_incarn = 0
>   no_formal_ino = 26
>   no_addr = 26
>   di_mode = 0775
>   di_uid = 500
>   di_gid = 500
>   di_nlink = 4
>   di_size = 3864
>   di_blocks = 1
>   di_atime = 1267660812
>   di_mtime = 1265728936
>   di_ctime = 1266338341
>   di_major = 0
>   di_minor = 0
>   di_rgrp = 0
>   di_goal_rgrp = 0
>   di_goal_dblk = 0
>   di_goal_mblk = 0
>   di_flags = 0x00000001
>   di_payload_format = 1200
>   di_type = 2
>   di_height = 0
>   di_incarn = 0
>   di_pad = 0
>   di_depth = 0
>   di_entries = 4
>   no_formal_ino = 0
>   no_addr = 0
>   di_eattr = 0
>   di_reserved =
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 00 00 00
>
> Flags:
>   jdata
> [root at fenclxmrcati11 ~]# gfs_tool gettune /acct10
> ilimit1 = 100
> ilimit1_tries = 3
> ilimit1_min = 1
> ilimit2 = 500
> ilimit2_tries = 10
> ilimit2_min = 3
> demote_secs = 300
> incore_log_blocks = 1024
> jindex_refresh_secs = 60
> depend_secs = 60
> scand_secs = 5
> recoverd_secs = 60
> logd_secs = 1
> quotad_secs = 5
> inoded_secs = 15
> glock_purge = 0
> quota_simul_sync = 64
> quota_warn_period = 10
> atime_quantum = 3600
> quota_quantum = 60
> quota_scale = 1.0000   (1, 1)
> quota_enforce = 1
> quota_account = 1
> new_files_jdata = 0
> new_files_directio = 0
> max_atomic_write = 4194304
> max_readahead = 262144
> lockdump_size = 131072
> stall_secs = 600
> complain_secs = 10
> reclaim_limit = 5000
> entries_per_readdir = 32
> prefetch_secs = 10
> statfs_slots = 64
> max_mhc = 10000
> greedy_default = 100
> greedy_quantum = 25
> greedy_max = 250
> rgrp_try_threshold = 100
> statfs_fast = 0
>
>
>
>
>
>
>
>
>
>
> On Thu, Mar 4, 2010 at 8:59 AM, Carlos Maiolino <cmaiolino at redhat.com>wrote:
>
>> On Thu, Mar 04, 2010 at 08:33:28AM -0600, Alan A wrote:
>> > We are trying to deploy GFS in production, and are experiencing major
>> > performance issues. What parameters in GFS settune can be changed to
>> > increase I/O, to better tune performance? Application we run utilizes a
>> lot
>> > of I/O, please advise.
>> >
>> > We experience OK performance when starting, but as things ramp up and we
>> get
>> > few processes going GFS slows dramatically.
>> >
>> > --
>> > Alan A.
>>
>> > --
>> > Linux-cluster mailing list
>> > Linux-cluster at redhat.com
>> > https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>
>> Hi Alan,
>>
>> Are you using GFS1 or GFS2 ?
>>
>> If you are using GFS1, try to use GFS2, GFS1 will be deprecated and GFS2
>> has a lot of "auto-tuning" parameters as default and the performance is
>> better as well.
>> So, try to look at this document:
>>
>>
>> http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5.4/html/Global_File_System_2/s1-manage-atimeconf.html
>>
>> see ya
>>
>> --
>> ---
>>
>> Best Regards
>>
>> Carlos Eduardo Maiolino
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
>
>
> --
> Alan A.
>

-- 
Alan A.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100304/9527b967/attachment.htm>