[Linux-cluster] Re: GFS hanging on 3 node RHEL4 cluster

Mon Oct 13 21:38:15 UTC 2008

Shawn,
    Looking at the output below you may want to try and increase 
statfs_slots to 256. Also, if you have any disk monitoring utilities 
that monitor drive usage you may want to set statfs_fast equal to 1.

---
Jay
Shawn Hood wrote:
> High priorty support request, I mean.
>
> On Mon, Oct 13, 2008 at 5:32 PM, Shawn Hood <shawnlhood at gmail.com> wrote:
>   
>> As a heads up, I'm about to open a high priority bug on this.  It's
>> crippling us.  Also, I meant to say it is a 4 node cluster, not a 3
>> node.
>>
>> Please let me know if I can provide any more information in addition
>> to this.  I will provide the information from a time series of
>> gfs_tool counters commands with the support request.
>>
>> Shawn
>>
>> On Tue, Oct 7, 2008 at 1:40 PM, Shawn Hood <shawnlhood at gmail.com> wrote:
>>     
>>> More info:
>>>
>>> All filesystems mounted using noatime,nodiratime,noquota.
>>>
>>> All filesystems report the same data from gfs_tool gettune:
>>>
>>> limit1 = 100
>>> ilimit1_tries = 3
>>> ilimit1_min = 1
>>> ilimit2 = 500
>>> ilimit2_tries = 10
>>> ilimit2_min = 3
>>> demote_secs = 300
>>> incore_log_blocks = 1024
>>> jindex_refresh_secs = 60
>>> depend_secs = 60
>>> scand_secs = 5
>>> recoverd_secs = 60
>>> logd_secs = 1
>>> quotad_secs = 5
>>> inoded_secs = 15
>>> glock_purge = 0
>>> quota_simul_sync = 64
>>> quota_warn_period = 10
>>> atime_quantum = 3600
>>> quota_quantum = 60
>>> quota_scale = 1.0000   (1, 1)
>>> quota_enforce = 0
>>> quota_account = 0
>>> new_files_jdata = 0
>>> new_files_directio = 0
>>> max_atomic_write = 4194304
>>> max_readahead = 262144
>>> lockdump_size = 131072
>>> stall_secs = 600
>>> complain_secs = 10
>>> reclaim_limit = 5000
>>> entries_per_readdir = 32
>>> prefetch_secs = 10
>>> statfs_slots = 64
>>> max_mhc = 10000
>>> greedy_default = 100
>>> greedy_quantum = 25
>>> greedy_max = 250
>>> rgrp_try_threshold = 100
>>> statfs_fast = 0
>>> seq_readahead = 0
>>>
>>>
>>> And data on the FS from gfs_tool counters:
>>>                                  locks 2948
>>>                             locks held 1352
>>>                           freeze count 0
>>>                          incore inodes 1347
>>>                       metadata buffers 0
>>>                        unlinked inodes 0
>>>                              quota IDs 0
>>>                     incore log buffers 0
>>>                         log space used 0.05%
>>>              meta header cache entries 0
>>>                     glock dependencies 0
>>>                 glocks on reclaim list 0
>>>                              log wraps 2
>>>                   outstanding LM calls 0
>>>                  outstanding BIO calls 0
>>>                       fh2dentry misses 0
>>>                       glocks reclaimed 223287
>>>                         glock nq calls 1812286
>>>                         glock dq calls 1810926
>>>                   glock prefetch calls 101158
>>>                          lm_lock calls 198294
>>>                        lm_unlock calls 142643
>>>                           lm callbacks 341621
>>>                     address operations 502691
>>>                      dentry operations 395330
>>>                      export operations 0
>>>                        file operations 199243
>>>                       inode operations 984276
>>>                       super operations 1727082
>>>                          vm operations 0
>>>                        block I/O reads 520531
>>>                       block I/O writes 130315
>>>
>>>                                  locks 171423
>>>                             locks held 85717
>>>                           freeze count 0
>>>                          incore inodes 85376
>>>                       metadata buffers 1474
>>>                        unlinked inodes 0
>>>                              quota IDs 0
>>>                     incore log buffers 24
>>>                         log space used 0.83%
>>>              meta header cache entries 6621
>>>                     glock dependencies 2037
>>>                 glocks on reclaim list 0
>>>                              log wraps 428
>>>                   outstanding LM calls 0
>>>                  outstanding BIO calls 0
>>>                       fh2dentry misses 0
>>>                       glocks reclaimed 45784677
>>>                         glock nq calls 962822941
>>>                         glock dq calls 962595532
>>>                   glock prefetch calls 20215922
>>>                          lm_lock calls 40708633
>>>                        lm_unlock calls 23410498
>>>                           lm callbacks 64156052
>>>                     address operations 705464659
>>>                      dentry operations 19701522
>>>                      export operations 0
>>>                        file operations 364990733
>>>                       inode operations 98910127
>>>                       super operations 440061034
>>>                          vm operations 7
>>>                        block I/O reads 90394984
>>>                       block I/O writes 131199864
>>>
>>>                                  locks 2916542
>>>                             locks held 1476005
>>>                           freeze count 0
>>>                          incore inodes 1454165
>>>                       metadata buffers 12539
>>>                        unlinked inodes 100
>>>                              quota IDs 0
>>>                     incore log buffers 11
>>>                         log space used 13.33%
>>>              meta header cache entries 9928
>>>                     glock dependencies 110
>>>                 glocks on reclaim list 0
>>>                              log wraps 2393
>>>                   outstanding LM calls 25
>>>                  outstanding BIO calls 0
>>>                       fh2dentry misses 55546
>>>                       glocks reclaimed 127341056
>>>                         glock nq calls 867427
>>>                         glock dq calls 867430
>>>                   glock prefetch calls 36679316
>>>                          lm_lock calls 110179878
>>>                        lm_unlock calls 84588424
>>>                           lm callbacks 194863553
>>>                     address operations 250891447
>>>                      dentry operations 359537343
>>>                      export operations 390941288
>>>                        file operations 399156716
>>>                       inode operations 537830
>>>                       super operations 1093798409
>>>                          vm operations 774785
>>>                        block I/O reads 258044208
>>>                       block I/O writes 101585172
>>>
>>>
>>>
>>> On Tue, Oct 7, 2008 at 1:33 PM, Shawn Hood <shawnlhood at gmail.com> wrote:
>>>       
>>>> Problem:
>>>> It seems that IO on one machine in the cluster (not always the same
>>>> machine) will hang and all processes accessing clustered LVs will
>>>> block.  Other machines will follow suit shortly thereafter until the
>>>> machine that first exhibited the problem is rebooted (via fence_drac
>>>> manually).  No messages in dmesg, syslog, etc.  Filesystems recently
>>>> fsckd.
>>>>
>>>> Hardware:
>>>> Dell 1950s (similar except memory -- 3x 16GB RAM, 1x 8GB RAM).
>>>> Running RHEL4 ES U7.  Four machines
>>>> Onboard gigabit NICs (Machines use little bandwidth, and all network
>>>> traffic including DLM share NICs)
>>>> QLogic 2462 PCI-Express dual channel FC HBAs
>>>> QLogic SANBox 5200 FC switch
>>>> Apple XRAID which presents as two LUNs (~4.5TB raw aggregate)
>>>> Cisco Catalyst switch
>>>>
>>>> Simple four machine RHEL4 U7 cluster running kernel 2.6.9-78.0.1.ELsmp
>>>> x86_64 with the following packages:
>>>> ccs-1.0.12-1
>>>> cman-1.0.24-1
>>>> cman-kernel-smp-2.6.9-55.13.el4_7.1
>>>> cman-kernheaders-2.6.9-55.13.el4_7.1
>>>> dlm-kernel-smp-2.6.9-54.11.el4_7.1
>>>> dlm-kernheaders-2.6.9-54.11.el4_7.1
>>>> fence-1.32.63-1.el4_7.1
>>>> GFS-6.1.18-1
>>>> GFS-kernel-smp-2.6.9-80.9.el4_7.1
>>>>
>>>> One clustered VG.  Striped across two physical volumes, which
>>>> correspond to each side of an Apple XRAID.
>>>> Clustered volume group info:
>>>>  --- Volume group ---
>>>>  VG Name               hq-san
>>>>  System ID
>>>>  Format                lvm2
>>>>  Metadata Areas        2
>>>>  Metadata Sequence No  50
>>>>  VG Access             read/write
>>>>  VG Status             resizable
>>>>  Clustered             yes
>>>>  Shared                no
>>>>  MAX LV                0
>>>>  Cur LV                3
>>>>  Open LV               3
>>>>  Max PV                0
>>>>  Cur PV                2
>>>>  Act PV                2
>>>>  VG Size               4.55 TB
>>>>  PE Size               4.00 MB
>>>>  Total PE              1192334
>>>>  Alloc PE / Size       905216 / 3.45 TB
>>>>  Free  PE / Size       287118 / 1.10 TB
>>>>  VG UUID               hfeIhf-fzEq-clCf-b26M-cMy3-pphm-B6wmLv
>>>>
>>>> Logical volumes contained with hq-san VG:
>>>>  cam_development   hq-san                          -wi-ao 500.00G
>>>>  qa            hq-san                          -wi-ao   1.07T
>>>>  svn_users         hq-san                          -wi-ao   1.89T
>>>>
>>>> All four machines mount svn_users, two machines mount qa, and one
>>>> mounts cam_development.
>>>>
>>>> /etc/cluster/cluster.conf:
>>>>
>>>> <?xml version="1.0"?>
>>>> <cluster alias="tungsten" config_version="31" name="qualia">
>>>>        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
>>>>        <clusternodes>
>>>>                <clusternode name="odin" votes="1">
>>>>                        <fence>
>>>>                                <method name="1">
>>>>                    <device modulename="" name="odin-drac"/>
>>>>                </method>
>>>>                        </fence>
>>>>                </clusternode>
>>>>                <clusternode name="hugin" votes="1">
>>>>                        <fence>
>>>>                                <method name="1">
>>>>                    <device modulename="" name="hugin-drac"/>
>>>>                </method>
>>>>                        </fence>
>>>>                </clusternode>
>>>>                <clusternode name="munin" votes="1">
>>>>                        <fence>
>>>>                                <method name="1">
>>>>                    <device modulename="" name="munin-drac"/>
>>>>                </method>
>>>>                        </fence>
>>>>                </clusternode>
>>>>                <clusternode name="zeus" votes="1">
>>>>                        <fence>
>>>>                                <method name="1">
>>>>                    <device modulename="" name="zeus-drac"/>
>>>>                </method>
>>>>                        </fence>
>>>>                </clusternode>
>>>>    </clusternodes>
>>>>        <cman expected_votes="1" two_node="0"/>
>>>>        <fencedevices>
>>>>                <resources/>
>>>>                <fencedevice name="odin-drac" agent="fence_drac"
>>>> ipaddr="redacted" login="root" passwd="redacted"/>
>>>>                <fencedevice name="hugin-drac" agent="fence_drac"
>>>> ipaddr="redacted" login="root" passwd="redacted"/>
>>>>                <fencedevice name="munin-drac" agent="fence_drac"
>>>> ipaddr="redacted" login="root" passwd="redacted"/>
>>>>                <fencedevice name="zeus-drac" agent="fence_drac"
>>>> ipaddr="redacted" login="root" passwd="redacted"/>
>>>>        </fencedevices>
>>>>        <rm>
>>>>        <failoverdomains/>
>>>>        <resources/>
>>>>    </rm>
>>>> </cluster>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Shawn Hood
>>>> 910.670.1819 m
>>>>
>>>>         
>>>
>>> --
>>> Shawn Hood
>>> 910.670.1819 m
>>>
>>>       
>>
>> --
>> Shawn Hood
>> 910.670.1819 m
>>
>>     
>
>
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081013/9a65324b/attachment.htm>