[Linux-cluster] Re: GFS hanging on 3 node RHEL4 cluster

Tue Oct 7 17:40:51 UTC 2008

More info:

All filesystems mounted using noatime,nodiratime,noquota.

All filesystems report the same data from gfs_tool gettune:

limit1 = 100
ilimit1_tries = 3
ilimit1_min = 1
ilimit2 = 500
ilimit2_tries = 10
ilimit2_min = 3
demote_secs = 300
incore_log_blocks = 1024
jindex_refresh_secs = 60
depend_secs = 60
scand_secs = 5
recoverd_secs = 60
logd_secs = 1
quotad_secs = 5
inoded_secs = 15
glock_purge = 0
quota_simul_sync = 64
quota_warn_period = 10
atime_quantum = 3600
quota_quantum = 60
quota_scale = 1.0000   (1, 1)
quota_enforce = 0
quota_account = 0
new_files_jdata = 0
new_files_directio = 0
max_atomic_write = 4194304
max_readahead = 262144
lockdump_size = 131072
stall_secs = 600
complain_secs = 10
reclaim_limit = 5000
entries_per_readdir = 32
prefetch_secs = 10
statfs_slots = 64
max_mhc = 10000
greedy_default = 100
greedy_quantum = 25
greedy_max = 250
rgrp_try_threshold = 100
statfs_fast = 0
seq_readahead = 0

And data on the FS from gfs_tool counters:
                                  locks 2948
                             locks held 1352
                           freeze count 0
                          incore inodes 1347
                       metadata buffers 0
                        unlinked inodes 0
                              quota IDs 0
                     incore log buffers 0
                         log space used 0.05%
              meta header cache entries 0
                     glock dependencies 0
                 glocks on reclaim list 0
                              log wraps 2
                   outstanding LM calls 0
                  outstanding BIO calls 0
                       fh2dentry misses 0
                       glocks reclaimed 223287
                         glock nq calls 1812286
                         glock dq calls 1810926
                   glock prefetch calls 101158
                          lm_lock calls 198294
                        lm_unlock calls 142643
                           lm callbacks 341621
                     address operations 502691
                      dentry operations 395330
                      export operations 0
                        file operations 199243
                       inode operations 984276
                       super operations 1727082
                          vm operations 0
                        block I/O reads 520531
                       block I/O writes 130315

                                  locks 171423
                             locks held 85717
                           freeze count 0
                          incore inodes 85376
                       metadata buffers 1474
                        unlinked inodes 0
                              quota IDs 0
                     incore log buffers 24
                         log space used 0.83%
              meta header cache entries 6621
                     glock dependencies 2037
                 glocks on reclaim list 0
                              log wraps 428
                   outstanding LM calls 0
                  outstanding BIO calls 0
                       fh2dentry misses 0
                       glocks reclaimed 45784677
                         glock nq calls 962822941
                         glock dq calls 962595532
                   glock prefetch calls 20215922
                          lm_lock calls 40708633
                        lm_unlock calls 23410498
                           lm callbacks 64156052
                     address operations 705464659
                      dentry operations 19701522
                      export operations 0
                        file operations 364990733
                       inode operations 98910127
                       super operations 440061034
                          vm operations 7
                        block I/O reads 90394984
                       block I/O writes 131199864

                                  locks 2916542
                             locks held 1476005
                           freeze count 0
                          incore inodes 1454165
                       metadata buffers 12539
                        unlinked inodes 100
                              quota IDs 0
                     incore log buffers 11
                         log space used 13.33%
              meta header cache entries 9928
                     glock dependencies 110
                 glocks on reclaim list 0
                              log wraps 2393
                   outstanding LM calls 25
                  outstanding BIO calls 0
                       fh2dentry misses 55546
                       glocks reclaimed 127341056
                         glock nq calls 867427
                         glock dq calls 867430
                   glock prefetch calls 36679316
                          lm_lock calls 110179878
                        lm_unlock calls 84588424
                           lm callbacks 194863553
                     address operations 250891447
                      dentry operations 359537343
                      export operations 390941288
                        file operations 399156716
                       inode operations 537830
                       super operations 1093798409
                          vm operations 774785
                        block I/O reads 258044208
                       block I/O writes 101585172

On Tue, Oct 7, 2008 at 1:33 PM, Shawn Hood <shawnlhood at gmail.com> wrote:
> Problem:
> It seems that IO on one machine in the cluster (not always the same
> machine) will hang and all processes accessing clustered LVs will
> block.  Other machines will follow suit shortly thereafter until the
> machine that first exhibited the problem is rebooted (via fence_drac
> manually).  No messages in dmesg, syslog, etc.  Filesystems recently
> fsckd.
>
> Hardware:
> Dell 1950s (similar except memory -- 3x 16GB RAM, 1x 8GB RAM).
> Running RHEL4 ES U7.  Four machines
> Onboard gigabit NICs (Machines use little bandwidth, and all network
> traffic including DLM share NICs)
> QLogic 2462 PCI-Express dual channel FC HBAs
> QLogic SANBox 5200 FC switch
> Apple XRAID which presents as two LUNs (~4.5TB raw aggregate)
> Cisco Catalyst switch
>
> Simple four machine RHEL4 U7 cluster running kernel 2.6.9-78.0.1.ELsmp
> x86_64 with the following packages:
> ccs-1.0.12-1
> cman-1.0.24-1
> cman-kernel-smp-2.6.9-55.13.el4_7.1
> cman-kernheaders-2.6.9-55.13.el4_7.1
> dlm-kernel-smp-2.6.9-54.11.el4_7.1
> dlm-kernheaders-2.6.9-54.11.el4_7.1
> fence-1.32.63-1.el4_7.1
> GFS-6.1.18-1
> GFS-kernel-smp-2.6.9-80.9.el4_7.1
>
> One clustered VG.  Striped across two physical volumes, which
> correspond to each side of an Apple XRAID.
> Clustered volume group info:
>  --- Volume group ---
>  VG Name               hq-san
>  System ID
>  Format                lvm2
>  Metadata Areas        2
>  Metadata Sequence No  50
>  VG Access             read/write
>  VG Status             resizable
>  Clustered             yes
>  Shared                no
>  MAX LV                0
>  Cur LV                3
>  Open LV               3
>  Max PV                0
>  Cur PV                2
>  Act PV                2
>  VG Size               4.55 TB
>  PE Size               4.00 MB
>  Total PE              1192334
>  Alloc PE / Size       905216 / 3.45 TB
>  Free  PE / Size       287118 / 1.10 TB
>  VG UUID               hfeIhf-fzEq-clCf-b26M-cMy3-pphm-B6wmLv
>
> Logical volumes contained with hq-san VG:
>  cam_development   hq-san                          -wi-ao 500.00G
>  qa            hq-san                          -wi-ao   1.07T
>  svn_users         hq-san                          -wi-ao   1.89T
>
> All four machines mount svn_users, two machines mount qa, and one
> mounts cam_development.
>
> /etc/cluster/cluster.conf:
>
> <?xml version="1.0"?>
> <cluster alias="tungsten" config_version="31" name="qualia">
>        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
>        <clusternodes>
>                <clusternode name="odin" votes="1">
>                        <fence>
>                                <method name="1">
>                    <device modulename="" name="odin-drac"/>
>                </method>
>                        </fence>
>                </clusternode>
>                <clusternode name="hugin" votes="1">
>                        <fence>
>                                <method name="1">
>                    <device modulename="" name="hugin-drac"/>
>                </method>
>                        </fence>
>                </clusternode>
>                <clusternode name="munin" votes="1">
>                        <fence>
>                                <method name="1">
>                    <device modulename="" name="munin-drac"/>
>                </method>
>                        </fence>
>                </clusternode>
>                <clusternode name="zeus" votes="1">
>                        <fence>
>                                <method name="1">
>                    <device modulename="" name="zeus-drac"/>
>                </method>
>                        </fence>
>                </clusternode>
>    </clusternodes>
>        <cman expected_votes="1" two_node="0"/>
>        <fencedevices>
>                <resources/>
>                <fencedevice name="odin-drac" agent="fence_drac"
> ipaddr="redacted" login="root" passwd="redacted"/>
>                <fencedevice name="hugin-drac" agent="fence_drac"
> ipaddr="redacted" login="root" passwd="redacted"/>
>                <fencedevice name="munin-drac" agent="fence_drac"
> ipaddr="redacted" login="root" passwd="redacted"/>
>                <fencedevice name="zeus-drac" agent="fence_drac"
> ipaddr="redacted" login="root" passwd="redacted"/>
>        </fencedevices>
>        <rm>
>        <failoverdomains/>
>        <resources/>
>    </rm>
> </cluster>
>
>
>
>
> --
> Shawn Hood
> 910.670.1819 m
>

-- 
Shawn Hood
910.670.1819 m