[Linux-cluster] GFS filesystem "hang" with cluster-1.03.00

Josef Whiter jwhiter at redhat.com
Fri Oct 20 12:41:18 UTC 2006


In your previous message you asked about the latency.  With gfs1, there is a
certain amount of latency involved with stat calls, so ls -al, du, df all take a
great deal of time comparitively.  With these calls first you have to traverse
the FS in order to cache all the inforamation about the files, so every lookup
requires a lock on each directory to the file, and then a lock on the file
itself inorder to read its information off of the disk.  Then thats just the
lookup, then we have to grab a shared lock again to get the stat information
from the file.  Each lock mind you requires exporting the lock to all of the
other nodes so they know about it and getting confirmation back on that lock.
So for every stat lookup you are looking at at the very least 2 seperate locks,
one for the lookup and then one for the stat.  Every subsequent call is faster
because the lookups no longer require the locks to lookup the file, as the inode
information is now cached, so we just need the lock for the file.

If gfs_tool counters is stuck, you'll want to get a couple instances of sysrq-t
from all nodes and see if you can see who is hanging, wether its in D state or
if the particular process isn't makeing progress.

Josef

On Fri, Oct 20, 2006 at 12:48:29PM +0200, Ramon van Alteren wrote:
> Some additional info,
> 
> I ran gfs_tool counters <mountpoint> on all stuck nodes
> They all seem to have a large amount of outstanding BIO calls.
> 
> Any way I can find out what is causing this ?
> Any other information to look for ?
> 
> As far as I can see, there's no bottleneck at the shared coraid storage AoE.
> 
> gfs_tool counters output on: mrcluster1
> 
>                                   locks 3024
>                              locks held 108
>                           incore inodes 12
>                        metadata buffers 166
>                         unlinked inodes 0
>                               quota IDs 0
>                      incore log buffers 1
>                          log space used 0.15%
>               meta header cache entries 9998
>                      glock dependencies 33
>                  glocks on reclaim list 0
>                               log wraps 8
>                    outstanding LM calls 0
>                   outstanding BIO calls 805
>                        fh2dentry misses 0
>                        glocks reclaimed 5992
>                          glock nq calls 4784596
>                          glock dq calls 4784236
>                    glock prefetch calls 25
>                           lm_lock calls 6341
>                         lm_unlock calls 5269
>                            lm callbacks 11907
>                      address operations 254155990
>                       dentry operations 1978
>                       export operations 0
>                         file operations 1457916
>                        inode operations 5489
>                        super operations 1577443
>                           vm operations 0
>                         block I/O reads 136200
>                        block I/O writes 81780453
> gfs_tool counters output on: mrcluster2
> 
>                                   locks 3020
>                              locks held 107
>                           incore inodes 10
>                        metadata buffers 604
>                         unlinked inodes 0
>                               quota IDs 0
>                      incore log buffers 1
>                          log space used 0.15%
>               meta header cache entries 10000
>                      glock dependencies 34
>                  glocks on reclaim list 0
>                               log wraps 8
>                    outstanding LM calls 0
>                   outstanding BIO calls 490
>                        fh2dentry misses 0
>                        glocks reclaimed 3725
>                          glock nq calls 4755003
>                          glock dq calls 4754287
>                    glock prefetch calls 12
>                           lm_lock calls 4364
>                         lm_unlock calls 3017
>                            lm callbacks 8140
>                      address operations 252523873
>                       dentry operations 1957
>                       export operations 0
>                         file operations 1444785
>                        inode operations 5425
>                        super operations 1564779
>                           vm operations 0
>                         block I/O reads 135658
>                        block I/O writes 81574696
> gfs_tool counters output on: mrcluster3
> 
>                                   locks 3018
>                              locks held 135
>                           incore inodes 9
>                        metadata buffers 1
>                         unlinked inodes 0
>                               quota IDs 0
>                      incore log buffers 1
>                          log space used 0.15%
>               meta header cache entries 9997
>                      glock dependencies 20
>                  glocks on reclaim list 0
>                               log wraps 25
>                    outstanding LM calls 0
>                   outstanding BIO calls 191
>                        fh2dentry misses 0
>                        glocks reclaimed 11097
>                          glock nq calls 15308139
>                          glock dq calls 15307573
>                    glock prefetch calls 13
>                           lm_lock calls 8734
>                         lm_unlock calls 7813
>                            lm callbacks 17167
>                      address operations 941469125
>                       dentry operations 5308
>                       export operations 0
>                         file operations 4730084
>                        inode operations 17157
>                        super operations 5170894
>                           vm operations 0
>                         block I/O reads 333851
>                        block I/O writes 4449228
> gfs_tool counters output on: mrcluster4
> 
>                                   locks 3017
>                              locks held 206
>                           incore inodes 7
>                        metadata buffers 2945
>                         unlinked inodes 0
>                               quota IDs 2
>                      incore log buffers 5
>                          log space used 0.24%
>               meta header cache entries 9343
>                      glock dependencies 54
>                  glocks on reclaim list 0
>                               log wraps 2
>                    outstanding LM calls 0
>                   outstanding BIO calls 249
>                        fh2dentry misses 0
>                        glocks reclaimed 2075
>                          glock nq calls 1485236
>                          glock dq calls 1485023
>                    glock prefetch calls 0
>                           lm_lock calls 1967
>                         lm_unlock calls 1326
>                            lm callbacks 3657
>                      address operations 66603382
>                       dentry operations 1747
>                       export operations 0
>                         file operations 457700
>                        inode operations 4821
>                        super operations 484938
>                           vm operations 0
>                         block I/O reads 28969
>                        block I/O writes 21973543
> 
> Grtz Ramon
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




More information about the Linux-cluster mailing list