[Linux-cluster] GFS filesystem hangs
Fair, Brian
xbfair at citistreetonline.com
Thu Nov 15 08:40:35 UTC 2007
We have a a GFS filesystem (1 of 100 on this server in particular) that will consistently hang. I haven't identified the circumstances around it, but there is some speculation that it may occur during heavy usage, though that isn't for sure. When this happens, the load average on the system skyrockets.
The mountpoint is /omni_mnt/clients/j2
When I say hang, cd sometimes hangs, ls will hang, etc. Programs and file operations certainly hang. Sometimes just cd'ing into the mountpoint, other times into a large subdirectory.
ie:
# cd /omni_mnt/clients/j2
root at hlpom500:[/omni_mnt/clients/j2 <mailto:root at hlpom500:[/omni_mnt/clients/j2> ]
# ls
<normal output>
root at hlpom500:[/omni_mnt/clients/j2 <mailto:root at hlpom500:[/omni_mnt/clients/j2> ]
# cd stmt
root at hlpom500:[/omni_mnt/clients/j2/stmt <mailto:root at hlpom500:[/omni_mnt/clients/j2/stmt> ]
# ls
<hangs here, shell must be killed>
In the past, shutting down and rebooting the 2 systems that mount this gfs has cleared the issue.
Info:
RHEL ES 4 u5
kernel 2.6.9-55.0.2.ELsmp
GFS 2.6.9-72.2.0.2
Not sure what is helpful. but here are some outputs from the system while the fs was hung.. I have a lockdump also, but it is 4,650 lines. I can send it along if needed. Any suggestions on data to gather in the future are welcomed.
Thanks!
Brian Fair
gfs_tool gettune ************************************************************************
ilimit1 = 100
ilimit1_tries = 3
ilimit1_min = 1
ilimit2 = 500
ilimit2_tries = 10
ilimit2_min = 3
demote_secs = 300
incore_log_blocks = 1024
jindex_refresh_secs = 60
depend_secs = 60
scand_secs = 5
recoverd_secs = 60
logd_secs = 1
quotad_secs = 5
inoded_secs = 15
glock_purge = 0
quota_simul_sync = 64
quota_warn_period = 10
atime_quantum = 3600
quota_quantum = 60
quota_scale = 1.0000 (1, 1)
quota_enforce = 1
quota_account = 1
new_files_jdata = 0
new_files_directio = 0
max_atomic_write = 4194304
max_readahead = 262144
lockdump_size = 131072
stall_secs = 600
complain_secs = 10
reclaim_limit = 5000
entries_per_readdir = 32
prefetch_secs = 10
statfs_slots = 64
max_mhc = 10000
greedy_default = 100
greedy_quantum = 25
greedy_max = 250
rgrp_try_threshold = 100
statfs_fast = 0
gfs_tool counters ************************************************************************
locks 246
locks held 127
freeze count 0
incore inodes 101
metadata buffers 4
unlinked inodes 2
quota IDs 3
incore log buffers 0
log space used 0.05%
meta header cache entries 0
glock dependencies 0
glocks on reclaim list 0
log wraps 85
outstanding LM calls 2
outstanding BIO calls 0
fh2dentry misses 0
glocks reclaimed 1316856
glock nq calls 194073094
glock dq calls 193851427
glock prefetch calls 102749
lm_lock calls 903612
lm_unlock calls 833348
lm callbacks 1769983
address operations 71707236
dentry operations 23750382
export operations 0
file operations 139487453
inode operations 38356847
super operations 110620113
vm operations 1052447
block I/O reads 241669
block I/O writes 3295626
More information about the Linux-cluster
mailing list