[Linux-cluster] umount hang on 2.6.10 and latest GFS
Daniel McNeil
daniel at osdl.org
Fri Jan 28 01:41:10 UTC 2005
I hit a umount hang running my tests. It was running with
2 nodes mounted cl030 and cl031. It has finished a test
and is unmounting cl030 when it hung. cl031 seems fine
with the gfs file system still mounted.
The gfs file system is unmounted (not in /proc/mounts), but
the umount is hung trying to stop dlm_astd
Here's the stack trace:
umount D 00000008 0 10453 10447 (NOTLB)
cdaa8de4 00000082 cdaa8dd4 00000008 00000001 000c0000 00000008 00000002
c1bce798 00000286 e8f782e0 cdaa8dc4 c0116871 e9db55e0 960546f9 c170ef60
00000000 0001fba3 0167aae6 00005e6b f74f8080 f74f81ec c170ef60 00000000
Call Trace:
[<c03ce814>] wait_for_completion+0xa4/0xe0
[<c01326a5>] kthread_stop+0x85/0xae
[<f8aca033>] astd_stop+0x13/0x32 [dlm]
[<f8ad1e51>] dlm_release+0x91/0xa0 [dlm]
[<f8ad2832>] release_lockspace+0x222/0x2f0 [dlm]
[<f8c2f22c>] release_gdlm+0x1c/0x30 [lock_dlm]
[<f8c2f55f>] lm_dlm_unmount+0x4f/0x70 [lock_dlm]
[<f881242c>] lm_unmount+0x3c/0xa0 [lock_harness]
[<f8fb60ef>] gfs_lm_unmount+0x2f/0x40 [gfs]
[<f8fc62ab>] gfs_put_super+0x2fb/0x3a0 [gfs]
[<c0165d67>] generic_shutdown_super+0x127/0x140
[<f8fc337e>] gfs_kill_sb+0x2e/0x69 [gfs]
[<c0165b71>] deactivate_super+0x81/0xa0
[<c017c4dc>] sys_umount+0x3c/0xa0
[<c017c559>] sys_oldumount+0x19/0x20
[<c010323d>] sysenter_past_esp+0x52/0x75
dlm_astd D 00000008 0 10264 6 3235 (L-TLB)
dc9c3ee8 00000046 dc9c3ed8 00000008 00000002 00000800 00000008 c8cc35e0
f7bc0568 5f8a4c1c 0179a889 e4676c5a 00004b2d dc9c3f14 c051c000 c1716f60
00000001 000001b0 0167c50b 00005e6b e9db55e0 e9db574c c1714060 00000000
Call Trace:
[<c03cef7c>] rwsem_down_read_failed+0x9c/0x190
[<f8aca119>] .text.lock.ast+0xc7/0x1de [dlm]
[<f8ac9ea5>] dlm_astd+0x1e5/0x210 [dlm]
[<c013245a>] kthread+0xba/0xc0
[<c0101315>] kernel_thread_helper+0x5/0x10
So, it looks like dlm_astd is stuck on a down_read().
The only down_read I see is in process_asts().
down_read(&ls->ls_in_recovery);
So, it looks block on recovery of the lockspace, but the
DLM is not listed in /proc/cluster/services and
/proc/cluster/dlm_locks shows no locks.
Full info available here:
http://developer.osdl.org/daniel/GFS/test.25jan2005/
Ideas?
Daniel
More information about the Linux-cluster
mailing list