[Linux-cluster] GFS problem

Paras pradhan pradhanparas at gmail.com
Wed Dec 22 21:21:46 UTC 2010


Hi,

This morning when I rebooted one node out of the 3 nodes cluster, it
came back normally but saw repeated INFO of GFS :

--

INFO: task gfs2_quotad:7957 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
gfs2_quotad   D ffff8800ea0cfd30     0  7957     67          7965  7956 (L-TLB)
 ffff8800ea0cfcd0  0000000000000246  0000000000000000  ffff8800f996d800
 000000000000000a  ffff8800fb3a70c0  ffff8800ffceb7e0  000000000000a429
 ffff8800fb3a72a8  0000000000000000
Call Trace:
 [<ffffffff8886d7b8>] :dlm:dlm_put_lockspace+0x10/0x1f
 [<ffffffff8886be5f>] :dlm:dlm_lock+0x117/0x129
 [<ffffffff88910556>] :lock_dlm:gdlm_ast+0x0/0x311
 [<ffffffff889102c1>] :lock_dlm:gdlm_bast+0x0/0x8d
 [<ffffffff88894e8f>] :gfs2:just_schedule+0x0/0xe
 [<ffffffff88894e98>] :gfs2:just_schedule+0x9/0xe
 [<ffffffff80263825>] __wait_on_bit+0x40/0x6e
 [<ffffffff88894e8f>] :gfs2:just_schedule+0x0/0xe
 [<ffffffff802638bf>] out_of_line_wait_on_bit+0x6c/0x78
 [<ffffffff8029c45e>] wake_bit_function+0x0/0x23
 [<ffffffff88894e8a>] :gfs2:gfs2_glock_wait+0x2b/0x30
 [<ffffffff888ab456>] :gfs2:gfs2_statfs_sync+0x3f/0x165
 [<ffffffff888ab44e>] :gfs2:gfs2_statfs_sync+0x37/0x165
 [<ffffffff8025dd7c>] del_timer_sync+0xc/0x16
 [<ffffffff888a5277>] :gfs2:quotad_check_timeo+0x20/0x60
 [<ffffffff888a6d46>] :gfs2:gfs2_quotad+0xde/0x214
 [<ffffffff8029c430>] autoremove_wake_function+0x0/0x2e
 [<ffffffff888a6c68>] :gfs2:gfs2_quotad+0x0/0x214
 [<ffffffff8029c218>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80233be4>] kthread+0xfe/0x132
 [<ffffffff80260b2c>] child_rip+0xa/0x12
 [<ffffffff8029c218>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80233ae6>] kthread+0x0/0x132
 [<ffffffff80260b22>] child_rip+0x0/0x12
--

clustat was not listing the services too saying Service temoprarily
unavailible. try again later...

Then I ran gfs2_list df. It printed out few lines then it stopped. I
could't do 'ls; on mounted GFS file-systems on all three nodes. Then I
rebooted this node once again. After that everything is normal.

Just wanted to know what might has caused the problem.

messaged logs says:


Dec 22 10:53:57 cvprd2 fenced[7379]: fence "xxxx.xxx.xxx" success
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms1.1: jid=2:
Trying to acquire journal lock...
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms2.2: jid=1:
Trying to acquire journal lock...
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms3.1: jid=0:
Trying to acquire journal lock...
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms4.1: jid=0:
Trying to acquire journal lock...
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms5.1: jid=0:
Trying to acquire journal lock...
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms6.1: jid=0:
Trying to acquire journal lock...
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms1.1: jid=2:
Looking at journal...
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms2.2: jid=1:
Looking at journal...
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms3.1: jid=0:
Looking at journal...
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms5.1: jid=0:
Looking at journal...
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms6.1: jid=0:
Looking at journal...
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms4.1: jid=0:
Looking at journal...
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms1.1: jid=2:
Acquiring the transaction lock...
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms1.1: jid=2:
Replaying journal...
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms1.1: jid=2:
Replayed 0 of 0 blocks
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms1.1: jid=2:
Found 0 revoke tags
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms1.1: jid=2:
Journal replayed in 0s
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms1.1: jid=2: Done
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms3.1: jid=0:
Acquiring the transaction lock...
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms3.1: jid=0:
Replaying journal...
Dec 22 10:53:57 cvprd2 kernel: GFS2: fsid=vprd:guest_vms3.1: jid=0:
Replayed 1 of 1 blocks
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms3.1: jid=0:
Found 0 revoke tags
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms3.1: jid=0:
Journal replayed in 1s
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms3.1: jid=0: Done
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms2.2: jid=1:
Acquiring the transaction lock...
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms2.2: jid=1:
Replaying journal...
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms2.2: jid=1:
Replayed 5 of 5 blocks
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms2.2: jid=1:
Found 0 revoke tags
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms2.2: jid=1:
Journal replayed in 0s
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms2.2: jid=1: Done
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms5.1: jid=0: Done
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms6.1: jid=0: Done
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms4.1: jid=0:
Acquiring the transaction lock...
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms4.1: jid=0:
Replaying journal...
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms4.1: jid=0:
Replayed 0 of 0 blocks
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms4.1: jid=0:
Found 0 revoke tags
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms4.1: jid=0:
Journal replayed in 0s
Dec 22 10:53:58 cvprd2 kernel: GFS2: fsid=vprd:guest_vms4.1: jid=0: Done


OS: RHEL 5.5 64 bit (up to date)

Thanks!
Paras.




More information about the Linux-cluster mailing list