<div style="line-height:1.7;color:#000000;font-size:14px;font-family:Arial"><div>hi ,guys</div><div>I have a two-nodes GFS2 cluster based on logic volume created by drbd block device /dev/drbd0. The two nodes' mount points of GFS2 filesystem are <span style="line-height: 1.7;">exported by samba share. Then there are two clients mounting and copying data into them respectively. Hours later, one client</span><span style="line-height: 1.7;">(assume just call it clientA)</span><span style="line-height: 1.7;"> has finished all tasks, while </span><span style="line-height: 1.7;">the other client</span><span style="line-height: 1.7;">(assume just call it clientB)</span><span style="line-height: 1.7;"> is still copying with very slow write speed(2-3MB/s, in normal case 40-100MB/s). </span></div><div>Then I doubt that the there is something wrong with gfs2 filesystem on the corresponding server node that clientB mount to, and I try to write some data into it by </div><div>excute commad as follows: </div><div>[root@dcs-229 ~]# dd if=/dev/zero of=./data2 bs=128k count=1000</div><div>1000+0 records in<br>1000+0 records out<br>131072000 bytes (131 MB) copied, 183.152 s, 716 kB/s</div><div>It shows the write speed is too slow, almostly hangs up. I redo it once again, it hangs up. Then, I terminate it with 『Ctr + c』, and kernel reports error messages as</div><div>follows:</div><div><div>Nov 12 11:50:11 dcs-229 kernel: GFS2: fsid=MyCluster:gfs.1: fatal: invalid metadata block</div><div>Nov 12 11:50:11 dcs-229 kernel: GFS2: fsid=MyCluster:gfs.1: bh = 25 (magic number)</div><div>Nov 12 11:50:11 dcs-229 kernel: GFS2: fsid=MyCluster:gfs.1: function = gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line = 393</div><div>Nov 12 11:50:11 dcs-229 kernel: GFS2: fsid=MyCluster:gfs.1: jid=0: Trying to acquire journal lock...</div><div>Nov 12 11:50:11 dcs-229 kernel: Pid: 12044, comm: glock_workqueue Not tainted 2.6.32-358.el6.x86_64 #1</div><div>Nov 12 11:50:11 dcs-229 kernel: Call Trace:</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa044be22>] ? gfs2_lm_withdraw+0x102/0x130 [gfs2]</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81096cc0>] ? wake_bit_function+0x0/0x50</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa044bf75>] ? gfs2_meta_check_ii+0x45/0x50 [gfs2]</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa04367d9>] ? gfs2_meta_indirect_buffer+0xf9/0x100 [gfs2]</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffff8105e203>] ? perf_event_task_sched_out+0x33/0x80</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa0431505>] ? gfs2_inode_refresh+0x25/0x2c0 [gfs2]</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa0430b48>] ? inode_go_lock+0x88/0xf0 [gfs2]</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa042f25b>] ? do_promote+0x1bb/0x330 [gfs2]</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa042f548>] ? finish_xmote+0x178/0x410 [gfs2]</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa04303e3>] ? glock_work_func+0x133/0x1d0 [gfs2]</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa04302b0>] ? glock_work_func+0x0/0x1d0 [gfs2]</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81090ac0>] ? worker_thread+0x170/0x2a0</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81090950>] ? worker_thread+0x0/0x2a0</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81096916>] ? kthread+0x96/0xa0</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffff8100c0ca>] ? child_rip+0xa/0x20</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81096880>] ? kthread+0x0/0xa0</div><div>Nov 12 11:50:11 dcs-229 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20</div><div>Nov 12 11:50:11 dcs-229 kernel: GFS2: fsid=MyCluster:gfs.1: jid=0: Failed</div></div><div>And the other node also reports error messages:</div><div><div>Nov 12 11:48:50 dcs-226 kernel: Pid: 13784, comm: glock_workqueue Not tainted 2.6.32-358.el6.x86_64 #1</div><div>Nov 12 11:48:50 dcs-226 kernel: Call Trace:</div><div>Nov 12 11:48:50 dcs-226 kernel: [<ffffffffa0478e22>] ? gfs2_lm_withdraw+0x102/0x130 [gfs2]</div><div>Nov 12 11:48:50 dcs-226 kernel: [<ffffffff81096cc0>] ? wake_bit_function+0x0/0x50</div><div>Nov 12 11:48:50 dcs-226 kernel: [<ffffffffa0478f75>] ? gfs2_meta_check_ii+0x45/0x50 [gfs2]</div><div>Nov 12 11:48:50 dcs-226 kernel: [<ffffffffa04637d9>] ? gfs2_meta_indirect_buffer+0xf9/0x100 [gfs2]</div><div>Nov 12 11:48:50 dcs-226 kernel: [<ffffffff8105e203>] ? perf_event_task_sched_out+0x33/0x80</div><div>Nov 12 11:48:50 dcs-226 kernel: [<ffffffffa045e505>] ? gfs2_inode_refresh+0x25/0x2c0 [gfs2]</div><div>Nov 12 11:48:50 dcs-226 kernel: [<ffffffffa045db48>] ? inode_go_lock+0x88/0xf0 [gfs2]</div><div>Nov 12 11:48:50 dcs-226 kernel: GFS2: fsid=MyCluster:gfs.0: fatal: invalid metadata block</div><div>Nov 12 11:48:51 dcs-226 kernel: GFS2: fsid=MyCluster:gfs.0: bh = 66213 (magic number)</div><div>Nov 12 11:48:51 dcs-226 kernel: GFS2: fsid=MyCluster:gfs.0: function = gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line = 393</div><div>Nov 12 11:48:51 dcs-226 kernel: GFS2: fsid=MyCluster:gfs.0: about to withdraw this file system</div><div>Nov 12 11:48:51 dcs-226 kernel: GFS2: fsid=MyCluster:gfs.0: telling LM to unmount</div><div>Nov 12 11:48:51 dcs-226 kernel: [<ffffffffa045c25b>] ? do_promote+0x1bb/0x330 [gfs2]</div><div>Nov 12 11:48:51 dcs-226 kernel: [<ffffffffa045c548>] ? finish_xmote+0x178/0x410 [gfs2]</div><div>Nov 12 11:48:51 dcs-226 kernel: [<ffffffffa045d3e3>] ? glock_work_func+0x133/0x1d0 [gfs2]</div><div>Nov 12 11:48:51 dcs-226 kernel: [<ffffffffa045d2b0>] ? glock_work_func+0x0/0x1d0 [gfs2]</div><div>Nov 12 11:48:51 dcs-226 kernel: [<ffffffff81090ac0>] ? worker_thread+0x170/0x2a0</div><div>Nov 12 11:48:51 dcs-226 kernel: [<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40</div><div>Nov 12 11:48:51 dcs-226 kernel: [<ffffffff81090950>] ? worker_thread+0x0/0x2a0</div><div>Nov 12 11:48:51 dcs-226 kernel: [<ffffffff81096916>] ? kthread+0x96/0xa0</div><div>Nov 12 11:48:51 dcs-226 kernel: [<ffffffff8100c0ca>] ? child_rip+0xa/0x20</div><div>Nov 12 11:48:51 dcs-226 kernel: [<ffffffff81096880>] ? kthread+0x0/0xa0</div><div>Nov 12 11:48:51 dcs-226 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20</div></div><div>After this, mount points has crashed. what should i do? Anyone could help me?</div></div><br><br><span title="neteasefooter"><span id="netease_mail_footer"></span></span>