[Linux-cluster] GFS fail with iozone

Steven Whitehouse swhiteho at redhat.com
Fri Sep 21 13:56:32 UTC 2012


Hi,

On Fri, 2012-09-21 at 15:48 +0200, Andrew Holway wrote:
> On Sep 21, 2012, at 10:34 AM, Steven Whitehouse wrote:
> 
> > Hi,
> > 
> > On Thu, 2012-09-20 at 16:25 +0200, Andrew Holway wrote:
> >> It seems that my node004 is the problem.
> >> 
> >> I cannot kill the iozone processes and I find this in the logs.
> >> 
> > This looks like there is some problem with the i/o stack below the level
> > of GFS2. What kind of storage are you using? If this is a JBOD then
> > perhaps there is a faulty disk or something like that?
> 
> Why do you say that?
> 
Based on your logs below....

> It did it again. but I have no indication from my storage brick that I have an issue. It does appear that it was the same node (node004) that caused the issue again.
> 
> The other three stopped doing IO for some time and then resumed.
> 
> The node004 died completely
> 
> I will run with loglevel=TRACE now.
> 
> Thanks,
> 
> Andrew
> 
> node004 messages
> Sep 21 11:28:50 node004 dlm_controld[22407]: dlm_controld 3.0.12.1 started
> Sep 21 11:28:51 node004 gfs_controld[22456]: gfs_controld 3.0.12.1 started
> Sep 21 11:28:59 node004 kernel: dlm: Using TCP for communications
> Sep 21 11:29:00 node004 clvmd: Cluster LVM daemon started - connected to CMAN
> Sep 21 11:29:00 node004 kernel: dlm: connecting to 2
> Sep 21 11:29:00 node004 kernel: dlm: connecting to 1
> Sep 21 11:29:01 node004 kernel: dlm: connecting to 3
> Sep 21 11:30:06 node004 kernel: GFS2 (built Jun 22 2012 12:21:46) installed
> Sep 21 11:30:06 node004 kernel: GFS2: fsid=: Trying to join cluster "lock_dlm", "nimble_cluster:gfs_test"
> Sep 21 11:30:06 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: Joined cluster. Now mounting FS...
> Sep 21 11:30:07 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: jid=2, already locked for use
> Sep 21 11:30:07 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: jid=2: Looking at journal...
> Sep 21 11:30:07 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: jid=2: Done
> Sep 21 11:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 11:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 11:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 11:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 11:38:24 node004 xinetd[5015]: START: node-state pid=23019 from=::ffff:10.141.255.254
> Sep 21 11:38:24 node004 xinetd[5015]: EXIT: node-state status=0 pid=23019 duration=0(sec)
> Sep 21 11:39:23 node004 xinetd[5015]: START: node-state pid=23038 from=::ffff:10.141.255.254
> Sep 21 11:39:23 node004 xinetd[5015]: EXIT: node-state status=0 pid=23038 duration=0(sec)
> Sep 21 11:39:25 node004 xinetd[5015]: START: node-state pid=23057 from=::ffff:10.141.255.254
> Sep 21 11:39:25 node004 xinetd[5015]: EXIT: node-state status=0 pid=23057 duration=0(sec)
> Sep 21 11:39:40 node004 xinetd[5015]: START: node-state pid=23075 from=::ffff:10.141.255.254
> Sep 21 11:39:40 node004 xinetd[5015]: EXIT: node-state status=0 pid=23075 duration=0(sec)
> Sep 21 11:39:45 node004 xinetd[5015]: START: node-state pid=23097 from=::ffff:10.141.255.254
> Sep 21 11:39:45 node004 xinetd[5015]: EXIT: node-state status=0 pid=23097 duration=0(sec)
> Sep 21 11:39:53 node004 xinetd[5015]: START: node-state pid=23119 from=::ffff:10.141.255.254
> Sep 21 11:39:53 node004 xinetd[5015]: EXIT: node-state status=0 pid=23119 duration=0(sec)
> Sep 21 11:39:54 node004 rpc.statd[23170]: Version 1.2.3 starting
> Sep 21 11:39:54 node004 sm-notify[23171]: Version 1.2.3 starting
> Sep 21 11:40:50 node004 rpc.statd[23215]: Version 1.2.3 starting
> Sep 21 11:40:50 node004 sm-notify[23216]: Version 1.2.3 starting
> Sep 21 12:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 12:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 12:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 12:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 12:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 12:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 12:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 12:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]

This is a SCSI error of some kind....

> Sep 21 12:54:43 node004 kernel: INFO: task iozone:23804 blocked for more than 120 seconds.
> Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 21 12:54:43 node004 kernel: iozone        D 0000000000000011     0 23804  22911 0x00000080
> Sep 21 12:54:43 node004 kernel: ffff880dd06db958 0000000000000086 0000000000000000 ffffffffa01bf1fc
> Sep 21 12:54:43 node004 kernel: ffff880dd06db928 000000004280d602 0000000000000000 ffff880ff308f380
> Sep 21 12:54:43 node004 kernel: ffff88100c25b058 ffff880dd06dbfd8 000000000000fb88 ffff88100c25b058
> Sep 21 12:54:43 node004 kernel: Call Trace:
> Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod]
> Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190
> Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13
> Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ac70>] ? do_sync_write+0x0/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
> Sep 21 12:54:43 node004 kernel: [<ffffffff81179b51>] ? generic_file_llseek_unlocked+0x1/0x80
> Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150
> Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117c4b9>] ? fget_light+0x19/0x90
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b

The above is GFS2 getting stuck doing a direct i/o write, and the reason
that it is stuck is that it is waiting for an i/o request to complete.


> Sep 21 12:54:43 node004 kernel: INFO: task iozone:23805 blocked for more than 120 seconds.
> Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 21 12:54:43 node004 kernel: iozone        D 0000000000000011     0 23805  22911 0x00000080
> Sep 21 12:54:43 node004 kernel: ffff880dd073b958 0000000000000086 0000000000000000 ffffffffa01bf1fc
> Sep 21 12:54:43 node004 kernel: ffff880dd073b928 000000002c279ff5 0000000000000000 ffff8810048e10c0
> Sep 21 12:54:43 node004 kernel: ffff881015abdaf8 ffff880dd073bfd8 000000000000fb88 ffff881015abdaf8
> Sep 21 12:54:43 node004 kernel: Call Trace:
> Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod]
> Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90
> Sep 21 12:54:43 node004 kernel: [<ffffffffa098e570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190
> Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13
> Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
> Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150
> Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0
> Sep 21 12:54:43 node004 kernel: [<ffffffff811937f0>] ? dput+0x0/0x150
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
> Sep 21 12:54:43 node004 kernel: INFO: task iozone:23806 blocked for more than 120 seconds.
> Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 21 12:54:43 node004 kernel: iozone        D 0000000000000008     0 23806  22911 0x00000080
> Sep 21 12:54:43 node004 kernel: ffff880dd070d958 0000000000000086 0000000000000000 ffffffffa01bf1fc
> Sep 21 12:54:43 node004 kernel: ffff880dd070d928 00000000780f0b6e 0000000000000000 ffff882006c04a40
> Sep 21 12:54:43 node004 kernel: ffff88100d04d058 ffff880dd070dfd8 000000000000fb88 ffff88100d04d058
> Sep 21 12:54:43 node004 kernel: Call Trace:
> Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod]
> Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190
> Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0
> Sep 21 12:54:43 node004 kernel: [<ffffffff81116681>] ? generic_file_aio_write+0x1/0xe0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ac70>] ? do_sync_write+0x0/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
> Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150
> Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
> Sep 21 12:54:43 node004 kernel: INFO: task iozone:23807 blocked for more than 120 seconds.
> Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 21 12:54:43 node004 kernel: iozone        D 0000000000000011     0 23807  22911 0x00000080
> Sep 21 12:54:43 node004 kernel: ffff880dd06ab958 0000000000000082 0000000000000000 ffffffffa01bf1fc
> Sep 21 12:54:43 node004 kernel: ffff880dd06ab928 00000000b2eadd67 0000000000000000 ffff881004891ec0
> Sep 21 12:54:43 node004 kernel: ffff881015ff8638 ffff880dd06abfd8 000000000000fb88 ffff881015ff8638
> Sep 21 12:54:43 node004 kernel: Call Trace:
> Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod]
> Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90
> Sep 21 12:54:43 node004 kernel: [<ffffffffa098e570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190
> Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480
> Sep 21 12:54:43 node004 kernel: [<ffffffff810623da>] ? __cond_resched+0x2a/0x40
> Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ac70>] ? do_sync_write+0x0/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
> Sep 21 12:54:43 node004 kernel: [<ffffffff811bb89f>] ? inotify_inode_queue_event+0x2f/0x120
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13
> Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150
> Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
> Sep 21 12:54:43 node004 kernel: INFO: task iozone:23808 blocked for more than 120 seconds.
> Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 21 12:54:43 node004 kernel: iozone        D 0000000000000011     0 23808  22911 0x00000080
> Sep 21 12:54:43 node004 kernel: ffff88100eb19958 0000000000000082 0000000000000000 ffffffffa01bf1fc
> Sep 21 12:54:43 node004 kernel: ffff88100eb19928 00000000291dd6ee 0000000000000000 ffff881004891a40
> Sep 21 12:54:43 node004 kernel: ffff88100bb3b098 ffff88100eb19fd8 000000000000fb88 ffff88100bb3b098
> Sep 21 12:54:43 node004 kernel: Call Trace:
> Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod]
> Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90
> Sep 21 12:54:43 node004 kernel: [<ffffffffa098e570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190
> Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13
> Sep 21 12:54:43 node004 kernel: [<ffffffff814fea54>] ? mutex_unlock+0x14/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ac70>] ? do_sync_write+0x0/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150
> Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
> Sep 21 12:54:43 node004 kernel: INFO: task iozone:23809 blocked for more than 120 seconds.
> Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 21 12:54:43 node004 kernel: iozone        D 0000000000000011     0 23809  22911 0x00000080
> Sep 21 12:54:43 node004 kernel: ffff880fe76d3958 0000000000000082 0000000000000000 ffffffffa01bf1fc
> Sep 21 12:54:43 node004 kernel: ffff880fe76d3928 0000000075c0b5d9 0000000000000000 ffff880eb83d5bc0
> Sep 21 12:54:43 node004 kernel: ffff880ff10f65f8 ffff880fe76d3fd8 000000000000fb88 ffff880ff10f65f8
> Sep 21 12:54:43 node004 kernel: Call Trace:
> Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod]
> Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90
> Sep 21 12:54:43 node004 kernel: [<ffffffffa098e570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190
> Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13
> Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
> Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150
> Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
> Sep 21 12:54:43 node004 kernel: INFO: task iozone:23810 blocked for more than 120 seconds.
> Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 21 12:54:43 node004 kernel: iozone        D 0000000000000011     0 23810  22911 0x00000080
> Sep 21 12:54:43 node004 kernel: ffff880e4e045958 0000000000000086 0000000000000000 ffffffffa01bf1fc
> Sep 21 12:54:43 node004 kernel: ffff880e4e045928 00000000ab0cddda 0000000000000000 ffff88100d2fd6c0
> Sep 21 12:54:43 node004 kernel: ffff8810079ba5f8 ffff880e4e045fd8 000000000000fb88 ffff8810079ba5f8
> Sep 21 12:54:43 node004 kernel: Call Trace:
> Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod]
> Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90
> Sep 21 12:54:43 node004 kernel: [<ffffffffa098e570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190
> Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff811aa034>] ? generic_write_sync+0x24/0x50
> Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdee>] ? reschedule_interrupt+0xe/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ac70>] ? do_sync_write+0x0/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
> Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150
> Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
> Sep 21 12:54:43 node004 kernel: INFO: task iozone:23811 blocked for more than 120 seconds.
> Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 21 12:54:43 node004 kernel: iozone        D 0000000000000008     0 23811  22911 0x00000080
> Sep 21 12:54:43 node004 kernel: ffff88100c499958 0000000000000086 0000000000000000 ffffffffa01bf1fc
> Sep 21 12:54:43 node004 kernel: ffff88100c499928 000000004b5eff48 0000000000000000 ffff88201678ac80
> Sep 21 12:54:43 node004 kernel: ffff88100db61ab8 ffff88100c499fd8 000000000000fb88 ffff88100db61ab8
> Sep 21 12:54:43 node004 kernel: Call Trace:
> Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod]
> Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90
> Sep 21 12:54:43 node004 kernel: [<ffffffffa098e570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190
> Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff811aa01d>] ? generic_write_sync+0xd/0x50
> Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ac70>] ? do_sync_write+0x0/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
> Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150
> Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
> Sep 21 12:54:43 node004 kernel: INFO: task iozone:23813 blocked for more than 120 seconds.
> Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 21 12:54:43 node004 kernel: iozone        D 0000000000000011     0 23813  22911 0x00000080
> Sep 21 12:54:43 node004 kernel: ffff880dd073f958 0000000000000086 0000000000000000 ffffffffa01bf1fc
> Sep 21 12:54:43 node004 kernel: ffff880dd073f928 00000000f66794ee 0000000000000000 ffff881007a7ba80
> Sep 21 12:54:43 node004 kernel: ffff880fe78c7af8 ffff880dd073ffd8 000000000000fb88 ffff880fe78c7af8
> Sep 21 12:54:43 node004 kernel: Call Trace:
> Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod]
> Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90
> Sep 21 12:54:43 node004 kernel: [<ffffffffa098e570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190
> Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff814fea41>] ? mutex_unlock+0x1/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
> Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150
> Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0
> Sep 21 12:54:43 node004 kernel: [<ffffffff811937f0>] ? dput+0x0/0x150
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
> Sep 21 12:54:43 node004 kernel: INFO: task iozone:23814 blocked for more than 120 seconds.
> Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 21 12:54:43 node004 kernel: iozone        D 0000000000000011     0 23814  22911 0x00000080
> Sep 21 12:54:43 node004 kernel: ffff880e4e119958 0000000000000082 0000000000000000 ffffffffa01bf1fc
> Sep 21 12:54:43 node004 kernel: ffff880e4e119928 00000000fa09a3ba 0000000000000000 ffff88100d2fd300
> Sep 21 12:54:43 node004 kernel: ffff880fe78c7098 ffff880e4e119fd8 000000000000fb88 ffff880fe78c7098
> Sep 21 12:54:43 node004 kernel: Call Trace:
> Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod]
> Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90
> Sep 21 12:54:43 node004 kernel: [<ffffffffa098e570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190
> Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff811aa034>] ? generic_write_sync+0x24/0x50
> Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0
> Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2]
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ac70>] ? do_sync_write+0x0/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140
> Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
> Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150
> Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0
> Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90
> Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b


These bits below are SCSI errors from sdj

> Sep 21 12:55:38 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:55:38 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:55:38 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:55:38 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 08 69 0a 88 00 00 20 00
> Sep 21 12:55:39 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:55:39 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:55:39 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:55:39 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 03 ce 6a c0 00 00 20 00
> Sep 21 12:55:41 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:55:41 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:55:41 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:55:41 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 01 81 f2 28 00 00 20 00
> Sep 21 12:55:46 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:55:46 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:55:46 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:55:46 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 06 e6 0b 58 00 00 20 00
> Sep 21 12:55:47 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:55:47 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:55:47 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:55:47 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 06 48 cb 48 00 00 20 00
> Sep 21 12:55:49 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:55:49 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:55:49 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:55:49 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 07 64 ad 88 00 00 20 00
> Sep 21 12:55:50 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:55:50 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:55:50 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:55:50 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 05 79 27 80 00 00 20 00
> Sep 21 12:55:52 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:55:52 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:55:52 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:55:52 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 01 6c 26 60 00 00 20 00
> Sep 21 12:55:54 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:55:54 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:55:54 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:55:54 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 04 de 20 c8 00 00 20 00
> Sep 21 12:55:55 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:55:55 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:55:55 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:55:55 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 03 2a 6c 20 00 00 20 00
> Sep 21 12:55:57 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:55:57 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:55:57 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:55:57 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 01 92 25 28 00 00 20 00
> Sep 21 12:55:58 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:55:58 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:55:58 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:55:58 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 06 03 33 90 00 00 20 00
> Sep 21 12:56:00 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:56:00 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:56:00 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:56:00 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 01 23 7e 40 00 00 20 00
> Sep 21 12:56:02 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:56:02 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:56:02 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:56:02 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 07 73 c9 18 00 00 20 00
> Sep 21 12:56:03 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:56:03 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:56:03 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:56:03 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 01 4f 6a a8 00 00 20 00
> Sep 21 12:56:05 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:56:05 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:56:05 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:56:05 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 04 3e 9c c8 00 00 20 00
> Sep 21 12:59:05 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 12:59:05 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 12:59:05 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 12:59:05 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 00 09 ea c8 00 00 28 00
> Sep 21 12:59:05 node004 kernel: Buffer I/O error on device dm-0, logical block 80985
> Sep 21 12:59:05 node004 kernel: lost page write due to I/O error on dm-0
> Sep 21 12:59:05 node004 kernel: Buffer I/O error on device dm-0, logical block 80986
> Sep 21 12:59:05 node004 kernel: lost page write due to I/O error on dm-0
> Sep 21 12:59:05 node004 kernel: Buffer I/O error on device dm-0, logical block 80987
> Sep 21 12:59:05 node004 kernel: lost page write due to I/O error on dm-0
> Sep 21 12:59:05 node004 kernel: Buffer I/O error on device dm-0, logical block 80988
> Sep 21 12:59:05 node004 kernel: lost page write due to I/O error on dm-0
> Sep 21 12:59:05 node004 kernel: Buffer I/O error on device dm-0, logical block 80989
> Sep 21 12:59:05 node004 kernel: lost page write due to I/O error on dm-0

Now the buffer layer is complaining that writes are failing to dm-0
presumably as a result of the previously seen errors.


> Sep 21 13:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 13:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 13:02:05 node004 kernel: sd 6:0:0:0: timing out command, waited 180s
> Sep 21 13:02:05 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code
> Sep 21 13:02:05 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> Sep 21 13:02:05 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 00 09 ea f0 00 00 08 00
> Sep 21 13:02:05 node004 kernel: Buffer I/O error on device dm-0, logical block 80990
> Sep 21 13:02:05 node004 kernel: lost page write due to I/O error on dm-0
> Sep 21 13:02:05 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: fatal: I/O error
> Sep 21 13:02:05 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2:   block = 80990
> Sep 21 13:02:05 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2:   function = log_write_header, file = fs/gfs2/log.c, line = 616
> Sep 21 13:02:05 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: about to withdraw this file system
> Sep 21 13:02:05 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: telling LM to unmount
> Sep 21 13:02:05 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: withdrawn
> Sep 21 13:02:05 node004 kernel: Pid: 22758, comm: glock_workqueue Not tainted 2.6.32-279.el6.x86_64 #1
> Sep 21 13:02:05 node004 kernel: Call Trace:
> Sep 21 13:02:05 node004 kernel: [<ffffffffa09ad062>] ? gfs2_lm_withdraw+0x102/0x130 [gfs2]
> Sep 21 13:02:05 node004 kernel: [<ffffffff814fea28>] ? out_of_line_wait_on_bit+0x78/0x90
> Sep 21 13:02:05 node004 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50
> Sep 21 13:02:05 node004 kernel: [<ffffffffa09ad0d0>] ? gfs2_io_error_bh_i+0x40/0x50 [gfs2]
> Sep 21 13:02:05 node004 kernel: [<ffffffff811adfb6>] ? __wait_on_buffer+0x26/0x30
> Sep 21 13:02:05 node004 kernel: [<ffffffffa0995288>] ? log_write_header+0x3a8/0x490 [gfs2]
> Sep 21 13:02:05 node004 kernel: [<ffffffffa0995951>] ? gfs2_log_flush+0x301/0x6f0 [gfs2]
> Sep 21 13:02:05 node004 kernel: [<ffffffff810629d3>] ? dequeue_entity+0x113/0x2e0
> Sep 21 13:02:05 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
> Sep 21 13:02:05 node004 kernel: [<ffffffffa09927d0>] ? inode_go_sync+0x80/0x160 [gfs2]
> Sep 21 13:02:05 node004 kernel: [<ffffffffa0991336>] ? do_xmote+0x156/0x280 [gfs2]
> Sep 21 13:02:05 node004 kernel: [<ffffffff814fd830>] ? thread_return+0x4e/0x76e
> Sep 21 13:02:05 node004 kernel: [<ffffffffa0991551>] ? run_queue+0xf1/0x1d0 [gfs2]
> Sep 21 13:02:05 node004 kernel: [<ffffffffa0991d2a>] ? glock_work_func+0x7a/0x1b0 [gfs2]
> Sep 21 13:02:05 node004 kernel: [<ffffffffa0991cb0>] ? glock_work_func+0x0/0x1b0 [gfs2]
> Sep 21 13:02:05 node004 kernel: [<ffffffff8108c760>] ? worker_thread+0x170/0x2a0
> Sep 21 13:02:05 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
> Sep 21 13:02:05 node004 kernel: [<ffffffff8108c5f0>] ? worker_thread+0x0/0x2a0
> Sep 21 13:02:05 node004 kernel: [<ffffffff81091d66>] ? kthread+0x96/0xa0
> Sep 21 13:02:05 node004 kernel: [<ffffffff8100c14a>] ? child_rip+0xa/0x20
> Sep 21 13:02:05 node004 kernel: [<ffffffff81091cd0>] ? kthread+0x0/0xa0
> Sep 21 13:02:05 node004 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20
> Sep 21 13:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 13:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 13:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 13:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 14:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 14:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 14:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 14:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 14:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 14:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 14:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 14:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 15:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 15:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 15:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 15:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 15:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 15:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 15:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> Sep 21 15:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> 
> 
> 
> > 
> > Steve.
> > 
> >> Sep 20 15:59:09 node004 kernel: sd 3:0:0:0: timing out command, waited 180s
> >> Sep 20 15:59:09 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code
> >> Sep 20 15:59:09 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> >> Sep 20 15:59:09 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 01 94 88 a0 00 00 20 00
> >> Sep 20 15:59:11 node004 kernel: sd 3:0:0:0: timing out command, waited 180s
> >> Sep 20 15:59:11 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code
> >> Sep 20 15:59:11 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> >> Sep 20 15:59:11 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 06 51 ff 90 00 00 20 00
> >> Sep 20 15:59:13 node004 kernel: sd 3:0:0:0: timing out command, waited 180s
> >> Sep 20 15:59:13 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code
> >> Sep 20 15:59:13 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> >> Sep 20 15:59:13 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 06 46 d5 c0 00 00 20 00
> >> Sep 20 15:59:14 node004 kernel: sd 3:0:0:0: timing out command, waited 180s
> >> Sep 20 15:59:14 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code
> >> Sep 20 15:59:14 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> >> Sep 20 15:59:14 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 03 da c7 78 00 00 20 00
> >> Sep 20 15:59:16 node004 kernel: sd 3:0:0:0: timing out command, waited 180s
> >> Sep 20 15:59:16 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code
> >> Sep 20 15:59:16 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> >> Sep 20 15:59:16 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 06 f5 8f 60 00 00 20 00
> >> Sep 20 15:59:17 node004 kernel: sd 3:0:0:0: timing out command, waited 180s
> >> Sep 20 15:59:17 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code
> >> Sep 20 15:59:17 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> >> Sep 20 15:59:17 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 01 30 7c 90 00 00 20 00
> >> Sep 20 15:59:19 node004 kernel: sd 3:0:0:0: timing out command, waited 180s
> >> Sep 20 15:59:19 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code
> >> Sep 20 15:59:19 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> >> Sep 20 15:59:19 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 05 79 8b e0 00 00 20 00
> >> Sep 20 15:59:20 node004 kernel: sd 3:0:0:0: timing out command, waited 180s
> >> Sep 20 15:59:20 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code
> >> Sep 20 15:59:20 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> >> Sep 20 15:59:20 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 04 37 13 08 00 00 20 00
> >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00]
> >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00]
> >> Sep 20 16:02:15 node004 kernel: INFO: task glock_workqueue:9820 blocked for more than 120 seconds.
> >> Sep 20 16:02:15 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >> Sep 20 16:02:15 node004 kernel: glock_workque D 000000000000001b     0  9820      2 0x00000080
> >> Sep 20 16:02:15 node004 kernel: ffff8820150a9c70 0000000000000046 0000000000000004 00000000aa8f20cf
> >> Sep 20 16:02:15 node004 kernel: ffff881fffd050c8 0000000000000441 ffff8820150a9c10 ffffffff811acd5e
> >> Sep 20 16:02:15 node004 kernel: ffff882015b39ab8 ffff8820150a9fd8 000000000000fb88 ffff882015b39ab8
> >> Sep 20 16:02:15 node004 kernel: Call Trace:
> >> Sep 20 16:02:15 node004 kernel: [<ffffffff811acd5e>] ? submit_bh+0x10e/0x150
> >> Sep 20 16:02:15 node004 kernel: [<ffffffff8109cd39>] ? ktime_get_ts+0xa9/0xe0
> >> Sep 20 16:02:15 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0
> >> Sep 20 16:02:15 node004 kernel: [<ffffffffa094aaca>] gfs2_log_flush+0x47a/0x6f0 [gfs2]
> >> Sep 20 16:02:15 node004 kernel: [<ffffffff810629d3>] ? dequeue_entity+0x113/0x2e0
> >> Sep 20 16:02:15 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
> >> Sep 20 16:02:15 node004 kernel: [<ffffffffa09477d0>] inode_go_sync+0x80/0x160 [gfs2]
> >> Sep 20 16:02:15 node004 kernel: [<ffffffffa0946336>] do_xmote+0x156/0x280 [gfs2]
> >> Sep 20 16:02:15 node004 kernel: [<ffffffff814fd830>] ? thread_return+0x4e/0x76e
> >> Sep 20 16:02:15 node004 kernel: [<ffffffffa0946551>] run_queue+0xf1/0x1d0 [gfs2]
> >> Sep 20 16:02:15 node004 kernel: [<ffffffffa0946d2a>] glock_work_func+0x7a/0x1b0 [gfs2]
> >> Sep 20 16:02:15 node004 kernel: [<ffffffffa0946cb0>] ? glock_work_func+0x0/0x1b0 [gfs2]
> >> Sep 20 16:02:15 node004 kernel: [<ffffffff8108c760>] worker_thread+0x170/0x2a0
> >> Sep 20 16:02:15 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
> >> Sep 20 16:02:15 node004 kernel: [<ffffffff8108c5f0>] ? worker_thread+0x0/0x2a0
> >> Sep 20 16:02:15 node004 kernel: [<ffffffff81091d66>] kthread+0x96/0xa0
> >> Sep 20 16:02:15 node004 kernel: [<ffffffff8100c14a>] child_rip+0xa/0x20
> >> Sep 20 16:02:15 node004 kernel: [<ffffffff81091cd0>] ? kthread+0x0/0xa0
> >> Sep 20 16:02:15 node004 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20
> >> Sep 20 16:02:21 node004 kernel: sd 3:0:0:0: timing out command, waited 180s
> >> Sep 20 16:02:21 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code
> >> Sep 20 16:02:21 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> >> Sep 20 16:02:21 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 00 0e 6e c0 00 00 20 00
> >> Sep 20 16:02:21 node004 kernel: Buffer I/O error on device dm-6, logical block 117976
> >> Sep 20 16:02:21 node004 kernel: lost page write due to I/O error on dm-6
> >> Sep 20 16:02:21 node004 kernel: Buffer I/O error on device dm-6, logical block 117977
> >> Sep 20 16:02:21 node004 kernel: lost page write due to I/O error on dm-6
> >> Sep 20 16:02:21 node004 kernel: Buffer I/O error on device dm-6, logical block 117978
> >> Sep 20 16:02:21 node004 kernel: lost page write due to I/O error on dm-6
> >> Sep 20 16:02:21 node004 kernel: Buffer I/O error on device dm-6, logical block 117979
> >> Sep 20 16:02:21 node004 kernel: lost page write due to I/O error on dm-6
> >> Sep 20 16:02:22 node004 kernel: sd 3:0:0:0: timing out command, waited 180s
> >> Sep 20 16:02:22 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code
> >> Sep 20 16:02:22 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> >> Sep 20 16:02:22 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 00 0e 6e b8 00 00 08 00
> >> Sep 20 16:02:22 node004 kernel: Buffer I/O error on device dm-6, logical block 117975
> >> Sep 20 16:02:22 node004 kernel: lost page write due to I/O error on dm-6
> >> Sep 20 16:05:22 node004 kernel: sd 3:0:0:0: timing out command, waited 180s
> >> Sep 20 16:05:22 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code
> >> Sep 20 16:05:22 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> >> Sep 20 16:05:22 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 00 0e 6e e0 00 00 08 00
> >> Sep 20 16:05:22 node004 kernel: Buffer I/O error on device dm-6, logical block 117980
> >> Sep 20 16:05:22 node004 kernel: lost page write due to I/O error on dm-6
> >> Sep 20 16:05:22 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.3: fatal: I/O error
> >> Sep 20 16:05:22 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.3:   block = 117980
> >> Sep 20 16:05:22 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.3:   function = log_write_header, file = fs/gfs2/log.c, line = 616
> >> Sep 20 16:05:22 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.3: about to withdraw this file system
> >> Sep 20 16:05:22 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.3: telling LM to unmount
> >> Sep 20 16:05:22 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.3: withdrawn
> >> Sep 20 16:05:22 node004 kernel: Pid: 9820, comm: glock_workqueue Not tainted 2.6.32-279.el6.x86_64 #1
> >> Sep 20 16:05:22 node004 kernel: Call Trace:
> >> Sep 20 16:05:22 node004 kernel: [<ffffffffa0962062>] ? gfs2_lm_withdraw+0x102/0x130 [gfs2]
> >> Sep 20 16:05:22 node004 kernel: [<ffffffff814fea28>] ? out_of_line_wait_on_bit+0x78/0x90
> >> Sep 20 16:05:22 node004 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50
> >> Sep 20 16:05:22 node004 kernel: [<ffffffffa09620d0>] ? gfs2_io_error_bh_i+0x40/0x50 [gfs2]
> >> Sep 20 16:05:22 node004 kernel: [<ffffffff811adfb6>] ? __wait_on_buffer+0x26/0x30
> >> Sep 20 16:05:22 node004 kernel: [<ffffffffa094a288>] ? log_write_header+0x3a8/0x490 [gfs2]
> >> Sep 20 16:05:22 node004 kernel: [<ffffffffa094a951>] ? gfs2_log_flush+0x301/0x6f0 [gfs2]
> >> Sep 20 16:05:22 node004 kernel: [<ffffffff810629d3>] ? dequeue_entity+0x113/0x2e0
> >> Sep 20 16:05:22 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
> >> Sep 20 16:05:22 node004 kernel: [<ffffffffa09477d0>] ? inode_go_sync+0x80/0x160 [gfs2]
> >> Sep 20 16:05:22 node004 kernel: [<ffffffffa0946336>] ? do_xmote+0x156/0x280 [gfs2]
> >> Sep 20 16:05:22 node004 kernel: [<ffffffff814fd830>] ? thread_return+0x4e/0x76e
> >> Sep 20 16:05:22 node004 kernel: [<ffffffffa0946551>] ? run_queue+0xf1/0x1d0 [gfs2]
> >> Sep 20 16:05:22 node004 kernel: [<ffffffffa0946d2a>] ? glock_work_func+0x7a/0x1b0 [gfs2]
> >> Sep 20 16:05:22 node004 kernel: [<ffffffffa0946cb0>] ? glock_work_func+0x0/0x1b0 [gfs2]
> >> Sep 20 16:05:22 node004 kernel: [<ffffffff8108c760>] ? worker_thread+0x170/0x2a0
> >> Sep 20 16:05:22 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
> >> Sep 20 16:05:22 node004 kernel: [<ffffffff8108c5f0>] ? worker_thread+0x0/0x2a0
> >> Sep 20 16:05:22 node004 kernel: [<ffffffff81091d66>] ? kthread+0x96/0xa0
> >> Sep 20 16:05:22 node004 kernel: [<ffffffff8100c14a>] ? child_rip+0xa/0x20
> >> Sep 20 16:05:22 node004 kernel: [<ffffffff81091cd0>] ? kthread+0x0/0xa0
> >> Sep 20 16:05:22 node004 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20
> >> Sep 20 16:08:22 node004 kernel: sd 3:0:0:0: timing out command, waited 180s
> >> Sep 20 16:08:22 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code
> >> Sep 20 16:08:22 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> >> Sep 20 16:08:22 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 00 00 08 b0 00 00 08 00
> >> Sep 20 16:08:22 node004 kernel: Buffer I/O error on device dm-6, logical block 22
> >> Sep 20 16:08:22 node004 kernel: lost page write due to I/O error on dm-6
> >> Sep 20 16:16:05 node004 xinetd[4416]: START: node-state pid=14578 from=::ffff:10.141.255.254
> >> Sep 20 16:16:05 node004 xinetd[4416]: EXIT: node-state status=0 pid=14578 duration=0(sec)
> >> Sep 20 16:17:34 node004 xinetd[4416]: START: node-state pid=14653 from=::ffff:10.141.255.254
> >> Sep 20 16:17:34 node004 xinetd[4416]: EXIT: node-state status=0 pid=14653 duration=0(sec)
> >> Sep 20 16:17:36 node004 xinetd[4416]: START: node-state pid=14671 from=::ffff:10.141.255.254
> >> Sep 20 16:17:36 node004 xinetd[4416]: EXIT: node-state status=0 pid=14671 duration=0(sec)
> >> Sep 20 16:17:39 node004 xinetd[4416]: START: node-state pid=14690 from=::ffff:10.141.255.254
> >> Sep 20 16:17:39 node004 xinetd[4416]: EXIT: node-state status=0 pid=14690 duration=0(sec)
> >> Sep 20 16:17:41 node004 xinetd[4416]: START: node-state pid=14708 from=::ffff:10.141.255.254
> >> Sep 20 16:17:41 node004 xinetd[4416]: EXIT: node-state status=0 pid=14708 duration=0(sec)
> >> On Sep 20, 2012, at 4:14 PM, Andrew Holway wrote:
> >> 
> >>> Aslo,
> >>> 
> >>> IOzone gave this error: Error writing block 29813, fd= 3
> >>> 
> >>> GFS2: fsid=nimble_cluster:gfs_test.0: jid=3: Trying to acquire journal lock...
> >>> GFS2: fsid=nimble_cluster:gfs_test.0: jid=3: Looking at journal...
> >>> GFS2: fsid=nimble_cluster:gfs_test.0: jid=3: Acquiring the transaction lock...
> >>> GFS2: fsid=nimble_cluster:gfs_test.0: jid=3: Replaying journal...
> >>> 
> >>> 
> >>> GFS seemed to repair itself and things carried on working.
> >>> 
> >>> thanks,
> >>> 
> >>> Andrew
> >>> 
> >>> On Sep 20, 2012, at 4:08 PM, Andrew Holway wrote:
> >>> 
> >>>> Hello,
> >>>> 
> >>>> I have set up a 4 node cluster. They are interconnected with an IPoIB (connected mode)
> >>>> 
> >>>> Whist running a benchmark with IOzone I got the following errors:
> >>>> 
> >>>> IO seems to have halted.
> >>>> 
> >>>> Thanks,
> >>>> 
> >>>> Andrew
> >>>> 
> >>>> Sep 20 16:01:57 node001 kernel: INFO: task iozone:15816 blocked for more than 120 seconds.
> >>>> Sep 20 16:01:57 node001 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >>>> Sep 20 16:01:57 node001 kernel: iozone        D 0000000000000011     0 15816  15374 0x00000080
> >>>> Sep 20 16:01:57 node001 kernel: ffff880fd5ebbac8 0000000000000086 ffff880fd5ebba38 ffffffff81276b66
> >>>> Sep 20 16:01:57 node001 kernel: 0000000000000096 ffff881ff95588d8 ffff880fd5ebba58 ffffffff81091f97
> >>>> Sep 20 16:01:57 node001 kernel: ffff880fe238c638 ffff880fd5ebbfd8 000000000000fb88 ffff880fe238c638
> >>>> Sep 20 16:01:57 node001 kernel: Call Trace:
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81276b66>] ? __prop_inc_single+0x46/0x60
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81091f97>] ? bit_waitqueue+0x17/0xd0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa094357e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff810538b6>] ? enqueue_task+0x66/0x80
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09454f5>] gfs2_glock_wait+0x45/0x90 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09468f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540ec>] gfs2_permission+0xec/0x100 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540e4>] ? gfs2_permission+0xe4/0x100 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118982d>] __link_path_walk+0xad/0x1030
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118aa3a>] path_walk+0x6a/0xe0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118ac0b>] do_path_lookup+0x5b/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118b877>] user_path_at+0x57/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff811804ac>] vfs_fstatat+0x3c/0x80
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118061b>] vfs_stat+0x1b/0x20
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81180644>] sys_newstat+0x24/0x50
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
> >>>> Sep 20 16:01:57 node001 kernel: INFO: task iozone:15818 blocked for more than 120 seconds.
> >>>> Sep 20 16:01:57 node001 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >>>> Sep 20 16:01:57 node001 kernel: iozone        D 0000000000000011     0 15818  15374 0x00000080
> >>>> Sep 20 16:01:57 node001 kernel: ffff880fbe5b3ac8 0000000000000082 0000000000000000 ffff881ff95587a0
> >>>> Sep 20 16:01:57 node001 kernel: ffff881000000002 ffff88100ee13048 00000000be5b3a58 00000040ffffffff
> >>>> Sep 20 16:01:57 node001 kernel: ffff88100eed1ab8 ffff880fbe5b3fd8 000000000000fb88 ffff88100eed1ab8
> >>>> Sep 20 16:01:57 node001 kernel: Call Trace:
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa094357e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09454f5>] gfs2_glock_wait+0x45/0x90 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09468f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540ec>] gfs2_permission+0xec/0x100 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540e4>] ? gfs2_permission+0xe4/0x100 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118982d>] __link_path_walk+0xad/0x1030
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118aa3a>] path_walk+0x6a/0xe0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118ac0b>] do_path_lookup+0x5b/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118b877>] user_path_at+0x57/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff811804ac>] vfs_fstatat+0x3c/0x80
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118061b>] vfs_stat+0x1b/0x20
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81180644>] sys_newstat+0x24/0x50
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
> >>>> Sep 20 16:01:57 node001 kernel: INFO: task iozone:15820 blocked for more than 120 seconds.
> >>>> Sep 20 16:01:57 node001 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >>>> Sep 20 16:01:57 node001 kernel: iozone        D 0000000000000008     0 15820  15374 0x00000080
> >>>> Sep 20 16:01:57 node001 kernel: ffff880ffed7bac8 0000000000000086 0000000000000000 ffffffff81276b66
> >>>> Sep 20 16:01:57 node001 kernel: 0000000000000002 ffff881ff95588d8 00000000fed7ba58 00000040ffffffff
> >>>> Sep 20 16:01:57 node001 kernel: ffff880fbdd51ab8 ffff880ffed7bfd8 000000000000fb88 ffff880fbdd51ab8
> >>>> Sep 20 16:01:57 node001 kernel: Call Trace:
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81276b66>] ? __prop_inc_single+0x46/0x60
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa094357e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff810538b6>] ? enqueue_task+0x66/0x80
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09454f5>] gfs2_glock_wait+0x45/0x90 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09468f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540ec>] gfs2_permission+0xec/0x100 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540e4>] ? gfs2_permission+0xe4/0x100 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118982d>] __link_path_walk+0xad/0x1030
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118aa3a>] path_walk+0x6a/0xe0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118ac0b>] do_path_lookup+0x5b/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118b877>] user_path_at+0x57/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff811804ac>] vfs_fstatat+0x3c/0x80
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118061b>] vfs_stat+0x1b/0x20
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81180644>] sys_newstat+0x24/0x50
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
> >>>> Sep 20 16:01:57 node001 kernel: INFO: task iozone:15822 blocked for more than 120 seconds.
> >>>> Sep 20 16:01:57 node001 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >>>> Sep 20 16:01:57 node001 kernel: iozone        D 0000000000000011     0 15822  15374 0x00000080
> >>>> Sep 20 16:01:57 node001 kernel: ffff880fd5dddac8 0000000000000086 0000000000000000 ffffffff81276b66
> >>>> Sep 20 16:01:57 node001 kernel: 0000000000000096 ffff881ff95588d8 ffff880fd5ddda58 ffffffff81091f97
> >>>> Sep 20 16:01:57 node001 kernel: ffff880fe238d098 ffff880fd5dddfd8 000000000000fb88 ffff880fe238d098
> >>>> Sep 20 16:01:57 node001 kernel: Call Trace:
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81276b66>] ? __prop_inc_single+0x46/0x60
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81091f97>] ? bit_waitqueue+0x17/0xd0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa094357e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff810538b6>] ? enqueue_task+0x66/0x80
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09454f5>] gfs2_glock_wait+0x45/0x90 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09468f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540ec>] gfs2_permission+0xec/0x100 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540e4>] ? gfs2_permission+0xe4/0x100 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118982d>] __link_path_walk+0xad/0x1030
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118aa3a>] path_walk+0x6a/0xe0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118ac0b>] do_path_lookup+0x5b/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118b877>] user_path_at+0x57/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff811804ac>] vfs_fstatat+0x3c/0x80
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118061b>] vfs_stat+0x1b/0x20
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81180644>] sys_newstat+0x24/0x50
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
> >>>> Sep 20 16:01:57 node001 kernel: INFO: task iozone:15824 blocked for more than 120 seconds.
> >>>> Sep 20 16:01:57 node001 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >>>> Sep 20 16:01:57 node001 kernel: iozone        D 0000000000000011     0 15824  15374 0x00000080
> >>>> Sep 20 16:01:57 node001 kernel: ffff880fbe5edac8 0000000000000086 0000000000000000 ffffffff81276b66
> >>>> Sep 20 16:01:57 node001 kernel: 0000000000000002 ffff881ff95588d8 00000000be5eda58 00000040ffffffff
> >>>> Sep 20 16:01:57 node001 kernel: ffff880ff69085f8 ffff880fbe5edfd8 000000000000fb88 ffff880ff69085f8
> >>>> Sep 20 16:01:57 node001 kernel: Call Trace:
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81276b66>] ? __prop_inc_single+0x46/0x60
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa094357e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff810538b6>] ? enqueue_task+0x66/0x80
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09454f5>] gfs2_glock_wait+0x45/0x90 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09468f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540ec>] gfs2_permission+0xec/0x100 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540e4>] ? gfs2_permission+0xe4/0x100 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118982d>] __link_path_walk+0xad/0x1030
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118aa3a>] path_walk+0x6a/0xe0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118ac0b>] do_path_lookup+0x5b/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118b877>] user_path_at+0x57/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff811804ac>] vfs_fstatat+0x3c/0x80
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118061b>] vfs_stat+0x1b/0x20
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81180644>] sys_newstat+0x24/0x50
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
> >>>> Sep 20 16:01:57 node001 kernel: INFO: task iozone:15826 blocked for more than 120 seconds.
> >>>> Sep 20 16:01:57 node001 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >>>> Sep 20 16:01:57 node001 kernel: iozone        D 0000000000000011     0 15826  15374 0x00000080
> >>>> Sep 20 16:01:57 node001 kernel: ffff880fbe7cfac8 0000000000000086 0000000000000000 ffffffff81276b66
> >>>> Sep 20 16:01:57 node001 kernel: 0000000000000096 ffff881ff95588d8 ffff880fbe7cfa58 ffffffff81091f97
> >>>> Sep 20 16:01:57 node001 kernel: ffff88100ddcbab8 ffff880fbe7cffd8 000000000000fb88 ffff88100ddcbab8
> >>>> Sep 20 16:01:57 node001 kernel: Call Trace:
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81276b66>] ? __prop_inc_single+0x46/0x60
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81091f97>] ? bit_waitqueue+0x17/0xd0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa094357e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff810538b6>] ? enqueue_task+0x66/0x80
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09454f5>] gfs2_glock_wait+0x45/0x90 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09468f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540ec>] gfs2_permission+0xec/0x100 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540e4>] ? gfs2_permission+0xe4/0x100 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118982d>] __link_path_walk+0xad/0x1030
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118aa3a>] path_walk+0x6a/0xe0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118ac0b>] do_path_lookup+0x5b/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118b877>] user_path_at+0x57/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff811804ac>] vfs_fstatat+0x3c/0x80
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118061b>] vfs_stat+0x1b/0x20
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81180644>] sys_newstat+0x24/0x50
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
> >>>> Sep 20 16:01:57 node001 kernel: INFO: task iozone:15828 blocked for more than 120 seconds.
> >>>> Sep 20 16:01:57 node001 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >>>> Sep 20 16:01:57 node001 kernel: iozone        D 0000000000000011     0 15828  15374 0x00000080
> >>>> Sep 20 16:01:57 node001 kernel: ffff88100684bac8 0000000000000086 0000000000000000 ffffffff81276b66
> >>>> Sep 20 16:01:57 node001 kernel: 0000000000000002 ffff881ff95588d8 000000000684ba58 00000040ffffffff
> >>>> Sep 20 16:01:57 node001 kernel: ffff88100edc7af8 ffff88100684bfd8 000000000000fb88 ffff88100edc7af8
> >>>> Sep 20 16:01:57 node001 kernel: Call Trace:
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81276b66>] ? __prop_inc_single+0x46/0x60
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa094357e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff810538b6>] ? enqueue_task+0x66/0x80
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09454f5>] gfs2_glock_wait+0x45/0x90 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09468f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540ec>] gfs2_permission+0xec/0x100 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540e4>] ? gfs2_permission+0xe4/0x100 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118982d>] __link_path_walk+0xad/0x1030
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118aa3a>] path_walk+0x6a/0xe0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118ac0b>] do_path_lookup+0x5b/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118b877>] user_path_at+0x57/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff811804ac>] vfs_fstatat+0x3c/0x80
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118061b>] vfs_stat+0x1b/0x20
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81180644>] sys_newstat+0x24/0x50
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
> >>>> Sep 20 16:01:57 node001 kernel: INFO: task iozone:15830 blocked for more than 120 seconds.
> >>>> Sep 20 16:01:57 node001 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >>>> Sep 20 16:01:57 node001 kernel: iozone        D 0000000000000000     0 15830  15374 0x00000080
> >>>> Sep 20 16:01:57 node001 kernel: ffff880fbdd0fac8 0000000000000082 0000000000000000 ffffffff81276b66
> >>>> Sep 20 16:01:57 node001 kernel: 0000000000000002 ffff881ff95588d8 00000000bdd0fa58 00000040ffffffff
> >>>> Sep 20 16:01:57 node001 kernel: ffff88100de93098 ffff880fbdd0ffd8 000000000000fb88 ffff88100de93098
> >>>> Sep 20 16:01:57 node001 kernel: Call Trace:
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81276b66>] ? __prop_inc_single+0x46/0x60
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa094357e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff810538b6>] ? enqueue_task+0x66/0x80
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09454f5>] gfs2_glock_wait+0x45/0x90 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09468f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540ec>] gfs2_permission+0xec/0x100 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540e4>] ? gfs2_permission+0xe4/0x100 [gfs2]
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118982d>] __link_path_walk+0xad/0x1030
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118aa3a>] path_walk+0x6a/0xe0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118ac0b>] do_path_lookup+0x5b/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118b877>] user_path_at+0x57/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff811804ac>] vfs_fstatat+0x3c/0x80
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118061b>] vfs_stat+0x1b/0x20
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81180644>] sys_newstat+0x24/0x50
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0
> >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
> >>>> 
> >>>> 
> >>> 
> >> 
> >> 
> >> 
> > 
> > 
> 
> 





More information about the Linux-cluster mailing list