[rhelv6-list] RHEL6.2 Kernel/EXT4 bug

Musayev, Ilya imusayev at webmd.net
Thu Jan 12 17:14:14 UTC 2012


Curious if anyone has seen this in their RHEL6.2 setups, if you have 6.1 or 6.2 please try this out and see what happens. List of commands to reproduce is below, latest iozone required.

https://bugzilla.redhat.com/show_bug.cgi?id=773377


Description of problem:



While in process of bench-marking various file systems, i came across an issue

where EXT4 partition would lock up under specific load. In current setup it was

under iozone testing. The I/O scheduler is set to default "noop".



*This only occurs with EXT4 and not XFS.



A repetitive message i'd see in messages file is attached:



Jan  9 23:38:51 minigiant5 kernel: INFO: task jbd2/md0-8:16169 blocked for more

than 120 seconds.

Jan  9 23:38:51 minigiant5 kernel: "echo 0 >

/proc/sys/kernel/hung_task_timeout_secs" disables this message.

Jan  9 23:38:51 minigiant5 kernel: jbd2/md0-8    D 0000000000000005     0 16169

     2 0x00000000

Jan  9 23:38:51 minigiant5 kernel: ffff88010153bc20 0000000000000046

0000000000000000 ffff88010153bbe4

Jan  9 23:38:51 minigiant5 kernel: 0000000000000000 ffff88012dc24c00

ffff880028215f80 0000000000000400

Jan  9 23:38:51 minigiant5 kernel: ffff880108f230f8 ffff88010153bfd8

000000000000f4e8 ffff880108f230f8

Jan  9 23:38:51 minigiant5 kernel: Call Trace:

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff811a9240>] ? sync_buffer+0x0/0x50

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff814ed2a3>] io_schedule+0x73/0xc0

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff811a9280>] sync_buffer+0x40/0x50

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff814edc5f>] __wait_on_bit+0x5f/0x90

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff811a9240>] ? sync_buffer+0x0/0x50

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff814edd08>]

out_of_line_wait_on_bit+0x78/0x90

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81090a50>] ?

wake_bit_function+0x0/0x50

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff811a9236>]

__wait_on_buffer+0x26/0x30

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa022a0e6>]

jbd2_journal_commit_transaction+0xa76/0x14b0 [jbd2]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff810096f0>] ?

__switch_to+0xd0/0x320

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff8107c99b>] ?

try_to_del_timer_sync+0x7b/0xe0

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa022f928>] kjournald2+0xb8/0x220

[jbd2]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81090a10>] ?

autoremove_wake_function+0x0/0x40

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa022f870>] ? kjournald2+0x0/0x220

[jbd2]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff810906a6>] kthread+0x96/0xa0

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff8100c14a>] child_rip+0xa/0x20

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81090610>] ? kthread+0x0/0xa0

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20

Jan  9 23:38:51 minigiant5 kernel: INFO: task iozone:17280 blocked for more

than 120 seconds.

Jan  9 23:38:51 minigiant5 kernel: "echo 0 >

/proc/sys/kernel/hung_task_timeout_secs" disables this message.

Jan  9 23:38:51 minigiant5 kernel: iozone        D 0000000000000000     0 17280

 16178 0x00000000

Jan  9 23:38:51 minigiant5 kernel: ffff880127e779a8 0000000000000086

ffff880127e77998 00000000ffffffff

Jan  9 23:38:51 minigiant5 kernel: ffffffff81ea1100 0000000000000286

ffff880127e77958 ffffffff8107c99b

Jan  9 23:38:51 minigiant5 kernel: ffff8800b9071b38 ffff880127e77fd8

000000000000f4e8 ffff8800b9071b38

Jan  9 23:38:51 minigiant5 kernel: Call Trace:

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff8107c99b>] ?

try_to_del_timer_sync+0x7b/0xe0

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81090cfe>] ?

prepare_to_wait+0x4e/0x80

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa022909d>]

do_get_write_access+0x29d/0x520 [jbd2]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81090a50>] ?

wake_bit_function+0x0/0x50

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa0229471>]

jbd2_journal_get_write_access+0x31/0x50 [jbd2]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa027db78>]

__ext4_journal_get_write_access+0x38/0x80 [ext4]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa0259253>]

ext4_reserve_inode_write+0x73/0xa0 [ext4]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa02592cc>]

ext4_mark_inode_dirty+0x4c/0x1d0 [ext4]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa02595c0>]

ext4_dirty_inode+0x40/0x60 [ext4]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff8119fc6b>]

__mark_inode_dirty+0x3b/0x160

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff811901e2>]

file_update_time+0xf2/0x170

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81112b50>]

__generic_file_aio_write+0x220/0x480

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff8105e770>] ?

default_wake_function+0x0/0x20

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81112e1f>]

generic_file_aio_write+0x6f/0xe0

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa0252de1>]

ext4_file_write+0x61/0x1e0 [ext4]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff811b4ccd>] ?

fsnotify_add_notify_event+0x12d/0x280

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff811760fa>]

do_sync_write+0xfa/0x140

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff811b4613>] ? fsnotify+0x113/0x160

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81090a10>] ?

autoremove_wake_function+0x0/0x40

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81218e2b>] ?

selinux_file_permission+0xfb/0x150

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff8120c1d6>] ?

security_file_permission+0x16/0x20

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff811763f8>] vfs_write+0xb8/0x1a0

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81176e01>] sys_write+0x51/0x90

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81176753>] ? sys_lseek+0x53/0x80

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff8100b0f2>]

system_call_fastpath+0x16/0x1b

Jan  9 23:38:51 minigiant5 kernel: INFO: task iozone:17282 blocked for more

than 120 seconds.

Jan  9 23:38:51 minigiant5 kernel: "echo 0 >

/proc/sys/kernel/hung_task_timeout_secs" disables this message.

Jan  9 23:38:51 minigiant5 kernel: iozone        D 0000000000000000     0 17282

 16178 0x00000000

Jan  9 23:38:51 minigiant5 kernel: ffff880108fe59a8 0000000000000082

0000000000000000 00000000ffffffff

Jan  9 23:38:51 minigiant5 kernel: ffffffff81ea1100 0000000000000286

ffff880108fe5958 ffffffff8107c99b

Jan  9 23:38:51 minigiant5 kernel: ffff880128e45078 ffff880108fe5fd8

000000000000f4e8 ffff880128e45078

Jan  9 23:38:51 minigiant5 kernel: Call Trace:

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff8107c99b>] ?

try_to_del_timer_sync+0x7b/0xe0

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa022909d>]

do_get_write_access+0x29d/0x520 [jbd2]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81090a50>] ?

wake_bit_function+0x0/0x50

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa0229471>]

jbd2_journal_get_write_access+0x31/0x50 [jbd2]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa027db78>]

__ext4_journal_get_write_access+0x38/0x80 [ext4]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa0259253>]

ext4_reserve_inode_write+0x73/0xa0 [ext4]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa02592cc>]

ext4_mark_inode_dirty+0x4c/0x1d0 [ext4]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa02595c0>]

ext4_dirty_inode+0x40/0x60 [ext4]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff8119fc6b>]

__mark_inode_dirty+0x3b/0x160

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff811901e2>]

file_update_time+0xf2/0x170

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81112b50>]

__generic_file_aio_write+0x220/0x480

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff8105e770>] ?

default_wake_function+0x0/0x20

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81112e1f>]

generic_file_aio_write+0x6f/0xe0

Jan  9 23:38:51 minigiant5 kernel: [<ffffffffa0252de1>]

ext4_file_write+0x61/0x1e0 [ext4]

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff811b4ccd>] ?

fsnotify_add_notify_event+0x12d/0x280

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff811760fa>]

do_sync_write+0xfa/0x140

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff811b4613>] ? fsnotify+0x113/0x160

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81090a10>] ?

autoremove_wake_function+0x0/0x40

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81218e2b>] ?

selinux_file_permission+0xfb/0x150

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff8120c1d6>] ?

security_file_permission+0x16/0x20

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff811763f8>] vfs_write+0xb8/0x1a0

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff81176e01>] sys_write+0x51/0x90

Jan  9 23:38:51 minigiant5 kernel: [<ffffffff8100b0f2>]

system_call_fastpath+0x16/0x1b

Jan 10 00:06:51 minigiant5 kernel: INFO: task jbd2/md0-8:16169 blocked for more

than 120 seconds.

Jan 10 00:06:51 minigiant5 kernel: "echo 0 >

/proc/sys/kernel/hung_task_timeout_secs" disables this message.

Jan 10 00:06:51 minigiant5 kernel: jbd2/md0-8    D 0000000000000005     0 16169

     2 0x00000000

Jan 10 00:06:51 minigiant5 kernel: ffff88010153bc20 0000000000000046

ffff88010153bbe8 ffff88010153bbe4

Jan 10 00:06:51 minigiant5 kernel: 0000000000000001 ffff88012dc24c00

ffff8800282f5f80 0000000000000400

Jan 10 00:06:51 minigiant5 kernel: ffff880108f230f8 ffff88010153bfd8

000000000000f4e8 ffff880108f230f8

Jan 10 00:06:51 minigiant5 kernel: Call Trace:

Jan 10 00:06:51 minigiant5 kernel: [<ffffffff811a9240>] ? sync_buffer+0x0/0x50

Jan 10 00:06:51 minigiant5 kernel: [<ffffffff814ed2a3>] io_schedule+0x73/0xc0

Jan 10 00:06:51 minigiant5 kernel: [<ffffffff811a9280>] sync_buffer+0x40/0x50

Jan 10 00:06:51 minigiant5 kernel: [<ffffffff814edc5f>] __wait_on_bit+0x5f/0x90

Jan 10 00:06:51 minigiant5 kernel: [<ffffffff811a9240>] ? sync_buffer+0x0/0x50

Jan 10 00:06:51 minigiant5 kernel: [<ffffffff814edd08>]

out_of_line_wait_on_bit+0x78/0x90

Jan 10 00:06:51 minigiant5 kernel: [<ffffffff81090a50>] ?

wake_bit_function+0x0/0x50

Jan 10 00:06:51 minigiant5 kernel: [<ffffffff811a9236>]

__wait_on_buffer+0x26/0x30

Jan 10 00:06:51 minigiant5 kernel: [<ffffffffa022a7a1>]

jbd2_journal_commit_transaction+0x1131/0x14b0 [jbd2]

Jan 10 00:06:51 minigiant5 kernel: [<ffffffff810096f0>] ?

__switch_to+0xd0/0x320

Jan 10 00:06:51 minigiant5 kernel: [<ffffffff8107c99b>] ?

try_to_del_timer_sync+0x7b/0xe0

Jan 10 00:06:51 minigiant5 kernel: [<ffffffffa022f928>] kjournald2+0xb8/0x220

[jbd2]

Jan 10 00:06:51 minigiant5 kernel: [<ffffffff81090a10>] ?

autoremove_wake_function+0x0/0x40

Jan 10 00:06:51 minigiant5 kernel: [<ffffffffa022f870>] ? kjournald2+0x0/0x220

[jbd2]

Jan 10 00:06:51 minigiant5 kernel: [<ffffffff810906a6>] kthread+0x96/0xa0

Jan 10 00:06:51 minigiant5 kernel: [<ffffffff8100c14a>] child_rip+0xa/0x20

Jan 10 00:06:51 minigiant5 kernel: [<ffffffff81090610>] ? kthread+0x0/0xa0

Jan 10 00:06:51 minigiant5 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20

Jan 10 00:16:51 minigiant5 kernel: INFO: task jbd2/md0-8:16169 blocked for more

than 120 seconds.

Jan 10 00:16:51 minigiant5 kernel: "echo 0 >

/proc/sys/kernel/hung_task_timeout_secs" disables this message.

Jan 10 00:16:51 minigiant5 kernel: jbd2/md0-8    D 0000000000000005     0 16169

     2 0x00000000

Jan 10 00:16:51 minigiant5 kernel: ffff88010153bc20 0000000000000046

ffff88010153bbe8 ffff88010153bbe4

Jan 10 00:16:51 minigiant5 kernel: 0000000000000001 ffff88012dc24c00

ffff880028235f80 0000000000000400

Jan 10 00:16:51 minigiant5 kernel: ffff880108f230f8 ffff88010153bfd8

000000000000f4e8 ffff880108f230f8

Jan 10 00:16:51 minigiant5 kernel: Call Trace:

Jan 10 00:16:51 minigiant5 kernel: [<ffffffff811a9240>] ? sync_buffer+0x0/0x50

Jan 10 00:16:51 minigiant5 kernel: [<ffffffff814ed2a3>] io_schedule+0x73/0xc0

Jan 10 00:16:51 minigiant5 kernel: [<ffffffff811a9280>] sync_buffer+0x40/0x50

Jan 10 00:16:51 minigiant5 kernel: [<ffffffff814edc5f>] __wait_on_bit+0x5f/0x90

Jan 10 00:16:51 minigiant5 kernel: [<ffffffff811a9240>] ? sync_buffer+0x0/0x50

Jan 10 00:16:51 minigiant5 kernel: [<ffffffff814edd08>]

out_of_line_wait_on_bit+0x78/0x90

Jan 10 00:16:51 minigiant5 kernel: [<ffffffff81090a50>] ?

wake_bit_function+0x0/0x50

Jan 10 00:16:51 minigiant5 kernel: [<ffffffff811a9236>]

__wait_on_buffer+0x26/0x30

Jan 10 00:16:51 minigiant5 kernel: [<ffffffffa022a7a1>]

jbd2_journal_commit_transaction+0x1131/0x14b0 [jbd2]

Jan 10 00:16:51 minigiant5 kernel: [<ffffffff810096f0>] ?

__switch_to+0xd0/0x320

Jan 10 00:16:51 minigiant5 kernel: [<ffffffff8107c99b>] ?

try_to_del_timer_sync+0x7b/0xe0

Jan 10 00:16:51 minigiant5 kernel: [<ffffffffa022f928>] kjournald2+0xb8/0x220

[jbd2]

Jan 10 00:16:51 minigiant5 kernel: [<ffffffff81090a10>] ?

autoremove_wake_function+0x0/0x40

Jan 10 00:16:51 minigiant5 kernel: [<ffffffffa022f870>] ? kjournald2+0x0/0x220

[jbd2]

Jan 10 00:16:51 minigiant5 kernel: [<ffffffff810906a6>] kthread+0x96/0xa0

Jan 10 00:16:51 minigiant5 kernel: [<ffffffff8100c14a>] child_rip+0xa/0x20

Jan 10 00:16:51 minigiant5 kernel: [<ffffffff81090610>] ? kthread+0x0/0xa0

Jan 10 00:16:51 minigiant5 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20







Version-Release number of selected component (if applicable):



[root at minigiant5 ~]# uname -a

Linux minigiant5 2.6.32-220.2.1.el6.x86_64 #1 SMP Tue Dec 13 16:21:34 EST 2011

x86_64 x86_64 x86_64 GNU/Linux



[root at minigiant5 ~]# dmsetup version

Library version:   1.02.66-RHEL6 (2011-10-12)

Driver version:    4.22.6



2 x 1 TB SATA drives with software RAID 1.

scsi-SATA_SAMSUNG_HD103SJS246J9KB510183

scsi-SATA_SAMSUNG_HD103SJS246J9KB510187



System Information

        Manufacturer: Dell Inc.

        Product Name: OptiPlex 990



RAM: 4GB

CPU: Quad Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz with HT



How reproducible:

Always



Steps to Reproduce:



mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1

mkfs.ext4 /dev/md0

mount /dev/md0 /storage/ -o noatime

iozone -R -l 5 -u 5 -r 4k -s 5000m -F /storage/f1 /storage/f2 /storage/f3

/storage/f4 /storage/f5





Actual results:



Process takes much longer to complete due to EXT4 FS lockups



Expected results:



The partition along with the process should not lock due to FS issues





Additional info:



It takes a to run while but eventually locks up the partition.


Ilya Musayev
- Systems Architect


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/rhelv6-list/attachments/20120112/57252791/attachment.htm>


More information about the rhelv6-list mailing list