Fwd: md_thread and kjournald race w/ raid1 on 2-way x86

Srinivas Murthy codevana at gmail.com
Wed Jun 4 03:04:18 UTC 2008


Sorry. Understand.

Yes, I am told it does. No way to be sure.


On Tue, Jun 3, 2008 at 7:58 PM, Eric Sandeen <sandeen at redhat.com> wrote:
> Srinivas Murthy wrote:
>> The changes we have are in the networking part. Nothing in the fs or
>> block layers.
>>
>> Thanks,
>> _Sri
>>
>
> Ok - and does it still happen without the taint? :)
>
> networking can corrupt memory as well as anything else.
>
> I'm not saying that's it for sure but it's worth testing.
>
> -Eric
>
>>
>>
>>
>> On Tue, Jun 3, 2008 at 7:47 PM, Eric Sandeen <sandeen at redhat.com> wrote:
>>> Srinivas Murthy wrote:
>>>> Hi
>>>>
>>>> I have the following kernel (2.6.23) crash on a 2-cpu smp 32b x86 system.
>>>> <0>Assertion failure in journal_commit_transaction() at
>>>> fs/jbd/commit.c:693: "((&bh->b_count)->counter) == 0"
>>>> <0>------------[ cut here ]------------
>>>> <2>kernel BUG at fs/jbd/commit.c:693!
>>>> <0>invalid opcode: 0000 [#1]
>>>> <0>PREEMPT SMP
>>>> <0>CPU:    1
>>>> <0>EIP:    0060:[<c02c1149>]    Tainted: P        VLI
>>> What's the proprietary kernel; does it happen without the tainted kernel?
>>>
>>> -Eric
>>>
>>>> <0>EFLAGS: 00010296   (2.6.23.waas #4)
>>>> <0>EIP is at journal_commit_transaction+0x879/0xe00
>>>> <0>eax: 00000096   ebx: f76bcf00   ecx: ffffffff   edx: f7588ac0
>>>> <0>esi: f6c66f88   edi: c0651dcd   ebp: c6549f5c   esp: c6549ec4
>>>> <0>ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss: 0068
>>>> <0>Process kjournald (pid: 1684, ti=c6548000 task=c39db580 task.ti=c6548000)
>>>> <0>Stack: c0651de8 c0651dcd c0651e44 000002b5 c0651fa0 00000000
>>>> 00000000 f7f63414
>>>> <0>       f7f634dc 00000000 00000fcc f7435034 00000000 00000000
>>>> c6402000 00000000
>>>> <0>       f7f63400 f7386fc0 000005d7 f77fb580 c39db580 f70bdd74
>>>> 00000202 c70f8000
>>>> <0>Call Trace:
>>>> <0> [<c0203a1a>] show_trace_log_lvl+0x1a/0x30
>>>> <0> [<c0203aea>] show_stack_log_lvl+0x9a/0xc0
>>>> <0> [<c0203d46>] show_registers+0x1d6/0x340
>>>> <0> [<c020403d>] die+0x10d/0x220
>>>> <0> [<c02041e1>] do_trap+0x91/0xd0
>>>> <0> [<c0204419>] do_invalid_op+0x89/0xa0
>>>> <0> [<c06317e2>] error_code+0x72/0x78
>>>> <0> [<c02c3845>] kjournald+0xb5/0x1f0
>>>> <0> [<c0232a5c>] kthread+0x5c/0xa0
>>>> <0> [<c020388b>] kernel_thread_helper+0x7/0x1c
>>>> <0> =======================
>>>> <0>Code: 65 c0 b8 a0 1f 65 c0 bf cd 1d 65 c0 89 44 24 10 b8 b5 02 00
>>>> 00 89 44 24 0c b8 44 1e 65 c0 89 7c 24 04 89 44 24 08 e8 57 de f5 ff
>>>> <0f> 0b eb fe 8b 43 34 85 c0 75 4e 8b 53 38 85 d2 0f 84 9e 00 00
>>>> <0>EIP: [<c02c1149>] journal_commit_transaction+0x879/0xe00 SS:ESP 0068:c6549ec4
>>>> <6>SysRq : Changing Loglevel
>>>> <4>Loglevel set to 7
>>>>
>>>> [0]kdb> btc
>>>> btc: cpu status: Currently on cpu 0
>>>> Available cpus: 0-1
>>>> Stack traceback for pid 1609
>>>> 0xc69ce000     1609        2  1    0   R  0xc69ce1e0 *md5_resync
>>>> esp        eip        Function (args)
>>>> 0xc69e5d4c 0xc028fef3 bio_alloc_bioset+0xb3 (0x11200, invalid, 0xc70e3060)
>>>> 0xc69e5d78 0xc028ffbe bio_alloc+0xe
>>>> 0xc69e5d80 0xc054f6d7 r1buf_pool_alloc+0x37 (0x11200, 0xc39ca0c0)
>>>> 0xc69e5da4 0xc024aff6 mempool_alloc+0x26 (0xf7e7dcc0, invalid)
>>>> 0xc69e5de0 0xc0552624 sync_request+0x1f4 (0xf7f40a00, 0xa2ce80, 0x0,
>>>> 0xc69e5ea0, 0x0)
>>>> 0xc69e5e40 0xc056667f md_do_sync+0x4ef (0xf7f40a00)
>>>> 0xc69e5f78 0xc0564f55 md_thread+0x35 (0xf7e7dc80)
>>>> 0xc69e5fd0 0xc0232a5c kthread+0x5c (invalid)
>>>> Stack traceback for pid 1684
>>>> 0xc39db580     1684        2  1    1   R  0xc39db760  kjournald
>>>> esp        eip        Function (args)
>>>> kdb_bb: address 0xffffffff not recognised
>>>> Using old style backtrace, unreliable with no arguments
>>>> esp        eip        Function (args)
>>>> 0xc6549eb8 0xc02c1149 journal_commit_transaction+0x879
>>>> 0xc6549f28 0xc0227945 lock_timer_base+0x25
>>>> 0xc6549f40 0xc0227b6a try_to_del_timer_sync+0x4a
>>>> 0xc6549f60 0xc02c3845 kjournald+0xb5
>>>> 0xc6549f88 0xc0233040 autoremove_wake_function
>>>> 0xc6549f94 0xc062f8e1 __sched_text_start+0x1f1
>>>> 0xc6549fa8 0xc0233040 autoremove_wake_function
>>>> [0]kdb>
>>>>
>>>> _______________________________________________
>>>> Ext3-users mailing list
>>>> Ext3-users at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/ext3-users
>>>
>
>




More information about the Ext3-users mailing list