[Crash-utility] Re: Problem with using crash 4.0-2.21 on ppc
Dave Anderson
anderson at redhat.com
Thu Feb 23 19:53:41 UTC 2006
Haren Myneni wrote:
> Rachita Kothiyal wrote:
>
> >On Thu, Feb 23, 2006 at 09:49:37AM -0500, Dave Anderson wrote:
> >
> >
> >>Ok, then I guess I'll take that as a thumbs-up.
> >>
> >>Waiting on Rachita's go-ahead...
> >>
> >>
> >
> >Dave,
> >
> >After the application of the patch (posted by Haren)
> >on crash-4.0-2.21, I am now able to open the dump using crash
> >for analysis.
> >
> >The following may be unrelated to the present discussion, but
> >it is an observation:
> >
> >When I do 'bt -a' I get the following error on one of the cpus:
> >
> >PID: 2871 TASK: c000000161d05800 CPU: 4 COMMAND: "klogd"
> >bt: invalid kernel virtual address: ff807a50 type: "Regs NIP value"
> >
> >
> Rachita,
> As I mentioned before, this task should be running in user space.
> You should notice the similar kind of stack trace even using GDB. Better
> to give proper error message here.
>
Is ff807a50 typically a legitimate user-space stack address
in ppc64 user VM? You could probably run the address
through IN_TASK_VMA(), and if it is a valid user-space
stack address, just indicate that the process was running
in user-space.
Now I understand why you (ppc64) dump the register set
first, because all the other processor types would show
a stack trace emanating from user-space down into the
reception of the IP interrupt issued by the panicking
processor.
>
> About your other issue: I could not reproduce it.
>
> crash 4.0-2.21
> Copyright (C) 2002, 2003, 2004, 2005, 2006 Red Hat, Inc.
> Copyright (C) 2004, 2005, 2006 IBM Corporation
> Copyright (C) 1999-2006 Hewlett-Packard Co
> Copyright (C) 2005 Fujitsu Limited
> Copyright (C) 2005 NEC Corporation
> Copyright (C) 1999, 2002 Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public License,
> and you are welcome to change it and/or distribute copies of it under
> certain conditions. Enter "help copying" to see the conditions.
> This program has absolutely no warranty. Enter "help warranty" for details.
>
> GNU gdb 6.1
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for details.
> This GDB was configured as "powerpc64-unknown-linux-gnu"...
>
> crash: pglist_data.node_mem_map structure member does not exist.
> crash: certain memory-related commands will fail or display invalid data
>
> KERNEL: /home/hbabu/2616-rc2-k1/vmlinux
> DUMPFILE: /home/vmcore_2616_rc2_0207
> CPUS: 2
> DATE: Tue Feb 7 16:56:08 2006
> UPTIME: 00:00:09
> LOAD AVERAGE: 0.05, 0.24, 0.12
> TASKS: 57
> NODENAME: elm3a135
> RELEASE: 2.6.16-rc2-kexec-k1
> VERSION: #6 SMP Tue Feb 7 16:46:10 PST 2006
> MACHINE: ppc64 (unknown Mhz)
> MEMORY: 2.9 GB
> PANIC: "SysRq : Trigger a crashdump"
> PID: 11076
> COMMAND: "kpanic"
> TASK: c00000000bc6d800 [THREAD_INFO: c0000000ac504000]
> CPU: 1
> STATE: TASK_RUNNING (SYSRQ)
>
> crash> bt
> PID: 11076 TASK: c00000000bc6d800 CPU: 1 COMMAND: "kpanic"
>
> R0: 0000000000000000 R1: c0000000ac507970 R2: c00000000077a4a0
> R3: c0000000ac5079e0 R4: 0000000000000000 R5: 0000000000000000
> R6: 756d700d0a657220 R7: 6120637261736864 R8: 0000000000000000
> R9: c0000000007b0fa0 R10: 0000000000000000 R11: c0000000007b0fa8
> R12: 8000000000001032 R13: c0000000005a5d80 R14: 0000000000000000
> R15: 0000000000000000 R16: 00000000100bbf08 R17: 00000000100bbeb8
> R18: 0000000010070000 R19: 0000000000000000 R20: 0000000010046720
> R21: 000000000000001f R22: 00000000100040e8 R23: 0000000010004d74
> R24: 8000000000009032 R25: 0000000000000000 R26: 0000000000000000
> R27: 0000000000000063 R28: 0000000000000009 R29: 0000000000000000
> R30: c0000000005e1560 R31: c0000000b96dd000
> NIP: c0000000000777a8 MSR: 8000000000001032 OR3: c0000000ac7202f8
> CTR: c000000000278b04 LR: c000000000278b18 XER: 0000000000000000
> CCR: c0000000ac507b90 MQ: 0000000000000000 DAR: 0000000000000063
> DSISR: 0000000000000009 Syscall Result: 0000000000000000
> NIP [c0000000000777a8] .crash_kexec
> LR [c000000000278b18] .sysrq_handle_crashdump
>
> #0 [c0000000ac507970] .crash_kexec at c0000000000777d0
> #1 [c0000000ac507b50] .sysrq_handle_crashdump at c000000000278b18
> #2 [c0000000ac507bd0] .__handle_sysrq at c0000000002789c0
> #3 [c0000000ac507c80] .write_sysrq_trigger at c000000000105478
> #4 [c0000000ac507d00] .vfs_write at c0000000000b72ec
> #5 [c0000000ac507d90] .sys_write at c0000000000b74c4
> #6 [c0000000ac507e30] syscall_exit at c0000000000086f8
> syscall [c01] exception frame:
> R0: 0000000000000004 R1: 00000000ffd109d0 R2: 000000004001ee60
> R3: 0000000000000001 R4: 000000001004f4a8 R5: 0000000000000002
> R6: 000000001004f3a8 R7: 0000000000000011 R8: 000000001004f530
> R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000
> R12: 0000000000000000 R13: 000000001004c9d8
> NIP: 000000000ff691e8 MSR: 000000000200f032 OR3: 0000000000000001
> CTR: 00000000100040ec LR: 000000001000432c XER: 0000000020000000
> CCR: 0000000048008448 MQ: c00000000077a4a0 DAR: 00000000100040ec
> DSISR: 0000000040000000 Syscall Result: 0000000000000000
>
> crash> set -c 0
> PID: 0
> COMMAND: "swapper"
> TASK: c0000000005a5050 (1 of 2) [THREAD_INFO: c000000000558000]
> CPU: 0
> STATE: TASK_RUNNING (ACTIVE)
> crash> bt
> PID: 0 TASK: c0000000005a5050 CPU: 0 COMMAND: "swapper"
>
> R0: 0000000000000000 R1: c00000000055bd80 R2: c00000000077a4a0
> R3: 0000000000000000 R4: c0000000005a5350 R5: 0000000000000002
> R6: 0000000024004042 R7: 0000000000000000 R8: c00000000055ba00
> R9: c0000000005a4e88 R10: 0000008000000000 R11: 00003fef00100649
> R12: 0000000028004028 R13: c0000000005a5b80
> NIP: c000000000018648 MSR: 8000000000009032 OR3: 0000000000000000
> CTR: 0000000000000000 LR: c0000000000186b8 XER: 0000000020000000
> CCR: 0000000044004042 MQ: c0000000005a5050 DAR: c0000000b780b780
> DSISR: c0000000000186b8 Syscall Result: 0000000000000000
> NIP [c000000000018648] .default_idle
>
> #0 [c00000000055bd80] .default_idle at c0000000000186b8
> #1 [c00000000055be00] .cpu_idle at c0000000000184f4
> #2 [c00000000055be70] .rest_init at c0000000000092f4
> #3 [c00000000055bef0] .start_kernel at c000000000502760
> #4 [c00000000055bf90] .hmt_init at c000000000008574
> crash> set -c 1
> PID: 11076
> COMMAND: "kpanic"
> TASK: c00000000bc6d800 [THREAD_INFO: c0000000ac504000]
> CPU: 1
> STATE: TASK_RUNNING (SYSRQ)
> crash> bt
> PID: 11076 TASK: c00000000bc6d800 CPU: 1 COMMAND: "kpanic"
>
> R0: 0000000000000000 R1: c0000000ac507970 R2: c00000000077a4a0
> R3: c0000000ac5079e0 R4: 0000000000000000 R5: 0000000000000000
> R6: 756d700d0a657220 R7: 6120637261736864 R8: 0000000000000000
> R9: c0000000007b0fa0 R10: 0000000000000000 R11: c0000000007b0fa8
> R12: 8000000000001032 R13: c0000000005a5d80 R14: 0000000000000000
> R15: 0000000000000000 R16: 00000000100bbf08 R17: 00000000100bbeb8
> R18: 0000000010070000 R19: 0000000000000000 R20: 0000000010046720
> R21: 000000000000001f R22: 00000000100040e8 R23: 0000000010004d74
> R24: 8000000000009032 R25: 0000000000000000 R26: 0000000000000000
> R27: 0000000000000063 R28: 0000000000000009 R29: 0000000000000000
> R30: c0000000005e1560 R31: c0000000b96dd000
> NIP: c0000000000777a8 MSR: 8000000000001032 OR3: c0000000ac7202f8
> CTR: c000000000278b04 LR: c000000000278b18 XER: 0000000000000000
> CCR: c0000000ac507b90 MQ: 0000000000000000 DAR: 0000000000000063
> DSISR: 0000000000000009 Syscall Result: 0000000000000000
> NIP [c0000000000777a8] .crash_kexec
> LR [c000000000278b18] .sysrq_handle_crashdump
>
> #0 [c0000000ac507970] .crash_kexec at c0000000000777d0
> #1 [c0000000ac507b50] .sysrq_handle_crashdump at c000000000278b18
> #2 [c0000000ac507bd0] .__handle_sysrq at c0000000002789c0
> #3 [c0000000ac507c80] .write_sysrq_trigger at c000000000105478
> #4 [c0000000ac507d00] .vfs_write at c0000000000b72ec
> #5 [c0000000ac507d90] .sys_write at c0000000000b74c4
> #6 [c0000000ac507e30] syscall_exit at c0000000000086f8
> syscall [c01] exception frame:
> R0: 0000000000000004 R1: 00000000ffd109d0 R2: 000000004001ee60
> R3: 0000000000000001 R4: 000000001004f4a8 R5: 0000000000000002
> R6: 000000001004f3a8 R7: 0000000000000011 R8: 000000001004f530
> R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000
> R12: 0000000000000000 R13: 000000001004c9d8
> NIP: 000000000ff691e8 MSR: 000000000200f032 OR3: 0000000000000001
> CTR: 00000000100040ec LR: 000000001000432c XER: 0000000020000000
> CCR: 0000000048008448 MQ: c00000000077a4a0 DAR: 00000000100040ec
> DSISR: 0000000040000000 Syscall Result: 0000000000000000
>
> crash>
>
> Probably, this issue is showing up on your system (has 8 CPUS) since my
> system is having only 2 CPUs. We need to investigate.
>
That's all I could think of as well. Rachita also didn't mention
whether he could do "set <task|pid>" of that same task, and then
get a backtrace? But a crash-gdb backtrace would be helpful.
>
> Dave, I tested very few commands on PPC64 vmcore. Where as Rachita is
> doing more testing. We might see some bugs which I have not encountered.
> We will get back to you with patches as we find bugs.
>
That's understood and not a problem -- especially on kernels
that are beyond the RHEL4 era. Do you want me to go ahead
and put out a new release with your paca fix?
Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20060223/43f244be/attachment.htm>
More information about the Crash-utility
mailing list