[Crash-utility] Re: Problem with using crash 4.0-2.21 on ppc

Dave Anderson anderson at redhat.com
Thu Feb 23 19:53:41 UTC 2006


Haren Myneni wrote:

> Rachita Kothiyal wrote:
>
> >On Thu, Feb 23, 2006 at 09:49:37AM -0500, Dave Anderson wrote:
> >
> >
> >>Ok, then I guess I'll take that as a thumbs-up.
> >>
> >>Waiting on Rachita's go-ahead...
> >>
> >>
> >
> >Dave,
> >
> >After the application of the patch (posted by Haren)
> >on crash-4.0-2.21, I am now able to open the dump using crash
> >for analysis.
> >
> >The following may be unrelated to the present discussion, but
> >it is an observation:
> >
> >When I do 'bt -a' I get the following error on one of the cpus:
> >
> >PID: 2871   TASK: c000000161d05800  CPU: 4   COMMAND: "klogd"
> >bt: invalid kernel virtual address: ff807a50  type: "Regs NIP value"
> >
> >
> Rachita,
>     As I mentioned before, this task should be running in user space.
> You should notice the similar kind of stack trace even using GDB. Better
> to give proper error message here.
>

Is ff807a50 typically a legitimate user-space stack address
in ppc64 user VM?  You could probably run the address
through IN_TASK_VMA(), and if it is a valid user-space
stack address, just indicate that the process was running
in user-space.

Now I understand why you (ppc64) dump the register set
first, because all the other processor types would show
a stack trace emanating from user-space down into the
reception of the IP interrupt issued by the panicking
processor.


>
> About your other issue: I could not reproduce it.
>
> crash 4.0-2.21
> Copyright (C) 2002, 2003, 2004, 2005, 2006  Red Hat, Inc.
> Copyright (C) 2004, 2005, 2006  IBM Corporation
> Copyright (C) 1999-2006  Hewlett-Packard Co
> Copyright (C) 2005  Fujitsu Limited
> Copyright (C) 2005  NEC Corporation
> Copyright (C) 1999, 2002  Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public License,
> and you are welcome to change it and/or distribute copies of it under
> certain conditions.  Enter "help copying" to see the conditions.
> This program has absolutely no warranty.  Enter "help warranty" for details.
>
> GNU gdb 6.1
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "powerpc64-unknown-linux-gnu"...
>
> crash: pglist_data.node_mem_map structure member does not exist.
> crash: certain memory-related commands will fail or display invalid data
>
>       KERNEL: /home/hbabu/2616-rc2-k1/vmlinux
>     DUMPFILE: /home/vmcore_2616_rc2_0207
>         CPUS: 2
>         DATE: Tue Feb  7 16:56:08 2006
>       UPTIME: 00:00:09
> LOAD AVERAGE: 0.05, 0.24, 0.12
>        TASKS: 57
>     NODENAME: elm3a135
>      RELEASE: 2.6.16-rc2-kexec-k1
>      VERSION: #6 SMP Tue Feb 7 16:46:10 PST 2006
>      MACHINE: ppc64  (unknown Mhz)
>       MEMORY: 2.9 GB
>        PANIC: "SysRq : Trigger a crashdump"
>          PID: 11076
>      COMMAND: "kpanic"
>         TASK: c00000000bc6d800  [THREAD_INFO: c0000000ac504000]
>          CPU: 1
>        STATE: TASK_RUNNING (SYSRQ)
>
> crash> bt
> PID: 11076  TASK: c00000000bc6d800  CPU: 1   COMMAND: "kpanic"
>
>  R0:  0000000000000000    R1:  c0000000ac507970    R2:  c00000000077a4a0
>  R3:  c0000000ac5079e0    R4:  0000000000000000    R5:  0000000000000000
>  R6:  756d700d0a657220    R7:  6120637261736864    R8:  0000000000000000
>  R9:  c0000000007b0fa0    R10: 0000000000000000    R11: c0000000007b0fa8
>  R12: 8000000000001032    R13: c0000000005a5d80    R14: 0000000000000000
>  R15: 0000000000000000    R16: 00000000100bbf08    R17: 00000000100bbeb8
>  R18: 0000000010070000    R19: 0000000000000000    R20: 0000000010046720
>  R21: 000000000000001f    R22: 00000000100040e8    R23: 0000000010004d74
>  R24: 8000000000009032    R25: 0000000000000000    R26: 0000000000000000
>  R27: 0000000000000063    R28: 0000000000000009    R29: 0000000000000000
>  R30: c0000000005e1560    R31: c0000000b96dd000
>  NIP: c0000000000777a8    MSR: 8000000000001032    OR3: c0000000ac7202f8
>  CTR: c000000000278b04    LR:  c000000000278b18    XER: 0000000000000000
>  CCR: c0000000ac507b90    MQ:  0000000000000000    DAR: 0000000000000063
>  DSISR: 0000000000000009     Syscall Result: 0000000000000000
>  NIP [c0000000000777a8] .crash_kexec
>  LR  [c000000000278b18] .sysrq_handle_crashdump
>
>  #0 [c0000000ac507970] .crash_kexec at c0000000000777d0
>  #1 [c0000000ac507b50] .sysrq_handle_crashdump at c000000000278b18
>  #2 [c0000000ac507bd0] .__handle_sysrq at c0000000002789c0
>  #3 [c0000000ac507c80] .write_sysrq_trigger at c000000000105478
>  #4 [c0000000ac507d00] .vfs_write at c0000000000b72ec
>  #5 [c0000000ac507d90] .sys_write at c0000000000b74c4
>  #6 [c0000000ac507e30] syscall_exit at c0000000000086f8
>  syscall  [c01] exception frame:
>  R0:  0000000000000004    R1:  00000000ffd109d0    R2:  000000004001ee60
>  R3:  0000000000000001    R4:  000000001004f4a8    R5:  0000000000000002
>  R6:  000000001004f3a8    R7:  0000000000000011    R8:  000000001004f530
>  R9:  0000000000000000    R10: 0000000000000000    R11: 0000000000000000
>  R12: 0000000000000000    R13: 000000001004c9d8
>  NIP: 000000000ff691e8    MSR: 000000000200f032    OR3: 0000000000000001
>  CTR: 00000000100040ec    LR:  000000001000432c    XER: 0000000020000000
>  CCR: 0000000048008448    MQ:  c00000000077a4a0    DAR: 00000000100040ec
>  DSISR: 0000000040000000     Syscall Result: 0000000000000000
>
> crash> set -c 0
>     PID: 0
> COMMAND: "swapper"
>    TASK: c0000000005a5050  (1 of 2)  [THREAD_INFO: c000000000558000]
>     CPU: 0
>   STATE: TASK_RUNNING (ACTIVE)
> crash> bt
> PID: 0      TASK: c0000000005a5050  CPU: 0   COMMAND: "swapper"
>
>  R0:  0000000000000000    R1:  c00000000055bd80    R2:  c00000000077a4a0
>  R3:  0000000000000000    R4:  c0000000005a5350    R5:  0000000000000002
>  R6:  0000000024004042    R7:  0000000000000000    R8:  c00000000055ba00
>  R9:  c0000000005a4e88    R10: 0000008000000000    R11: 00003fef00100649
>  R12: 0000000028004028    R13: c0000000005a5b80
>  NIP: c000000000018648    MSR: 8000000000009032    OR3: 0000000000000000
>  CTR: 0000000000000000    LR:  c0000000000186b8    XER: 0000000020000000
>  CCR: 0000000044004042    MQ:  c0000000005a5050    DAR: c0000000b780b780
>  DSISR: c0000000000186b8     Syscall Result: 0000000000000000
>  NIP [c000000000018648] .default_idle
>
>  #0 [c00000000055bd80] .default_idle at c0000000000186b8
>  #1 [c00000000055be00] .cpu_idle at c0000000000184f4
>  #2 [c00000000055be70] .rest_init at c0000000000092f4
>  #3 [c00000000055bef0] .start_kernel at c000000000502760
>  #4 [c00000000055bf90] .hmt_init at c000000000008574
> crash> set -c 1
>     PID: 11076
> COMMAND: "kpanic"
>    TASK: c00000000bc6d800  [THREAD_INFO: c0000000ac504000]
>     CPU: 1
>   STATE: TASK_RUNNING (SYSRQ)
> crash> bt
> PID: 11076  TASK: c00000000bc6d800  CPU: 1   COMMAND: "kpanic"
>
>  R0:  0000000000000000    R1:  c0000000ac507970    R2:  c00000000077a4a0
>  R3:  c0000000ac5079e0    R4:  0000000000000000    R5:  0000000000000000
>  R6:  756d700d0a657220    R7:  6120637261736864    R8:  0000000000000000
>  R9:  c0000000007b0fa0    R10: 0000000000000000    R11: c0000000007b0fa8
>  R12: 8000000000001032    R13: c0000000005a5d80    R14: 0000000000000000
>  R15: 0000000000000000    R16: 00000000100bbf08    R17: 00000000100bbeb8
>  R18: 0000000010070000    R19: 0000000000000000    R20: 0000000010046720
>  R21: 000000000000001f    R22: 00000000100040e8    R23: 0000000010004d74
>  R24: 8000000000009032    R25: 0000000000000000    R26: 0000000000000000
>  R27: 0000000000000063    R28: 0000000000000009    R29: 0000000000000000
>  R30: c0000000005e1560    R31: c0000000b96dd000
>  NIP: c0000000000777a8    MSR: 8000000000001032    OR3: c0000000ac7202f8
>  CTR: c000000000278b04    LR:  c000000000278b18    XER: 0000000000000000
>  CCR: c0000000ac507b90    MQ:  0000000000000000    DAR: 0000000000000063
>  DSISR: 0000000000000009     Syscall Result: 0000000000000000
>  NIP [c0000000000777a8] .crash_kexec
>  LR  [c000000000278b18] .sysrq_handle_crashdump
>
>  #0 [c0000000ac507970] .crash_kexec at c0000000000777d0
>  #1 [c0000000ac507b50] .sysrq_handle_crashdump at c000000000278b18
>  #2 [c0000000ac507bd0] .__handle_sysrq at c0000000002789c0
>  #3 [c0000000ac507c80] .write_sysrq_trigger at c000000000105478
>  #4 [c0000000ac507d00] .vfs_write at c0000000000b72ec
>  #5 [c0000000ac507d90] .sys_write at c0000000000b74c4
>  #6 [c0000000ac507e30] syscall_exit at c0000000000086f8
>  syscall  [c01] exception frame:
>  R0:  0000000000000004    R1:  00000000ffd109d0    R2:  000000004001ee60
>  R3:  0000000000000001    R4:  000000001004f4a8    R5:  0000000000000002
>  R6:  000000001004f3a8    R7:  0000000000000011    R8:  000000001004f530
>  R9:  0000000000000000    R10: 0000000000000000    R11: 0000000000000000
>  R12: 0000000000000000    R13: 000000001004c9d8
>  NIP: 000000000ff691e8    MSR: 000000000200f032    OR3: 0000000000000001
>  CTR: 00000000100040ec    LR:  000000001000432c    XER: 0000000020000000
>  CCR: 0000000048008448    MQ:  c00000000077a4a0    DAR: 00000000100040ec
>  DSISR: 0000000040000000     Syscall Result: 0000000000000000
>
> crash>
>
> Probably, this issue is showing up on your system (has 8 CPUS) since my
> system is having only 2 CPUs. We need to investigate.
>

That's all I could think of as well.  Rachita also didn't mention
whether he could do "set <task|pid>" of that same task, and then
get a backtrace?  But a crash-gdb backtrace would be helpful.

>
> Dave, I tested very few commands on PPC64 vmcore. Where as Rachita is
> doing more testing. We might see some bugs which I have not encountered.
> We will get back to you with patches as we find bugs.
>

That's understood and not a problem -- especially on kernels
that are beyond the RHEL4 era.  Do you want me to go ahead
and put out a new release with your paca fix?

Dave

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20060223/43f244be/attachment.htm>


More information about the Crash-utility mailing list