[Crash-utility] Re: Problem with using crash 4.0-2.21 on ppc

Haren Myneni haren at us.ibm.com
Wed Feb 22 23:31:31 UTC 2006


Haren Myneni wrote:

>
>
> crash-utility-bounces at redhat.com wrote on 02/22/2006 07:31:43 AM:
>
> > Rachita Kothiyal wrote:
> >
> > >
> > > > Rachita Kothiyal wrote:
> > >
> > >
> > > >
> > > > This happens in get_idle_threads() when perusing the runqueues 
> array,
> > > > where each per-cpu runqueue data structure contains a pointer to the
> > > > idle (swapper) task for that CPU.  Now, this process requires 
> that the
> > > > per-cpu address manipulations are working correctly in order to 
> find the
> > > > each cpu's runqueue data structure.  It looks like the ppc64 change
> > > > for per-cpu data accesses is suspect here:
> > > >
> > > > > Fix to recognize post-2.6.15 ppc64 kernels moving the 
> per_cpu_offsets
> > > > > to the "paca" structure.  Without this patch, crash fails with the
> > > > > following error messages: "crash: cannot determine idle task 
> addresses
> > > > > from init_tasks[] or runqueues[]" and "crash: cannot resolve
> > > > > init_task_union".  (pbadari at us.ibm.com)
> > > >
> > >
> > > Right, but I thought this patch fixed this problem.
> > > (I am using crash-4.0-2.21, and it includes this patch)
> >
> > Right -- me too...   ;-)
>
> Badari tested his patch on live system. He can give more information 
> anyway.
>
> However, I used his patch for testing PPC64 vmcore before I post my 
> patch. Did not see any issue when invoking crash tool. Tested on 
> 2.6.16-rc2-gi9.
>
> I will also verify if I have the same vmcore.
>
> Thanks
> Haren
>
Dave,

I used Badari's patch (first version) for my testing on PPC64 kdump 
vmcore. Later, this patch was changed to use paca[CPU#].hw_cpu_id to 
determine whether the CPU exists (CPU hotplug case). The reason it 
failed on vmcore is, when the kdump boot happens, this hw_cpu_id is set 
to -1 for secondary cpus when they stopped.  Hence, not setting 
per_cpu_offset for these CPUs and causing this issue.
 
Instead of looking for hw_cpu_id, this patch will look for the 
corresponding data_offset.  If 0, means CPU does not exists.

Rachita, please let us know if still an issue. Badari, is there any 
issue with this patch?

Thanks
Haren

Crash output:
/home/hbabu/crash_tool/crash-4.0-2.21 # ./crash 
/home/hbabu/2616-rc2-k1/vmlinux /home/vmcore_2616_rc2_0207

crash 4.0-2.21
Copyright (C) 2002, 2003, 2004, 2005, 2006  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005  Fujitsu Limited
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.

GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "powerpc64-unknown-linux-gnu"...

crash: pglist_data.node_mem_map structure member does not exist.
crash: certain memory-related commands will fail or display invalid data

      KERNEL: /home/hbabu/2616-rc2-k1/vmlinux
    DUMPFILE: /home/vmcore_2616_rc2_0207
        CPUS: 2
        DATE: Tue Feb  7 16:56:08 2006
      UPTIME: 00:00:09
LOAD AVERAGE: 0.05, 0.24, 0.12
       TASKS: 57
    NODENAME: elm3a135
     RELEASE: 2.6.16-rc2-kexec-k1
     VERSION: #6 SMP Tue Feb 7 16:46:10 PST 2006
     MACHINE: ppc64  (unknown Mhz)
      MEMORY: 2.9 GB
       PANIC: "SysRq : Trigger a crashdump"
         PID: 11076
     COMMAND: "kpanic"
        TASK: c00000000bc6d800  [THREAD_INFO: c0000000ac504000]
         CPU: 1
       STATE: TASK_RUNNING (SYSRQ)

crash> bt -a
PID: 0      TASK: c0000000005a5050  CPU: 0   COMMAND: "swapper"

 R0:  0000000000000000    R1:  c00000000055bd80    R2:  c00000000077a4a0
 R3:  0000000000000000    R4:  c0000000005a5350    R5:  0000000000000002
 R6:  0000000024004042    R7:  0000000000000000    R8:  c00000000055ba00
 R9:  c0000000005a4e88    R10: 0000008000000000    R11: 00003fef00100649
 R12: 0000000028004028    R13: c0000000005a5b80
 NIP: c000000000018648    MSR: 8000000000009032    OR3: 0000000000000000
 CTR: 0000000000000000    LR:  c0000000000186b8    XER: 0000000020000000
 CCR: 0000000044004042    MQ:  c0000000005a5050    DAR: c0000000b780b780
 DSISR: c0000000000186b8     Syscall Result: 0000000000000000
 NIP [c000000000018648] .default_idle

 #0 [c00000000055bd80] .default_idle at c0000000000186b8
 #1 [c00000000055be00] .cpu_idle at c0000000000184f4
 #2 [c00000000055be70] .rest_init at c0000000000092f4
 #3 [c00000000055bef0] .start_kernel at c000000000502760
 #4 [c00000000055bf90] .hmt_init at c000000000008574

PID: 11076  TASK: c00000000bc6d800  CPU: 1   COMMAND: "kpanic"

 R0:  0000000000000000    R1:  c0000000ac507970    R2:  c00000000077a4a0
 R3:  c0000000ac5079e0    R4:  0000000000000000    R5:  0000000000000000
 R6:  756d700d0a657220    R7:  6120637261736864    R8:  0000000000000000
 R9:  c0000000007b0fa0    R10: 0000000000000000    R11: c0000000007b0fa8
 R12: 8000000000001032    R13: c0000000005a5d80    R14: 0000000000000000
 R15: 0000000000000000    R16: 00000000100bbf08    R17: 00000000100bbeb8
 R18: 0000000010070000    R19: 0000000000000000    R20: 0000000010046720
 R21: 000000000000001f    R22: 00000000100040e8    R23: 0000000010004d74
 R24: 8000000000009032    R25: 0000000000000000    R26: 0000000000000000
 R27: 0000000000000063    R28: 0000000000000009    R29: 0000000000000000
 R30: c0000000005e1560    R31: c0000000b96dd000
 NIP: c0000000000777a8    MSR: 8000000000001032    OR3: c0000000ac7202f8
 CTR: c000000000278b04    LR:  c000000000278b18    XER: 0000000000000000
 CCR: c0000000ac507b90    MQ:  0000000000000000    DAR: 0000000000000063
 DSISR: 0000000000000009     Syscall Result: 0000000000000000
 NIP [c0000000000777a8] .crash_kexec
 LR  [c000000000278b18] .sysrq_handle_crashdump

 #0 [c0000000ac507970] .crash_kexec at c0000000000777d0
 #1 [c0000000ac507b50] .sysrq_handle_crashdump at c000000000278b18
 #2 [c0000000ac507bd0] .__handle_sysrq at c0000000002789c0
 #3 [c0000000ac507c80] .write_sysrq_trigger at c000000000105478
 #4 [c0000000ac507d00] .vfs_write at c0000000000b72ec
 #5 [c0000000ac507d90] .sys_write at c0000000000b74c4
 #6 [c0000000ac507e30] syscall_exit at c0000000000086f8
 syscall  [c01] exception frame:
 R0:  0000000000000004    R1:  00000000ffd109d0    R2:  000000004001ee60
 R3:  0000000000000001    R4:  000000001004f4a8    R5:  0000000000000002
 R6:  000000001004f3a8    R7:  0000000000000011    R8:  000000001004f530
 R9:  0000000000000000    R10: 0000000000000000    R11: 0000000000000000
 R12: 0000000000000000    R13: 000000001004c9d8
 NIP: 000000000ff691e8    MSR: 000000000200f032    OR3: 0000000000000001
 CTR: 00000000100040ec    LR:  000000001000432c    XER: 0000000020000000
 CCR: 0000000048008448    MQ:  c00000000077a4a0    DAR: 00000000100040ec
 DSISR: 0000000040000000     Syscall Result: 0000000000000000

crash>                                


>
> >
> > >
> > >
> > > > >
> > > > > But I was able to run it ok on a live system.
> > > > >
> > > >
> > > > Same kernel?  I have no idea why there would be a difference
> > > > between live and vmcore.
> > >
> > > Yes, same kernel (2.6.16-rc4)
> > >
> >
> > I'm sure Badari can give you more information, but you might start
> > by putting some debug printf's in his new ppc64_paca_init() function
> > to see whether it's calculating the same kt->__per_cpu_offset[cpu]
> > values on a live system vs. its associated vmcore?
> >
> > Dave
> >
> >
> >
> > --
> > Crash-utility mailing list
> > Crash-utility at redhat.com
> > https://www.redhat.com/mailman/listinfo/crash-utility
>
>------------------------------------------------------------------------
>
>--
>Crash-utility mailing list
>Crash-utility at redhat.com
>https://www.redhat.com/mailman/listinfo/crash-utility
>  
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: crash-percpu-offset-fix.patch
Type: text/x-patch
Size: 1051 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20060222/29fa690e/attachment.bin>


More information about the Crash-utility mailing list