[Crash-utility] cannot find stack info on ppc64le

Dave Anderson anderson at redhat.com
Mon Jan 19 19:11:49 UTC 2015



----- Original Message -----
> 
> ----- Original Message -----
> > Hello,
> > 
> > I just noticed that on ppc64le, sometimes "bt" cannot find the stack
> > info of current process. For example, there is a vmcore captured by
> > kdump on a ppc64le system, which running with a kernel version 3.10. The
> > vmcore was captured when kernel oopsed. There is no stack info found by
> > bt:
> 
> Hello Han,
> 
> I've never worked on the backtrace code for ppc64, as it was written
> by (and maintained by) IBM.  From the debug messages, what happened is
> that the starting IP/SP hooks are not being found.  The crash command
> sequence presumably looks like this:
> 
>   cmd_bt
>    back_trace
>     get_kdump_regs
>       get_netdump_regs
>         get_netdump_regs_ppc64   (should setup bt->machdep to point to NT_PRSTATUS note)
>           ppc64_get_stack_frame
>             ppc64_get_dumpfile_stack_frame
>                ppc64_kdump_stack_frame (should get IP/SP pair based upon NT_PRSTATUS note contents)
>     ppc64_back_trace_cmd
>      ppc64_back_trace
> 
> ppc64_kdump_stack_frame() should pull the starting NIP/KSP values from the
> pt_regs structure in the per-cpu NT_PRSTATUS note, but it appears that it is not,
> leaving the registers at their initialized values of NULL.
> 
> This causes the failure later on when ppc64_back_trace_cmd() is called, and which
> prints the "=> PC: 0 () FP: 0" debug message shown below, and later on ppc64_back_trace()
> prints the "cannot find the stack info." debug message.
> 
> Without the dumpfile, I can't offer much else.  Can you verify the crash utility
> stack trail above, and if it is as I suspect, figure out why ppc64_kdump_stack_frame()
> is failing?  Or what other path it is taking?

Actually, if this is a compressed kdump, ppc64_kdump_stack_frame() will not be
called, and the register access is done inside ppc64_get_dumpfile_stack_frame().

The ppc64_get_dumpfile_stack_frame() function first grabs the registers from the pt_regs
structure in the per-cpu NT_PRSTATUS note, but then also checks the hard and soft IRQ
stacks, and the hardware interrupt stack, for known instances of kernel dump functions,
which would override the pt_regs contents.  If nothing is found on those stacks,
the registers from the NT_PRSTATUS note are used.

Dave




> 
> > 
> > crash 7.0.9-2.ael7b
> > Copyright (C) 2002-2014  Red Hat, Inc.
> > Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
> > Copyright (C) 1999-2006  Hewlett-Packard Co
> > Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
> > Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
> > Copyright (C) 2005, 2011  NEC Corporation
> > Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
> > Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
> > This program is free software, covered by the GNU General Public License,
> > and you are welcome to change it and/or distribute copies of it under
> > certain conditions.  Enter "help copying" to see the conditions.
> > This program has absolutely no warranty.  Enter "help warranty" for
> > details.
> > 
> > GNU gdb (GDB) 7.6
> > Copyright (C) 2013 Free Software Foundation, Inc.
> > License GPLv3+: GNU GPL version 3 or later
> > <http://gnu.org/licenses/gpl.html>
> > This is free software: you are free to change and redistribute it.
> > There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> > and "show warranty" for details.
> > This GDB was configured as "powerpc64le-unknown-linux-gnu"...
> > 
> >       KERNEL: /usr/lib/debug/lib/modules/3.10.0-221.ael7b.ppc64le/vmlinux
> >     DUMPFILE: /var/crash/127.0.0.1-2015.01.15-22:19:14/vmcore  [PARTIAL
> >     DUMP]
> >         CPUS: 16
> >         DATE: Thu Jan 15 21:18:16 2015
> >       UPTIME: 17:53:43
> > LOAD AVERAGE: 213.58, 213.23, 212.70
> >        TASKS: 1383
> >     NODENAME: thymelp2.isst.aus.stglabs.ibm.com
> >      RELEASE: 3.10.0-221.ael7b.ppc64le
> >      VERSION: #1 SMP Wed Jan 7 09:27:09 EST 2015
> >      MACHINE: ppc64le  (3425 Mhz)
> >       MEMORY: 15 GB
> >        PANIC: "Oops: Kernel access of bad area, sig: 11 [#1]" (check log
> >        for
> >        details)
> >          PID: 1970
> >      COMMAND: "cat"
> >         TASK: c0000003130874a0  [THREAD_INFO: c00000005069c000]
> >          CPU: 5
> >        STATE: TASK_RUNNING (PANIC)
> > 
> > crash> set debug 99
> > debug: 99
> > crash> bt
> > PID: 1970   TASK: c0000003130874a0  CPU: 5   COMMAND: "cat"
> > GETBUF(16384 -> 0)
> > <readmem: c00000005069c000, KVADDR, "stack contents", 16384, (ROE),
> > 10a81570>
> > <read_diskdump: addr: c00000005069c000 paddr: 5069c000 cnt: 16384>
> > read_diskdump: paddr/pfn: 5069c000/5069 -> cache physical page: 50690000
> > c00000005069c018: do_no_restart_syscall
> > c00000005069e870: blk_throtl_bio+240
> > c00000005069e990: clone_endio
> > c00000005069ea00: generic_make_request_checks+836
> > c00000005069eab8: hardware_interrupt_common+128
> > c00000005069eac0: generic_make_request+36
> > c00000005069eb10: mempool_alloc_slab+36
> > c00000005069eb30: mempool_alloc+256
> > c00000005069eb50: mempool_alloc_slab+36
> > c00000005069ebc0: get_request+948
> > c00000005069ec00: __split_and_process_bio+1408
> > c00000005069ec20: autoremove_wake_function
> > c00000005069ec80: find_busiest_group+544
> > c00000005069edf0: load_balance+684
> > c00000005069ee10: blk_throtl_bio+240
> > c00000005069ee70: find_busiest_group+544
> > c00000005069eee0: dequeue_task_fair+968
> > c00000005069ef30: clone_endio
> > c00000005069ef50: get_page_from_freelist+1436
> > c00000005069f0a0: pSeries_cause_ipi_mux+112
> > c00000005069f0c0: smp_send_reschedule+164
> > c00000005069f0e0: default_wake_function+708
> > c00000005069f160: __wake_up_locked+116
> > c00000005069f1b0: ep_poll_callback+444
> > c00000005069f250: run_posix_cpu_timers+104
> > c00000005069f2c0: hvterm_raw_put_chars+64
> > c00000005069f2e0: hvc_console_print+336
> > c00000005069f3a8: initial_stab+2048
> > c00000005069f3b0: crash_save_cpu+252
> > c00000005069f488: cik_cp_resume+13476
> > c00000005069f490: dev_get_drvdata
> > c00000005069f580: default_machine_kexec+332
> > c00000005069f610: pSeries_machine_kexec+60
> > c00000005069f680: machine_kexec+56
> > c00000005069f6a0: crash_kexec+312
> > c00000005069f6f0: dev_attr_show+64
> > c00000005069f748: cik_cp_resume+13476
> > c00000005069f750: dev_get_drvdata
> > c00000005069f7f0: radeon_hwmon_show_temp+72
> > c00000005069f800: slb_miss_realmode+80
> > c00000005069f808: dev_get_drvdata
> > c00000005069f810: radeon_hwmon_show_temp+32
> > c00000005069f890: die+840
> > c00000005069f930: bad_page_fault+224
> > c00000005069f948: radeon_hwmon_show_temp+72
> > c00000005069f9a0: handle_page_fault+44
> > c00000005069fa00: dev_attr_show+64
> > c00000005069fa58: cik_cp_resume+13476
> > c00000005069fa60: dev_get_drvdata
> > c00000005069fb00: radeon_hwmon_show_temp+72
> > c00000005069fb10: slb_miss_realmode+80
> > c00000005069fb18: dev_get_drvdata
> > c00000005069fb20: radeon_hwmon_show_temp+32
> > c00000005069fb60: handle_mm_fault+1724
> > c00000005069fb80: sysfs_open_file
> > c00000005069fbd0: handle_page_fault+16
> > c00000005069fc90: alloc_pages_current+416
> > c00000005069fd00: dev_attr_show+64
> > c00000005069fd30: sysfs_read_file+220
> > c00000005069fde0: sys_read+304
> > c00000005069fe40: syscall_exit
> > [3fffd0d6fe88] back_trace:
> >         task: c0000003130874a0
> >        flags: 0
> >      instptr: 0
> >       stkptr: 0
> >         bptr: 0
> >    stackbase: c00000005069c000
> >     stacktop: c0000000506a0000
> >           tc: 1003c7b9fa8 (1970, c0000003130874a0)
> >           hp: 0
> >          ref: 0
> >     stackbuf: 10a81570
> >     textlist: 0
> >     frameptr: 0
> >  call_target: none
> >    eframe_ip: 0
> >        debug: 0
> >        radix: 0
> >      cpumask: 0
> >  => PC: 0 () FP: 0
> >   GETBUF(248 -> 1)
> >     GETBUF(1500 -> 2)
> > cannot find the stack info.
> >     FREEBUF(2)
> >   FREEBUF(1)
> > crash>
> > 
> > 
> > Is this a problem?
> > 
> > Thanks in advance!
> > 
> > --
> > Crash-utility mailing list
> > Crash-utility at redhat.com
> > https://www.redhat.com/mailman/listinfo/crash-utility
> > 
> 




More information about the Crash-utility mailing list