[Crash-utility] [RFC PATCH v2 0/4] Improve stack unwind on ppc64

lijiang lijiang at redhat.com
Mon Sep 4 07:55:38 UTC 2023


Hi, Aditya
Sorry for the late reply, and thank you for the update.

On Wed, Aug 9, 2023 at 4:38 AM <crash-utility-request at redhat.com> wrote:

> Date: Wed,  9 Aug 2023 02:03:17 +0530
> From: Aditya Gupta <adityag at linux.ibm.com>
> To: crash-utility at redhat.com
> Cc: Mahesh J Salgaonkar <mahesh at linux.ibm.com>, Sourabh Jain
>         <sourabhjain at linux.ibm.com>, Hari Bathini <hbathini at linux.ibm.com>
> Subject: [Crash-utility] [RFC PATCH v2 0/4] Improve stack unwind on
>         ppc64
> Message-ID: <20230808203321.241732-1-adityag at linux.ibm.com>
> Content-Type: text/plain; charset=UTF-8
>
> The Problem:
> ============
>
> Currently crash is unable to show function arguments and local variables,
> as
>

That's true, we have to calculate and infer their values from the
stack/registers, because they may be stored in registers or stack. This is
not friendly to most kernel developers and debuggers.

Anyway, this is a good point. If inline functions can also be displayed, it
would be better.

gdb can do. And functionality for moving between frames ('up'/'down') is not
> working in crash.
>
> Crash has 'gdb passthroughs' for things gdb can do, but the gdb
> passthroughs
> 'bt', 'frame', 'info locals', 'up', 'down' are not working either, due to
> gdb not getting the register values from `crash_target::fetch_registers`,
> which then uses `machdep->get_cpu_reg`, which is not implemented for PPC64
>
> Proposed Solution:
> ==================
>
> Fix the gdb passthroughs by implementing "machdep->get_cpu_reg" for PPC64.
> This way, "gdb mode in crash" will support this feature for both ELF and
> kdump-compressed vmcore formats, while "gdb" would only have supported ELF
> format
>
> Implications on Architectures:
> ====================================
>
> No architecture other than PPC64 has been affected, other than in case of
> 'frame' command
>
>
BTW: Can this feature be implemented on other architectures such as X86 64,
etc? Have you investigated?



> As mentioned in patch #2, since frame will not be prohibited, so it will
> print:
>
>         crash> frame
>         #0  <unavailable> in ?? ()
>
> Instead of before prohibited message:
>
>         crash> frame
>         crash: prohibited gdb command: frame
>
> On PPC64, the default mode ("crash mode") will not have ANY OTHER changes,
> other than 'frame' as mentioned above.
>
> Major change will be in 'gdb mode' on PPC64, that it will print the
> frames, and
> local variables, instead of failing with errors showing no frame, or
> showing
> that couldn't get PC
>
> Testing:
> ========
>
> Git tree with this patch series applied:
> https://github.com/adi-g15-ibm/crash/tree/stack-unwind-rfc2
>
> To test gdb passthroughs:
>
>         crash> set gdb on
>         gdb> thread 3 # or any other thread number to change context in gdb
>         gdb> bt
>         gdb> frame
>         gdb> up
>         gdb> down
>         gdb> info locals
>
>
I did a simple test as below(kernel commit: 99d99825fc07):

gdb> info threads
  Id   Target Id         Frame
  1    CPU 0             <unavailable> in ?? ()
  2    CPU 1
gdb> thread 2
[Switching to thread 2 (CPU 1)]
#0  0xc0000000002843f8 in crash_setup_regs (oldregs=<optimized out>,
newregs=0xc00000003dbd7958) at ./arch/powerpc/include/asm/kexec.h:69
69                      ppc_save_regs(newregs);
gdb> bt
#0  0xc0000000002843f8 in crash_setup_regs (oldregs=<optimized out>,
newregs=0xc00000003dbd7958) at ./arch/powerpc/include/asm/kexec.h:69
#1  __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:1064
#2  0xc00000000014e018 in panic (fmt=0xc000000001443d80 "sysrq triggered
crash\n") at kernel/panic.c:359
#3  0xc0000000009b8978 in sysrq_handle_crash (key=<optimized out>) at
drivers/tty/sysrq.c:155
#4  0xc0000000009b946c in __handle_sysrq (key=key at entry=99,
check_mask=check_mask at entry=false) at drivers/tty/sysrq.c:602
#5  0xc0000000009b9ce8 in write_sysrq_trigger (file=<optimized out>,
buf=<optimized out>, count=2, ppos=<optimized out>) at
drivers/tty/sysrq.c:1163
#6  0xc0000000006919fc in pde_write (ppos=<optimized out>, count=<optimized
out>, buf=<optimized out>, file=<optimized out>, pde=0xc00000000556fcc0) at
fs/proc/inode.c:340
#7  proc_reg_write (file=<optimized out>, buf=<optimized out>,
count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:352
#8  0xc0000000005b7cb8 in vfs_write (file=file at entry=0xc000000036fa5f00,
buf=buf at entry=0x10027835560 <error: Cannot access memory at address
0x10027835560>, count=count at entry=2, pos=pos at entry=0xc00000003dbd7de0) at
fs/read_write.c:582
#9  0xc0000000005b83a4 in ksys_write (fd=<optimized out>, buf=0x10027835560
<error: Cannot access memory at address 0x10027835560>, count=2) at
fs/read_write.c:637
#10 0xc000000000031454 in system_call_exception (regs=0xc00000003dbd7e80,
r0=<optimized out>) at arch/powerpc/kernel/syscall.c:153
#11 0xc00000000000cedc in system_call_vectored_common () at
arch/powerpc/kernel/interrupt_64.S:198
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
gdb> frame 7
#7  proc_reg_write (file=<optimized out>, buf=<optimized out>,
count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:352
352                     rv = pde_write(pde, file, buf, count, ppos);
gdb> info rv
gdb: gdb request failed: info rv
gdb>

Seems that the 'info locals' command is not working as expected. I haven't
investigated the details.


Known Issues:
> =============
>
> 1. In gdb mode, 'info threads' might hang for few seconds, and print only 2
>    threads
>

Hmm, it only prints 2 threads, and one of which is unavailable on my side.
Can you try to dig into the details?


> 2. In gdb mode, 'bt' might fail to show backtrace in few vmcores collected
>    from older kernels. This is a known issue due to register mismatch, and
>    its fix has been merged upstream:
>
> Commit:
> https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef785819e72db79
>
> TODO:
> =====
>
> 1. Introduce automatic thread selection in gdb mode, to select the crashing
>    thread in gdb, eliminating the need to manually run "thread <id>" after
>    switching to gdb mode.
>
> Changelog:
> ==========
>
> RFC V2:
>   - removed patch implementing 'frame', 'up', 'down' in crash
>   - updated the cover letter by removing the mention of those commands
> other
>         than the respective gdb passthrough
>
>
In addition, the get_dumpfile_regs() is not invoked in the [patch 1], I
would suggest moving it into the [patch 2]. Just a glance, I haven't looked
at the patchset carefully.

Thanks.
Lianbo

Aditya Gupta (4):
>   add generic get_dumpfile_regs to read registers
>   ppc64: fix gdb passthrough by implementing machdep->get_cpu_reg
>   remove 'frame' from prohibited commands list
>   make cpu context change transparent to crash/gdb
>
>  defs.h          | 125 ++++++++++++++++++++++++++++++++++++++++++++++++
>  gdb-10.2.patch  |  28 +++++++++++
>  gdb_interface.c |   2 +-
>  kernel.c        |  33 +++++++++++++
>  ppc64.c         | 105 ++++++++++++++++++++++++++++++++++++++--
>  tools.c         |  12 +++--
>  6 files changed, 298 insertions(+), 7 deletions(-)
>
> --
> 2.41.0
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20230904/aa4c54e8/attachment.htm>


More information about the Crash-utility mailing list