[Crash-utility] [RFC PATCH v2 0/4] Improve stack unwind on ppc64
lijiang
lijiang at redhat.com
Mon Sep 4 07:55:38 UTC 2023
Hi, Aditya
Sorry for the late reply, and thank you for the update.
On Wed, Aug 9, 2023 at 4:38 AM <crash-utility-request at redhat.com> wrote:
> Date: Wed, 9 Aug 2023 02:03:17 +0530
> From: Aditya Gupta <adityag at linux.ibm.com>
> To: crash-utility at redhat.com
> Cc: Mahesh J Salgaonkar <mahesh at linux.ibm.com>, Sourabh Jain
> <sourabhjain at linux.ibm.com>, Hari Bathini <hbathini at linux.ibm.com>
> Subject: [Crash-utility] [RFC PATCH v2 0/4] Improve stack unwind on
> ppc64
> Message-ID: <20230808203321.241732-1-adityag at linux.ibm.com>
> Content-Type: text/plain; charset=UTF-8
>
> The Problem:
> ============
>
> Currently crash is unable to show function arguments and local variables,
> as
>
That's true, we have to calculate and infer their values from the
stack/registers, because they may be stored in registers or stack. This is
not friendly to most kernel developers and debuggers.
Anyway, this is a good point. If inline functions can also be displayed, it
would be better.
gdb can do. And functionality for moving between frames ('up'/'down') is not
> working in crash.
>
> Crash has 'gdb passthroughs' for things gdb can do, but the gdb
> passthroughs
> 'bt', 'frame', 'info locals', 'up', 'down' are not working either, due to
> gdb not getting the register values from `crash_target::fetch_registers`,
> which then uses `machdep->get_cpu_reg`, which is not implemented for PPC64
>
> Proposed Solution:
> ==================
>
> Fix the gdb passthroughs by implementing "machdep->get_cpu_reg" for PPC64.
> This way, "gdb mode in crash" will support this feature for both ELF and
> kdump-compressed vmcore formats, while "gdb" would only have supported ELF
> format
>
> Implications on Architectures:
> ====================================
>
> No architecture other than PPC64 has been affected, other than in case of
> 'frame' command
>
>
BTW: Can this feature be implemented on other architectures such as X86 64,
etc? Have you investigated?
> As mentioned in patch #2, since frame will not be prohibited, so it will
> print:
>
> crash> frame
> #0 <unavailable> in ?? ()
>
> Instead of before prohibited message:
>
> crash> frame
> crash: prohibited gdb command: frame
>
> On PPC64, the default mode ("crash mode") will not have ANY OTHER changes,
> other than 'frame' as mentioned above.
>
> Major change will be in 'gdb mode' on PPC64, that it will print the
> frames, and
> local variables, instead of failing with errors showing no frame, or
> showing
> that couldn't get PC
>
> Testing:
> ========
>
> Git tree with this patch series applied:
> https://github.com/adi-g15-ibm/crash/tree/stack-unwind-rfc2
>
> To test gdb passthroughs:
>
> crash> set gdb on
> gdb> thread 3 # or any other thread number to change context in gdb
> gdb> bt
> gdb> frame
> gdb> up
> gdb> down
> gdb> info locals
>
>
I did a simple test as below(kernel commit: 99d99825fc07):
gdb> info threads
Id Target Id Frame
1 CPU 0 <unavailable> in ?? ()
2 CPU 1
gdb> thread 2
[Switching to thread 2 (CPU 1)]
#0 0xc0000000002843f8 in crash_setup_regs (oldregs=<optimized out>,
newregs=0xc00000003dbd7958) at ./arch/powerpc/include/asm/kexec.h:69
69 ppc_save_regs(newregs);
gdb> bt
#0 0xc0000000002843f8 in crash_setup_regs (oldregs=<optimized out>,
newregs=0xc00000003dbd7958) at ./arch/powerpc/include/asm/kexec.h:69
#1 __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:1064
#2 0xc00000000014e018 in panic (fmt=0xc000000001443d80 "sysrq triggered
crash\n") at kernel/panic.c:359
#3 0xc0000000009b8978 in sysrq_handle_crash (key=<optimized out>) at
drivers/tty/sysrq.c:155
#4 0xc0000000009b946c in __handle_sysrq (key=key at entry=99,
check_mask=check_mask at entry=false) at drivers/tty/sysrq.c:602
#5 0xc0000000009b9ce8 in write_sysrq_trigger (file=<optimized out>,
buf=<optimized out>, count=2, ppos=<optimized out>) at
drivers/tty/sysrq.c:1163
#6 0xc0000000006919fc in pde_write (ppos=<optimized out>, count=<optimized
out>, buf=<optimized out>, file=<optimized out>, pde=0xc00000000556fcc0) at
fs/proc/inode.c:340
#7 proc_reg_write (file=<optimized out>, buf=<optimized out>,
count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:352
#8 0xc0000000005b7cb8 in vfs_write (file=file at entry=0xc000000036fa5f00,
buf=buf at entry=0x10027835560 <error: Cannot access memory at address
0x10027835560>, count=count at entry=2, pos=pos at entry=0xc00000003dbd7de0) at
fs/read_write.c:582
#9 0xc0000000005b83a4 in ksys_write (fd=<optimized out>, buf=0x10027835560
<error: Cannot access memory at address 0x10027835560>, count=2) at
fs/read_write.c:637
#10 0xc000000000031454 in system_call_exception (regs=0xc00000003dbd7e80,
r0=<optimized out>) at arch/powerpc/kernel/syscall.c:153
#11 0xc00000000000cedc in system_call_vectored_common () at
arch/powerpc/kernel/interrupt_64.S:198
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
gdb> frame 7
#7 proc_reg_write (file=<optimized out>, buf=<optimized out>,
count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:352
352 rv = pde_write(pde, file, buf, count, ppos);
gdb> info rv
gdb: gdb request failed: info rv
gdb>
Seems that the 'info locals' command is not working as expected. I haven't
investigated the details.
Known Issues:
> =============
>
> 1. In gdb mode, 'info threads' might hang for few seconds, and print only 2
> threads
>
Hmm, it only prints 2 threads, and one of which is unavailable on my side.
Can you try to dig into the details?
> 2. In gdb mode, 'bt' might fail to show backtrace in few vmcores collected
> from older kernels. This is a known issue due to register mismatch, and
> its fix has been merged upstream:
>
> Commit:
> https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef785819e72db79
>
> TODO:
> =====
>
> 1. Introduce automatic thread selection in gdb mode, to select the crashing
> thread in gdb, eliminating the need to manually run "thread <id>" after
> switching to gdb mode.
>
> Changelog:
> ==========
>
> RFC V2:
> - removed patch implementing 'frame', 'up', 'down' in crash
> - updated the cover letter by removing the mention of those commands
> other
> than the respective gdb passthrough
>
>
In addition, the get_dumpfile_regs() is not invoked in the [patch 1], I
would suggest moving it into the [patch 2]. Just a glance, I haven't looked
at the patchset carefully.
Thanks.
Lianbo
Aditya Gupta (4):
> add generic get_dumpfile_regs to read registers
> ppc64: fix gdb passthrough by implementing machdep->get_cpu_reg
> remove 'frame' from prohibited commands list
> make cpu context change transparent to crash/gdb
>
> defs.h | 125 ++++++++++++++++++++++++++++++++++++++++++++++++
> gdb-10.2.patch | 28 +++++++++++
> gdb_interface.c | 2 +-
> kernel.c | 33 +++++++++++++
> ppc64.c | 105 ++++++++++++++++++++++++++++++++++++++--
> tools.c | 12 +++--
> 6 files changed, 298 insertions(+), 7 deletions(-)
>
> --
> 2.41.0
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20230904/aa4c54e8/attachment.htm>
More information about the Crash-utility
mailing list