[Crash-utility] Re: [PATCH 0/2] Display local variables & function parameters fromstack frames
Dave Anderson
anderson at redhat.com
Thu May 14 21:18:05 UTC 2009
> Dave
> We had some observation with x86_64 dumps and wanted to know your
> opinion on them
> On x86_64 dumps for active processes we are reading the register
> content from ELF Notes and we found that register content doesn't match
> with observed output of bt command. SP and IP register content we got
> from ELF_NOTES are breaking the code when we do stack unwinding, using
> information from dwarf section, while the unwinding, atleast the first
> stage, works with SP and IP got from bt way.
> This issue is similar to gdb, gdb too breaks when unwinding is
> attempted on this dump.
>
> I wanted to know what you think about this and how can we proceed.
>
> 1. Is it reliable way to parse through the stack frame looking for valid
> address as is done in 'x86_64_get_dumpfile_stack_frame'. Is it the
> right/safe way to do, does any x86_ABI talks about ?
I do it that way because I've never wanted to depend upon the ELF prstatus
note section, because netdump/diskdump only has the panic cpu's info, and
kdump's sections can be difficult to match to a cpu if there has been any
cpu hot-plugging. And not to mention that there are several other dumpfile
formats supported. You can do it any way you'd like.
> 2. It looks like we can't safely rely on ELF_NOTES, is this a known
> issue with kexec dumping ?
The "sp: ffff88020f471dc8 Breaks unwinding" issue that you're seeing
is a result of using "echo c > /proc/sysrq-trigger", or if panic() was
called, in which case crash_kexec() is called with a NULL pt_regs pointer.
When that's the case, a "fake" register set is hand-created in
crash_setup_regs(), which is what you are seeing. Check out the kernel
code in crash_setup_regs() -- it just reads the rsp as it was in that
function, and populates the IP with current_text_addr().
> 3. If Parsing the stack frame is the right thing to do, can we modify
> bt_cmd routines so as to reuse some of the routines for repopulating our
> register contents, especially esp/eip.
Sorry -- I don't understand what you're asking.
Dave
>
> Scenario We are Facing
> ------------------------
> Register Content from ELF_NOTES: Matches with gdb out put
>
> crash> local display
>
> IP: ffffffff80255d7b
> ax: 1
> bx: 0
> cx: 6237
> dx: 0
> sp: ffff88020f471dc8 <=== Breaks unwinding
> bp: 0
> si: 0
> di: ffffffff80596ec0
> cs: 10
> oirg_ax: 8241000001b6
> flags: 46
> ip: ffffffff80255d7b
> r8: 0
> r9: ffff880028080c80
> r10: ffff880028080c80
> r11: d805926f0
> r12: 63
> r13: 0
> ------------------------
> crash> bt
> PID: 4814 TASK: ffff8802104397f0 CPU: 3 COMMAND: "bash"
> #0 [ffff88020f471cf0] machine_kexec at ffffffff8021db38
> #1 [ffff88020f471dc0] crash_kexec at ffffffff80255d9c
> #2 [ffff88020f471e80] __handle_sysrq at ffffffff80385756
> #3 [ffff88020f471ec0] write_sysrq_trigger at ffffffff802d291b
> #4 [ffff88020f471ed0] proc_reg_write at ffffffff802cca2d
> #5 [ffff88020f471f10] vfs_write at ffffffff8029125d
> #6 [ffff88020f471f40] sys_write at ffffffff802916e5
> #7 [ffff88020f471f80] system_call_fastpath at ffffffff8020be0b
> RIP: 000000311bcc4150 RSP: 00007fff976f40d0 RFLAGS: 00010202
> RAX: 0000000000000001 RBX: ffffffff8020be0b RCX: 00000000000003e
> 4
> RDX: 0000000000000002 RSI: 00007f428f6ec000 RDI: 000000000000000
> 1
> RBP: 0000000000000002 R8: 00000000ffffffff R9: 00007f428f6d86e
> 0
> R10: 0000000000000072 R11: 0000000000000246 R12: 000000311bf4d76
> 0
> R13: 00007f428f6ec000 R14: 0000000000000002 R15: 000000008f6ec00
> 0
>
> Regards
> Sharyathi N
More information about the Crash-utility
mailing list