[Crash-utility] Re: [PATCH 0/2] Display local variables & function parameters fromstack frames

Dave Anderson anderson at redhat.com
Thu May 14 21:18:05 UTC 2009


 
> Dave
>         We had some observation with x86_64 dumps and wanted to know your 
> opinion on them
>         On x86_64 dumps for active processes we are reading the register 
> content from ELF Notes and we found that register content doesn't match 
> with observed output of bt command. SP and IP register content we got 
> from ELF_NOTES are breaking the code when we do stack unwinding, using 
> information from dwarf section, while the unwinding, atleast the first 
> stage, works with SP and IP got from bt way.
>         This issue is similar to gdb, gdb too breaks when unwinding is 
> attempted on this dump.
> 
>         I wanted to know what you think about this and how can we proceed.
> 
> 1. Is it reliable way to parse through the stack frame looking for valid 
> address as is done in 'x86_64_get_dumpfile_stack_frame'. Is it the 
> right/safe way to do, does any x86_ABI talks about ?

I do it that way because I've never wanted to depend upon the ELF prstatus 
note section, because netdump/diskdump only has the panic cpu's info, and
kdump's sections can be difficult to match to a cpu if there has been any
cpu hot-plugging.  And not to mention that there are several other dumpfile
formats supported.  You can do it any way you'd like.

> 2. It looks like we can't safely rely on ELF_NOTES, is this a known 
> issue with kexec dumping ?

The "sp: ffff88020f471dc8 Breaks unwinding" issue that you're seeing
is a result of using "echo c > /proc/sysrq-trigger", or if panic() was
called, in which case crash_kexec() is called with a NULL pt_regs pointer.  
When that's the case, a "fake" register set is hand-created in 
crash_setup_regs(), which is what you are seeing.  Check out the kernel
code in crash_setup_regs() -- it just reads the rsp as it was in that 
function, and populates the IP with current_text_addr().

> 3. If Parsing the stack frame is the right thing to do, can we modify 
> bt_cmd routines so as to reuse some of the routines for repopulating our 
>   register contents, especially esp/eip.

Sorry -- I don't understand what you're asking.

Dave

> 
> Scenario We are Facing
> ------------------------
> Register Content from ELF_NOTES: Matches with gdb out put
> 
> crash> local display
> 
>   IP:         ffffffff80255d7b
>   ax:         1
>   bx:         0
>   cx:         6237
>   dx:         0
>   sp:         ffff88020f471dc8 <=== Breaks unwinding
>   bp:         0
>   si:         0
>   di:         ffffffff80596ec0
>   cs:         10
>   oirg_ax:         8241000001b6
>   flags:         46
>   ip:         ffffffff80255d7b
>   r8:         0
>   r9:         ffff880028080c80
>   r10:         ffff880028080c80
>   r11:         d805926f0
>   r12:         63
>   r13:         0
> ------------------------
> crash> bt
> PID: 4814   TASK: ffff8802104397f0  CPU: 3   COMMAND: "bash"
>   #0 [ffff88020f471cf0] machine_kexec at ffffffff8021db38
>   #1 [ffff88020f471dc0] crash_kexec at ffffffff80255d9c
>   #2 [ffff88020f471e80] __handle_sysrq at ffffffff80385756
>   #3 [ffff88020f471ec0] write_sysrq_trigger at ffffffff802d291b
>   #4 [ffff88020f471ed0] proc_reg_write at ffffffff802cca2d
>   #5 [ffff88020f471f10] vfs_write at ffffffff8029125d
>   #6 [ffff88020f471f40] sys_write at ffffffff802916e5
>   #7 [ffff88020f471f80] system_call_fastpath at ffffffff8020be0b
>      RIP: 000000311bcc4150  RSP: 00007fff976f40d0  RFLAGS: 00010202
>      RAX: 0000000000000001  RBX: ffffffff8020be0b  RCX: 00000000000003e
> 4
>      RDX: 0000000000000002  RSI: 00007f428f6ec000  RDI: 000000000000000
> 1
>      RBP: 0000000000000002   R8: 00000000ffffffff   R9: 00007f428f6d86e
> 0
>      R10: 0000000000000072  R11: 0000000000000246  R12: 000000311bf4d76
> 0
>      R13: 00007f428f6ec000  R14: 0000000000000002  R15: 000000008f6ec00
> 0
> 
> Regards
> Sharyathi N




More information about the Crash-utility mailing list