[Crash-utility] Re: [PATCH 0/3] Display local variables & function parameters from stack frames
Dave Anderson
anderson at redhat.com
Wed May 27 13:06:23 UTC 2009
----- "Sharyathi Nagesh" <sharyath at in.ibm.com> wrote:
> Dave
> Excuse me for overlooking this part of the code, I am attaching a fix
> to this, hope this fixes the issue.
That one looks better...
> Dave I have few observations regarding the points you have raised
> mkdumpfile -c stripping the ELF Note for pt_regs:
> mkdumpfile won't be saving much by stripping ELF Notes of pt_regs
> information. It will be ~256 bytes * number of cpus which is not much.
> We will discuss with mkdumpfile developers to check out the possibility
> of retaining this ELF Note information.
>
> Regarding CONFIG_FRAMEPOINTER
> We understand this is disabled so as to release one more
> register,bp, for general purpose operations and this is default.
> Ideally this information should have got saved in dwarf section, so
> theoretically speaking we should be able to unwind the x86/x86_64 dump
> even with out CONFIG_FRAMEPOINTER. But some how the stack unwinding is
> not as direct as it is in ppc64 we are re-looking into this
> implementation.
>
> Regarding Exception Frame on the top of the stack frame
> As we understand if we have the pt_regs of the topmost stack we
> should be able to unwind to the next stack frame, even if top most
> stack frame is an exception frame, atleast in ppc64. We are not sure
> of x86 and x86_64 we can relook into that too.
I understand that with the topmost pt_regs you can then start the
backtrace OK. That's not what I'm referring to.
What I'm talking about is bumping into another exception frame
while unwinding from the topmost pt_regs. Or what happens
when the crash occurs while operating on an alternate kernel
stack.
Just take a simple example -- what happens when you actually enter
"alt-sysrq-c" on an x86_64, generating a keyboard interrupt and therefore
a transition to the per-cpu IRQ stack? By the time crash-kexec()
is called, you're already operating on the IRQ stack. Your
code will work its way back to the top of the per-cpu IRQ stack,
but then what does it do? Or suppose the task takes a page fault,
lays down an exception frame, and then later BUG()'s out while attempting
the handle the fault. Your code will start from the most-recently
occurring exception frame, but will bump into the page fault exception
during the unwind operation. Does you code properly recognize the new
exception frame (and the passage through assembly-language code
when that happens)?
All I'm saying is basing your test results simply on instances where
panic() is called or "echo c ..." was entered is the most trivial type
of kernel crash -- because there's no kernel exception frame
laid down until crash_kexec() gets called. That's a fairly rare
occurance w/respect to typical kernel crashes.
Dave
More information about the Crash-utility
mailing list