[Crash-utility] Re: [PATCH 0/3] Display local variables & function parameters from stack frames
Dave Anderson
anderson at redhat.com
Tue May 26 16:00:53 UTC 2009
----- "Sharyathi Nagesh" <sharyath at in.ibm.com> wrote:
> Dave
> Attaching the patches that we have developed till now.
> We have tried to accommodate your suggestions regarding Makefiles in
> this patches. We have tried to provide unwinding support in ppc64, x86_64
> and x86 architecture. The feature is tested on ppc64 with dumps taken through
> panic and echo c > /proc/sysrq-trigger. We have observed that many times
> the variable information is optimized out and accessing variable
> information is not possible (volatile variables are shown correctly),
> IMHO this behavior is similar to that shown by gdb, please spare your thoughts
> on this. This implementation works reasonably on ppc64, we were able to unwind
> to stack and get variable information. But we are still facing problem with x86
> and x86_64 architecture, Our generic implementation for stack unwinding using dwarf
> is not succeeding, since frame pointer is optimized out by default. IMHO stack
> unwinding as talked about in ABI's is not valid for the same reason. But if you
> enable CONFIG_FRAME_POINTER in kernel, then unwinding using bp will be possible.
>
> Please let me know your thoughts in general
Well, it's not like I didn't warn you several months ago when you decided
to undertake this task...
My general thoughts are the still the same:
- Tesing with just panic() and "echo c > /proc/sysrq-trigger" are pretty
much exceptional ways of the kernel being shut down. Typically a "real"
crash is due to a BUG() or segmentation violation, or some scenario where
an exception frame is laid down on the stack -- and perhaps a switch is
made to another kernel stack entirely. Coming from panic() or "echo c",
the trace will stay on the kernel stack and no exception will be raised.
(unless you're using a RHEL4 or earlier kernel that called BUG() if either
netdump/diskdump were in play.) In any case, without an exception, you're
making it "easy" for the unwinder. Your code is going to have to deal with
exception frames on the stack and with potential stack-switches.
- These days you'd be hard-pressed to find a distribution kernel that is
compiled with CONFIG_FRAME_POINTER. And if it is, the handling can
be folded into the currently-existing backtracer. (In fact, the original
x86 backtrace code does have a separate code path for kernels that
were built without -fomit-frame-pointer.) But by the time x86_64 came
around, -fomit-frame-pointer was pretty much the default, and I don't think
I've ever seen an x86_64 dumpfile with frame-pointers, certainly not
from a distributor. So, yes it's nice you've got something that works
with that configuration, but it's pretty unrealistic.
- The problems you're running into getting local variables is not surprising.
And without a rock-solid backtrace as a prerequisite, it doesn't make any
sense to even try getting local variables.
- Your dependency on getting the starting register set from the ELF header
presumes just that -- i.e., that you've got them and that they are meaningful.
I'm not sure how/if you've worked around the "hand-carved" pt_regs created
when no exception frame has been laid down (i.e. when panic() or "echo c"
are used.) But the bigger picture is that there are multiple types of
dumpfiles, and while kdump ELF vmcores are prevalent, they are often only
the starting point. With the huge cumbersome memory sizes of modern machines,
it's more likely that the ELF vmcore seen by the secondary kdump kernel will
be run through "makedumpfile -c ..." into the compressed kdump format
prior to the dump ever being seen by whoever analyzes it. At Red Hat, the
support organization pretty much makes all customers use "makedumpfile -c ..."
by default. And of course with the compressed kdump format, there is no
register set as your starting point.
That all being said, here's what I can do for you. I will take your new
functions that you've added to netdump.c, and their protos in defs.h,
and apply them to the next crash utility release. Since they are not being
used by anybody but your module at the moment, it's harmless to add them,
and then you will not have to patch the crash sources at all in your
subsequent module patch/posts.
Dave
>
> Here I am attaching 3 patches
> 1. Display-local-variables-and-function-parameters.patch
> Provides feature to print local variables and arguments in the current
> stack frame using dwarf information.
>
> 2. Provide-stack-unwinding-feature.patch
> Provides option to unwind the stack, using dwarf information, works
> on ppc64
>
> 3. unwind_x86_64.patch
> provides unwinding feature in x86_64 with out using dwarf information
>
> but requires CONFIG_FRAME_POINTER to be enabled
More information about the Crash-utility
mailing list