[Crash-utility] Re: [PATCH 0/3] Display local variables & function parameters from stack frames

Dave Anderson anderson at redhat.com
Tue May 26 16:00:53 UTC 2009


----- "Sharyathi Nagesh" <sharyath at in.ibm.com> wrote:

> Dave
> 	Attaching the patches that we have developed till now.

> We have tried to accommodate your suggestions regarding Makefiles in 
> this patches. We have tried to provide unwinding support in ppc64, x86_64
> and x86 architecture. The feature is tested on ppc64 with dumps taken through 
> panic and echo c > /proc/sysrq-trigger. We have observed that many times 
> the variable information is optimized out and accessing variable 
> information is not possible (volatile variables are shown correctly),
> IMHO this behavior is similar to that shown by gdb, please spare your thoughts
> on this.  This implementation works reasonably on ppc64, we were able to unwind
> to stack and get variable information.  But we are still facing problem with x86
> and x86_64 architecture, Our generic implementation for stack unwinding using dwarf
> is not succeeding, since frame pointer is optimized out by default. IMHO stack
> unwinding as talked about in ABI's is not valid for the same reason. But if you
> enable CONFIG_FRAME_POINTER in kernel, then unwinding using bp will be possible.
>
> 	Please let me know your thoughts in general

Well, it's not like I didn't warn you several months ago when you decided
to undertake this task...

My general thoughts are the still the same:

 - Tesing with just panic() and "echo c > /proc/sysrq-trigger" are pretty
   much exceptional ways of the kernel being shut down.  Typically a "real"
   crash is due to a BUG() or segmentation violation, or some scenario where
   an exception frame is laid down on the stack -- and perhaps a switch is
   made to another kernel stack entirely.  Coming from panic() or "echo c",
   the trace will stay on the kernel stack and no exception will be raised.
   (unless you're using a RHEL4 or earlier kernel that called BUG() if either
   netdump/diskdump were in play.)  In any case, without an exception, you're
   making it "easy" for the unwinder.  Your code is going to have to deal with
   exception frames on the stack and with potential stack-switches.

 - These days you'd be hard-pressed to find a distribution kernel that is
   compiled with CONFIG_FRAME_POINTER.  And if it is, the handling can
   be folded into the currently-existing backtracer.  (In fact, the original
   x86 backtrace code does have a separate code path for kernels that
   were built without -fomit-frame-pointer.)  But by the time x86_64 came
   around, -fomit-frame-pointer was pretty much the default, and I don't think
   I've ever seen an x86_64 dumpfile with frame-pointers, certainly not
   from a distributor.  So, yes it's nice you've got something that works
   with that configuration, but it's pretty unrealistic.
   
 - The problems you're running into getting local variables is not surprising.
   And without a rock-solid backtrace as a prerequisite, it doesn't make any
   sense to even try getting local variables.

 - Your dependency on getting the starting register set from the ELF header
   presumes just that -- i.e., that you've got them and that they are meaningful.
   I'm not sure how/if you've worked around the "hand-carved" pt_regs created
   when no exception frame has been laid down (i.e. when panic() or "echo c"
   are used.)  But the bigger picture is that there are multiple types of
   dumpfiles, and while kdump ELF vmcores are prevalent, they are often only
   the starting point.  With the huge cumbersome memory sizes of modern machines,
   it's more likely that the ELF vmcore seen by the secondary kdump kernel will
   be run through "makedumpfile -c ..." into the compressed kdump format
   prior to the dump ever being seen by whoever analyzes it.  At Red Hat, the
   support organization pretty much makes all customers use "makedumpfile -c ..."
   by default.  And of course with the compressed kdump format, there is no
   register set as your starting point.

That all being said, here's what I can do for you.  I will take your new
functions that you've added to netdump.c, and their protos in defs.h,
and apply them to the next crash utility release.  Since they are not being
used by anybody but your module at the moment, it's harmless to add them,
and then you will not have to patch the crash sources at all in your
subsequent module patch/posts.

Dave


> 
> Here I am attaching 3 patches
> 1. Display-local-variables-and-function-parameters.patch
> 	Provides feature to print local variables and arguments in the current 
> stack frame using dwarf information.
> 
> 2. Provide-stack-unwinding-feature.patch
> 	Provides option to unwind the stack, using dwarf information, works
> on ppc64
> 
> 3. unwind_x86_64.patch
> 	provides unwinding feature in x86_64 with out using dwarf information
> 
> but requires CONFIG_FRAME_POINTER to be enabled




More information about the Crash-utility mailing list