[Crash-utility] [BUG?] DWARF unwind is always off (and seems to be broken)

Dave Anderson anderson at redhat.com
Thu Mar 30 16:26:03 UTC 2017



----- Original Message -----
> 
> All,
> 
> 1. Crash version 7.0.8 (debian8) never actually sets DWARF_UNWIND due to 
> a bug in initialization order: first, `kernel_init` is called that 
> checks whether the DWARF_UNWIND bit is set and, if not, sets 
> NO_DWARF_UNWIND.

Correct -- DWARF_UNWIND only gets set if the user wants it, so by default it
should never be set.  The check in kernel_init() exists to handle the case 
where DWARF_UNWIND was explicitly set by a "set unwind on" command
in a .crashrc file.
 
> Then the `init_unwind_table` called that sets DWARF_UNWIND only if 
> NO_DWARF_UNWIND is not set. So, the resulting `kt->flags` are 
> (DWARF_UNWIND_EH_FRAME | NO_DWARF_UNWIND) and the 
> `x86_64_low_budge_back_trace_cmd` is always called despite having the 
> DWARF tables loaded.
 
Right.  x86_64_low_budge_back_trace_cmd() is the default backtrace function.

The alternative x86_64_dwarf_back_trace_cmd() backtrace function was added 
by IBM back in the 4.0-3.8 timeframe in 2006.  It has not been maintained
much since then (if at all), so I don't even know how well it even works.
Personally, I've never utilized it.

> 2. When setting DWARF_UNWIND by gdb, the only result that backtrace 
> shows is the address inside the `__schedule` derived via 
> `x86_64_thread_return_init`.

You can enter "set unwind on" during runtime.

> This is due to the two things:
> 
> First, `thread_return` points to the address after `callq __switch_to` 
> and this address is not on the stack of the original thread since the 
> stacks were switched already early. Setting `thread_return` to the 
> address just before stacks are swapped wont help either because of the 
> below.
> 
> Second, since the Kernel lacks CFI instructions in `context_switch` 
> macro the stack state before the `rsp` swap is slightly different 
> comparing to what is described in the debug info, at exactly two longs 
> because of `pushfq; push %rbp`. Manually adjusting `frame.regs.rsp` at 
> the entry to the `unwind` fixes this, yielding in a correct DWARF_UNWIND 
> output from the `bt` command.
> 
> 3. I'm asking for an advice on how to fix that.
>
> My proposal is to switch `thread_return` to point to the code right 
> before stacks are swapped since it seems that the current implementation 
> wrongly assumes that the `thread_return` value will be on the stack of 
> original thread. This, however, is not required to fix DWARF_UNWIND.
>
> Next, the stack should be adjusted right before we enter the 
> `x86_64_dwarf_back_trace_cmd` routine based on what the disassembly of 
> `context_switch` macro.
> 
> Does this seems correct to you guys?

As to advice, you're on your own, unless the original author -- or someone
else who actually still uses it -- wants to step up and resurrect support
of the functionality.  Feel free to file a patch.  All I ask is that any 
changes you make do not affect the default functionality.

Thanks,
  Dave




More information about the Crash-utility mailing list