[Crash-utility] [PATCH] Fix x86 initialization for {hard, soft}irq_ctx
Dave Anderson
anderson at redhat.com
Wed Feb 15 21:50:46 UTC 2017
Hi Ogawa,
I've checked in a patch for storing the correct 32-bit x86 hard/soft IRQ stack addresses:
https://github.com/crash-utility/crash/commit/5959759058e53035fb26b9f2b590acfcef14409d
Fix for Linux 3.15 and later 32-bit X86 kernels containing kernel
commit 198d208df4371734ac4728f69cb585c284d20a15, titled "x86: Keep
thread_info on thread stack in x86_32". Without the patch, incorrect
addresses of each per-cpu hardirq_stack and softirq_stack were saved
for usage by the "bt" command.
(hirofumi at mail.parknet.co.jp, anderson at redhat.com)
I don't have a vmcore that I can test whether the stack transition still works,
but at least it's got the correct addresses, and I believe I have the correct
values set up in restore_stack().
Also, there are also several other 32-bit x86 backtrace-related patches that I've
checked in the last few days for other issues that you may have run into:
commit 222e784fb2d16f28ab702a7f9a1889e0b18ca152
Author: Dave Anderson <anderson at redhat.com>
Date: Fri Feb 10 15:07:58 2017 -0500
Fix for 32-bit X86 kernels configured with CONFIG_RANDOMIZE_BASE.
Without the patch, an invalid kernel PAGE_OFFSET value is calculated
and as a result the session fails during session initialization just
after the patching of the gdb minimal_symbol values message, showing
the warning message "WARNING: cannot read linux_banner string",
followed by "crash: /vmlinux and /dev/crash do not match!". This
patch also adds a new "--machdep page_offset=<value>" option that
can be used if the CONFIG_PAGE_OFFSET value is not the default
address of 0xc0000000.
(anderson at redhat.com)
commit 69b577e423c1adb5f06d19bccc790063a500aac4
Author: Dave Anderson <anderson at redhat.com>
Date: Tue Feb 14 14:25:37 2017 -0500
Fix for the "bt" command on Linux 4.9 and later 32-bit X86 kernels
containing kernel commit 0100301bfdf56a2a370c7157b5ab0fbf9313e1cd,
subject "sched/x86: Rewrite the switch_to() code". Without the
patch, backtraces for inactive (sleeping) tasks fail with the message
"bt: invalid structure member offset: task_struct_thread_eip".
(anderson at redhat.com)
commit ec1a9b967da19acaa0ec6b6a904d16267ba6a409
Author: Dave Anderson <anderson at redhat.com>
Date: Tue Feb 14 15:23:01 2017 -0500
Fix for a "[-Wmisleading-indentation]" compiler warning and the
associated bug that is generated by lkcd_x86_trace.c when building
32-bit X86 with "make warn" with gcc-6.3.1.
(anderson at redhat.com)
commit 63dd57685e5e897ae343a213b58cb1e6ae4ed282
Author: Dave Anderson <anderson at redhat.com>
Date: Tue Feb 14 16:23:56 2017 -0500
Fix for an invalid "bt" warning on a 32-bit X86 idle (swapper) task.
Without the patch, the backtrace displays the "cannot resolve stack
trace" warning, dumps the backtrace, and then the text symbols:
crash> bt
PID: 0 TASK: f0962180 CPU: 6 COMMAND: "swapper/6"
bt: cannot resolve stack trace:
#0 [f095ff1c] __schedule at c0b6ef8d
#1 [f095ff58] schedule at c0b6f4a9
#2 [f095ff64] schedule_preempt_disabled at c0b6f728
#3 [f095ff6c] cpu_startup_entry at c04b0310
#4 [f095ff94] start_secondary at c04468c0
bt: text symbols on stack:
[f095ff1c] __schedule at c0b6ef8d
[f095ff58] schedule at c0b6f4ae
[f095ff64] schedule_preempt_disabled at c0b6f72d
[f095ff6c] cpu_startup_entry at c04b0315
[f095ff94] start_secondary at c04468c5
crash>
The backtrace shown is actually correct.
(anderson at redhat.com)
commit 8323490fd962ba1b6aa745deea2ec817d837862d
Author: Dave Anderson <anderson at redhat.com>
Date: Wed Feb 15 15:04:07 2017 -0500
Another fix for a similar "bt: cannot resolve stack trace" warning
on a 32-bit X86 idle/swapper task, but when running on cpu 0.
(anderson at redhat.com)
commit 345da6dc904cb88a3abfae36a990150a2036b95d
Author: Dave Anderson <anderson at redhat.com>
Date: Wed Feb 15 15:39:13 2017 -0500
Remove two one-time warning messages that are displayed when running
the "bt" command on Linux 4.2 and later 32-bit X86 kernels. Without
the patch, the first "bt" command that is executed will be preceded
by "bt: WARNING: "system_call" symbol does not exist", followed by
"bt: WARNING: neither "ret_from_sys_call" nor "syscall_badsys"
symbols exist".
(anderson at redhat.com)
It's been quite some time since I've done any 32-bit x86 maintenance,
and any help/time you can afford would be appreciated.
Thanks,
Dave
----- Original Message -----
> Dave Anderson <anderson at redhat.com> writes:
>
> >> > Please review the attached update to your patch, which adds support for
> >> > the irq_ctx-to-irq_stack transition.
> >> > - if ((hard_sp = per_cpu_symbol_search("per_cpu__hardirq_ctx"))) {
> >> > + if ((hard_sp = per_cpu_symbol_search("per_cpu__hardirq_ctx")) ||
> >> > + (hard_sp = per_cpu_symbol_search("per_cpu__hardirq_stack"))) {
> >> > if ((kt->flags & SMP) && (kt->flags & PER_CPU_OFF)) {
> >> > for (i = 0; i < NR_CPUS; i++) {
> >> > + ulong ptr;
> >> > +
> >>
> >> [...]
> >>
> >> Right, this will work, maybe, several stuffs. However hardirq_tasks[] is
> >> missing, so this will not be fully. For now, I don't have dump of other
> >> than v4.9 (stack format was changed), so I'm not sure which commands are
> >> not working (user of hardirq_tasks[]).
> >
> > I haven't looked at the recent kernel to check out how the linkage is done
> > without the temporary thread_info at the bottom of the hard/soft irq
> > stacks,
> > but looking at a live 4.9 system, it looks like the first word at the base
> > of
> > the hard and soft irq stack now contains a pointer into the stack of the
> > last caller? If that's true, then stkptr_to_task() could be used to
> > populate the hardirq_tasks[] and softirq_tasks[] -- but the initialization
> > would have to be delayed until near the end of task_init().
>
> Right. At least on v4.9, with quick check, looks like the x86 and x86_64
> has different (x86 is for hardirq/softirq, x86_64 is for
> hardirq/exception).
>
> linux/arch/x86/kernel/dumpstack_{32,64}.c is how get previous stack from
> sp.
>
> x86:
> in_hardirq_stack(stack, info)
> info->next_sp = (unsigned long *)*begin;
> in_softirq_stack(stack, info)
> info->next_sp = (unsigned long *)*begin;
>
> both is same as you said (at start of stack pages).
>
> x86_64
> in_exception_stack(stack, info)
> info->next_sp = (unsigned long *)regs->sp;
> in_irq_stack(stack, info)
> info->next_sp = (unsigned long *)*(end - 1);
>
> exception has pt_regs at top of stack (at end of stack pages)
> irq has previous sp at top of stack (at end of stack pages)
>
> But I'm still not checking when switched to this format, and how
> consolidate/coding those in crash.
> --
> OGAWA Hirofumi <hirofumi at mail.parknet.co.jp>
>
More information about the Crash-utility
mailing list