[Crash-utility] [PATCH] Fix for "bt" command prints a bogus warning

lijiang lijiang at redhat.com
Tue Feb 14 02:22:48 UTC 2023


On Tue, Feb 14, 2023 at 8:43 AM HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab at nec.com>
wrote:

> On 2023/02/10 19:36, lijiang wrote:
> >> On Fri, Feb 10, 2023 at 2:43 PM HAGIO KAZUHITO(萩尾 一仁) <
> k-hagio-ab at nec.com <mailto:k-hagio-ab at nec.com>> wrote:
> >>
> >>     On 2023/02/09 21:35, lijiang wrote:> On Tue, Feb 7, 2023 at 6:08 PM
> lijiang <lijiang at redhat.com <mailto:lijiang at redhat.com> <mailto:
> lijiang at redhat.com <mailto:lijiang at redhat.com>>> wrote:
> >>     >
> >>     >         more important thing is how we determine the
> irq_eframe_link.
> >>     >
> >>     >     The following patch can work on upstream kernel vmcore and
> RHEL9 vmcore.
> >>     >     Maybe we can check the symbols asm_common_interrupt and
> asm_call_on_stack as below:
> >>     >
> >>     >     diff --git a/x86_64.c b/x86_64.c
> >>     >     index 7a5d6f050c89..62036f71f632 100644
> >>     >     --- a/x86_64.c
> >>     >     +++ b/x86_64.c
> >>     >     @@ -3938,6 +3938,11 @@ in_exception_stack:
> >>     >               if (irq_eframe) {
> >>     >                       bt->flags |= BT_EXCEPTION_FRAME;
> >>     >                       i = (irq_eframe -
> bt->stackbase)/sizeof(ulong);
> >>     >     + if (symbol_exists("asm_common_interrupt")) {
> >>     >     + i -= 1;
> >>     >     + up = (ulong *)(&bt->stackbuf[i*sizeof(ulong)]);
> >>     >     + bt->instptr = *up;
> >>     >     + }
> >>     >                       x86_64_print_stack_entry(bt, ofp, level, i,
> bt->instptr);
> >>     >                       bt->flags &= ~(ulonglong)BT_EXCEPTION_FRAME;
> >>     >                       cs =
> x86_64_exception_frame(EFRAME_PRINT|EFRAME_CS, 0,
> >>     >     @@ -6521,6 +6526,16 @@ x86_64_irq_eframe_link_init(void)
> >>     >        else
> >>     >        return;
> >>     >
> >>     >     + if (symbol_exists("asm_common_interrupt") &&
> !symbol_exists("asm_call_on_stack")) {
> >>     >     + machdep->machspec->irq_eframe_link =-32;
> >>     >     + return;
> >>     >     + }
> >>     >     +
> >>     >     + if (symbol_exists("asm_common_interrupt") &&
> symbol_exists("asm_call_on_stack")) {
> >>     >     + machdep->machspec->irq_eframe_link =-56;
> >>     >     + return;
> >>     >     + }
> >>     >     +
> >>     >        if (THIS_KERNEL_VERSION < LINUX(2,6,9))
> >>     >        return;
> >>     >
> >>     >
> >>     > Do you have any other comments about the above changes? Or still
> looking for a better solution, any thoughts? Kazu.
> >>
> >>     Thank you for the information.  I've looked for a way to follow
> >>     kernel changes, but it would be hard with the current unwinder.
> >>
> >>     So for now, let's go with this?
> >>
> >>
> >> Agree. I will post the v2 later with your signature together.
> >>
> >>     +       if (symbol_exists("asm_common_interrupt")) {
> >>     +               if (symbol_exists("asm_call_on_stack"))
> >>     +                       machdep->machspec->irq_eframe_link = -64;
> >>     +               else
> >>     +                       machdep->machspec->irq_eframe_link = -32;
> >>     +               return;
> >>     +       }
> >>
> >>     Probably the actual irq_eframe_link for your kernel is -64, I
> >>     think it's adjusted in x86_64_irq_eframe_link().  Please check
> >>
> >>
> >> You are right. I confirm that the irq_eframe_link will be adjusted in
> the x86_64_irq_eframe_link().
> >>
> >> But it should be better to set the irq_eframe_link to -64 in
> the x86_64_irq_eframe_link_init(), it doesn't need to be adjusted any more.
> >>
> >>     "help -m".  e.g. a 5.8 vmcore on hand:
> >>
> >>     crash-dev> bt
> >>     ...
> >>     crash-dev> help -m | grep irq_eframe
> >>                irq_eframe_link: -64
> >>
> >>
> >>     The following is the reason why the irq_eframe_link is -32 on
> >>     RHEL9.1 (5.14.0-162.6.1.el9_1.x86_64).
> >>
> >>
> >> Thank you for sharing the following process. Kazu.
> >>
> >> I get another way to calculate the value of irq eframe, the irq eframe
> is
> >> located at the stack address where the asm_common_interrupt symbol is
> stored
> >> plus 8.
> >>
> >> Maybe we can use the following formula to calculate the result, but it
> may need
> >> to walk through the whole raw stack data. At least, we do not need to
> get the
> >> eframe address from the ASM code. What do you think?
> >>
> >> BTW: Maybe we can do it in the future.
>
> That way might be a bit better, but anyways I think they are not
> very robust for kernel changes and various interrupts.
>
> I've been working on enhancing the usage of ORC data and currently
> it's to determine the frame size of functions, but probably we
> will also be able to find pt_regs more exactly with the data.
> Until it finishes, let's use this way.
>
>
Sounds like a good idea. Looking forward to implementing it in the next
release.

Thanks.
Lianbo


> Thanks,
> Kazu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20230214/a12bdff4/attachment-0001.htm>


More information about the Crash-utility mailing list