[Crash-utility] [PATCH] ppc64: fix crash load failure

lijiang lijiang at redhat.com
Mon Jun 12 06:48:08 UTC 2023


On Mon, Jun 12, 2023 at 9:03 AM HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab at nec.com>
wrote:

> Hi Lianbo,
>
> sorry, my company was closed last Friday.
>
>
No worry, Kazu.

On 2023/06/08 20:46, lijiang wrote:
> > On Thu, Jun 8, 2023 at 5:56 PM lijiang <lijiang at redhat.com> wrote:
> >
> >> On Thu, Jun 8, 2023 at 5:02 PM HAGIO KAZUHITO(萩尾 一仁) <
> k-hagio-ab at nec.com>
> >> wrote:
> >>
> >>> On 2023/06/08 13:31, Lianbo Jiang wrote:
> >>>> Crash tool will fail to load vmcore with the following error:
> >>>>
> >>>>    #crash vmlinux /var/crash/127.0.0.1-2023-06-07-22\:03\:24/vmcore -s
> >>>>     crash: invalid structure size: note_buf
> >>>>            FILE: diskdump.c  LINE: 121  FUNCTION: have_crash_notes()
> >>>>
> >>>>     [./crash] error trace: 101859ac => 10291798 => 10291450 =>
> 10266038
> >>>>
> >>>>       10266038: SIZE_verify+156
> >>>>       10291450: have_crash_notes+308
> >>>>       10291798: map_cpus_to_prstatus_kdump_cmprs+448
> >>>>       101859ac: task_init+11980
> >>>>
> >>>> The reason is that the note_buf is not intialized before using the
> >>>> SIZE(note_buf) in the have_crash_notes(). Let's initialize the
> variable
> >>>> note_buf in the ppc64_init() to fix this issue.
> >>>
> >>> Thanks for the patch.
> >>>
> >>> Most of the architectures already have it, but some still do not have,
> >>
> >>
> >> I don't have a 4 cpus s390x machine on hand, and am still trying to find
> >> it.
> >>
> >
> > I checked the crash code. For the s390x, it doesn't invoke the
> > map_cpus_to_prstatus{_kdump_cmprs}(),  so this patch should not affect
> the
> > s390x arch.
>
> My thought is that the original commit db8c030857b4 should have added
> the initialization of SIZE(note_buf) in task_init(), because there might
> be some architectures that do not initialize it and it's better to
> initialize it when it's used, e.g. like this:
>
> --- a/task.c
> +++ b/task.c
> @@ -675,6 +675,8 @@ task_init(void)
>                  tt->this_task = pid_to_task(active_pid);
>          }
>          else {
> +               if (INVALID_SIZE(note_buf))
> +                       STRUCT_SIZE_INIT(note_buf, "note_buf_t");
>                  if (KDUMP_DUMPFILE())
>                          map_cpus_to_prstatus();
>                  else if (ELF_NOTES_VALID() && DISKDUMP_DUMPFILE())
>
>
Ok, fine to me. But, generally it should be good to be initialized in the
machdep_init() because the note_buf is strongly related to  the
crash_notes, and the crash_notes contains machine specific information such
as cpu registers at the time of crash.


> To be exact, maybe we should check also whether the size is valid:
>
> -       crash_notes_exists = kernel_symbol_exists("crash_notes");
> +       crash_notes_exists = kernel_symbol_exists("crash_notes") &&
> +                               VALID_SIZE(note_buf);
>
> but if crash_notes is defined, note_buf_t also should be defined.  So I
> think the above VALID_SIZE(note_buf) is not necessary.
>
>
Yes, as I mentioned above, they are closely related.


>    /* Per cpu memory for storing cpu states in case of system crash. */
>    note_buf_t __percpu *crash_notes;
>
> What do you think?
>

Agree, I will post v2 later.

Thanks
Lianbo


>
> Thanks,
> Kazu
>
> >
> > Mikhail Zaslonko: Can you help to confirm this issue or do some tests?
> > Thank  you in advance.
> >
> > Lianbo
> >
> >
> >>
> >>>
> >>> e.g. is s390x ok?  It might be better to put it in task_init() because
> >>
> >>
> >> It was initialized in the machdep_init(POST_GDB), see the following
> code:
> >> ...
> >>                          read_in_kernel_config(IKCFG_INIT);
> >>                          kernel_init();
> >>                          machdep_init(POST_GDB); <---
> >>                          vm_init();
> >>                          machdep_init(POST_VM);
> >>                          module_init();
> >>                          help_init();
> >>                          task_init();
> >>                          vfs_init();
> >>                          net_init();
> >>                          dev_init();
> >>                          machdep_init(POST_INIT);
> >> ...
> >> I would suggest putting it at the end of the kernel_init(), so that the
> >> original initialization order will basically not be changed
> >>
> >>
> >>> it's used there?
> >>>
> >>
> >> Do you mean we need to move the existing initialization(all) to the
> >> task_init()?
> >>
> >>>
> >>> $ git grep 'STRUCT_SIZE_INIT(note_buf'
> >>> arm.c:          STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>> arm64.c:        STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>> mips.c:         STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>> mips64.c:               STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>> ppc.c:          STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>> riscv64.c:              STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>> x86.c:          STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>> x86_64.c:               STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>> xen_hyper.c:    XEN_HYPER_STRUCT_SIZE_INIT(note_buf_t, "note_buf_t");
> >>>
> >>> Thanks,
> >>> Kazu
> >>>
> >>>>
> >>>> Fixes: db8c030857b4 ("diskdump/netdump: fix segmentation fault caused
> >>> by failure of stopping CPUs")
> >>>> Signed-off-by: Lianbo Jiang <lijiang at redhat.com>
> >>>> ---
> >>>>    ppc64.c | 1 +
> >>>>    1 file changed, 1 insertion(+)
> >>>>
> >>>> diff --git a/ppc64.c b/ppc64.c
> >>>> index b95a621d8fe4..ff7f0fca3a95 100644
> >>>> --- a/ppc64.c
> >>>> +++ b/ppc64.c
> >>>> @@ -695,6 +695,7 @@ ppc64_init(int when)
> >>>>                }
> >>>>
> >>>>                ppc64_init_paca_info();
> >>>> +             STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>>>
> >>>>                if (!machdep->hz) {
> >>>>                        machdep->hz = HZ;
> >>
> >>
> >>
> >> On Thu, Jun 8, 2023 at 5:56 PM lijiang <lijiang at redhat.com
> >> <mailto:lijiang at redhat.com>> wrote:
> >>
> >>     On Thu, Jun 8, 2023 at 5:02 PM HAGIO KAZUHITO(萩尾 一仁)
> >>     <k-hagio-ab at nec.com <mailto:k-hagio-ab at nec.com>> wrote:
> >>
> >>         On 2023/06/08 13:31, Lianbo Jiang wrote:
> >>         > Crash tool will fail to load vmcore with the following error:
> >>         >
> >>         >   #crash vmlinux
> >>         /var/crash/127.0.0.1-2023-06-07-22\:03\:24/vmcore -s
> >>         >    crash: invalid structure size: note_buf
> >>         >           FILE: diskdump.c  LINE: 121  FUNCTION:
> >>         have_crash_notes()
> >>         >
> >>         >    [./crash] error trace: 101859ac => 10291798 => 10291450
> >>         => 10266038
> >>         >
> >>         >      10266038: SIZE_verify+156
> >>         >      10291450: have_crash_notes+308
> >>         >      10291798: map_cpus_to_prstatus_kdump_cmprs+448
> >>         >      101859ac: task_init+11980
> >>         >
> >>         > The reason is that the note_buf is not intialized before
> >>         using the
> >>         > SIZE(note_buf) in the have_crash_notes(). Let's initialize
> >>         the variable
> >>         > note_buf in the ppc64_init() to fix this issue.
> >>
> >>         Thanks for the patch.
> >>
> >>         Most of the architectures already have it, but some still do
> >>         not have,
> >>
> >>
> >>     I don't have a 4 cpus s390x machine on hand, and am still trying
> >>     to find it.
> >>
> >>
> >> I checked the crash code. For the s390x, it doesn't invoke the
> >> map_cpus_to_prstatus{_kdump_cmprs}(),  so this patch should not affect
> >> the s390x arch.
> >>
> >> Mikhail Zaslonko: Can you help to confirm this issue or do some tests?
> >> Thank  you in advance.
> >>
> >> Lianbo
> >>
> >>
> >>         e.g. is s390x ok?  It might be better to put it in task_init()
> >>         because
> >>
> >>     It was initialized in the machdep_init(POST_GDB), see the
> >>     following code:
> >>     ...
> >>                             read_in_kernel_config(IKCFG_INIT);
> >>                             kernel_init();
> >>                             machdep_init(POST_GDB); <---
> >>                             vm_init();
> >>                             machdep_init(POST_VM);
> >>                             module_init();
> >>                             help_init();
> >>                             task_init();
> >>                             vfs_init();
> >>                             net_init();
> >>                             dev_init();
> >>                             machdep_init(POST_INIT);
> >>     ...
> >>     I would suggest putting it at the end of the kernel_init(), so
> >>     that the original initialization order will basically not be changed
> >>
> >>
> >>         it's used there?
> >>
> >>     Do you mean we need to move the existing initialization(all) to
> >>     the task_init()?
> >>
> >>
> >>         $ git grep 'STRUCT_SIZE_INIT(note_buf'
> >>         arm.c:          STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>         arm64.c:        STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>         mips.c:         STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>         mips64.c:               STRUCT_SIZE_INIT(note_buf,
> "note_buf_t");
> >>         ppc.c:          STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>         riscv64.c:              STRUCT_SIZE_INIT(note_buf,
> "note_buf_t");
> >>         x86.c:          STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>         x86_64.c:               STRUCT_SIZE_INIT(note_buf,
> "note_buf_t");
> >>         xen_hyper.c:    XEN_HYPER_STRUCT_SIZE_INIT(note_buf_t,
> >>         "note_buf_t");
> >>
> >>         Thanks,
> >>         Kazu
> >>
> >>         >
> >>         > Fixes: db8c030857b4 ("diskdump/netdump: fix segmentation
> >>         fault caused by failure of stopping CPUs")
> >>         > Signed-off-by: Lianbo Jiang <lijiang at redhat.com
> >>         <mailto:lijiang at redhat.com>>
> >>         > ---
> >>         >   ppc64.c | 1 +
> >>         >   1 file changed, 1 insertion(+)
> >>         >
> >>         > diff --git a/ppc64.c b/ppc64.c
> >>         > index b95a621d8fe4..ff7f0fca3a95 100644
> >>         > --- a/ppc64.c
> >>         > +++ b/ppc64.c
> >>         > @@ -695,6 +695,7 @@ ppc64_init(int when)
> >>         >               }
> >>         >
> >>         >               ppc64_init_paca_info();
> >>         > +             STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>         >
> >>         >               if (!machdep->hz) {
> >>         >                       machdep->hz = HZ;
> >>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20230612/97abbc67/attachment-0001.htm>


More information about the Crash-utility mailing list