[Crash-utility] [PATCH] ppc64: fix crash load failure
lijiang
lijiang at redhat.com
Mon Jun 12 06:48:08 UTC 2023
On Mon, Jun 12, 2023 at 9:03 AM HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab at nec.com>
wrote:
> Hi Lianbo,
>
> sorry, my company was closed last Friday.
>
>
No worry, Kazu.
On 2023/06/08 20:46, lijiang wrote:
> > On Thu, Jun 8, 2023 at 5:56 PM lijiang <lijiang at redhat.com> wrote:
> >
> >> On Thu, Jun 8, 2023 at 5:02 PM HAGIO KAZUHITO(萩尾 一仁) <
> k-hagio-ab at nec.com>
> >> wrote:
> >>
> >>> On 2023/06/08 13:31, Lianbo Jiang wrote:
> >>>> Crash tool will fail to load vmcore with the following error:
> >>>>
> >>>> #crash vmlinux /var/crash/127.0.0.1-2023-06-07-22\:03\:24/vmcore -s
> >>>> crash: invalid structure size: note_buf
> >>>> FILE: diskdump.c LINE: 121 FUNCTION: have_crash_notes()
> >>>>
> >>>> [./crash] error trace: 101859ac => 10291798 => 10291450 =>
> 10266038
> >>>>
> >>>> 10266038: SIZE_verify+156
> >>>> 10291450: have_crash_notes+308
> >>>> 10291798: map_cpus_to_prstatus_kdump_cmprs+448
> >>>> 101859ac: task_init+11980
> >>>>
> >>>> The reason is that the note_buf is not intialized before using the
> >>>> SIZE(note_buf) in the have_crash_notes(). Let's initialize the
> variable
> >>>> note_buf in the ppc64_init() to fix this issue.
> >>>
> >>> Thanks for the patch.
> >>>
> >>> Most of the architectures already have it, but some still do not have,
> >>
> >>
> >> I don't have a 4 cpus s390x machine on hand, and am still trying to find
> >> it.
> >>
> >
> > I checked the crash code. For the s390x, it doesn't invoke the
> > map_cpus_to_prstatus{_kdump_cmprs}(), so this patch should not affect
> the
> > s390x arch.
>
> My thought is that the original commit db8c030857b4 should have added
> the initialization of SIZE(note_buf) in task_init(), because there might
> be some architectures that do not initialize it and it's better to
> initialize it when it's used, e.g. like this:
>
> --- a/task.c
> +++ b/task.c
> @@ -675,6 +675,8 @@ task_init(void)
> tt->this_task = pid_to_task(active_pid);
> }
> else {
> + if (INVALID_SIZE(note_buf))
> + STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> if (KDUMP_DUMPFILE())
> map_cpus_to_prstatus();
> else if (ELF_NOTES_VALID() && DISKDUMP_DUMPFILE())
>
>
Ok, fine to me. But, generally it should be good to be initialized in the
machdep_init() because the note_buf is strongly related to the
crash_notes, and the crash_notes contains machine specific information such
as cpu registers at the time of crash.
> To be exact, maybe we should check also whether the size is valid:
>
> - crash_notes_exists = kernel_symbol_exists("crash_notes");
> + crash_notes_exists = kernel_symbol_exists("crash_notes") &&
> + VALID_SIZE(note_buf);
>
> but if crash_notes is defined, note_buf_t also should be defined. So I
> think the above VALID_SIZE(note_buf) is not necessary.
>
>
Yes, as I mentioned above, they are closely related.
> /* Per cpu memory for storing cpu states in case of system crash. */
> note_buf_t __percpu *crash_notes;
>
> What do you think?
>
Agree, I will post v2 later.
Thanks
Lianbo
>
> Thanks,
> Kazu
>
> >
> > Mikhail Zaslonko: Can you help to confirm this issue or do some tests?
> > Thank you in advance.
> >
> > Lianbo
> >
> >
> >>
> >>>
> >>> e.g. is s390x ok? It might be better to put it in task_init() because
> >>
> >>
> >> It was initialized in the machdep_init(POST_GDB), see the following
> code:
> >> ...
> >> read_in_kernel_config(IKCFG_INIT);
> >> kernel_init();
> >> machdep_init(POST_GDB); <---
> >> vm_init();
> >> machdep_init(POST_VM);
> >> module_init();
> >> help_init();
> >> task_init();
> >> vfs_init();
> >> net_init();
> >> dev_init();
> >> machdep_init(POST_INIT);
> >> ...
> >> I would suggest putting it at the end of the kernel_init(), so that the
> >> original initialization order will basically not be changed
> >>
> >>
> >>> it's used there?
> >>>
> >>
> >> Do you mean we need to move the existing initialization(all) to the
> >> task_init()?
> >>
> >>>
> >>> $ git grep 'STRUCT_SIZE_INIT(note_buf'
> >>> arm.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>> arm64.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>> mips.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>> mips64.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>> ppc.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>> riscv64.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>> x86.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>> x86_64.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>> xen_hyper.c: XEN_HYPER_STRUCT_SIZE_INIT(note_buf_t, "note_buf_t");
> >>>
> >>> Thanks,
> >>> Kazu
> >>>
> >>>>
> >>>> Fixes: db8c030857b4 ("diskdump/netdump: fix segmentation fault caused
> >>> by failure of stopping CPUs")
> >>>> Signed-off-by: Lianbo Jiang <lijiang at redhat.com>
> >>>> ---
> >>>> ppc64.c | 1 +
> >>>> 1 file changed, 1 insertion(+)
> >>>>
> >>>> diff --git a/ppc64.c b/ppc64.c
> >>>> index b95a621d8fe4..ff7f0fca3a95 100644
> >>>> --- a/ppc64.c
> >>>> +++ b/ppc64.c
> >>>> @@ -695,6 +695,7 @@ ppc64_init(int when)
> >>>> }
> >>>>
> >>>> ppc64_init_paca_info();
> >>>> + STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >>>>
> >>>> if (!machdep->hz) {
> >>>> machdep->hz = HZ;
> >>
> >>
> >>
> >> On Thu, Jun 8, 2023 at 5:56 PM lijiang <lijiang at redhat.com
> >> <mailto:lijiang at redhat.com>> wrote:
> >>
> >> On Thu, Jun 8, 2023 at 5:02 PM HAGIO KAZUHITO(萩尾 一仁)
> >> <k-hagio-ab at nec.com <mailto:k-hagio-ab at nec.com>> wrote:
> >>
> >> On 2023/06/08 13:31, Lianbo Jiang wrote:
> >> > Crash tool will fail to load vmcore with the following error:
> >> >
> >> > #crash vmlinux
> >> /var/crash/127.0.0.1-2023-06-07-22\:03\:24/vmcore -s
> >> > crash: invalid structure size: note_buf
> >> > FILE: diskdump.c LINE: 121 FUNCTION:
> >> have_crash_notes()
> >> >
> >> > [./crash] error trace: 101859ac => 10291798 => 10291450
> >> => 10266038
> >> >
> >> > 10266038: SIZE_verify+156
> >> > 10291450: have_crash_notes+308
> >> > 10291798: map_cpus_to_prstatus_kdump_cmprs+448
> >> > 101859ac: task_init+11980
> >> >
> >> > The reason is that the note_buf is not intialized before
> >> using the
> >> > SIZE(note_buf) in the have_crash_notes(). Let's initialize
> >> the variable
> >> > note_buf in the ppc64_init() to fix this issue.
> >>
> >> Thanks for the patch.
> >>
> >> Most of the architectures already have it, but some still do
> >> not have,
> >>
> >>
> >> I don't have a 4 cpus s390x machine on hand, and am still trying
> >> to find it.
> >>
> >>
> >> I checked the crash code. For the s390x, it doesn't invoke the
> >> map_cpus_to_prstatus{_kdump_cmprs}(), so this patch should not affect
> >> the s390x arch.
> >>
> >> Mikhail Zaslonko: Can you help to confirm this issue or do some tests?
> >> Thank you in advance.
> >>
> >> Lianbo
> >>
> >>
> >> e.g. is s390x ok? It might be better to put it in task_init()
> >> because
> >>
> >> It was initialized in the machdep_init(POST_GDB), see the
> >> following code:
> >> ...
> >> read_in_kernel_config(IKCFG_INIT);
> >> kernel_init();
> >> machdep_init(POST_GDB); <---
> >> vm_init();
> >> machdep_init(POST_VM);
> >> module_init();
> >> help_init();
> >> task_init();
> >> vfs_init();
> >> net_init();
> >> dev_init();
> >> machdep_init(POST_INIT);
> >> ...
> >> I would suggest putting it at the end of the kernel_init(), so
> >> that the original initialization order will basically not be changed
> >>
> >>
> >> it's used there?
> >>
> >> Do you mean we need to move the existing initialization(all) to
> >> the task_init()?
> >>
> >>
> >> $ git grep 'STRUCT_SIZE_INIT(note_buf'
> >> arm.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >> arm64.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >> mips.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >> mips64.c: STRUCT_SIZE_INIT(note_buf,
> "note_buf_t");
> >> ppc.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >> riscv64.c: STRUCT_SIZE_INIT(note_buf,
> "note_buf_t");
> >> x86.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >> x86_64.c: STRUCT_SIZE_INIT(note_buf,
> "note_buf_t");
> >> xen_hyper.c: XEN_HYPER_STRUCT_SIZE_INIT(note_buf_t,
> >> "note_buf_t");
> >>
> >> Thanks,
> >> Kazu
> >>
> >> >
> >> > Fixes: db8c030857b4 ("diskdump/netdump: fix segmentation
> >> fault caused by failure of stopping CPUs")
> >> > Signed-off-by: Lianbo Jiang <lijiang at redhat.com
> >> <mailto:lijiang at redhat.com>>
> >> > ---
> >> > ppc64.c | 1 +
> >> > 1 file changed, 1 insertion(+)
> >> >
> >> > diff --git a/ppc64.c b/ppc64.c
> >> > index b95a621d8fe4..ff7f0fca3a95 100644
> >> > --- a/ppc64.c
> >> > +++ b/ppc64.c
> >> > @@ -695,6 +695,7 @@ ppc64_init(int when)
> >> > }
> >> >
> >> > ppc64_init_paca_info();
> >> > + STRUCT_SIZE_INIT(note_buf, "note_buf_t");
> >> >
> >> > if (!machdep->hz) {
> >> > machdep->hz = HZ;
> >>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20230612/97abbc67/attachment-0001.htm>
More information about the Crash-utility
mailing list