<div dir="ltr"><div dir="ltr">On Mon, Jun 12, 2023 at 9:03 AM HAGIO KAZUHITO(萩尾 一仁) <<a href="mailto:k-hagio-ab@nec.com">k-hagio-ab@nec.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Lianbo,<br>
<br>
sorry, my company was closed last Friday.<br>
<br></blockquote><div><br></div><div>No worry, Kazu.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
On 2023/06/08 20:46, lijiang wrote:<br>
> On Thu, Jun 8, 2023 at 5:56 PM lijiang <<a href="mailto:lijiang@redhat.com" target="_blank">lijiang@redhat.com</a>> wrote:<br>
> <br>
>> On Thu, Jun 8, 2023 at 5:02 PM HAGIO KAZUHITO(萩尾 一仁) <<a href="mailto:k-hagio-ab@nec.com" target="_blank">k-hagio-ab@nec.com</a>><br>
>> wrote:<br>
>><br>
>>> On 2023/06/08 13:31, Lianbo Jiang wrote:<br>
>>>> Crash tool will fail to load vmcore with the following error:<br>
>>>><br>
>>>> #crash vmlinux /var/crash/127.0.0.1-2023-06-07-22\:03\:24/vmcore -s<br>
>>>> crash: invalid structure size: note_buf<br>
>>>> FILE: diskdump.c LINE: 121 FUNCTION: have_crash_notes()<br>
>>>><br>
>>>> [./crash] error trace: 101859ac => 10291798 => 10291450 => 10266038<br>
>>>><br>
>>>> 10266038: SIZE_verify+156<br>
>>>> 10291450: have_crash_notes+308<br>
>>>> 10291798: map_cpus_to_prstatus_kdump_cmprs+448<br>
>>>> 101859ac: task_init+11980<br>
>>>><br>
>>>> The reason is that the note_buf is not intialized before using the<br>
>>>> SIZE(note_buf) in the have_crash_notes(). Let's initialize the variable<br>
>>>> note_buf in the ppc64_init() to fix this issue.<br>
>>><br>
>>> Thanks for the patch.<br>
>>><br>
>>> Most of the architectures already have it, but some still do not have,<br>
>><br>
>><br>
>> I don't have a 4 cpus s390x machine on hand, and am still trying to find<br>
>> it.<br>
>><br>
> <br>
> I checked the crash code. For the s390x, it doesn't invoke the<br>
> map_cpus_to_prstatus{_kdump_cmprs}(), so this patch should not affect the<br>
> s390x arch.<br>
<br>
My thought is that the original commit db8c030857b4 should have added <br>
the initialization of SIZE(note_buf) in task_init(), because there might <br>
be some architectures that do not initialize it and it's better to <br>
initialize it when it's used, e.g. like this:<br>
<br>
--- a/task.c<br>
+++ b/task.c<br>
@@ -675,6 +675,8 @@ task_init(void)<br>
tt->this_task = pid_to_task(active_pid);<br>
}<br>
else {<br>
+ if (INVALID_SIZE(note_buf))<br>
+ STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
if (KDUMP_DUMPFILE())<br>
map_cpus_to_prstatus();<br>
else if (ELF_NOTES_VALID() && DISKDUMP_DUMPFILE())<br>
<br></blockquote><div> </div><div>Ok, fine to me. But, generally it should be good to be initialized in the machdep_init() because the note_buf is strongly related to the crash_notes, and the crash_notes contains machine specific information such as cpu registers at the time of crash.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
To be exact, maybe we should check also whether the size is valid:<br>
<br>
- crash_notes_exists = kernel_symbol_exists("crash_notes");<br>
+ crash_notes_exists = kernel_symbol_exists("crash_notes") &&<br>
+ VALID_SIZE(note_buf);<br>
<br>
but if crash_notes is defined, note_buf_t also should be defined. So I <br>
think the above VALID_SIZE(note_buf) is not necessary.<br>
<br></blockquote><div> </div><div>Yes, as I mentioned above, they are closely related.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
/* Per cpu memory for storing cpu states in case of system crash. */<br>
note_buf_t __percpu *crash_notes;<br>
<br>
What do you think?<br></blockquote><div><br></div><div>Agree, I will post v2 later.</div><div><br></div><div>Thanks</div><div>Lianbo</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Thanks,<br>
Kazu<br>
<br>
> <br>
> Mikhail Zaslonko: Can you help to confirm this issue or do some tests?<br>
> Thank you in advance.<br>
> <br>
> Lianbo<br>
> <br>
> <br>
>><br>
>>><br>
>>> e.g. is s390x ok? It might be better to put it in task_init() because<br>
>><br>
>><br>
>> It was initialized in the machdep_init(POST_GDB), see the following code:<br>
>> ...<br>
>> read_in_kernel_config(IKCFG_INIT);<br>
>> kernel_init();<br>
>> machdep_init(POST_GDB); <---<br>
>> vm_init();<br>
>> machdep_init(POST_VM);<br>
>> module_init();<br>
>> help_init();<br>
>> task_init();<br>
>> vfs_init();<br>
>> net_init();<br>
>> dev_init();<br>
>> machdep_init(POST_INIT);<br>
>> ...<br>
>> I would suggest putting it at the end of the kernel_init(), so that the<br>
>> original initialization order will basically not be changed<br>
>><br>
>><br>
>>> it's used there?<br>
>>><br>
>><br>
>> Do you mean we need to move the existing initialization(all) to the<br>
>> task_init()?<br>
>><br>
>>><br>
>>> $ git grep 'STRUCT_SIZE_INIT(note_buf'<br>
>>> arm.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>>> arm64.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>>> mips.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>>> mips64.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>>> ppc.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>>> riscv64.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>>> x86.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>>> x86_64.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>>> xen_hyper.c: XEN_HYPER_STRUCT_SIZE_INIT(note_buf_t, "note_buf_t");<br>
>>><br>
>>> Thanks,<br>
>>> Kazu<br>
>>><br>
>>>><br>
>>>> Fixes: db8c030857b4 ("diskdump/netdump: fix segmentation fault caused<br>
>>> by failure of stopping CPUs")<br>
>>>> Signed-off-by: Lianbo Jiang <<a href="mailto:lijiang@redhat.com" target="_blank">lijiang@redhat.com</a>><br>
>>>> ---<br>
>>>> ppc64.c | 1 +<br>
>>>> 1 file changed, 1 insertion(+)<br>
>>>><br>
>>>> diff --git a/ppc64.c b/ppc64.c<br>
>>>> index b95a621d8fe4..ff7f0fca3a95 100644<br>
>>>> --- a/ppc64.c<br>
>>>> +++ b/ppc64.c<br>
>>>> @@ -695,6 +695,7 @@ ppc64_init(int when)<br>
>>>> }<br>
>>>><br>
>>>> ppc64_init_paca_info();<br>
>>>> + STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>>>><br>
>>>> if (!machdep->hz) {<br>
>>>> machdep->hz = HZ;<br>
>><br>
>><br>
>><br>
>> On Thu, Jun 8, 2023 at 5:56 PM lijiang <<a href="mailto:lijiang@redhat.com" target="_blank">lijiang@redhat.com</a> <br>
>> <mailto:<a href="mailto:lijiang@redhat.com" target="_blank">lijiang@redhat.com</a>>> wrote:<br>
>><br>
>> On Thu, Jun 8, 2023 at 5:02 PM HAGIO KAZUHITO(萩尾 一仁)<br>
>> <<a href="mailto:k-hagio-ab@nec.com" target="_blank">k-hagio-ab@nec.com</a> <mailto:<a href="mailto:k-hagio-ab@nec.com" target="_blank">k-hagio-ab@nec.com</a>>> wrote:<br>
>><br>
>> On 2023/06/08 13:31, Lianbo Jiang wrote:<br>
>> > Crash tool will fail to load vmcore with the following error:<br>
>> ><br>
>> > #crash vmlinux<br>
>> /var/crash/127.0.0.1-2023-06-07-22\:03\:24/vmcore -s<br>
>> > crash: invalid structure size: note_buf<br>
>> > FILE: diskdump.c LINE: 121 FUNCTION:<br>
>> have_crash_notes()<br>
>> ><br>
>> > [./crash] error trace: 101859ac => 10291798 => 10291450<br>
>> => 10266038<br>
>> ><br>
>> > 10266038: SIZE_verify+156<br>
>> > 10291450: have_crash_notes+308<br>
>> > 10291798: map_cpus_to_prstatus_kdump_cmprs+448<br>
>> > 101859ac: task_init+11980<br>
>> ><br>
>> > The reason is that the note_buf is not intialized before<br>
>> using the<br>
>> > SIZE(note_buf) in the have_crash_notes(). Let's initialize<br>
>> the variable<br>
>> > note_buf in the ppc64_init() to fix this issue.<br>
>><br>
>> Thanks for the patch.<br>
>><br>
>> Most of the architectures already have it, but some still do<br>
>> not have,<br>
>><br>
>><br>
>> I don't have a 4 cpus s390x machine on hand, and am still trying<br>
>> to find it.<br>
>><br>
>><br>
>> I checked the crash code. For the s390x, it doesn't invoke the <br>
>> map_cpus_to_prstatus{_kdump_cmprs}(), so this patch should not affect <br>
>> the s390x arch.<br>
>><br>
>> Mikhail Zaslonko: Can you help to confirm this issue or do some tests? <br>
>> Thank you in advance.<br>
>><br>
>> Lianbo<br>
>><br>
>><br>
>> e.g. is s390x ok? It might be better to put it in task_init()<br>
>> because<br>
>><br>
>> It was initialized in the machdep_init(POST_GDB), see the<br>
>> following code:<br>
>> ...<br>
>> read_in_kernel_config(IKCFG_INIT);<br>
>> kernel_init();<br>
>> machdep_init(POST_GDB); <---<br>
>> vm_init();<br>
>> machdep_init(POST_VM);<br>
>> module_init();<br>
>> help_init();<br>
>> task_init();<br>
>> vfs_init();<br>
>> net_init();<br>
>> dev_init();<br>
>> machdep_init(POST_INIT);<br>
>> ...<br>
>> I would suggest putting it at the end of the kernel_init(), so<br>
>> that the original initialization order will basically not be changed<br>
>><br>
>><br>
>> it's used there?<br>
>><br>
>> Do you mean we need to move the existing initialization(all) to<br>
>> the task_init()?<br>
>><br>
>><br>
>> $ git grep 'STRUCT_SIZE_INIT(note_buf'<br>
>> arm.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>> arm64.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>> mips.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>> mips64.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>> ppc.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>> riscv64.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>> x86.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>> x86_64.c: STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>> xen_hyper.c: XEN_HYPER_STRUCT_SIZE_INIT(note_buf_t,<br>
>> "note_buf_t");<br>
>><br>
>> Thanks,<br>
>> Kazu<br>
>><br>
>> ><br>
>> > Fixes: db8c030857b4 ("diskdump/netdump: fix segmentation<br>
>> fault caused by failure of stopping CPUs")<br>
>> > Signed-off-by: Lianbo Jiang <<a href="mailto:lijiang@redhat.com" target="_blank">lijiang@redhat.com</a><br>
>> <mailto:<a href="mailto:lijiang@redhat.com" target="_blank">lijiang@redhat.com</a>>><br>
>> > ---<br>
>> > ppc64.c | 1 +<br>
>> > 1 file changed, 1 insertion(+)<br>
>> ><br>
>> > diff --git a/ppc64.c b/ppc64.c<br>
>> > index b95a621d8fe4..ff7f0fca3a95 100644<br>
>> > --- a/ppc64.c<br>
>> > +++ b/ppc64.c<br>
>> > @@ -695,6 +695,7 @@ ppc64_init(int when)<br>
>> > }<br>
>> ><br>
>> > ppc64_init_paca_info();<br>
>> > + STRUCT_SIZE_INIT(note_buf, "note_buf_t");<br>
>> ><br>
>> > if (!machdep->hz) {<br>
>> > machdep->hz = HZ;<br>
>></blockquote></div></div>