[Crash-utility] [External Mail]RE: zram decompress support for gcore/crash-utility

赵乾利 zhaoqianli at xiaomi.com
Mon Apr 6 11:10:06 UTC 2020


Hi,hatayama

Please refer to the following for the exact kernel changes:

commit 34be98f4944f99076f049a6806fc5f5207a755d3
Author: Ard Biesheuvel <ard.biesheuvel at linaro.org>
Date:   Thu Jul 20 17:15:45 2017 +0100

    arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP

    For historical reasons, we leave the top 16 bytes of our task and IRQ
    stacks unused, a practice used to ensure that the SP can always be
    masked to find the base of the current stack (historically, where
    thread_info could be found).

    However, this is not necessary, as:

    * When an exception is taken from a task stack, we decrement the SP by
      S_FRAME_SIZE and stash the exception registers before we compare the
      SP against the task stack. In such cases, the SP must be at least
      S_FRAME_SIZE below the limit, and can be safely masked to determine
      whether the task stack is in use.

    * When transitioning to an IRQ stack, we'll place a dummy frame onto the
      IRQ stack before enabling asynchronous exceptions, or executing code
      we expect to trigger faults. Thus, if an exception is taken from the
      IRQ stack, the SP must be at least 16 bytes below the limit.

    * We no longer mask the SP to find the thread_info, which is now found
      via sp_el0. Note that historically, the offset was critical to ensure
      that cpu_switch_to() found the correct stack for new threads that
      hadn't yet executed ret_from_fork().

    Given that, this initial offset serves no purpose, and can be removed.
    This brings us in-line with other architectures (e.g. x86) which do not
    rely on this masking.

    Signed-off-by: Ard Biesheuvel <ard.biesheuvel at linaro.org>
    [Mark: rebase, kill THREAD_START_SP, commit msg additions]
    Signed-off-by: Mark Rutland <mark.rutland at arm.com>
    Reviewed-by: Will Deacon <will.deacon at arm.com>
    Tested-by: Laura Abbott <labbott at redhat.com>
    Cc: Catalin Marinas <catalin.marinas at arm.com>
    Cc: James Morse <james.morse at arm.com>

________________________________________
From: d.hatayama at fujitsu.com <d.hatayama at fujitsu.com>
Sent: Monday, April 6, 2020 12:32
To: 赵乾利
Cc: crash-utility at redhat.com <crash-utility at redhat.com>
Subject: RE: [External Mail]RE: zram decompress support for gcore/crash-utility

Zhao,

> -----Original Message-----
> From: 赵乾利 <zhaoqianli at xiaomi.com>
> Sent: Wednesday, April 1, 2020 10:22 PM
> To: Hatayama, Daisuke/畑山 大輔 <d.hatayama at fujitsu.com>
> Cc: crash-utility at redhat.com <crash-utility at redhat.com> <crash-utility at redhat.com>
> Subject: 答复: [External Mail]RE: zram decompress support for gcore/crash-utility
>
> Hi,hatayama
>
> I just porting zram support into crash-utility,in this way,gcore need calling zram decompress(try_zram_decompress)
> function in gcore.
> integrate zram decompress to readmem is a good suggestion,I'm working on it.
>
> About 0002-gcore-ARM-ARM64-reserved-8-16-byte-in-the-top-of-sta.patch,it's a completely independent
> patch,without this patch,the coredump register will be wrong/dislocation, so that gdb cannot parse out the complete call
> stack.

Thanks for your explanation. I will write this in the commit description.

Could you also tell me the exact commit in Linux kernel that made the corresponding change?

> You can see blow:
> [Without patch]
> (gdb) bt
> #0  android::Mutex::lock (this=<optimized out>) at system/core/libutils/include/utils/Mutex.h:183
> #1  android::Looper::pollInner (this=0x704ad1c590 <epoll_wait(int, epoll_event*, int, int)>, timeoutMillis=1291145664)
> at system/core/libutils/Looper.cpp:243
> #2  0xbc5e696a00000018 in ?? ()
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>
> (gdb) info reg
> x0             0xffffff801998bff0 -549326372880
> x1             0xffffffa6d3e83848 -382991845304
> x2             0x34 52
> x3             0x7fdb6d8f90 549142237072
> x4             0x10 16
> x5             0x74a5 29861
> x6             0x0 0
> x7             0x8 8
> x8             0x704e33a000 482348343296
> x9             0xbf815ee 200807918
> x10            0x16 22
> x11            0xebabf645f5e97f31 -1464806473440067791
> x12            0x1 1
> x13            0xc0 192
> x14            0x20daea4ae8 141111741160
> x15            0x115 277
> x16            0x1080400222a3e010 1189020679541088272
> x17            0x40 64
> x18            0x704a9fcd70 482288323952
> x19            0x704ad1c590 482291598736
> x20            0x704e01c000 482345074688
> x21            0x704cf551c0 482327482816
> x22            0x704cf55268 482327482984
> x23            0x74a5 29861
> x24            0x74a5 29861
> x25            0x704cf551c0 482327482816
> x26            0x7fffffff 2147483647
> x27            0x704d0aa020 482328879136
> x28            0x704cfc8840 482327955520
> x29            0x704407b8 1883506616
> x30            0x70435dc8 1883463112
> sp             0x7fdb6d90f0 0x7fdb6d90f0
> pc             0x704a9f80c0 0x704a9f80c0 <android::Looper::pollInner(int)+148>
> cpsr           0xdb6d8f50 -613576880
> fpsr           0x17 23
> fpcr           0x0 0
>
> [With patch]
> (gdb) bt
> #0  __epoll_pwait () at bionic/libc/arch-arm64/syscalls/__epoll_pwait.S:9
> #1  0x000000704a9f80c0 in android::Looper::pollInner (this=0x704cf551c0, timeoutMillis=29861) at
> system/core/libutils/Looper.cpp:237
> #2  0x000000704a9f7f90 in android::Looper::pollOnce (this=0x704cf551c0, timeoutMillis=29861, outFd=0x0,
> outEvents=0x0, outData=0x0) at system/core/libutils/Looper.cpp:205
> #3  0x000000704c4530f4 in android::Looper::pollOnce (this=0x34, timeoutMillis=-613576816) at
> system/core/libutils/include/utils/Looper.h:267
> #4  android::NativeMessageQueue::pollOnce (this=<optimized out>, env=0x704cf5db80, pollObj=<optimized out>,
> timeoutMillis=-613576816)
>     at frameworks/base/core/jni/android_os_MessageQueue.cpp:110
> #5  android::android_os_MessageQueue_nativePollOnce (env=0x704cf5db80, obj=<optimized out>, ptr=<optimized
> out>, timeoutMillis=-613576816)
>     at frameworks/base/core/jni/android_os_MessageQueue.cpp:191
> #6  0x0000000073749590 in ?? ()
>
> (gdb) info registers
> x0             0x34 52
> x1             0x7fdb6d8f90 549142237072
> x2             0x10 16
> x3             0x74a5 29861
> x4             0x0 0
> x5             0x8 8
> x6             0x704e33a000 482348343296
> x7             0xbf815ee 200807918
> x8             0x16 22
> x9             0xebabf645f5e97f31 -1464806473440067791
> x10            0x1 1
> x11            0xc0 192
> x12            0x20daea4ae8 141111741160
> x13            0x115 277
> x14            0x1080400222a3e010 1189020679541088272
> x15            0x40 64
> x16            0x704a9fcd70 482288323952
> x17            0x704ad1c590 482291598736
> x18            0x704e01c000 482345074688
> x19            0x704cf551c0 482327482816
> x20            0x704cf55268 482327482984
> x21            0x74a5 29861
> x22            0x74a5 29861
> x23            0x704cf551c0 482327482816
> x24            0x7fffffff 2147483647
> x25            0x704d0aa020 482328879136
> x26            0x704cfc8840 482327955520
> x27            0x704407b8 1883506616
> x28            0x70435dc8 1883463112
> x29            0x7fdb6d90f0 549142237424
> x30            0x704a9f80c0 482288304320
> sp             0x7fdb6d8f50 0x7fdb6d8f50
> pc             0x704ad5c098 0x704ad5c098 <__epoll_pwait+8>
> cpsr           0x60001000 1610616832
> fpsr           0x17 23
> fpcr           0x0 0
>
> -----邮件原件-----
> 发件人: d.hatayama at fujitsu.com <d.hatayama at fujitsu.com>
> 发送时间: 2020年4月1日 17:47
> 收件人: 赵乾利 <zhaoqianli at xiaomi.com>
> 抄送: crash-utility at redhat.com <crash-utility at redhat.com> <crash-utility at redhat.com>
> 主题: [External Mail]RE: zram decompress support for gcore/crash-utility
>
> Zhan,
>
> It looks to me that 0002-gcore-ARM-ARM64-reserved-8-16-byte-in-the-top-of-sta.patch is independent of the ZRAM
> support. Without the patch, how does gcore behave? Fai or succeed? If gcore fails, could you show me a log indicating
> how gcore command fails?
>
> > -----Original Message-----
> > From: crash-utility-bounces at redhat.com
> > <crash-utility-bounces at redhat.com> On Behalf Of ?乾利
> > Sent: Tuesday, March 31, 2020 9:49 AM
> > To: crash-utility at redhat.com <crash-utility at redhat.com>
> > <crash-utility at redhat.com>
> > Subject: [Crash-utility] zram decompress support for
> > gcore/crash-utility
> >
> > Hello list,
> >
> >
> >
> > When i try to use gcore to parse coredump from fulldump,I got below issue that make the coredump unavailable.
> >
> > 1.     Zram is very common feature,in current android system supports zram swap,but gcore/crash-utility does not
> > support,even zram swap can decoded from ram,many page fault due to this reason.
> >
> > I added zram decompress feature to gcore,and i’m also considering
> > wheather support zram in crash-utility,but for this feature,i have to add miniLZO to codebase, I'm not sure if it's
> acceptable,plase help give some advice.
> >
> > miniLZO :miniLZO is a very lightweight subset of the LZO
> > library,Distributed under the terms of the GNU General Public License
> <http://www.oberhumer.com/opensource/gpl.html>  (GPL v2+).
> >
> > http://www.oberhumer.com/opensource/lzo/
> > <http://www.oberhumer.com/opensource/lzo/>
> >
> >
> >
> > This change is a bit big,I attached it to the mail,if attachment is not available,you can also see these patch in github:
> > https://github.com/zhaoqianli0202/crash-gcore/commits/upstream
> > <https://github.com/zhaoqianli0202/crash-gcore/commits/upstream>
> >
> > Please review.
> >
> >
> >
> > 2.     For historical reasons,kernel reserved top 8/16 bytes of stacks,but after kernel-4.14, this reservation was
> > cancelled,so gcore needs to improve compatibility.
> >
> > kernel change as below:
> >
> > commit 34be98f4944f99076f049a6806fc5f5207a755d3
> >
> > Author: Ard Biesheuvel <ard.biesheuvel at linaro.org
> > <mailto:ard.biesheuvel at linaro.org> >
> >
> > Date:   Thu Jul 20 17:15:45 2017 +0100
> >
> >
> >
> >     arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP
> >
> >
> >
> >     For historical reasons, we leave the top 16 bytes of our task and
> > IRQ
> >
> >     stacks unused, a practice used to ensure that the SP can always be
> >
> >     masked to find the base of the current stack (historically, where
> >
> >     thread_info could be found).™
> >
> > ====
> >
> > Patch for this issue:
> >
> > commit d1031df4617351a58b8edfb0121c306baaa34f9d
> >
> > Author: zhaoqianli <zhaoqianli at xiaomi.com>
> >
> > Date:   Mon Mar 30 12:07:02 2020 +0800
> >
> >
> >
> >     gcore: ARM/ARM64 reserved 8/16 byte in the top of stacks before
> > 4.14,
> >
> >     but this reservation was removed after 4.14
> >
> > Without this patch,gcore counld't parse full callstack in version
> > after 4.14
> >
> >
> >
> > diff --git a/gcore.c b/gcore.c
> >
> > index f75701d..f6e1787 100644
> >
> > --- a/gcore.c
> >
> > +++ b/gcore.c
> >
> > @@ -558,4 +558,16 @@ static void gcore_machdep_init(void)
> >
> >
> >
> >         if (!gcore_arch_vsyscall_has_vm_alwaysdump_flag())
> >
> >                 gcore_machdep->vm_alwaysdump = 0x00000000;
> >
> > +
> >
> > +#if defined(ARM) || defined(ARM64)
> >
> > +#ifdef ARM
> >
> > +#define STACK_RESERVE_SIZE 8
> >
> > +#else
> >
> > +#define STACK_RESERVE_SIZE 16
> >
> > +#endif
> >
> > +       if (THIS_KERNEL_VERSION >= LINUX(4,14,0))
> >
> > +               gcore_machdep->stack_reserve = 0;
> >
> > +       else
> >
> > +               gcore_machdep->stack_reserve = STACK_RESERVE_SIZE;
> >
> > +#endif
> >
> > }
> >
> > diff --git a/libgcore/gcore_arm.c b/libgcore/gcore_arm.c
> >
> > index 891d01e..c8aefdf 100644
> >
> > --- a/libgcore/gcore_arm.c
> >
> > +++ b/libgcore/gcore_arm.c
> >
> > @@ -29,7 +29,7 @@ static int gpr_get(struct task_context *target,
> >
> >
> >
> >         BZERO(regs, sizeof(*regs));
> >
> >
> >
> > -       readmem(machdep->get_stacktop(target->task) - 8 - SIZE(pt_regs), KVADDR,
> >
> > +       readmem(machdep->get_stacktop(target->task) -
> > + gcore_machdep->stack_reserve - SIZE(pt_regs), KVADDR,
> >
> >                 regs, SIZE(pt_regs), "genregs_get: pt_regs",
> >
> >                 gcore_verbose_error_handle());
> >
> >
> >
> > diff --git a/libgcore/gcore_arm64.c b/libgcore/gcore_arm64.c
> >
> > index 3257389..ed3fdc8 100644
> >
> > --- a/libgcore/gcore_arm64.c
> >
> > +++ b/libgcore/gcore_arm64.c
> >
> > @@ -28,7 +28,7 @@ static int gpr_get(struct task_context *target,
> >
> >
> >
> >         BZERO(regs, sizeof(*regs));
> >
> >
> >
> > -       readmem(machdep->get_stacktop(target->task) - 16 - SIZE(pt_regs), KVADDR,
> >
> > +       readmem(machdep->get_stacktop(target->task) -
> > + gcore_machdep->stack_reserve - SIZE(pt_regs), KVADDR,
> >
> >                 regs, sizeof(struct user_pt_regs), "gpr_get:
> > user_pt_regs",
> >
> >                 gcore_verbose_error_handle());
> >
> >
> >
> > @@ -124,7 +124,7 @@ static int compat_gpr_get(struct task_context
> > *target,
> >
> >         BZERO(&pt_regs, sizeof(pt_regs));
> >
> >         BZERO(regs, sizeof(*regs));
> >
> >
> >
> > -       readmem(machdep->get_stacktop(target->task) - 16 - SIZE(pt_regs), KVADDR,
> >
> > +       readmem(machdep->get_stacktop(target->task) -
> > + gcore_machdep->stack_reserve - SIZE(pt_regs), KVADDR,
> >
> >                 &pt_regs, sizeof(struct pt_regs), "compat_gpr_get:
> > pt_regs",
> >
> >                 gcore_verbose_error_handle());
> >
> >
> >
> > diff --git a/libgcore/gcore_defs.h b/libgcore/gcore_defs.h
> >
> > index b0f5603..f31036c 100644
> >
> > --- a/libgcore/gcore_defs.h
> >
> > +++ b/libgcore/gcore_defs.h
> >
> > @@ -1177,6 +1177,7 @@ extern struct gcore_size_table gcore_size_table;
> >
> > struct gcore_machdep_table
> >
> > {
> >
> >         ulong vm_alwaysdump;
> >
> > +       uint8_t stack_reserve;
> >
> > };
> >
> > #/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形
>> > 使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通
>> > 发件人并删除本邮件! This e-mail and its attachments contain confidential
> > information from XIAOMI, which is intended only for the person or
> > entity whose address is listed above. Any use of the information
> > contained herein in any way (including, but not limited to, total or
> > partial disclosure, reproduction, or dissemination) by persons other
> > than the intended recipient(s) is prohibited. If you receive this
> > e-mail in error, please notify the sender by phone or email
> > immediately and delete it!******/#
> #/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式
> 使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知
> 发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended
> only for the person or entity whose address is listed above. Any use of the information contained herein in any way
> (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the
> intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email
> immediately and delete it!******/#
#/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#




More information about the Crash-utility mailing list