[Crash-utility] Questions on multi-thread for crash

lijiang lijiang at redhat.com
Thu Feb 9 09:28:26 UTC 2023


On Wed, Feb 8, 2023 at 4:26 PM <crash-utility-request at redhat.com> wrote:

> Message: 1
> Date: Wed, 8 Feb 2023 12:34:32 +0800
> From: Tao Liu <ltao at redhat.com>
> To: crash-utility at redhat.com
> Subject: [Crash-utility] Questions on multi-thread for crash
> Message-ID: <Y+MmWBi+kq+ZSDqn at localhost.localdomain>
> Content-Type: text/plain; charset=iso-8859-1
>
> Hello,
>
> Recently I made an attempt to introduce a thread pool for crash utility, to
> optimize the performance of crash.
>
>
Good question, Tao.


> One obvious point which can benefit from multi-threading is
> memory.c:vm_init().
> There are hundreds of MEMBER_OFFSET_INIT() related symbol resolving
> functions,
> and most of the symbols are independent from each other, by careful
> arrangement,
> they can be invoked parallelly. By doing so, we can shorten the waiting
> time of
> crash starting.
>
> The implementation is abstracted as the following:
>
> Before multi-threading:
>         MEMBER_OFFSET_INIT(task_struct_mm, "task_struct", "mm");
>         MEMBER_OFFSET_INIT(mm_struct_mmap, "mm_struct", "mmap");
>
> After multi-threading:
>         create_threadpool(&pool, 3);
>         ...
>         MEMBER_OFFSET_INIT_PARA(pool, task_struct_mm, "task_struct", "mm");
>         MEMBER_OFFSET_INIT_PARA(pool, mm_struct_mmap, "mm_struct", "mmap");
>         ...
>         wait_and_destroy_threadpool(pool);
>
> MEMBER_OFFSET_INIT_PARA just append the task to the work queue of thread
> pool
> and continues, it's up to the pool to schedule the worker thread to do the
> symbol resolving work.
>
> However, after enable multi-threading, I noticed there are always random
> errors
> from gdb. From segfault to broken stack, it seems gdb is not thread safe at
> all...
>
> For example one error listed as follows:
>
>         Thread 10 "crash" received signal SIGSEGV, Segmentation fault.
>         [Switching to Thread 0x7fffc4f00640 (LWP 72950)]
>         c_yylex () at /sources/up-crash/gdb-10.2/gdb/c-exp.y:3250
>         3250 ? if (pstate->language ()->la_language != language_cplus
>         (gdb) bt
>         #0 ?c_yylex () at /sources/up-crash/gdb-10.2/gdb/c-exp.y:3250
>         #1 ?c_yyparse () at /sources/up-crash/gdb-10.2/gdb/c-exp.c.tmp:2092
>         #2 ?0x00000000006f62d7 in c_parse (par_state=<optimized out>) at
> /sources/
>             up-crash/gdb-10.2/gdb/c-exp.y:3414
>         #3 ?0x0000000000894eac in parse_exp_in_context
> (stringptr=0x7fffc4efeff8,
>             pc=<optimized out>, block=<optimized out>, comma=0,
> out_subexp=0x0,
>             tracker=0x7fffc4efef10, cstate=0x0, void_context_p=0) at
> parse.c:1122
>         #4 ?0x00000000008951d6 in parse_exp_1 (tracker=0x0, comma=0,
> block=0x0,
>             pc=0, stringptr=0x7fffc4efeff8) at parse.c:1031
>         #5 ?parse_expression (string=<optimized out>, string at entry
> =0x7fffc4eff140
>             "slab_s", tracker=tracker at entry=0x0) at parse.c:1166
>         #6 ?0x000000000092039a in gdb_get_datatype (req=0x7fffc4eff720) at
> symtab.c:7239
>         #7 ?gdb_command_funnel_1 (req=0x7fffc4eff720) at symtab.c:7018
>         #8 ?0x00000000009206de in gdb_command_funnel (req=0x7fffc4eff720)
> at symtab.c:6956
>         #9 ?0x00000000005ad137 in gdb_interface (req=0x7fffc4eff720) at
> gdb_interface.c:409
>         #10 0x00000000005fe76c in datatype_info (name=0xab9700 "slab_s",
>             member=0xaba8d8 "list", dm=0x0) at symbols.c:5708
>         #11 0x0000000000517a85 in
> member_offset_init_slab_s_list_slab_s_list ()
>             at memory.c:659
>         #12 0x000000000068168f in group_routine (args=<optimized out>) at
> thpool.c:81
>         #13 0x00007ffff7a48b17 in start_thread () from /lib64/libc.so.6
>         ? #14 0x00007ffff7acd6c0 in clone3 () from /lib64/libc.so.6
>         (gdb) p pstate
>         $1 = (parser_state *) 0x0
>
>         $ cat -n /sources/up-crash/gdb-10.2/gdb/c-exp.y
>         66 ?/* The state of the parser, used internally when we are
> parsing the
>         67 ? ? expression. ?*/
>         68 ?
>         69 ?static struct parser_state *pstate = NULL;
>
> pstate is a global variable and not thread safe, the value must be changed
> by
> someone else...
>
Now the project has reached a dead end. Because making gdb thread safe is an
> impossible mission to me. Is there any advice or suggestions? Thanks in
> advance!
>
>
Can you try to load some symbols on demand when crash initializes? And
later, load and cache
these symbols in crash when we execute a crash command for the first time,
but it may have another
issue, the crash command might be slow for the first time.

In addition, can you also try to filter out some old and unuseful symbols?
For example:

Some kernel symbols have been removed from the latest kernel, if the
current vmcore
generated by the latest kernel version, crash won't need to check or search
for these old
kernel symbols when initializing. Otherwise, still load these old kernel
symbols. Maybe it
may save the initializing time.

Thanks.
Lianbo


Thanks!
> Tao Liu
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20230209/6afd92c7/attachment.htm>


More information about the Crash-utility mailing list