[dm-devel] [PATCH] multipathd: avoid crash in uevent_cleanup()
lixiaokeng
lixiaokeng at huawei.com
Sun Feb 7 07:05:21 UTC 2021
On 2021/2/5 19:08, Martin Wilck wrote:
> On Thu, 2021-02-04 at 16:06 +0100, Martin Wilck wrote:
>> On Thu, 2021-02-04 at 09:40 +0800, lixiaokeng wrote:
>>>
>>>
>>> On 2021/2/3 21:57, Martin Wilck wrote:
>>>>> If exit() before all pthread_cancel in child of 0.7.7, there is
>>>>> no
>>>>> any crash.
>>>> What do you mean with "exit() before all pthread_cancel"? If this
>>>> happens on pthread_cancel(), and you don't call that function,
>>>> this
>>>> would actually be expected.
>>>
>>> When running_state is DAEMON_SHUTDOWN, break while then _exit(0).
>>> But
>>> is is not a great method.
>>
>> I wonder if it would be possible to figure out the LWP numbers
>> (process
>> IDs) of the different threads before the crash occurs, and compare
>> this
>> to the gdb output
>>
>> (gdb) info threads
>> Id Target Id Frame
>> * 1 LWP 1997690 0x00007f59a0109647 in ?? ()
>> 2 LWP 1996840 0x00007f59a0531de7 in ?? ()
>> 3 LWP 1997692 0x00007f59a0109647 in ?? ()
>> 4 LWP 1996857 0x00007f59a020d169 in ?? ()
>>
>> ... to identify which thread crashed, and if it's always the same
>> one.
>
>>From the LWP numbers, thread 2 and 4 are probably TUR checkers
> (temporary threads). thread 1 can't be easily identified. Could you
> provide the stack of thread 3? From that, we might be able to infer
> which thread crashed, because multipathd always starts its threads in
> the same sequence.
>
Here is another core stack(attachment is core dumps):
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib64/libthread_db.so.1".
Core was generated by `/sbin/multipathd -d -s'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007fe17dd97456 in ?? ()
[Current thread is 1 (Thread 0x7fe17cc00700 (LWP 3093458))]
(gdb) bt
#0 0x00007fe17dd97456 in ?? ()
#1 0x0000006800007530 in ?? ()
#2 0x00007fe17cbffb2e in ?? ()
#3 0xffffffff00000053 in ?? ()
#4 0x0000000000002006 in ?? ()
#5 0x0000000000000000 in ?? ()
(gdb) info thread
Id Target Id Frame
* 1 Thread 0x7fe17cc00700 (LWP 3093458) 0x00007fe17dd97456 in ?? ()
2 Thread 0x7fe17d421700 (LWP 3092869) 0x00007fe17da8a929 in __GI___poll (fds=fds at entry=0x0, nfds=nfds at entry=0, timeout=timeout at entry=10) at ../sysdeps/unix/sysv/linux/poll.c:29
3 Thread 0x7fe17d4a7a80 (LWP 3092860) 0x00007fe17da96e27 in socket () at ../sysdeps/unix/syscall-template.S:78
4 Thread 0x7fe17cc0e700 (LWP 3093459) 0x00007fe17da8c507 in ioctl () at ../sysdeps/unix/syscall-template.S:78
5 Thread 0x7fe17cbc1700 (LWP 3093460) 0x00007fe17da8c507 in ioctl () at ../sysdeps/unix/syscall-template.S:78
6 Thread 0x7fe17cb4c700 (LWP 3093461) 0x00007fe17da8c507 in ioctl () at ../sysdeps/unix/syscall-template.S:78
7 Thread 0x7fe17cb5e700 (LWP 3093462) 0x00007fe17da8c507 in ioctl () at ../sysdeps/unix/syscall-template.S:78
8 Thread 0x7fe17cb67700 (LWP 3093463) 0x00007fe17da8c507 in ioctl () at ../sysdeps/unix/syscall-template.S:78
9 Thread 0x7fe17cb70700 (LWP 3093464) 0x00007fe17da8c507 in ioctl () at ../sysdeps/unix/syscall-template.S:78
(gdb) thread 2
[Switching to thread 2 (Thread 0x7fe17d421700 (LWP 3092869))]
#0 0x00007fe17da8a929 in __GI___poll (fds=fds at entry=0x0, nfds=nfds at entry=0, timeout=timeout at entry=10) at ../sysdeps/unix/sysv/linux/poll.c:29
29 return SYSCALL_CANCEL (poll, fds, nfds, timeout);
(gdb) bt
#0 0x00007fe17da8a929 in __GI___poll (fds=fds at entry=0x0, nfds=nfds at entry=0, timeout=timeout at entry=10) at ../sysdeps/unix/sysv/linux/poll.c:29
#1 0x00007fe17dcdcd91 in poll (__timeout=10, __nfds=0, __fds=0x0) at /usr/include/bits/poll2.h:46
#2 call_rcu_thread (arg=0x557ab760e210) at urcu-call-rcu-impl.h:383
#3 0x00007fe17dcbef4b in start_thread (arg=0x7fe17d421700) at pthread_create.c:486
#4 0x00007fe17da957ef in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) thread 3
[Switching to thread 3 (Thread 0x7fe17d4a7a80 (LWP 3092860))]
#0 0x00007fe17da96e27 in socket () at ../sysdeps/unix/syscall-template.S:78
78 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) bt
#0 0x00007fe17da96e27 in socket () at ../sysdeps/unix/syscall-template.S:78
#1 0x00007fe17db95f34 in sd_pid_notify_with_fds (pid=0, unset_environment=0, state=0x557ab58af801 "ERRNO=0", fds=0x0, n_fds=0) at ../src/libsystemd/sd-daemon/sd-daemon.c:481
#2 0x0000557ab58a6ef7 in child (param=<optimized out>) at main.c:3140
#3 0x0000557ab589f503 in main (argc=<optimized out>, argv=0x7fffe51c3c08) at main.c:3325
(gdb) thread 4
[Switching to thread 4 (Thread 0x7fe17cc0e700 (LWP 3093459))]
#0 0x00007fe17da8c507 in ioctl () at ../sysdeps/unix/syscall-template.S:78
78 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) bt
#0 0x00007fe17da8c507 in ioctl () at ../sysdeps/unix/syscall-template.S:78
#1 0x00007fe17dd97456 in ?? ()
#2 0x0000006900007530 in ?? ()
#3 0x00007fe17cc0db2e in ?? ()
#4 0xffffffff00000053 in ?? ()
#5 0x0000000000002006 in ?? ()
#6 0x0000000000000000 in ?? ()
(gdb) thread 5
[Switching to thread 5 (Thread 0x7fe17cbc1700 (LWP 3093460))]
#0 0x00007fe17da8c507 in ioctl () at ../sysdeps/unix/syscall-template.S:78
78 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) bt
#0 0x00007fe17da8c507 in ioctl () at ../sysdeps/unix/syscall-template.S:78
#1 0x00007fe17dd97456 in ?? ()
#2 0x0000006a00007530 in ?? ()
#3 0x00007fe17cbc0b2e in ?? ()
#4 0xffffffff00000053 in ?? ()
#5 0x0000000000002006 in ?? ()
#6 0x0000000000000000 in ?? ()
Regards
Lixiaokeng
-------------- next part --------------
A non-text attachment was scrubbed...
Name: core.multipathd.0.5912fc2ca07945d8ab7d921ca6ff28ab.3092860.1612673516000000000000.lz4
Type: application/octet-stream
Size: 1533834 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20210207/efcf10be/attachment.obj>
More information about the dm-devel
mailing list