[dm-devel] [PATCH] multipathd: avoid crash in uevent_cleanup()

lixiaokeng lixiaokeng at huawei.com
Mon Feb 8 10:49:30 UTC 2021



On 2021/2/8 17:50, Martin Wilck wrote:
> On Mon, 2021-02-08 at 15:41 +0800, lixiaokeng wrote:
>>
>> Hi Martin,
>>
>> There is a _cleanup_ in device_new_from_nulstr. If uevent_thr exit in
>> device_new_from_nulstr and some keys is not be append to sd_device,
>> the _cleanup_ will be called, which leads to multipathd crashes with
>> the stack.
>>
>> When I use your advice,
>>
>>
>> On 2021/1/26 16:34, Martin Wilck wrote:
>>>     int oldstate;
>>>
>>>     pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &oldstate);
>>>
>>>     udev_monitor_receive_device(...)
>>>
>>>     pthread_setcancelstate(oldstate, NULL);
>>>     pthread_testcancel();
>>
>> this coredump does not seem to appear anymore (several hours with
>> test scripts).
> 
> Thanks for your continued hard work on this, but I can't follow you. In
> this post:
> 
> https://listman.redhat.com/archives/dm-devel/2021-January/msg00396.html
> 
> you said that this advice did _not_ help. Please clarify.
> 

Hi Martin,
At that time, I did not know how the crash occurred in the systemd interface.
There were still some crashes with pthread_testcancel(), for example
#0  0x0000ffffb6118f4c in aarch64_fallback_frame_state (context=0xffffb523f200, context=0xffffb523f200, fs=0xffffb523e700) at ./md-unwind-support.h:74
#1  uw_frame_state_for (context=context at entry=0xffffb523f200, fs=fs at entry=0xffffb523e700) at ../../../libgcc/unwind-dw2.c:1257
#2  0x0000ffffb6119ef4 in _Unwind_ForcedUnwind_Phase2 (exc=exc at entry=0xffffb52403b0, context=context at entry=0xffffb523f200) at ../../../libgcc/unwind.inc:155
#3  0x0000ffffb611a284 in _Unwind_ForcedUnwind (exc=0xffffb52403b0, stop=stop at entry=0xffffb64846c0 <unwind_stop>, stop_argument=0xffffb523f630) at ../../../libgcc/unwind.inc:207
#4  0x0000ffffb6484860 in __GI___pthread_unwind (buf=<optimized out>) at unwind.c:121
#5  0x0000ffffb6482d08 in __do_cancel () at pthreadP.h:304
#6  __GI___pthread_testcancel () at pthread_testcancel.c:26
#7  0x0000ffffb5c528e8 in ?? ()

I thought these crashes might be related to crash in systemd interface.

However, I think these may be independent questions after analyzing
coredump and discussing with the community. So I test it again.
?? and _Unwind_XXX crashes still exist but no crash in
device_monitor_receive_device.

Regards,
Lixiaokeng





More information about the dm-devel mailing list