[lvm-devel] stable-2.02 - libdaemon: fix the race of handling SIGTERM timeout
Zdenek Kabelac
zdenek.kabelac at gmail.com
Wed Sep 25 11:53:50 UTC 2019
Dne 25. 09. 19 v 9:48 wangjufeng napsal(a):
> From e97d3a14d070a85867ce39fc1749d19c2ac0b685 Mon Sep 17 00:00:00 2001
>
> From: wangjufeng <wangjufeng at huawei.com>
>
> Date: Wed, 25 Sep 2019 15:34:25 +0800
>
> Subject: [PATCH] libdaemon: fix the race of handling SIGTERM timeout
>
> When the daemon service lvmetad running without "-t" option,system
>
> shutdown may take a long time.
>
> error could be simulated follow those steps:
>
> 1.remove "-t" option from lvmetad start arguments.
>
> 2.start a process p1 like this
>
> while :
>
> do
>
> time service lvm2-lvmetad stop
>
> sleep 2
>
> service lvm2-lvmetad start
>
> done
>
> 3.start a process p2 like this
>
> while :
>
> do
>
> lvs
>
> done
>
> 4.use ctrl+c stop process p2, it is very easy to occur that, stop
>
> lvm2-lvmetad timeout.
>
> This is because, when system shutdown, lvmetad recieve SIGTERM,
>
> _shutdown_requested will change to 1 asynchronously. If a client
>
> thread is still running, s.threads->next is not NULL. It could
>
> happen that (_shutdown_requested && !s.threads->next) will not be
>
> satisfied before calling pselect, then it call pselect and get
>
> blocked . If no new request or signall come, systemd have to
>
> send SIGKILL after SIGTERM timeout, usually it could take 90
>
> seconds.
>
> This patch call _reap(s, 1) to finish all client threads after
>
> _shutdown_requested change to 1. And always pass a timeout
>
> value to pselect to avoid that _shutdown_requested change to 1
>
> between judging _shutdown_requested value and calling pselect.
>
>
Hi
Can you please check out the attached patch that simply disallows accepting
new connections while 'shutdown' is requested ?
Your example is in fact not the case we intended to support - if there *IS* a
running user - we shall serve it - instead of killing lvmetad in the middle of
lvm2 command.
So all served clients should be finished (and lvmetad should be 'quick' at
this serving - as all it does is to stream some cached data to socked.
But - we likely should not accept any new connection when we know we are going
to 'shutdown' - which is what my proposal patch is doing - otherwise we are
in danger that user running 'endless' loop of lvs will prevent shutdown.
The next issue is - WHY is your shutdown sequence using any lvm commnand when
lvmetad service is being stopped ?? This looks like a logical issue in
shutdown sequence?
Regards
Zdenek
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch.diff
Type: text/x-patch
Size: 808 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/lvm-devel/attachments/20190925/9c51da06/attachment.bin>
More information about the lvm-devel
mailing list