[dm-devel] [PATCH 1/7] multipath: fix tur checker locking

Benjamin Marzinski bmarzins at redhat.com
Thu Feb 8 19:27:41 UTC 2018


On Thu, Feb 08, 2018 at 06:19:32PM +0000, Bart Van Assche wrote:
> On Wed, 2018-02-07 at 16:49 -0600, Benjamin Marzinski wrote:
> > Commit 6e2423fd fixed a bug where the tur checker could cancel a
> > detached thread after it had exitted. However in fixing this, the new
> > code grabbed a mutex (to call condlog) while holding a spin_lock. To
> > deal with that, and to try to keep with the maixim "lock data, not
> > code", I've changed how the tur checker synchronizes with its thread.
> > 
> > Now, the tur checker creates joinable threads, and detaches them when
> > the thread is finished or has timed out. To track the state of the
> > threads, I've added a new variable to the checker context, ct->attached.
> > When a thread starts, attached is set to 1. When the thread finishes, it
> > saves the value of attached, and then zeros it out, while locked. If
> > attached was set, it detaches itself.
> > 
> > When the tur checker gives up on a thread, it also saves and decrements
> > ct->attached, while locked. At the same time it saves the value of
> > ct->thread.  If attached was set, it cancels the thread, and then
> > detaches it.
> > 
> > So the values that are protected by the spin lock are now ct->holders,
> > ct->thread, and ct->attached. There are cases where the tur checker just
> > wants to know if the thread is running. If so, its safe to simply read
> > ct->thread without locking.  Also, if it knows that the thread isn't
> > running, it can freely access all of these variables. I've left the
> > locking in-place in these cases to make the static analyzers happy.
> > However I have added comments stating when the locking isn't actually
> > necessary.
> 
> Hello Ben,
> 
> Have you considered to move the condlog() statements out of the spinlock
> section? I think that would lead to a much smaller and less contrived change
> than the patch you posted.

Yes. I thought it was better to keep the amount of code locked under a
spin_lock as small as possible. But it seems that I am alone in thinking
my trade-off was worth it.  If both you and Martin object, I'm fine with
redoing the patch to simply move the condlog(), since I don't believe
that either tur_check_async_timeout() or pthread_cancel() can block.

-Ben

> Thanks,
> 
> Bart.
> 
> 




More information about the dm-devel mailing list