[dm-devel] [PATCH v2] multipath -u: test socket connection in non-blocking mode

Martin Wilck mwilck at suse.com
Thu Apr 25 19:33:03 UTC 2019


On Wed, 2019-04-24 at 11:07 +0200, Martin Wilck wrote:
> Since commit d7188fcd "multipathd: start daemon after udev trigger",
> multipathd startup is delayed during boot until after "udev settle"
> terminates. But "multipath -u" is run by udev workers for storage
> devices,
> and attempts to connect to the multipathd socket. This causes a start
> job
> for multipathd to be scheduled by systemd, but that job won't be
> started
> until "udev settle" finishes. This is not a problem on systems with
> 129 or
> less storage units, because the connect() call of "multipath -u" will
> succeed anyway. But on larger systems, the listen backlog of the
> systemd
> socket can be exceeded, which causes connect() calls for the socket
> to
> block until multipathd starts up and begins calling accept(). This
> creates
> a deadlock situation, because "multipath -u" (called by udev workers)
> blocks, and thus "udev settle" doesn't finish, delaying multipathd
> startup. This situation then persists until either the workers or
> "udev
> settle" time out. In the former case, path devices might be
> misclassified
> as non-multipath devices by "multipath -u".
> 
> Fix this by using a non-blocking socket fd for connect() and
> interpret the
> errno appropriately.
> 
> This patch reverts most of the changes from commit 8cdf6661
> "multipath:
> check on multipathd without starting it". Instead, "multipath -u"
> does
> access the socket and start multipath again (which is what we want
> IMO),
> but it is now able to detect and handle the "full backlog" situation.
> 
> Signed-off-by: Martin Wilck <mwilck at suse.com>
> 
> V2:
> 
> Use same error reporting convention in __mpath_connect() as in
> mpath_connect() (Hannes Reinecke). We can't easily change the latter,
> because it's part of the "public" libmpathcmd API. 

FTR, our customer reported that this patch fixed his problem.

@Ben, I'd be grateful if you could try it (or have the user try it)
in your problem case as well.

-- 
Dr. Martin Wilck <mwilck at suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)





More information about the dm-devel mailing list