[linux-lvm] 答复: [dm-devel] dmsetup hangs forever
yanfei.zhang at huawei.com
Fri Oct 27 08:00:57 UTC 2017
If the udevd daemon would not timeout, I think dmsetup mandatory wait udev finalizing any timeouts is good idea.
But udevd would timeout in 180 sencond and kill the event process( systemd-udevd: timeout: killing). In this situation, I think mandatory wait udev finalizing is useless, because udev has been killed and can't coordination dmsetup forever. So I think it's better to tell the one who call the dmsetup, the process error return, than let the process wait forever.
If not add the dmsetup timeout mechanism, which strategy to solve this issue better?
1、guarantee the udev never timeout.(but I think it is difficult to make sure any udev event will finish in 180 sencond in any abnormal situation)
2、modify the udev daemon, if udev event timeout，also notify the dmsetup it's done.
3、the one who call the command dmsetup needed timeout itself.
发件人: Zdenek Kabelac [mailto:zkabelac at redhat.com]
发送时间: 2017年10月26日 16:39
收件人: Zhangyanfei (YF) <yanfei.zhang at huawei.com>; agk at redhat.com; christophe.varoqui at opensvc.com
抄送: dm-devel at redhat.com; guijianfeng <guijianfeng at huawei.com>; Fengtiantian <fengtiantian at huawei.com>; linux-lvm at redhat.com
主题: Re: [dm-devel] dmsetup hangs forever
Dne 26.10.2017 v 10:07 Zhangyanfei (YF) napsal(a):
> I find an issue when use dmsetup in the situation udev event timeout.
> Dmsetup use the dm_udev_wait function sync with udev event.When use
> the dmsetup generate a new dm-disk, if the raw disk is abnormal(for
> example ,a ipsan disk hung IO request), the udevd daemon handle the
> dm-disk udev event maybe timeout, and will not notify the dmsetup by
> semaphore. And because the
> dm_udev_wait use the semop to sync with udevd, if udevd event
> timeout, the dmsetup will hung forever even when the raw disk be recovery.
> I wonder if we could use the semtimedop instead semop to add the
> timeout in function dm_udev_wait. If the udevd daemon timeout when
> handle the dm event, the dm_udev_wait could timeout too, and the dmsetup could return error.
> This is my patch base lvm2-2.02.115-3:
Unfortunately the same argument why this can't really work still applies.
If the dm will start to timeout on it's own - without coordination with udev, your system's logic will end-up with one big mess.
So if the dm would handle timeout - you would also need to provide mechanism to correct associated services around it.
The main case here is - it's mandatory it's udev finalizing any timeouts so it's in sync with db content.
Moreover if you start to timeout - you typically mask some system failure. In majority of cases I've ever seen - it's been always a bug from this category (buggy udev rule, or service). So it's always better to fix the bug then keep it masked.
AFAIK I'd like to see the semaphore to go away - but it needs wider cooperation.
More information about the linux-lvm