[linux-lvm] lvcreate hangs forever during snapshot creation when suspending volume

Mon Aug 1 17:34:03 UTC 2022

Dne 01. 08. 22 v 19:29 Zdenek Kabelac napsal(a):
> Dne 30. 07. 22 v 18:33 Thomas Deutschmann napsal(a):
>> Hi,
>>
>> while trying to backup a Dell R7525 system running
>> Debian bookworm/testing using LVM snapshots I noticed that the system
>> will 'freeze' sometimes (not all the times) when creating the snapshot.
>> To recover from this, a power cycle is required.
>>
>> Is this a problem caused by LVM or a kernel issue?
>>
>> The command I run:
>>
>>    /usr/sbin/lvcreate \
>>    -vvvvv \
>>    --size 100G \
>>    --snapshot /dev/mapper/devDataStore1-volMachines \
>>    --name volMachines_snap
>>
>> The last 4 lines:
>>> [Sat Jul 30 16:31:34 2022] debugfs: Directory 'dm-4' with parent 'block' 
>>> already present!
>>> [Sat Jul 30 16:31:34 2022] debugfs: Directory 'dm-7' with parent 'block' 
>>> already present!
>>> [Sat Jul 30 16:34:55 2022] INFO: task mariadbd:1607 blocked for more than 
>>> 120 seconds.
>>> [Sat Jul 30 16:34:55 2022]       Not tainted 5.18.0-2-amd64 #1 Debian 5.18.5-1
>>> [Sat Jul 30 16:34:55 2022] "echo 0 > 
>>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> [Sat Jul 30 16:34:55 2022] task:mariadbd        state:D stack:    0 pid: 
>>> 1607 ppid:  1289 flags:0x00000000
>>> [Sat Jul 30 16:34:55 2022] Call Trace:
>>> [Sat Jul 30 16:34:55 2022]  <TASK>
>>> [Sat Jul 30 16:34:55 2022]  __schedule+0x30b/0x9e0
>>> [Sat Jul 30 16:34:55 2022]  schedule+0x4e/0xb0
>>> [Sat Jul 30 16:34:55 2022]  percpu_rwsem_wait+0x112/0x130
>>> [Sat Jul 30 16:34:55 2022]  ? __percpu_rwsem_trylock.part.0+0x70/0x70
>>> [Sat Jul 30 16:34:55 2022]  __percpu_down_read+0x5e/0x80
>>> [Sat Jul 30 16:34:55 2022]  io_write+0x2e9/0x300
>>> [Sat Jul 30 16:34:55 2022]  ? _raw_spin_lock+0x13/0x30
>>> [Sat Jul 30 16:34:55 2022]  ? newidle_balance+0x26a/0x400
>>> [Sat Jul 30 16:34:55 2022]  ? fget+0x7c/0xb0
>>> [Sat Jul 30 16:34:55 2022]  io_issue_sqe+0x47c/0x2550
>>> [Sat Jul 30 16:34:55 2022]  ? select_task_rq_fair+0x174/0x1240
>>> [Sat Jul 30 16:34:55 2022]  ? hrtimer_try_to_cancel+0x78/0x110
>>> [Sat Jul 30 16:34:55 2022]  io_submit_sqes+0x3ce/0x1aa0
>>> [Sat Jul 30 16:34:55 2022]  ? _raw_spin_unlock_irqrestore+0x23/0x40
>>> [Sat Jul 30 16:34:55 2022]  ? wake_up_q+0x4a/0x90
>>> [Sat Jul 30 16:34:55 2022]  ? __do_sys_io_uring_enter+0x565/0xa60
>>> [Sat Jul 30 16:34:55 2022]  __do_sys_io_uring_enter+0x565/0xa60
>>> [Sat Jul 30 16:34:55 2022]  do_syscall_64+0x3b/0xc0
>>> [Sat Jul 30 16:34:55 2022]  entry_SYSCALL_64_after_hwframe+0x44/0xae
>>> [Sat Jul 30 16:34:55 2022] RIP: 0033:0x7f05b90229b9
>>> [Sat Jul 30 16:34:55 2022] RSP: 002b:00007eff8e9efa38 EFLAGS: 00000216 
>>> ORIG_RAX: 00000000000001aa
>>> [Sat Jul 30 16:34:55 2022] RAX: ffffffffffffffda RBX: 0000561c424f1d18 RCX: 
>>> 00007f05b90229b9
>>> [Sat Jul 30 16:34:55 2022] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 
>>> 0000000000000009
>>> [Sat Jul 30 16:34:55 2022] RBP: 00007eff8e9efa90 R08: 0000000000000000 R09: 
>>> 0000000000000008
>>> [Sat Jul 30 16:34:55 2022] R10: 0000000000000000 R11: 0000000000000216 R12: 
>>> 0000561c42500938
>>> [Sat Jul 30 16:34:55 2022] R13: 00007f05b9821c00 R14: 0000561c425009e0 R15: 
>>> 0000561c424f1d18
>>> [Sat Jul 30 16:34:55 2022]  </TASK>
>>> [Sat Jul 30 16:34:55 2022] INFO: task mariadbd:9955 blocked for more than 
>>> 120 seconds.
>>> [Sat Jul 30 16:34:55 2022]       Not tainted 5.18.0-2-amd64 #1 Debian 5.18.5-1
>>> [Sat Jul 30 16:34:55 2022] "echo 0 > 
>>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> [Sat Jul 30 16:34:55 2022] task:mariadbd        state:D stack:    0 pid: 
>>> 9955 ppid:  1289 flags:0x00000000
>>> [Sat Jul 30 16:34:55 2022] Call Trace:
>>> [Sat Jul 30 16:34:55 2022]  <TASK>
>>> [...]
>>
>> The message "mariadbd:1607 blocked for more than 120 seconds" will repeat.
>>
>> MariaDB itself is running in a systemd-nspawn container. The container
>> storage is located on the volume for which snapshot creation will hang.
> 
> 
> Hi
> 
> 
> Lvm2 is *NOT* supported to be used within containers!
> 
> This requires some very specific 'system' modification and its in overal very 
> problematic - so rule #1 is - always run lvm2 command on your hosting machine.
> 
> 
> Now - you suggests you are able to reproduce this issue also on your bare 
> metal hw - in this case run these 3 commands  before  'lvcreate'
> 
> 
> # dmsetup table
> # dmsetup info -c
> # dmsetup ls --tree
> 
> # lvcreate ....
> 
> If it blocks take again these:
> # dmsetup table
> # dmsetup info -c
> # dmsetup ls --tree
> 
> 
> You 'lvcreate -vvvv'  & 'dmesg' trace simply suggest that system is waiting to 
> fulfill  'fsfreeze'  operation - it's unclear why it cannot be finished - 
> maybe some problem with your 'raid' array ??
> 
> So far I do not see any bug on lvm2 side - all works from lvm2 side as 
> expected - however it's unclear why your 'raid' is so slow ?
> 
> Note: you could always 'experiment' without lvm2 in the picture -
> you can ran   'fsfreeze --freeze|--unfreeze'  yourself - to see whether even 
> this command is able to finish  ?
> 
> Note2: if you system has lots of 'dirty' pages - it may potentially take a lot 
> of time to 'fsfreeze' operation of a filesystem since all 'dirty' pages needs 
> to be written to your disk..
> 
>

Forgot to mention typical problem with container is a 'missing udevd' - 
resulting endless waiting on 'cookie' confirmation on lvm2 side.

This could be usually 'resolved' by issueing:

'dmsetup udevcomplete_all'

(you could see 'dmsetup udevcookies'  in-flight operations)

But that should not be a problem if you are waiting on 'fsfreeze'....

Regards

Zdenek