[dm-devel] 5.6-rc1 lockdep warning during dm-thin shutdown
Theodore Y. Ts'o
tytso at mit.edu
Thu Feb 20 23:45:19 UTC 2020
Ping? It looks like I'm still seeing across the board failures for
generic/347 in 5.6-rc2 and it looks like it's pretty clearly a dm-thin
issue.
If this isn't going to be fixed soon, that's fine. I'm just going to
blacklist generic/347 in kvm-xfstests and gce-xfstests.
Thanks,
- Ted
On Thu, Feb 13, 2020 at 03:50:14PM -0500, Theodore Y. Ts'o wrote:
> I noticed this while running xfstests generic/347. The lockdep report is below:
>
> [ 3742.062744] run fstests generic/347 at 2020-02-13 13:18:03
> [ 3742.431761] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: acl,user_xattr,block_validity
> [ 3743.696630] EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: acl,user_xattr,block_validity
> [ 3749.384483] device-mapper: thin: 253:4: reached low water mark for data device: sending event.
> [ 3750.853156] device-mapper: thin: 253:4: switching pool to out-of-data-space (queue IO) mode
> [ 3812.123242] device-mapper: thin: 253:4: switching pool to out-of-data-space (error IO) mode
> [ 3812.138217] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 378 starting block 128192)
> [ 3812.150804] buffer_io_error: 708 callbacks suppressed
> [ 3812.150809] Buffer I/O error on device dm-5, logical block 128000
> [ 3812.162553] Buffer I/O error on device dm-5, logical block 128001
> [ 3812.169205] Buffer I/O error on device dm-5, logical block 128002
> [ 3812.175793] Buffer I/O error on device dm-5, logical block 128003
> [ 3812.183686] Buffer I/O error on device dm-5, logical block 128004
> [ 3812.191467] Buffer I/O error on device dm-5, logical block 128005
> [ 3812.199311] Buffer I/O error on device dm-5, logical block 128006
> [ 3812.207204] Buffer I/O error on device dm-5, logical block 128007
> [ 3812.214915] Buffer I/O error on device dm-5, logical block 128008
> [ 3812.222699] Buffer I/O error on device dm-5, logical block 128009
> [ 3812.240796] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 379 starting block 128448)
> [ 3812.264563] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 380 starting block 128704)
> [ 3812.287052] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 381 starting block 128960)
> [ 3812.308866] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 382 starting block 129216)
> [ 3812.330810] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 383 starting block 129472)
> [ 3812.352898] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 384 starting block 129728)
> [ 3812.373989] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 385 starting block 129984)
> [ 3812.395096] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 386 starting block 130240)
> [ 3812.416946] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 387 starting block 130496)
> [ 3813.736289] device-mapper: thin: 253:4: switching pool to write mode
> [ 3813.736304] device-mapper: thin: 253:4: growing the data device from 1000 to 1200 blocks
> [ 3813.848386] device-mapper: thin: 253:4: reached low water mark for data device: sending event.
>
> [ 3814.349795] ======================================================
> [ 3814.356234] WARNING: possible circular locking dependency detected
> [ 3814.362533] 5.5.0-xfstests-07635-gc96dceeabf76 #1477 Not tainted
> [ 3814.368654] ------------------------------------------------------
> [ 3814.374947] dmsetup/10207 is trying to acquire lock:
> [ 3814.380079] ffffffff85489a78 (shrinker_rwsem){++++}, at: unregister_shrinker+0x22/0x80
> [ 3814.388115]
> but task is already holding lock:
> [ 3814.394061] ffff9c184c8c95d0 (&pmd->root_lock){++++}, at: dm_pool_metadata_close+0xba/0x120
> [ 3814.402530]
> which lock already depends on the new lock.
>
> [ 3814.411867]
> the existing dependency chain (in reverse order) is:
> [ 3814.419464]
> -> #3 (&pmd->root_lock){++++}:
> [ 3814.425321] lock_acquire+0x9a/0x160
> [ 3814.429545] down_read+0x41/0x200
> [ 3814.433542] dm_thin_find_block+0x2e/0x80
> [ 3814.438590] thin_map+0x172/0x270
> [ 3814.442548] __map_bio+0x42/0x210
> [ 3814.446501] __split_and_process_non_flush+0xd6/0x150
> [ 3814.452292] __split_and_process_bio+0xe0/0x380
> [ 3814.457588] dm_process_bio+0x90/0x2b0
> [ 3814.462838] dm_make_request+0x54/0x120
> [ 3814.467596] generic_make_request+0xd8/0x400
> [ 3814.472508] submit_bio+0x3c/0x140
> [ 3814.476587] submit_bh_wbc+0x16d/0x190
> [ 3814.480975] ext4_read_block_bitmap_nowait+0x41d/0x670
> [ 3814.486896] ext4_mb_init_cache+0x163/0x760
> [ 3814.491799] ext4_mb_init_group+0x19a/0x260
> [ 3814.496759] ext4_mb_good_group+0x16a/0x180
> [ 3814.501587] ext4_mb_regular_allocator+0x28f/0x400
> [ 3814.507036] ext4_mb_new_blocks+0x57e/0xbe0
> [ 3814.512016] ext4_ext_map_blocks+0xb32/0xf30
> [ 3814.516992] ext4_map_blocks+0xef/0x580
> [ 3814.521743] mpage_map_and_submit_extent+0xe7/0x510
> [ 3814.527259] ext4_writepages+0x895/0xb10
> [ 3814.531971] do_writepages+0x41/0xe0
> [ 3814.536199] __filemap_fdatawrite_range+0xd2/0x120
> [ 3814.541752] file_write_and_wait_range+0x71/0xc0
> [ 3814.547100] ext4_sync_file+0x130/0x4b0
> [ 3814.551580] do_fsync+0x38/0x70
> [ 3814.555367] __x64_sys_fsync+0x10/0x20
> [ 3814.559760] do_syscall_64+0x50/0x1f0
> [ 3814.564195] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 3814.569882]
> -> #2 (&ei->i_data_sem){++++}:
> [ 3814.575580] lock_acquire+0x9a/0x160
> [ 3814.580109] down_write+0x40/0x110
> [ 3814.584145] ext4_truncate+0x3ce/0x510
> [ 3814.588529] ext4_evict_inode+0x29a/0x5c0
> [ 3814.593184] evict+0xd0/0x1a0
> [ 3814.596952] vfs_rmdir+0x167/0x190
> [ 3814.600990] do_rmdir+0x145/0x1b0
> [ 3814.605303] do_syscall_64+0x50/0x1f0
> [ 3814.609613] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 3814.615301]
> -> #1 (jbd2_handle){++++}:
> [ 3814.620744] lock_acquire+0x9a/0x160
> [ 3814.625088] start_this_handle+0x226/0x680
> [ 3814.629824] jbd2__journal_start+0xf7/0x270
> [ 3814.634856] __ext4_journal_start_sb+0x85/0x1b0
> [ 3814.640051] ext4_release_dquot+0x68/0xc0
> [ 3814.644716] dqput.part.0+0x7f/0x210
> [ 3814.648931] __dquot_drop+0x69/0x90
> [ 3814.653495] ext4_clear_inode+0x35/0x90
> [ 3814.658107] ext4_evict_inode+0x6f/0x5c0
> [ 3814.662767] evict+0xd0/0x1a0
> [ 3814.666365] dispose_list+0x48/0x70
> [ 3814.673422] prune_icache_sb+0x54/0x80
> [ 3814.677808] super_cache_scan+0x124/0x1a0
> [ 3814.682454] do_shrink_slab+0x171/0x3a0
> [ 3814.687036] shrink_slab+0x22c/0x2b0
> [ 3814.691322] drop_slab_node+0x2e/0x50
> [ 3814.695628] drop_slab+0x34/0x60
> [ 3814.699511] drop_caches_sysctl_handler+0x70/0x80
> [ 3814.704859] proc_sys_call_handler+0x19d/0x1c0
> [ 3814.709944] vfs_write+0xde/0x1b0
> [ 3814.713911] ksys_write+0x68/0xe0
> [ 3814.718207] do_syscall_64+0x50/0x1f0
> [ 3814.722512] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 3814.728895]
> -> #0 (shrinker_rwsem){++++}:
> [ 3814.734504] check_prev_add+0x8e/0x9f0
> [ 3814.738897] __lock_acquire+0xcdf/0xf80
> [ 3814.743431] lock_acquire+0x9a/0x160
> [ 3814.747651] down_write+0x40/0x110
> [ 3814.751806] unregister_shrinker+0x22/0x80
> [ 3814.756646] dm_bufio_client_destroy+0xbf/0x220
> [ 3814.761922] dm_block_manager_destroy+0x11/0x20
> [ 3814.767098] dm_pool_metadata_close+0x113/0x120
> [ 3814.772503] __pool_dec+0x96/0x120
> [ 3814.776547] pool_dtr+0x2d/0x70
> [ 3814.780574] dm_table_destroy+0x58/0x110
> [ 3814.785137] __dm_destroy+0x159/0x1b0
> [ 3814.789444] dev_remove+0xd3/0x110
> [ 3814.793482] ctl_ioctl+0x1ab/0x3b0
> [ 3814.797543] dm_ctl_ioctl+0xa/0x10
> [ 3814.801608] ksys_ioctl+0x82/0xc0
> [ 3814.805593] __x64_sys_ioctl+0x16/0x20
> [ 3814.809991] do_syscall_64+0x50/0x1f0
> [ 3814.814401] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 3814.820226]
> other info that might help us debug this:
>
> [ 3814.828343] Chain exists of:
> shrinker_rwsem --> &ei->i_data_sem --> &pmd->root_lock
>
> [ 3814.839823] Possible unsafe locking scenario:
>
> [ 3814.845864] CPU0 CPU1
> [ 3814.850625] ---- ----
> [ 3814.855274] lock(&pmd->root_lock);
> [ 3814.858962] lock(&ei->i_data_sem);
> [ 3814.865215] lock(&pmd->root_lock);
> [ 3814.871686] lock(shrinker_rwsem);
> [ 3814.875377]
> *** DEADLOCK ***
>
> [ 3814.881412] 2 locks held by dmsetup/10207:
> [ 3814.885637] #0: ffffffff86fc05f0 (&dm_thin_pool_table.mutex){+.+.}, at: pool_dtr+0x1d/0x70
> [ 3814.894114] #1: ffff9c184c8c95d0 (&pmd->root_lock){++++}, at: dm_pool_metadata_close+0xba/0x120
> [ 3814.903068]
> stack backtrace:
> [ 3814.907544] CPU: 1 PID: 10207 Comm: dmsetup Not tainted 5.5.0-xfstests-07635-gc96dceeabf76 #1477
> [ 3814.916451] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> [ 3814.925915] Call Trace:
> [ 3814.928485] dump_stack+0x71/0xa0
> [ 3814.931929] check_noncircular+0x162/0x180
> [ 3814.936144] check_prev_add+0x8e/0x9f0
> [ 3814.940010] __lock_acquire+0xcdf/0xf80
> [ 3814.943964] lock_acquire+0x9a/0x160
> [ 3814.947733] ? unregister_shrinker+0x22/0x80
> [ 3814.952123] down_write+0x40/0x110
> [ 3814.955650] ? unregister_shrinker+0x22/0x80
> [ 3814.960035] unregister_shrinker+0x22/0x80
> [ 3814.964248] dm_bufio_client_destroy+0xbf/0x220
> [ 3814.968898] dm_block_manager_destroy+0x11/0x20
> [ 3814.973551] dm_pool_metadata_close+0x113/0x120
> [ 3814.978307] __pool_dec+0x96/0x120
> [ 3814.981829] pool_dtr+0x2d/0x70
> [ 3814.985086] dm_table_destroy+0x58/0x110
> [ 3814.989156] __dm_destroy+0x159/0x1b0
> [ 3814.992939] dev_remove+0xd3/0x110
> [ 3814.996494] ? remove_all+0x30/0x30
> [ 3815.000136] ctl_ioctl+0x1ab/0x3b0
> [ 3815.003672] dm_ctl_ioctl+0xa/0x10
> [ 3815.007187] ksys_ioctl+0x82/0xc0
> [ 3815.010614] __x64_sys_ioctl+0x16/0x20
> [ 3815.014474] do_syscall_64+0x50/0x1f0
> [ 3815.018257] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 3815.023440] RIP: 0033:0x7f2c9d82e427
> [ 3815.027131] Code: 00 00 90 48 8b 05 69 aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 aa 0c 00 f7 d8 64 89 01 48
> [ 3815.046014] RSP: 002b:00007ffee7570578 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
> [ 3815.053715] RAX: ffffffffffffffda RBX: 00007f2c9d90ca7f RCX: 00007f2c9d82e427
> [ 3815.061052] RDX: 00005625880b2b40 RSI: 00000000c138fd04 RDI: 0000000000000003
> [ 3815.068302] RBP: 00007ffee7570630 R08: 00007f2c9d95b310 R09: 00007ffee75703e0
> [ 3815.076018] R10: 00007f2c9d95a65a R11: 0000000000000202 R12: 00007f2c9d95a65a
> [ 3815.083398] R13: 00007f2c9d95a65a R14: 00007f2c9d95a65a R15: 00007f2c9d95a65a
More information about the dm-devel
mailing list