[dm-devel] dm cache metadata: set dirty on all cache blocks after a crash
Mike Snitzer
snitzer at redhat.com
Thu Aug 9 17:54:07 UTC 2018
On Thu, Aug 09 2018 at 12:22pm -0400,
Ilya Dryomov <idryomov at gmail.com> wrote:
> Quoting Documentation/device-mapper/cache.txt:
>
> The 'dirty' state for a cache block changes far too frequently for us
> to keep updating it on the fly. So we treat it as a hint. In normal
> operation it will be written when the dm device is suspended. If the
> system crashes all cache blocks will be assumed dirty when restarted.
>
> This got broken in commit f177940a8091 ("dm cache metadata: switch to
> using the new cursor api for loading metadata") in 4.9, which removed
> the code that consulted cmd->clean_when_opened (CLEAN_SHUTDOWN on-disk
> flag) when loading cache blocks. This results in data corruption on an
> unclean shutdown with dirty cache blocks on the fast device. After the
> crash those blocks are considired clean and may get evicted from the
> cache at any time. This can be demonstrated by doing a lot of reads
> to trigger individual evictions, but uncache is more predictable:
>
> ### Disable auto-activation in lvm.conf to be able to do uncache in
> ### time (i.e. see uncache doing flushing) when the fix is applied.
>
> # xfs_io -d -c 'pwrite -b 4M -S 0xaa 0 1G' /dev/vdb
> # vgcreate vg_cache /dev/vdb /dev/vdc
> # lvcreate -L 1G -n lv_slowdev vg_cache /dev/vdb
> # lvcreate -L 512M -n lv_cachedev vg_cache /dev/vdc
> # lvcreate -L 256M -n lv_metadev vg_cache /dev/vdc
> # lvconvert --type cache-pool --cachemode writeback vg_cache/lv_cachedev --poolmetadata vg_cache/lv_metadev
> # lvconvert --type cache vg_cache/lv_slowdev --cachepool vg_cache/lv_cachedev
> # xfs_io -d -c 'pwrite -b 4M -S 0xbb 0 512M' /dev/mapper/vg_cache-lv_slowdev
> # xfs_io -d -c 'pread -v 254M 512' /dev/mapper/vg_cache-lv_slowdev | head -n 2
> 0fe00000: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
> 0fe00010: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
> # dmsetup status vg_cache-lv_slowdev
> 0 2097152 cache 8 27/65536 128 8192/8192 1 100 0 0 0 8192 7065 2 metadata2 writeback 2 migration_threshold 2048 smq 0 rw -
> ^^^^
> 7065 * 64k = 441M yet to be written to the slow device
> # echo b >/proc/sysrq-trigger
>
> # vgchange -ay vg_cache
> # xfs_io -d -c 'pread -v 254M 512' /dev/mapper/vg_cache-lv_slowdev | head -n 2
> 0fe00000: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
> 0fe00010: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
> # lvconvert --uncache vg_cache/lv_slowdev
> Flushing 0 blocks for cache vg_cache/lv_slowdev.
> Logical volume "lv_cachedev" successfully removed
> Logical volume vg_cache/lv_slowdev is not cached.
> # xfs_io -d -c 'pread -v 254M 512' /dev/mapper/vg_cache-lv_slowdev | head -n 2
> 0fe00000: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................
> 0fe00010: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................
>
> This is the case with both v1 and v2 cache pool metatata formats.
>
> After applying this patch:
>
> # vgchange -ay vg_cache
> # xfs_io -d -c 'pread -v 254M 512' /dev/mapper/vg_cache-lv_slowdev | head -n 2
> 0fe00000: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
> 0fe00010: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
> # lvconvert --uncache vg_cache/lv_slowdev
> Flushing 3724 blocks for cache vg_cache/lv_slowdev.
> ...
> Flushing 71 blocks for cache vg_cache/lv_slowdev.
> Logical volume "lv_cachedev" successfully removed
> Logical volume vg_cache/lv_slowdev is not cached.
> # xfs_io -d -c 'pread -v 254M 512' /dev/mapper/vg_cache-lv_slowdev | head -n 2
> 0fe00000: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
> 0fe00010: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
>
> Cc: stable at vger.kernel.org
> Fixes: f177940a8091 ("dm cache metadata: switch to using the new cursor api for loading metadata")
> Signed-off-by: Ilya Dryomov <idryomov at gmail.com>
I staged this earlier today for 4.19 inclusion, please see:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.19&id=5b1fe7bec8a8d0cc547a22e7ddc2bd59acd67de4
Thanks so much for your thorough work on this. Very well done!
Mike
More information about the dm-devel
mailing list