[dm-devel] [RFC PATCH] fstests: Check if a fs can survive random (emulated) power loss
Amir Goldstein
amir73il at gmail.com
Thu Mar 1 08:39:15 UTC 2018
On Thu, Mar 1, 2018 at 7:38 AM, Qu Wenruo <wqu at suse.com> wrote:
> This test case is originally designed to expose unexpected corruption
> for btrfs, where there are several reports about btrfs serious metadata
> corruption after power loss.
>
> The test case itself will trigger heavy fsstress for the fs, and use
> dm-flakey to emulate power loss by dropping all later writes.
So you are re-posting the test with dm-flakey or converting it to
dm-log-writes??
>
> For btrfs, it should be completely fine, as long as superblock write
> (FUA write) finishes atomically, since with metadata CoW, superblock
> either points to old trees or new tress, the fs should be as atomic as
> superblock.
>
> For journal based filesystems, each metadata update should be journaled,
> so metadata operation is as atomic as journal updates.
>
> It does show that XFS is doing the best work among the tested
> filesystems (Btrfs, XFS, ext4), no kernel nor xfs_repair problem at all.
>
> For btrfs, although btrfs check doesn't report any problem, kernel
> reports some data checksum error, which is a little unexpected as data
> is CoWed by default, which should be as atomic as superblock.
> (Unfortunately, still not the exact problem I'm chasing for)
>
> For EXT4, kernel is fine, but later e2fsck reports problem, which may
> indicates there is still something to be improved.
>
> Signed-off-by: Qu Wenruo <wqu at suse.com>
> ---
> tests/generic/479 | 109 ++++++++++++++++++++++++++++++++++++++++++++++++++
> tests/generic/479.out | 2 +
> tests/generic/group | 1 +
> 3 files changed, 112 insertions(+)
> create mode 100755 tests/generic/479
> create mode 100644 tests/generic/479.out
>
> diff --git a/tests/generic/479 b/tests/generic/479
> new file mode 100755
> index 00000000..ab530231
> --- /dev/null
> +++ b/tests/generic/479
> @@ -0,0 +1,109 @@
> +#! /bin/bash
> +# FS QA Test 479
> +#
> +# Test if a filesystem can survive emulated powerloss.
> +#
> +# No matter what the solution a filesystem uses (journal or CoW),
> +# it should survive unexpected powerloss, without major metadata
> +# corruption.
> +#
> +#-----------------------------------------------------------------------
> +# Copyright (c) 2018 SuSE. All Rights Reserved.
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> +#-----------------------------------------------------------------------
> +#
> +
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1 # failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> + ps -e | grep fsstress > /dev/null 2>&1
> + while [ $? -eq 0 ]; do
> + $KILLALL_PROG -KILL fsstress > /dev/null 2>&1
> + wait > /dev/null 2>&1
> + ps -e | grep fsstress > /dev/null 2>&1
> + done
> + _unmount_flakey &> /dev/null
> + _cleanup_flakey
> + cd /
> + rm -f $tmp.*
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +. ./common/dmflakey
> +
> +# remove previous $seqres.full before test
> +rm -f $seqres.full
> +
> +# real QA test starts here
> +
> +# Modify as appropriate.
> +_supported_fs generic
> +_supported_os Linux
> +_require_scratch
> +_require_dm_target flakey
> +_require_command "$KILLALL_PROG" "killall"
> +
> +runtime=$(($TIME_FACTOR * 15))
> +loops=$(($LOAD_FACTOR * 4))
> +
> +for i in $(seq -w $loops); do
> + echo "=== Loop $i: $(date) ===" >> $seqres.full
> +
> + _scratch_mkfs >/dev/null 2>&1
> + _init_flakey
> + _mount_flakey
> +
> + ($FSSTRESS_PROG $FSSTRESS_AVOID -w -d $SCRATCH_MNT -n 1000000 \
> + -p 100 >> $seqres.full &) > /dev/null 2>&1
> +
> + sleep $runtime
> +
> + # Here we only want to drop all write, don't need to umount the fs
> + _load_flakey_table $FLAKEY_DROP_WRITES
> +
> + ps -e | grep fsstress > /dev/null 2>&1
> + while [ $? -eq 0 ]; do
> + $KILLALL_PROG -KILL fsstress > /dev/null 2>&1
> + wait > /dev/null 2>&1
> + ps -e | grep fsstress > /dev/null 2>&1
> + done
> +
> + _unmount_flakey
> + _cleanup_flakey
> +
> + # Mount the fs to do proper log replay for journal based fs
> + # so later check won't report annoying dirty log and only
> + # report real problem.
> + _scratch_mount
> + _scratch_unmount
> +
> + _check_scratch_fs
> +done
> +
> +echo "Silence is golden"
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/generic/479.out b/tests/generic/479.out
> new file mode 100644
> index 00000000..290f18b3
> --- /dev/null
> +++ b/tests/generic/479.out
> @@ -0,0 +1,2 @@
> +QA output created by 479
> +Silence is golden
> diff --git a/tests/generic/group b/tests/generic/group
> index 1e808865..5ce3db1d 100644
> --- a/tests/generic/group
> +++ b/tests/generic/group
> @@ -481,3 +481,4 @@
> 476 auto rw
> 477 auto quick exportfs
> 478 auto quick
> +479 auto
+ stress
More information about the dm-devel
mailing list