[dm-devel] [Regression/Behavior change]dm-flakey corrupt read bio, even the feature is drop_writes

Qu Wenruo quwenruo at cn.fujitsu.com
Tue Aug 23 08:30:29 UTC 2016


Hi Lukas,

Thanks for your patch, while I am a little concerned of it, even I'm a 
newbie to flakey code.

At 08/22/2016 10:53 PM, Lukas Herbolt wrote:
> Hi Qu,
>
> Sorry for the confusion. Reading the email again and the code it seems
> that the READS are really returned as -EIO if you set the drop_writes.
> I just tested it and you are right.
>
> If I was reading the fstest correctly the flakey is created as:
> ---
> flakey: 0 409600 flakey 8:64 0 0 180 1 drop_writes
> ---
>
> I believe the READs are dropped because it does not have any flags set.
>
> ---
>         if (bio_data_dir(bio) == READ) {
>             /* If flags were specified, only corrupt those that match. */
>             if (fc->corrupt_bio_byte && (fc->corrupt_bio_rw == READ) &&
>                 all_corrupt_bio_flags_match(bio, fc))
>                 goto map_bio;
>             else
>                 return -EIO;
>         }
> ---
>
> with conclusion of setting:
> ---
>                 /*
>                  * Flag this bio as submitted while down.
>                  */
>                 pb->bio_submitted = true;
> ---
>
> I have quick test patch ready, but it probably broke more thing than
> fixes so I will continue on it.
> Just in case you want to test it. Diff is done again 4.8-rc1
>
> --- a/drivers/md/dm-flakey.c
> +++ b/drivers/md/dm-flakey.c
> @@ -292,6 +292,11 @@ static int flakey_map(struct dm_target *ti,
> struct bio *bio)
>                  * Map reads as normal only if corrupt_bio_byte set.
>                  */
>                 if (bio_data_dir(bio) == READ) {
> +                        /* We should retunr all READS as ok in case
> of DROP WRITES flag is set. */
> +                       if (test_bit(DROP_WRITES, &fc->flags)) {
> +                               pb->bio_submitted = false;
> +                               goto map_bio;
> +                       }

According to my personal understanding, drop_writes should:
1) Drop any write bio silently
    Just as its name
2) For read
2.1) Read out data if the range doesn't include corrupt_bio_byte

2.2) Read out corrupted data if the range contains corrupt_bio_byte.

So it seems that 2.2) is not fulfilled.

While it solves the problem I reported, I'm still concerned if it 
matches the correct/designed behavior of flakey.

Thanks,
Qu
>                         /* If flags were specified, only corrupt those
> that match. */
>                         if (fc->corrupt_bio_byte &&
> (fc->corrupt_bio_rw == READ) &&
>                             all_corrupt_bio_flags_match(bio, fc))
>
>
>
> On Mon, Aug 22, 2016 at 10:05 AM, Lukas Herbolt <lherbolt at redhat.com> wrote:
>> Hello,
>>
>> There is patch from Mike. It's part of current pull request to 4.8-rc1
>> For more details check:
>>  - https://www.redhat.com/archives/dm-devel/2016-July/msg00561.html
>>  - https://www.redhat.com/archives/dm-devel/2016-August/msg00109.html
>>
>> Lukas
>>
>> On Mon, Aug 22, 2016 at 9:31 AM, Qu Wenruo <quwenruo at cn.fujitsu.com> wrote:
>>> Hi, Mike and btrfs and dm guys
>>>
>>> When doing regression test on v4.8-rc1, we found that fstests/btrfs/056
>>> always fails. With the following dmesg:
>>> ---
>>> Buffer I/O error on dev dm-0, logical block 1310704, async page read
>>> Buffer I/O error on dev dm-0, logical block 16, async page read
>>> Buffer I/O error on dev dm-0, logical block 16, async page read
>>> ---
>>>
>>> And bisect leads to the following commits:
>>> ---
>>> commit 99f3c90d0d85708e7401a81ce3314e50bf7f2819
>>> Author: Mike Snitzer <snitzer at redhat.com>
>>> Date:   Fri Jul 29 13:19:55 2016 -0400
>>>
>>>     dm flakey: error READ bios during the down_interval
>>> ---
>>>
>>> While according to the document of dm-flakey, it says that when using
>>> drop_writes feature, read bios are not affected:
>>> ---
>>>   drop_writes:
>>>         All write I/O is silently ignored.
>>>         Read I/O is handled correctly.
>>> ---
>>>
>>> If I understand the word "correctly" correctly, it should means READ I/0 is
>>> handled without problem.
>>>
>>> However with this commit, it also corrupt the read bio, leading to the test
>>> failure.
>>>
>>>
>>> At least there are two fixes available here;
>>> 1) Fix fstest scripts
>>>    The related macro is "_flakey_drop_and_remount yes", which will
>>>    check the fs during the "drop_writes" time.
>>>
>>>    Currently, only btrfs/056 calls "_flakey_drop_and_remount" with
>>>    "yes". So other test cases are not affected.
>>>
>>>    However, even we move the fsck outside of the "drop_writes" range,
>>>    although test case can pass without problem, but we will still
>>>    get a dmesg error:
>>>    "Buffer I/O error on dev dm-0, logical block 1310704, async page read"
>>>
>>> 2) Revert to flakey behavior to allow READ bio
>>>    Then everything is back to the old good days.
>>>
>>> Not sure which one is correct for current use case, as I'm not familiar with
>>> dm codes.
>>>
>>> Any idea to fix dm-flaky and keep the READ bio behavior?
>>>
>>> Thanks,
>>> Qu
>>>
>>>
>>>
>>>
>>>
>>> --
>>> dm-devel mailing list
>>> dm-devel at redhat.com
>>> https://www.redhat.com/mailman/listinfo/dm-devel
>>
>>
>>
>> --
>> Lukas Herbolt
>> RHCE, RH436, BSc, SSc
>> Senior Technical Support Engineer
>> Global Support Services (GSS)
>> Email:    lherbolt at redhat.com
>
>
>





More information about the dm-devel mailing list