[dm-devel] dm raid1: "mirror" target doesn't use all available legs on multiple failures
Heinz Mauelshagen
heinzm at redhat.com
Tue Oct 11 14:56:47 UTC 2016
On 10/10/2016 10:41 PM, Mike Snitzer wrote:
> On Mon, Oct 10 2016 at 12:48pm -0400,
> Heinz Mauelshagen <heinzm at redhat.com> wrote:
>
>> In case legs of a "mirror" target fail, any read will cause a new,
>> operational default leg to be selected and the read be resubmitted
>> to it. If that new default leg fails the read too, no other still
>> accessible legs are used to resubmit the read again thus failing
>> the io.
>>
>> Fix by allowing the read to get resubmitted until there's no
>> operational legs any more.
>>
>> Resolves: rhbz1383444
>>
>> Signed-off-by: Heinz Mauelshagen <heinzm at redhat.com>
> Nothing seems to be checking bio_record->details.bi_bdev anymore.
> (The one you've removed really seems like a complete hack to begin with)
The very point of removing "if (!bio_record->details.bi_bdev) { ..." is
to allow for resubmission of read bios to any other still operational leg
after an initial resubmission failed.
do_reads() will conditionally error the bio in case there are none or
no in sync ones available.
The "mirror" processing for read bios is:
1) bio gets remapped to the default mirror in mirror_map() if region is
in sync our queued to the worker for submission
2) mirror_end_io() spots read error and requeues to the worker
3) worker
- submits bio to new, selected default if respective region is in sync
or
- fails bio if it aims at a non synchronized region
4) if not already failed in do_reads(),
mirror_end_io() processes like in step 1 _but_
errors the read because of the "if (!bio_record->details.bi_bdev) {
..."
My patch changes step 4 by removing the conditional to resubmit it
again in case any other operational leg is available. If none's available,
mirror_end_io() errors the read bios or the do_reads error logic applies
in case of a resubmission.
Why do you think this to be a hack?
>
> Shouldn't this patch go further by removing the other 2 places that set
> bio_record->details.bi_bdev = NULL; ?
Yes, that's a good cleanup.
Let's first settle any still open issues like other unspotted side effects
and I'll do a patch v2 with it.
>
>> ---
>> drivers/md/dm-raid1.c | 11 -----------
>> 1 file changed, 11 deletions(-)
>>
>> diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
>> index 7a6254d..dd31019 100644
>> --- a/drivers/md/dm-raid1.c
>> +++ b/drivers/md/dm-raid1.c
>> @@ -1266,16 +1266,6 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio, int error)
>> goto out;
>>
>> if (unlikely(error)) {
>> - if (!bio_record->details.bi_bdev) {
>> - /*
>> - * There wasn't enough memory to record necessary
>> - * information for a retry or there was no other
>> - * mirror in-sync.
>> - */
>> - DMERR_LIMIT("Mirror read failed.");
>> - return -EIO;
>> - }
>> -
>> m = bio_record->m;
>>
>> DMERR("Mirror read failed from %s. Trying alternative device.",
>> @@ -1291,7 +1281,6 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio, int error)
>> bd = &bio_record->details;
>>
>> dm_bio_restore(bd, bio);
>> - bio_record->details.bi_bdev = NULL;
>> bio->bi_error = 0;
>>
>> queue_bio(ms, bio, rw);
>> --
>> 2.7.4
>>
> --
> dm-devel mailing list
> dm-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
More information about the dm-devel
mailing list