[dm-devel] [PATCH] Don't lose writes if errors are not handled and log fails

Takahiro Yasui tyasui at redhat.com
Mon Feb 1 17:26:14 UTC 2010


> This fixes bug https://bugzilla.redhat.com/show_bug.cgi?id=555197
> 
> Please submit this patch before 2.6.33 goes out. It fixes a bug when old 
> LVM (<= 2.02.51) is used, that doesn't pass errors_handled flag to 
> dm-raid1.
> 
> It doesn't need to be backported to RHEL 5.5, because lvm always passes a 
> flag to handle errors there.

Don't we need to backport it to RHEL 5.5? If lvm is the only user of dm-raid1,
we don't need to backport it to RHEL 5.5. But if not, we need to.

> Don't lose writes if errors are not handled and log fails
> 
> If the log fails and errors are not handled by dmeventd, the writes
> are successfully finished without being actually written to the device.
> 
> This code path is taken:
> do_writes:
> 	bio_list_merge(&ms->failures, &sync);
> do_failures:
> 	if (!get_valid_mirror(ms)) (false)
> 	else if (errors_handled(ms)) (false)
> 	else bio_endio(bio, 0);
> 
> The logic in do_failures is based on presuming that the write was already
> tried --- if it succeeded at least on one leg and errors are not handled,
> it is reported as success.
> 
> However, bio can be added to the failures queue without being submitted, in
> do_writes.
> 
> This patch changes it so that bios are added to the failures list only if
> errors are handled --- then, they will be held with hold_bio() called from
> do_failures.

I agree that bios should be issued by do_write() when ms->log_failures is set,
but do we need to add bios to the failures queue? As you mentioned, the failures
queue should be used to bios which are already handled. Therefore, I think
bios are better to handled directly by hold_bio() instead of adding them to
the failures queue as bios for nosync regions are done.

-	if (unlikely(ms->log_failure)) {
-		spin_lock_irq(&ms->lock);
-		bio_list_merge(&ms->failures, &sync);
-		spin_unlock_irq(&ms->lock);
-		wakeup_mirrord(ms);
-	} else
-		while ((bio = bio_list_pop(&sync)))
+	while ((bio = bio_list_pop(&sync)))
+		if (unlikely(ms->log_failure) && errors_handled(ms))
+			hold_bio(ms, bio);
+		else
 			do_write(ms, bio);

The policy to treat bios when ms->log_failures == 1 is different, but
the above code is based on the following patch.

https://www.redhat.com/archives/dm-devel/2009-December/msg00211.html

Thanks,
Taka




More information about the dm-devel mailing list