[dm-devel] DM-RAID1 data corruption
malahal at us.ibm.com
malahal at us.ibm.com
Wed Apr 15 03:12:10 UTC 2009
Mikulas Patocka [mpatocka at redhat.com] wrote:
> Hi
>
> because of a loose cable, overheating, insufficient power or so, and the
> condition is repaired), raid1 sees set bit in the dirty bitmap and starts
> copying data from disk 0 to disk 1.
>
> The result: write bio was ended as succes, but the data was lost. For
> databases, this might have bad consequences - committed transactions being
> forgotten.
>
> -
>
> If the above scenario can't happen, pls. describe why.
IIRC, this is a known problem, always attributed to a "rare/small
window" of chance. :-(
> Delay all bios until the userspace code removes the failed mirror?
That is what the code does when a log device fails. We can use the same
approach.
> Or store the number of the default mirror in the log?
This is one way to do it but what about "corelog" mirrors?
Look at this patch
http://permalink.gmane.org/gmane.linux.kernel.device-mapper.devel/4973
It essentially generates an uevet and waits for the user level code to
act on it and send a message to unblock it.
More information about the dm-devel
mailing list