[dm-devel] [PATCH] dm: call remove_single_exception_chunk before commit_merge
Mikulas Patocka
mpatocka at redhat.com
Wed Mar 22 20:15:08 UTC 2023
Hi
I don't feel strong need to fix it.
You claim that if you corrupt the snapshot metadata, there is a small race
condition where the corruption may not be reported to the user.
If you corrupt the snapshot metadata, you obviously get corrupted result
of the merge. "garbage in - garbage out". I don't think that we need to
improve handling of corrupted metadata.
I wrote this code 13 years ago, then I forgot the details about it, and I
don't feel confident touching it unless there is some strong reason.
Mikulas
On Fri, 10 Mar 2023, Jiazi.Li wrote:
> Assume that the metadata of cow on the disk is corrupted after init
> for some reason:
> old chunk-id new chunk-id
> 0 2
> ...
> x ---> 0 y
> After starting merge, old chunk 0 will be updated twice, and old
> chunk x will not be updated.
> And dm-snap will print err log after merge new chunk 2 to old chunk 0:
>
> <3>[ 731.921642] (1)[4092:kworker/1:0]device-mapper: snapshots:
> Corruption detected: exception for block 0 is on disk but not in memory
> then set snap->merge_failed to true.
>
> If userspace use "sectors_allocated == metadata_sectors" to determine
> whether the merge is completed, there maybe the following race that
> makes the userspace unable to know merge fail event:
>
> kernel merge kworker userspace process
> merge_callback
> ->commit_merge
> get snapshot_status by ioctl
> ->remove_single_exception_chunk
> set merge_failed to true
> think merge has been completed,
> switch device to another target
>
> Could we call remove_single_exception_chunk first to solve this race?
>
> Signed-off-by: Jiazi.Li <jiazi.li at transsion.com>
> ---
> drivers/md/dm-snap.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c
> index f766c21408f1..f658d05752f2 100644
> --- a/drivers/md/dm-snap.c
> +++ b/drivers/md/dm-snap.c
> @@ -1141,15 +1141,15 @@ static void merge_callback(int read_err, unsigned long write_err, void *context)
> goto shut;
> }
>
> + if (remove_single_exception_chunk(s) < 0)
> + goto shut;
> +
> if (s->store->type->commit_merge(s->store,
> s->num_merging_chunks) < 0) {
> DMERR("Write error in exception store: shutting down merge");
> goto shut;
> }
>
> - if (remove_single_exception_chunk(s) < 0)
> - goto shut;
> -
> snapshot_merge_next_chunks(s);
>
> return;
> --
> 2.17.1
>
> --
> dm-devel mailing list
> dm-devel at redhat.com
> https://listman.redhat.com/mailman/listinfo/dm-devel
>
More information about the dm-devel
mailing list