[dm-devel] dm-snap deadlock in pending_complete()
NeilBrown
neilb at suse.com
Thu Aug 13 01:23:16 UTC 2015
On Wed, 12 Aug 2015 10:16:17 -0400 (EDT) Mikulas Patocka
<mpatocka at redhat.com> wrote:
>
>
> On Wed, 12 Aug 2015, NeilBrown wrote:
>
> > Ahh - thanks for the explanation. That makes sense.
> > Presumably you need to wait for old reads that you don't have a read
> > returning old data after a write has confirmed the new data is written?
> > Is there some other reason?
> >
> > As soon as the new "proper" exception is installed, no new reads will
> > use the old address, so it should be safe to wait without holding the
> > lock.
> >
> > So what about this?
>
> This is wrong too - when you add the exception to the complete hash table
> and drop the lock, writes to the chunk in the origin target are can modify
> the origin volume. These writes race with pending reads to the snapshot
> (__chunk_is_tracked tests for those reads). If the I/O scheduler schedules
> the write to the origin before the read from the snapshot, the read from
> the snapshot returns corrupted data.
>
> There can be more problems with snapshot merging - if the chunk is on the
> complete hash table, the merging code assumes that it can be safely
> merged.
>
Thanks again.
I came up with the following which should address the origin-write
issue, but I'm not so sure about snapshot merging, and it is becoming a
lot less simple that I had hoped.
I'll see if I can come up with code for a more general solution that is
still localised to dm - along the lines of my bio_split suggestion.
Thanks,
NeilBrown
diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c
index 7c82d3ccce87..7f9e1b0429c8 100644
--- a/drivers/md/dm-snap.c
+++ b/drivers/md/dm-snap.c
@@ -1461,14 +1461,22 @@ static void pending_complete(struct dm_snap_pending_exception *pe, int success)
goto out;
}
- /* Check for conflicting reads */
- __check_for_conflicting_io(s, pe->e.old_chunk);
+ /* Add proper exception. Now all reads will be
+ * redirected, so no new reads can be started on
+ * 'old_chunk'.
+ */
+ dm_insert_exception(&s->complete, e);
+
+ /* Drain conflicting reads */
+ if (__chunk_is_tracked(s, pe->e.old_chunk)) {
+ up_write(&s->lock);
+ __check_for_conflicting_io(s, pe->e.old_chunk);
+ down_write(&s->lock);
+ }
/*
- * Add a proper exception, and remove the
- * in-flight exception from the list.
+ * And now we can remove the temporary exception.
*/
- dm_insert_exception(&s->complete, e);
out:
dm_remove_exception(&pe->e);
@@ -2089,17 +2097,17 @@ static int __origin_write(struct list_head *snapshots, sector_t sector,
*/
chunk = sector_to_chunk(snap->store, sector);
- /*
- * Check exception table to see if block
- * is already remapped in this snapshot
- * and trigger an exception if not.
- */
- e = dm_lookup_exception(&snap->complete, chunk);
- if (e)
- goto next_snapshot;
-
pe = __lookup_pending_exception(snap, chunk);
if (!pe) {
+ /*
+ * Check exception table to see if block
+ * is already remapped in this snapshot
+ * and trigger an exception if not.
+ */
+ e = dm_lookup_exception(&snap->complete, chunk);
+ if (e)
+ goto next_snapshot;
+
up_write(&snap->lock);
pe = alloc_pending_exception(snap);
down_write(&snap->lock);
More information about the dm-devel
mailing list