[dm-devel] [PATCH v3 00/10] dm: zoned block device support

Mike Snitzer snitzer at redhat.com
Fri May 26 02:12:12 UTC 2017


On Wed, May 17 2017 at  9:55P -0400,
Damien Le Moal <Damien.LeMoal at wdc.com> wrote:

> Mike,
> 
> (resending using outlook as I am still having troubles reaching
> @redhat.com email domain with any other email client. My apologies
> if multiple copies of this email show up)
> 
> On 5/18/17 03:54, Mike Snitzer wrote:
> > On Tue, May 16 2017 at  4:03pm -0400,
> > Mike Snitzer <snitzer at redhat.com> wrote:
> >
> >> I see quite a few issues with this patchset (only gotten through patches
> >> 1 - 6).  I'll work through it in more detail and share my
> >> feedback/revisions tomorrow.  Mostly just cleanups, renames, etc.  But
> >> "the fun" is obviously once I get to the last patch.
> >
> > FYI, couldn't get to this like I planned.  And I'm taking some time off,
> > won't get back to this until next Tuesday (5/23).  To be clear, the
> > things I noticed in the preliminary patches were very benign, but do
> > need cleaning up.
> 
> Thank you for the review. Let me know the changes you would like to see
> and I will send an updated series.
> 
> > I have every intention of getting this reviewed and staged for 4.13.
> 
> That's great. Thanks.
> 
> > But would be useful to understand:
> > 1) who will be regression testing this target once it is merged?
> 
> Myself, Bart, and all other members of my team will be involved in
> maintaining and testing this. It is critical for us as an SMR disk
> vendor that those disks are supported correctly in Linux. So we will
> maintain and regression test all aspects of the zoned block device
> support constantly.
> 
> > 2) what is needed to test it? (I assume SMR drives?)
> 
> Yes, SMR drives, but not necessarily physical ones. We are working on
> adding ZBC support to the SCSI target (that is missing). With that, we
> are planning to create a tcm (or tcmu) driver to emulate a host-aware or
> host-managed disk for testing, with a regular disk or file as back-end
> storage. This was also requested by file system maintainers (BtrFS) to
> allow testing of zoned block device support even without physical SMR
> disks available.
> 
> Since zoned block device support spans the entire block I/O stack (from
> block layer API down to LLD, with device mapper and SCSI/libata in the
> middle) we are also starting to design new test cases for the newly
> released blktests infrastructure. This will allow automated testing,
> including device mapper targets that supports zoned block devices.
> 
> Ideally, we will try to release everything for inclusion in 4.13,
> together with the device mapper support. But all the test parts may get
> spread over one or two release cycles. But again, the goal is to have a
> comprehensive automated test suite for zoned block device, similar to
> what is available for regular block devices.

Thanks for all that context, much appreciated.

I'd like to get your thoughts on replacing the first 3 patches with
something like the following patch (_not_ compile tested).

Basically I'm not interested in training DM for hypothetical zoned block
device configurations.  I only want the bare minimum that is needed to
support the dm-zoned target at this point.  The fact that dm-zoned is
"drive managed" makes for a much more narrow validation (AFAICT anyway).
I dropped the zone_sectors validation -- we can backfill it if you feel
it important but I wanted to float this simplified patch as is:

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 5f5eae4..0e4407e 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -340,6 +340,30 @@ static int device_area_is_invalid(struct dm_target *ti, struct dm_dev *dev,
 		return 1;
 	}
 
+	/*
+	 * If the target is mapped to zoned block device(s), check
+	 * that the zones are not partially mapped.
+	 */
+	if (bdev_zoned_model(bdev) != BLK_ZONED_NONE) {
+		unsigned int zone_sectors = bdev_zone_sectors(bdev);
+
+		if (start & (zone_sectors - 1)) {
+			DMWARN("%s: start=%llu not aligned to h/w zone size %u of %s",
+			       dm_device_name(ti->table->md),
+			       (unsigned long long)start,
+			       zone_sectors, bdevname(bdev, b));
+			return 1;
+		}
+
+		if (len & (zone_sectors - 1)) {
+			DMWARN("%s: len=%llu not aligned to h/w zone size %u of %s",
+			       dm_device_name(ti->table->md),
+			       (unsigned long long)len,
+			       zone_sectors, bdevname(bdev, b));
+			return 1;
+		}
+	}
+
 	return 0;
 }
 
@@ -456,6 +480,8 @@ static int dm_set_device_limits(struct dm_target *ti, struct dm_dev *dev,
 		       q->limits.alignment_offset,
 		       (unsigned long long) start << SECTOR_SHIFT);
 
+	limits->zoned = blk_queue_zoned_model(q);
+
 	return 0;
 }
 
@@ -1346,6 +1372,36 @@ bool dm_table_has_no_data_devices(struct dm_table *table)
 	return true;
 }
 
+static int device_is_zoned_model(struct dm_target *ti, struct dm_dev *dev,
+				 sector_t start, sector_t len, void *data)
+{
+	struct request_queue *q = bdev_get_queue(dev->bdev);
+	enum blk_zoned_model zoned_model = *data;
+
+	return q && blk_queue_zoned_model(q) == zoned_model;
+}
+
+static bool dm_table_supports_zoned_model(struct dm_table *t,
+					  enum blk_zoned_model zoned_model)
+{
+	struct dm_target *ti;
+	unsigned i;
+
+	for (i = 0; i < dm_table_get_num_targets(t); i++) {
+		ti = dm_table_get_target(t, i);
+
+		if (zoned_model == BLK_ZONED_HM &&
+		    !dm_target_supports_zoned_hm(ti->type))
+			return false;
+
+		if (!ti->type->iterate_devices ||
+		    !ti->type->iterate_devices(ti, device_is_zoned_model, &zoned_model))
+			return false;
+	}
+
+	return true;
+}
+
 /*
  * Establish the new table's queue_limits and validate them.
  */
@@ -1355,6 +1411,7 @@ int dm_calculate_queue_limits(struct dm_table *table,
 	struct dm_target *ti;
 	struct queue_limits ti_limits;
 	unsigned i;
+	enum blk_zoned_model zoned_model = BLK_ZONED_NONE;
 
 	blk_set_stacking_limits(limits);
 
@@ -1372,6 +1429,14 @@ int dm_calculate_queue_limits(struct dm_table *table,
 		ti->type->iterate_devices(ti, dm_set_device_limits,
 					  &ti_limits);
 
+		if (zoned_model == BLK_ZONED_NONE && ti_limits.zoned != BLK_ZONED_NONE) {
+			/*
+			 * After stacking all limits, validate all devices
+			 * in table support this zoned model.
+			 */
+			zoned_model = ti_limits.zoned;
+		}
+
 		/* Set I/O hints portion of queue limits */
 		if (ti->type->io_hints)
 			ti->type->io_hints(ti, &ti_limits);
@@ -1398,6 +1463,19 @@ int dm_calculate_queue_limits(struct dm_table *table,
 			       (unsigned long long) ti->len);
 	}
 
+	/*
+	 * Verify that the zoned model, as determined before any .io_hints
+	 * override, is the same across all devices in the table.
+	 * - but if limits->zoned is not BLK_ZONED_NONE validate match for it
+	 */
+	if (limits->zoned != BLK_ZONED_NONE)
+		zoned_model = limits->zoned;
+	if (!dm_table_supports_zoned_model(table, zoned_model)) {
+		DMERR("%s: zoned model is inconsistent across all devices"
+		      dm_device_name(table->md));
+		return -EINVAL;
+	}
+
 	return validate_hardware_logical_block_alignment(table, limits);
 }
 
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index f4c639c..d13fcd2 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -237,6 +237,12 @@ typedef unsigned (*dm_num_write_bios_fn) (struct dm_target *ti, struct bio *bio)
 #define DM_TARGET_PASSES_INTEGRITY	0x00000020
 #define dm_target_passes_integrity(type) ((type)->features & DM_TARGET_PASSES_INTEGRITY)
 
+/*
+ * Indicates that a target supports host-managed zoned block devices.
+ */
+#define DM_TARGET_ZONED_HM		0x00000040
+#define dm_target_supports_zoned_hm(type) ((type)->features & DM_TARGET_ZONED_HM)
+
 struct dm_target {
 	struct dm_table *table;
 	struct target_type *type;




More information about the dm-devel mailing list