[dm-devel] [RFC PATCH] block: wire blkdev_fallocate() to block_device_operations' reserve_space

Mike Snitzer snitzer at redhat.com
Thu Apr 14 15:18:28 UTC 2016


On Tue, Apr 12 2016 at  8:12pm -0400,
Darrick J. Wong <darrick.wong at oracle.com> wrote:

> On Tue, Apr 12, 2016 at 05:04:27PM -0400, Mike Snitzer wrote:
> > On Tue, Apr 12 2016 at  4:39pm -0400,
> > Darrick J. Wong <darrick.wong at oracle.com> wrote:
> > 
> > > On Tue, Apr 12, 2016 at 04:04:59PM -0400, Mike Snitzer wrote:
> > > > diff --git a/fs/block_dev.c b/fs/block_dev.c
> > > > index 5a2c3ab..b34c07b 100644
> > > > --- a/fs/block_dev.c
> > > > +++ b/fs/block_dev.c
> > > > @@ -1801,17 +1801,13 @@ long blkdev_fallocate(struct file *file, int mode, loff_t start, loff_t len)
> > > >  	struct request_queue *q = bdev_get_queue(bdev);
> > > >  	struct address_space *mapping;
> > > >  	loff_t end = start + len - 1;
> > > > -	loff_t bs_mask, isize;
> > > > +	loff_t isize;
> > > >  	int error;
> > > >  
> > > >  	/* We only support zero range and punch hole. */
> > > >  	if (mode & ~BLKDEV_FALLOC_FL_SUPPORTED)
> > > >  		return -EOPNOTSUPP;
> > > >  
> > > > -	/* We haven't a primitive for "ensure space exists" right now. */
> > > > -	if (!(mode & ~FALLOC_FL_KEEP_SIZE))
> > > > -		return -EOPNOTSUPP;
> > > > -
> > > >  	/* Only punch if the device can do zeroing discard. */
> > > >  	if ((mode & FALLOC_FL_PUNCH_HOLE) &&
> > > >  	    (!blk_queue_discard(q) || !q->limits.discard_zeroes_data))
> > > > @@ -1829,9 +1825,12 @@ long blkdev_fallocate(struct file *file, int mode, loff_t start, loff_t len)
> > > >  			return -EINVAL;
> > > >  	}
> > > >  
> > > > -	/* Don't allow IO that isn't aligned to logical block size */
> > > > -	bs_mask = bdev_logical_block_size(bdev) - 1;
> > > > -	if ((start | len) & bs_mask)
> > > > +	/*
> > > > +	 * Don't allow IO that isn't aligned to minimum IO size (io_min)
> > > > +	 * - for normal device's io_min is usually logical block size
> > > > +	 * - but for more exotic devices (e.g. DM thinp) it may be larger
> > > > +	 */
> > > > +	if ((start | len) % bdev_io_min(bdev))
> 
> I started by noticing the 64-bit division.

Oops, yeah good point.  I did said my patch was untested (didn't mention
that it wasn't even compile tested.. RFC and all) ;)

> However, in researching alignment
> requirements for fallocate, I noticed that nothing says that we can return
> -EINVAL for unaligned offset/len for allocate or punch.  For file allocations
> ext4 and xfs simply enlarge the range so that the ends are aligned to the
> logical block size; for punch they both shrink the range to deallocate until
> the ends are aligned, and write zeroes to the partial blocks.
> 
> At least for user-visible fallocate we should do likewise, but for the internal
> blkdev_ helpers I think it makes more sense to check lbs alignment and let the
> lower level driver reject the IO if min_io alignment is a hard requirement.
> Documentation/block/queue-sysfs.txt says that the min_io is the smallest
> /preferred/ size.

Thinking about this all further.  Alignment on allocation isn't a big
deal for thinp.  If the extent requested isn't properly aligned we'll
still do the right thing (which is to round-up and allocate a block at
the beginning and/or end to fulfill the request).

As for discard, DM-thinp silently drops the discard of the beginning
and/or end that isn't aligned on a thinp blocksize boundary -- that is
DM thinp doesn't unmap the corresponding thinp block because the partial
block is still considered used.  But thinp still passes down the
appropriate discard for that subset of the still-mapped thinp block to
the underlying storage (if discard passdown in enabled on the DM
thin-pool).
 
> But, before that, I'll push out some new fallocate patches for -rc3.
> 
> > > >  		return -EINVAL;
> > > 
> > > Noted.  Will update the original patch.
> > 
> > BTW, I just noticed your "block: require write_same and discard requests
> > align to logical block size" -- doesn't look right.
> 
> What happens if we pass a request to thinp that isn't aligned to
> minimum_io_size?  Does it reject the command?

I hope I answered that above just now.

SO.. it seems we can avoid the mess of worrying about minimum_io_size
alignment (both for this fallocate interface and discard).




More information about the dm-devel mailing list