[Cluster-devel] [GFS2 PATCH] GFS2: Delete directory block reservation on failure

Tue Jul 30 15:42:53 UTC 2013

Hi,

----- Original Message -----
| On Tue, 2013-07-30 at 10:14 -0400, Bob Peterson wrote:
| > Hi,
| > 
| > This patch adds one line of code that deletes a block reservation
| > structure for the source directory in the event that the inode creation
| > operation fails. If the inode creation succeeds, the reservation will
| > be deleted anyway, since directory reservations are now only 1 block.
| > 
| Why would we want to do that? If the creation has failed then that gives
| us no information about whether further allocations are likely to be
| made for that directory,

It's hard to explain, but it has to do with keeping the bitmaps as
defragmented as possible in memory so that we don't slow down file block
allocations with tons of unnecessary reservation structures to go through.
Directory reservations are only for a single block anyway, and in the case
where a new inode is created successfully, the block reservation is deleted
immediately thereafter. The reason we do this is to keep the bitmaps
as tightly packed as possible so that file allocations are given priority.
Otherwise we spend a huge amount of time rejecting many possible free
blocks because of outstanding reservations left around for directories by
virtue of the fact that directories are cached and not closed like files.

For details, see:
http://git.kernel.org/cgit/linux/kernel/git/steve/gfs2-3.0-nmw.git/commit/fs/gfs2?id=af21ca8ed50f01c5278c5ded6dad6f05e8a5d2e4

However, in the unsuccessful case, today's code leaves the single-block
reservation structure out there in memory for the directory, also
fragmenting the bitmap and creating more clutter for the block allocator to
go through when finding free blocks, just like we had before the
aforementioned patch.

It seems pointless to leave the reservation around speculatively on the
hopes of future dinode allocations for that directory. Even more so in the
failure case, especially since it seems likely to fail a second and
subsequent times as well for the same reason it failed this time.

Regards,

Bob Peterson
Red Hat File Systems