[Cluster-devel] [GFS2 PATCH] gfs2: revert af21ca8ed50f01c5278c5ded6dad6f05e8a5d2e4

Fri Jun 1 08:41:06 UTC 2018

Hi,

On 30/05/18 20:18, Bob Peterson wrote:
> Hi,
>
> Patch af21ca8ed5 is defective and needs to be reverted. The idea
> was to only reserve single blocks for directories, with the thought
> that they should be packed together. That's all well and good for
> creating new directories, but directories have other reasons for
> getting reservations: for example, when files are added to them
> and they need their directory hash tables expanded. For
> those purposes, one block may not be enough and the reservation
> can run out of blocks, forcing us to do block allocations without
> a reservation. We can also get into situations where we cannot
> modify a directory when we should be able to.
>
> This patch reverts the patch.
>
> Signed-off-by: Bob Peterson <rpeterso at redhat.com>
Doing allocation without a reservation is ok though, since the 
reservations are only hints and what really protects us against running 
out of blocks is the block counts on the rgrp. I forget now what the 
reason was that we couldn't add reservations for directories of more 
than one block, but I had a feeling there was a reason that we did that 
at the time.

That said, having reservations result in directories which are less 
fragments is definitely a good thing. One issue is whether it makes 
sense in case of deep directory trees which each directory itself 
contains only other directories, and indeed, not a huge number (so that 
the directories are still stuffed files). In that case we'll land up 
with a lot of fragmentation if we are not careful and simply apply a 
reservation without some knowledge of the directory size/layout.

It would be good if we could add a feature to the allocator to allocate 
contiguous extents even if it means searching a bit harder, and then to 
use that for directory hash tables so that we can at least keep those to 
a single extent where possible. The downside is that we then land up 
leaving holes where the old hash table was, which are then not easy to 
reuse...

Steve.

> ---
>   fs/gfs2/rgrp.c | 9 ++-------
>   1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
> index 8be47b81011a..d7a8ed8bb30c 100644
> --- a/fs/gfs2/rgrp.c
> +++ b/fs/gfs2/rgrp.c
> @@ -1529,14 +1529,9 @@ static void rg_mblk_search(struct gfs2_rgrpd *rgd, struct gfs2_inode *ip,
>   	u32 extlen;
>   	u32 free_blocks = rgd_free(rgd, rs);
>   	int ret;
> -	struct inode *inode = &ip->i_inode;
>   
> -	if (S_ISDIR(inode->i_mode))
> -		extlen = 1;
> -	else {
> -		extlen = max_t(u32, atomic_read(&rs->rs_sizehint), ap->target);
> -		extlen = clamp(extlen, RGRP_RSRV_MINBLKS, free_blocks);
> -	}
> +	extlen = max_t(u32, atomic_read(&rs->rs_sizehint), ap->target);
> +	extlen = clamp(extlen, RGRP_RSRV_MINBLKS, free_blocks);
>   	if ((rgd->rd_free_clone < rgd->rd_reserved) || (free_blocks < extlen))
>   		return;
>   
>
>