[Cluster-devel] [GFS2 PATCH] GFS2: Add to tail, not head, of transaction

Steven Whitehouse swhiteho at redhat.com
Fri Jun 1 08:34:05 UTC 2018


Hi,


On 30/05/18 20:24, Bob Peterson wrote:
> Hi,
>
> Before this patch, frunction gfs2_trans_add_meta called list_add
> to add a buffer to a transaction, tr_buf. Later, in the before_commit
> functions, it traversed the list in sequential order, which meant
> that they were processed in a sub-optimal order. For example, blocks
> could go out in 54321 order rather than 12345, causing media heads
> to bounce unnecessarily.
>
> This makes no difference for small IO operations, but large writes
> benefit greatly when they add lots of indirect blocks to a
> transaction, and those blocks are allocated in ascending order,
> as the block allocator tries to do.
>
> This patch changes it to list_add_tail so they are traversed (and
> therefore written back) in the same order as they are added to
> the transaction.
>
> In one of the more extreme examples, I did a test where I had
> 10 simultaneous instances of iozone, each writing 25GB of data,
> using one of our performance testing machines. The results are:
>
>                           Without the patch   With the patch
>                           -----------------   ------------------
> Children see throughput  1395068.00 kB/sec   1527073.61 kB/sec
> Parent sees throughput   1348544.00 kB/sec   1485594.66 kB/sec
>
> These numbers are artificially inflated because I was also running
> with Andreas Gruenbacher's iomap-write patch set plus my rgrp
> sharing patch set in both these cases. Still, it shows a 9 percent
> performance boost for both children and parent throughput.
>
> Signed-off-by: Bob Peterson <rpeterso at redhat.com>
That makes sense assuming that a transaction will always add the blocks 
in ascending order. Writing into the journal will not be affected by 
this, since that will always be sequential anyway, so I assume that you 
are referring to AIL writeback here?

It might make more sense just to sort the blocks into order at journal 
flush time, since then we'll always be in order, no matter what order 
the blocks are added during the transactions,

Steve.

> ---
>   fs/gfs2/trans.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
> index c75cacaa349b..c4c00b90935c 100644
> --- a/fs/gfs2/trans.c
> +++ b/fs/gfs2/trans.c
> @@ -247,7 +247,7 @@ void gfs2_trans_add_meta(struct gfs2_glock *gl, struct buffer_head *bh)
>   	gfs2_pin(sdp, bd->bd_bh);
>   	mh->__pad0 = cpu_to_be64(0);
>   	mh->mh_jid = cpu_to_be32(sdp->sd_jdesc->jd_jid);
> -	list_add(&bd->bd_list, &tr->tr_buf);
> +	list_add_tail(&bd->bd_list, &tr->tr_buf);
>   	tr->tr_num_buf_new++;
>   out_unlock:
>   	gfs2_log_unlock(sdp);
>




More information about the Cluster-devel mailing list