[Cluster-devel] [PATCH 2/2] gfs2: writeout truncated pages

Wed Jun 15 17:08:02 UTC 2016

Hi Ben,

Comments below.

----- Original Message -----
| When gfs2 attempts to write a page to a file that is being truncated,
| and notices that the page is completely outside of the file size, it
| tries to invalidate it.  However, this may require a transaction for
| journaled data files to revoke any buffers from the page on the active
| items list. Unfortunately, this can happen inside a log flush, where a
| transaction cannot be started. Also, gfs2 may need to be able to remove
| the buffer from the ail1 list before it can finish the log flush.
| 
| To deal with this, gfs2 now skips the check to see if the write is
| outside the file size, and simply writes it anyway. This situation can
| only occur when the truncate code still has the file locked exclusively,
| and hasn't marked this block as free in the metadata (which happens
| later in truc_dealloc).  After gfs2 writes this page out, the truncation
| code will shortly invalidate it and write out any revokes if necessary.
| 
| To do this, gfs2 now implements its own version of block_write_full_page
| without the check, and calls the newly exported __block_write_full_page.
| The check still exists in nobh_writepage, which is gfs2 calls in the
| non-journaled case. But in this case the buffers will never be in the
| journal. Thus, no revokes will need to be issued and the write can
| safely be skipped without causing any possible journal replay issues.
| 
| Signed-off-by: Benjamin Marzinski <bmarzins at redhat.com>
| ---
|  fs/gfs2/aops.c | 37 +++++++++++++++++++++++++++----------
|  1 file changed, 27 insertions(+), 10 deletions(-)
| 
| diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
| index 37b7bc1..d3a7301 100644
| --- a/fs/gfs2/aops.c
| +++ b/fs/gfs2/aops.c
| @@ -100,20 +100,11 @@ static int gfs2_writepage_common(struct page *page,
|  	struct inode *inode = page->mapping->host;
|  	struct gfs2_inode *ip = GFS2_I(inode);
|  	struct gfs2_sbd *sdp = GFS2_SB(inode);
| -	loff_t i_size = i_size_read(inode);
| -	pgoff_t end_index = i_size >> PAGE_SHIFT;
| -	unsigned offset;
|  
|  	if (gfs2_assert_withdraw(sdp, gfs2_glock_is_held_excl(ip->i_gl)))
|  		goto out;
|  	if (current->journal_info)
|  		goto redirty;
| -	/* Is the page fully outside i_size? (truncate in progress) */
| -	offset = i_size & (PAGE_SIZE-1);
| -	if (page->index > end_index || (page->index == end_index && !offset)) {
| -		page->mapping->a_ops->invalidatepage(page, 0, PAGE_SIZE);
| -		goto out;
| -	}

The concept in general looks good. Two things:

(1) Since this is writepage_common, does it still need to do the
check for the normal non-jdata writeback? Does it makes sense to
split this back up into two writepage functions again rather than
the common one? After all, with your patch, the function does almost
nothing anymore.

(2) Just a nit, but can you eliminate the ugly goto around the return?
In other words, rather than:

	if (current->journal_info)
		goto redirty;
	return 1;
redirty:

Just simply do:

	if (!current->journal_info)
		return 1;

Regards,

Bob Peterson
Red Hat File Systems