rpms/kernel/F-10 linux-2.6-ext4-extent-header-check-fix.patch, NONE, 1.1 linux-2.6-ext4-flush-on-close.patch, NONE, 1.1 kernel.spec, 1.1288, 1.1289

Chuck Ebbert cebbert at fedoraproject.org
Fri Mar 13 02:17:51 UTC 2009


Author: cebbert

Update of /cvs/pkgs/rpms/kernel/F-10
In directory cvs1.fedora.phx.redhat.com:/tmp/cvs-serv2314

Modified Files:
	kernel.spec 
Added Files:
	linux-2.6-ext4-extent-header-check-fix.patch 
	linux-2.6-ext4-flush-on-close.patch 
Log Message:
Copy ext4 fixes from rawhide:
  linux-2.6-ext4-extent-header-check-fix.patch
  linux-2.6-ext4-flush-on-close.patch

linux-2.6-ext4-extent-header-check-fix.patch:

--- NEW FILE linux-2.6-ext4-extent-header-check-fix.patch ---
This should resolve kernel.org bugzilla 12821

I've not actually crafted a workload to exercise this code; 
this is from inspection...

The ext4_ext_search_right() function is confusing; it uses a
"depth" variable which is 0 at the root and maximum at the leaves, 
but the on-disk metadata uses a "depth" (actually eh_depth) which
is opposite: maximum at the root, and 0 at the leaves.

The ext4_ext_check_header() function is given a depth and checks
the header agaisnt that depth; it expects the on-disk semantics,
but we are giving it the opposite in the while loop in this 
function.  We should be giving it the on-disk notion of "depth"
which we can get from (p_depth - depth) - and if you look, the last
(more commonly hit) call to ext4_ext_check_header() does just this.

Sending in the wrong depth results in (incorrect) messages
about corruption:

EXT4-fs error (device sdb1): ext4_ext_search_right: bad header
in inode #2621457: unexpected eh_depth - magic f30a, entries 340,
max 340(0), depth 1(2)

Reported-by: David Dindorp <ddi at dubex.dk>
Signed-off-by: Eric Sandeen <sandeen at redhat.com>
--

Index: linux-2.6/fs/ext4/extents.c
===================================================================
--- linux-2.6.orig/fs/ext4/extents.c
+++ linux-2.6/fs/ext4/extents.c
@@ -1122,7 +1122,8 @@ ext4_ext_search_right(struct inode *inod
 	struct ext4_extent_idx *ix;
 	struct ext4_extent *ex;
 	ext4_fsblk_t block;
-	int depth, ee_len;
+	int depth;	/* Note, NOT eh_depth; depth from top of tree */
+	int ee_len;
 
 	BUG_ON(path == NULL);
 	depth = path->p_depth;
@@ -1179,7 +1180,8 @@ got_index:
 		if (bh == NULL)
 			return -EIO;
 		eh = ext_block_hdr(bh);
-		if (ext4_ext_check_header(inode, eh, depth)) {
+		/* subtract from p_depth to get proper eh_depth */
+		if (ext4_ext_check_header(inode, eh, path->p_depth - depth)) {
 			put_bh(bh);
 			return -EIO;
 		}

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


linux-2.6-ext4-flush-on-close.patch:

--- NEW FILE linux-2.6-ext4-flush-on-close.patch ---
From: Theodore Ts'o <tytso at mit.edu>
Date: Thu, 26 Feb 2009 06:04:07 +0000 (-0500)
Subject: ext4: add EXT4_IOC_ALLOC_DA_BLKS ioctl
X-Git-Url: http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Ftytso%2Fext4.git;a=commitdiff_plain;h=3bf3342f394d72ed2ec7e77b5b39e1b50fad8284

ext4: add EXT4_IOC_ALLOC_DA_BLKS ioctl

Add an ioctl which forces all of the delay allocated blocks to be
allocated.  This also provides a function ext4_alloc_da_blocks() which
will be used by the following commits to force files to be fully
allocated to preserve application-expected ext3 behaviour.

XXX ERS: actual ioctl removed, not needed for our purposes at this time

Signed-off-by: "Theodore Ts'o" <tytso at mit.edu>
---


From: Theodore Ts'o <tytso at mit.edu>
Date: Tue, 24 Feb 2009 13:21:14 +0000 (-0500)
Subject: ext4: Automatically allocate delay allocated blocks on close
X-Git-Url: http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Ftytso%2Fext4.git;a=commitdiff_plain;h=6645f8c3bc3cdaa7de4aaa3d34d40c2e8e5f09ae

ext4: Automatically allocate delay allocated blocks on close

When closing a file that had been previously truncated, force any
delay allocated blocks that to be allocated so that if the filesystem
is mounted with data=ordered, the data blocks will be pushed out to
disk along with the journal commit.  Many application programs expect
this, so we do this to avoid zero length files if the system crashes
unexpectedly.

Signed-off-by: "Theodore Ts'o" <tytso at mit.edu>
---

From: Theodore Ts'o <tytso at mit.edu>
Date: Tue, 24 Feb 2009 04:05:27 +0000 (-0500)
Subject: ext4: Automatically allocate delay allocated blocks on rename
X-Git-Url: http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Ftytso%2Fext4.git;a=commitdiff_plain;h=dbc85aa9f11d8c13c15527d43a3def8d7beffdc8

ext4: Automatically allocate delay allocated blocks on rename

When renaming a file such that a link to another inode is overwritten,
force any delay allocated blocks that to be allocated so that if the
filesystem is mounted with data=ordered, the data blocks will be
pushed out to disk along with the journal commit.  Many application
programs expect this, so we do this to avoid zero length files if the
system crashes unexpectedly.

Signed-off-by: "Theodore Ts'o" <tytso at mit.edu>
---


Index: linux-2.6.28.noarch/fs/ext4/ext4.h
===================================================================
--- linux-2.6.28.noarch.orig/fs/ext4/ext4.h
+++ linux-2.6.28.noarch/fs/ext4/ext4.h
@@ -255,6 +255,7 @@ struct flex_groups {
 #define EXT4_STATE_NEW			0x00000002 /* inode is newly created */
 #define EXT4_STATE_XATTR		0x00000004 /* has in-inode xattrs */
 #define EXT4_STATE_NO_EXPAND		0x00000008 /* No space for expansion */
+#define EXT4_STATE_DA_ALLOC_CLOSE	0x00000010 /* Alloc DA blks on close */
 
 /* Used to pass group descriptor data when online resize is done */
 struct ext4_new_group_input {
@@ -1091,6 +1092,7 @@ extern int ext4_can_truncate(struct inod
 extern void ext4_truncate(struct inode *);
 extern void ext4_set_inode_flags(struct inode *);
 extern void ext4_get_inode_flags(struct ext4_inode_info *);
+extern int ext4_alloc_da_blocks(struct inode *inode);
 extern void ext4_set_aops(struct inode *inode);
 extern int ext4_writepage_trans_blocks(struct inode *);
 extern int ext4_meta_trans_blocks(struct inode *, int nrblocks, int idxblocks);
Index: linux-2.6.28.noarch/fs/ext4/inode.c
===================================================================
--- linux-2.6.28.noarch.orig/fs/ext4/inode.c
+++ linux-2.6.28.noarch/fs/ext4/inode.c
@@ -2816,6 +2816,48 @@ out:
 	return;
 }
 
+/*
+ * Force all delayed allocation blocks to be allocated for a given inode.
+ */
+int ext4_alloc_da_blocks(struct inode *inode)
+{
+	if (!EXT4_I(inode)->i_reserved_data_blocks &&
+	    !EXT4_I(inode)->i_reserved_meta_blocks)
+		return 0;
+
+	/*
+	 * We do something simple for now.  The filemap_flush() will
+	 * also start triggering a write of the data blocks, which is
+	 * not strictly speaking necessary (and for users of
+	 * laptop_mode, not even desirable).  However, to do otherwise
+	 * would require replicating code paths in:
+	 * 
+	 * ext4_da_writepages() ->
+	 *    write_cache_pages() ---> (via passed in callback function)
+	 *        __mpage_da_writepage() -->
+	 *           mpage_add_bh_to_extent()
+	 *           mpage_da_map_blocks()
+	 *
+	 * The problem is that write_cache_pages(), located in
+	 * mm/page-writeback.c, marks pages clean in preparation for
+	 * doing I/O, which is not desirable if we're not planning on
+	 * doing I/O at all.
+	 *
+	 * We could call write_cache_pages(), and then redirty all of
+	 * the pages by calling redirty_page_for_writeback() but that
+	 * would be ugly in the extreme.  So instead we would need to
+	 * replicate parts of the code in the above functions,
+	 * simplifying them becuase we wouldn't actually intend to
+	 * write out the pages, but rather only collect contiguous
+	 * logical block extents, call the multi-block allocator, and
+	 * then update the buffer heads with the block allocations.
+	 * 
+	 * For now, though, we'll cheat by calling filemap_flush(),
+	 * which will map the blocks, and start the I/O, but not
+	 * actually wait for the I/O to complete.
+	 */
+	return filemap_flush(inode->i_mapping);
+}
 
 /*
  * bmap() is special.  It gets used by applications such as lilo and by
@@ -3838,6 +3880,9 @@ void ext4_truncate(struct inode *inode)
 	if (!ext4_can_truncate(inode))
 		return;
 
+	if (inode->i_size == 0)
+		ei->i_state |= EXT4_STATE_DA_ALLOC_CLOSE;
+
 	if (EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL) {
 		ext4_ext_truncate(inode);
 		return;
Index: linux-2.6.28.noarch/fs/ext4/file.c
===================================================================
--- linux-2.6.28.noarch.orig/fs/ext4/file.c
+++ linux-2.6.28.noarch/fs/ext4/file.c
@@ -33,6 +33,10 @@
  */
 static int ext4_release_file(struct inode *inode, struct file *filp)
 {
+	if (EXT4_I(inode)->i_state & EXT4_STATE_DA_ALLOC_CLOSE) {
+		ext4_alloc_da_blocks(inode);
+		EXT4_I(inode)->i_state &= ~EXT4_STATE_DA_ALLOC_CLOSE;
+	}
 	/* if we are the last writer on the inode, drop the block reservation */
 	if ((filp->f_mode & FMODE_WRITE) &&
 			(atomic_read(&inode->i_writecount) == 1))
Index: linux-2.6.28.noarch/fs/ext4/namei.c
===================================================================
--- linux-2.6.28.noarch.orig/fs/ext4/namei.c
+++ linux-2.6.28.noarch/fs/ext4/namei.c
@@ -2311,7 +2311,7 @@ static int ext4_rename(struct inode *old
 	struct inode *old_inode, *new_inode;
 	struct buffer_head *old_bh, *new_bh, *dir_bh;
 	struct ext4_dir_entry_2 *old_de, *new_de;
-	int retval;
+	int retval, force_da_alloc = 0;
 
 	old_bh = new_bh = dir_bh = NULL;
 
@@ -2449,6 +2449,7 @@ static int ext4_rename(struct inode *old
 		ext4_mark_inode_dirty(handle, new_inode);
 		if (!new_inode->i_nlink)
 			ext4_orphan_add(handle, new_inode);
+		force_da_alloc = 1;
 	}
 	retval = 0;
 
@@ -2457,6 +2458,8 @@ end_rename:
 	brelse(old_bh);
 	brelse(new_bh);
 	ext4_journal_stop(handle);
+	if (retval == 0 && force_da_alloc)
+		ext4_alloc_da_blocks(old_inode);
 	return retval;
 }
 


Index: kernel.spec
===================================================================
RCS file: /cvs/pkgs/rpms/kernel/F-10/kernel.spec,v
retrieving revision 1.1288
retrieving revision 1.1289
diff -u -r1.1288 -r1.1289
--- kernel.spec	12 Mar 2009 04:48:55 -0000	1.1288
+++ kernel.spec	13 Mar 2009 02:17:19 -0000	1.1289
@@ -657,6 +657,9 @@
 # silence the ACPI blacklist code
 Patch2802: linux-2.6-silence-acpi-blacklist.patch
 
+Patch2910: linux-2.6-ext4-extent-header-check-fix.patch
+Patch2911: linux-2.6-ext4-flush-on-close.patch
+
 Patch9001: revert-fix-modules_install-via-nfs.patch
 
 %endif
@@ -1076,6 +1079,8 @@
 #
 
 # ext4
+ApplyPatch linux-2.6-ext4-extent-header-check-fix.patch
+ApplyPatch linux-2.6-ext4-flush-on-close.patch
 
 # xfs
 
@@ -1760,6 +1765,11 @@
 %kernel_variant_files -k vmlinux %{with_kdump} kdump
 
 %changelog
+* Fri Mar 13 2009 Chuck Ebbert <cebbert at redhat.com> 2.6.29-0.59.rc7.git5
+- Copy ext4 fixes from rawhide:
+  linux-2.6-ext4-extent-header-check-fix.patch
+  linux-2.6-ext4-flush-on-close.patch
+
 * Thu Mar 12 2009 Jarod Wilson <jarod at redhat.com> 2.6.29-0.58.rc7.git5
 - Updated lirc patch to kill a slew of compile warnings and
   make lirc_serial behave properly w/kfifos




More information about the fedora-extras-commits mailing list