rpms/kernel/F-7 linux-2.6-vm-invalidate_mapping_pages-cond-resched.patch, NONE, 1.1 kernel-2.6.spec, 1.3173, 1.3174

Dave Jones (davej) fedora-extras-commits at redhat.com
Fri May 18 19:54:34 UTC 2007


Author: davej

Update of /cvs/pkgs/rpms/kernel/F-7
In directory cvs-int.fedora.redhat.com:/tmp/cvs-serv474

Modified Files:
	kernel-2.6.spec 
Added Files:
	linux-2.6-vm-invalidate_mapping_pages-cond-resched.patch 
Log Message:
* Fri May 18 2007 Dave Jones <davej at redhat.com>
- Re-add cond_resched to invalidate_mapping_pages()


linux-2.6-vm-invalidate_mapping_pages-cond-resched.patch:

--- NEW FILE linux-2.6-vm-invalidate_mapping_pages-cond-resched.patch ---
>From davej  Wed May 16 14:45:37 2007
Return-path: <linux-kernel-owner+davej=40kernelslacker.org-S1761694AbXEPSn4 at vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on
	gelk.kernelslacker.org
X-Spam-Level: 
X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,FORGED_RCVD_HELO
	autolearn=ham version=3.1.8
Envelope-to: davej at kernelslacker.org
Delivery-date: Wed, 16 May 2007 19:44:05 +0100
Received: from testure.choralone.org [194.9.77.134]
	by gelk.kernelslacker.org with IMAP (fetchmail-6.3.6)
	for <davej at localhost> (single-drop); Wed, 16 May 2007 14:45:37 -0400 (EDT)
Received: from vger.kernel.org ([209.132.176.167])
	by testure.choralone.org with esmtp (Exim 4.63)
	(envelope-from <linux-kernel-owner+davej=40kernelslacker.org-S1761694AbXEPSn4 at vger.kernel.org>)
	id 1HoOTl-0007I7-Bi
	for davej at kernelslacker.org; Wed, 16 May 2007 19:44:05 +0100
Received: (majordomo at vger.kernel.org) by vger.kernel.org via listexpand
	id S1761694AbXEPSn4 (ORCPT <rfc822;davej at kernelslacker.org>);
	Wed, 16 May 2007 14:43:56 -0400
Received: (majordomo at vger.kernel.org) by vger.kernel.org id S1756223AbXEPSnr
	(ORCPT <rfc822;linux-kernel-outgoing>);
	Wed, 16 May 2007 14:43:47 -0400
Received: from smtp2.linux-foundation.org ([207.189.120.14]:55012 "EHLO
	smtp2.linux-foundation.org" rhost-flags-OK-OK-OK-OK)
	by vger.kernel.org with ESMTP id S1756132AbXEPSnq (ORCPT
	<rfc822;linux-kernel at vger.kernel.org>);
	Wed, 16 May 2007 14:43:46 -0400
Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6])
	by smtp2.linux-foundation.org (8.13.5.20060308/8.13.5/Debian-3ubuntu1.1) with ESMTP id l4GIh8vo006373
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 16 May 2007 11:43:45 -0700
Received: from akpm.corp.google.com (shell0.pdx.osdl.net [10.9.0.31])
	by shell0.pdx.osdl.net (8.13.1/8.11.6) with SMTP id l4GIf0QE007190;
	Wed, 16 May 2007 11:41:01 -0700
Date:	Wed, 16 May 2007 11:41:00 -0700
From:	Andrew Morton <akpm at linux-foundation.org>
To:	Bernd Schubert <bs at q-leap.de>
Cc:	"Michal Piotrowski" <michal.k.k.piotrowski at gmail.com>,
	"Bernd Schubert" <bschubert at q-leap.de>,
	linux-kernel at vger.kernel.org
Subject: Re: mkfs.ext2 triggered softlockup
Message-Id: <20070516114100.9cd642b8.akpm at linux-foundation.org>
In-Reply-To: <200705161901.09072.bs at q-leap.de>
References: <f2fcc1$otm$2 at sea.gmane.org>
	<6bffcb0e0705160949m7486705s1b2fc5bbe8a025df at mail.gmail.com>
	<200705161901.09072.bs at q-leap.de>
X-Mailer: Sylpheed version 2.2.7 (GTK+ 2.8.6; i686-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-MIMEDefang-Filter: osdl$Revision: 1.177 $
X-Scanned-By: MIMEDefang 2.53 on 207.189.120.14
Sender:	linux-kernel-owner at vger.kernel.org
Precedence: bulk
X-Mailing-List:	linux-kernel at vger.kernel.org
Status: RO
Content-Length: 6833
Lines: 169

On Wed, 16 May 2007 19:01:08 +0200
Bernd Schubert <bs at q-leap.de> wrote:

> On Wednesday 16 May 2007 18:49:57 Michal Piotrowski wrote:
> > Hi Bernd,
> >
> > On 16/05/07, Bernd Schubert <bschubert at q-leap.de> wrote:
> > > Maybe you still remember my report about an mkfs.ext2 triggered ram disk
> > > corruption?
> > >
> > > http://lkml.org/lkml/2007/5/4/272
> > >
> > > Well, in principle I'm now doing the same stuff, only this time with
> > > another initrd, which mounts the root-fs over nfs.
> > >
> > > [ 1596.928552] BUG: soft lockup detected on CPU#2!
> > > [ 1596.933109]
> > > [ 1596.933110] Call Trace:
> > > [ 1596.933111]  <IRQ>  [<ffffffff8025167b>] softlockup_tick+0xd8/0xef
> > > [ 1596.933129]  [<ffffffff802329f8>] run_local_timers+0x13/0x15
> > > [ 1596.933132]  [<ffffffff80232a44>] update_process_times+0x4a/0x77
> > > [ 1596.933138]  [<ffffffff8021434b>] smp_local_timer_interrupt+0x34/0x54
> > > [ 1596.933143]  [<ffffffff802143cc>] smp_apic_timer_interrupt+0x61/0x78
> > > [ 1596.933147]  [<ffffffff8020a29b>] apic_timer_interrupt+0x6b/0x70
> > > [ 1596.933151]  <EOI>  [<ffffffff80299dff>] free_buffer_head+0x24/0x3e
> > > [ 1596.933162]  [<ffffffff80272a63>] kmem_cache_free+0x1f4/0x201
> > > [ 1596.933170]  [<ffffffff80299dff>] free_buffer_head+0x24/0x3e
> > > [ 1596.933175]  [<ffffffff80299ea1>] try_to_free_buffers+0x88/0x9f
> > > [ 1596.933181]  [<ffffffff802565a9>] try_to_release_page+0x39/0x40
> > > [ 1596.933188]  [<ffffffff8025b76d>] invalidate_mapping_pages+0x9d/0x121
> > > [ 1596.933196]  [<ffffffff8025b800>] invalidate_inode_pages+0xf/0x11
> > > [ 1596.933200]  [<ffffffff80299053>] invalidate_bdev+0x3b/0x3f
> > > [ 1596.933203]  [<ffffffff8029c9ee>] kill_bdev+0x13/0x29
> > > [ 1596.933208]  [<ffffffff8029d6e8>] __blkdev_put+0x62/0x141
> > > [ 1596.933213]  [<ffffffff8029db62>] blkdev_put+0xb/0xd
> > > [ 1596.933218]  [<ffffffff8029dbf7>] blkdev_close+0x2e/0x33
> > > [ 1596.933222]  [<ffffffff8027a3c3>] __fput+0xc3/0x172
> > > [ 1596.933228]  [<ffffffff8027a486>] fput+0x14/0x16
> > > [ 1596.933233]  [<ffffffff80278c4f>] filp_close+0x61/0x6d
> > > [ 1596.933238]  [<ffffffff80278ce7>] sys_close+0x8c/0xce
> > > [ 1596.933244]  [<ffffffff8020965e>] system_call+0x7e/0x83
> > > [ 1596.933250]
> >
> > Can you tell me which kernel version you are using?
> 
> Sorry, forgot that. I think 2.6.20.6 or 2.6.20.7 (I always rename them to .3, 
> for some reasons thats easier than to change our tftp-rembo config). The 
> kernel is patches with lustre patches, hmm, one of them also adds a read-only 
> test to the block device layer.
> Probably I should test a vanilla kernel. Going to do that now...
> 

Don't bother - it'll happen here too.

I assume the disk is large, and that the machine has a lot of RAM?

Root cause: I suck.




From: Andrew Morton <akpm at linux-foundation.org>

invalidate_mapping_pages() can sometimes take a long time (millions of pages
to free).  Long enough for the softlockup detector to trigger.

We used to have a cond_resched() in there but I took it out because the
drop_caches code calls invalidate_mapping_pages() under inode_lock.

The patch adds a nasty flag and puts the cond_resched() back.

Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
---

 fs/drop_caches.c   |    2 +-
 include/linux/fs.h |    3 +++
 mm/truncate.c      |   38 +++++++++++++++++++++++---------------
 3 files changed, 27 insertions(+), 16 deletions(-)

diff -puN fs/drop_caches.c~invalidate_mapping_pages-add-cond_resched fs/drop_caches.c
--- a/fs/drop_caches.c~invalidate_mapping_pages-add-cond_resched
+++ a/fs/drop_caches.c
@@ -20,7 +20,7 @@ static void drop_pagecache_sb(struct sup
 	list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
 		if (inode->i_state & (I_FREEING|I_WILL_FREE))
 			continue;
-		invalidate_mapping_pages(inode->i_mapping, 0, -1);
+		__invalidate_mapping_pages(inode->i_mapping, 0, -1, true);
 	}
 	spin_unlock(&inode_lock);
 }
diff -puN include/linux/fs.h~invalidate_mapping_pages-add-cond_resched include/linux/fs.h
--- a/include/linux/fs.h~invalidate_mapping_pages-add-cond_resched
+++ a/include/linux/fs.h
@@ -1583,6 +1583,9 @@ extern int __invalidate_device(struct bl
 extern int invalidate_partition(struct gendisk *, int);
 #endif
 extern int invalidate_inodes(struct super_block *);
+unsigned long __invalidate_mapping_pages(struct address_space *mapping,
+					pgoff_t start, pgoff_t end,
+					bool be_atomic);
 unsigned long invalidate_mapping_pages(struct address_space *mapping,
 					pgoff_t start, pgoff_t end);
 
diff -puN mm/truncate.c~invalidate_mapping_pages-add-cond_resched mm/truncate.c
--- a/mm/truncate.c~invalidate_mapping_pages-add-cond_resched
+++ a/mm/truncate.c
@@ -253,21 +253,8 @@ void truncate_inode_pages(struct address
 }
 EXPORT_SYMBOL(truncate_inode_pages);
 
-/**
- * invalidate_mapping_pages - Invalidate all the unlocked pages of one inode
- * @mapping: the address_space which holds the pages to invalidate
- * @start: the offset 'from' which to invalidate
- * @end: the offset 'to' which to invalidate (inclusive)
- *
- * This function only removes the unlocked pages, if you want to
- * remove all the pages of one inode, you must call truncate_inode_pages.
- *
- * invalidate_mapping_pages() will not block on IO activity. It will not
- * invalidate pages which are dirty, locked, under writeback or mapped into
- * pagetables.
- */
-unsigned long invalidate_mapping_pages(struct address_space *mapping,
-				pgoff_t start, pgoff_t end)
+unsigned long __invalidate_mapping_pages(struct address_space *mapping,
+				pgoff_t start, pgoff_t end, bool be_atomic)
 {
 	struct pagevec pvec;
 	pgoff_t next = start;
@@ -308,9 +295,30 @@ unlock:
 				break;
 		}
 		pagevec_release(&pvec);
+		if (likely(!be_atomic))
+			cond_resched();
 	}
 	return ret;
 }
+
+/**
+ * invalidate_mapping_pages - Invalidate all the unlocked pages of one inode
+ * @mapping: the address_space which holds the pages to invalidate
+ * @start: the offset 'from' which to invalidate
+ * @end: the offset 'to' which to invalidate (inclusive)
+ *
+ * This function only removes the unlocked pages, if you want to
+ * remove all the pages of one inode, you must call truncate_inode_pages.
+ *
+ * invalidate_mapping_pages() will not block on IO activity. It will not
+ * invalidate pages which are dirty, locked, under writeback or mapped into
+ * pagetables.
+ */
+unsigned long invalidate_mapping_pages(struct address_space *mapping,
+				pgoff_t start, pgoff_t end)
+{
+	return __invalidate_mapping_pages(mapping, start, end, false);
+}
 EXPORT_SYMBOL(invalidate_mapping_pages);
 
 /*
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Index: kernel-2.6.spec
===================================================================
RCS file: /cvs/pkgs/rpms/kernel/F-7/kernel-2.6.spec,v
retrieving revision 1.3173
retrieving revision 1.3174
diff -u -r1.3173 -r1.3174
--- kernel-2.6.spec	18 May 2007 19:52:16 -0000	1.3173
+++ kernel-2.6.spec	18 May 2007 19:53:57 -0000	1.3174
@@ -597,6 +597,7 @@
 Patch1910: linux-2.6-unexport-symbols.patch
 
 # VM bits.
+Patch2000: linux-2.6-vm-invalidate_mapping_pages-cond-resched.patch
 Patch2001: linux-2.6-vm-silence-atomic-alloc-failures.patch
 
 # Tweak some defaults.
@@ -1362,6 +1363,8 @@
 #
 # VM related fixes.
 #
+# Re-add cond_resched to invalidate_mapping_pages()
+%patch2000 -p1
 # Silence GFP_ATOMIC failures.
 %patch2001 -p1
 
@@ -2404,6 +2407,9 @@
 
 %changelog
 * Fri May 18 2007 Dave Jones <davej at redhat.com>
+- Re-add cond_resched to invalidate_mapping_pages()
+
+* Fri May 18 2007 Dave Jones <davej at redhat.com>
 - Don't print warnings about MSI failures on e1000
 
 * Fri May 18 2007 Dave Jones <davej at redhat.com>




More information about the fedora-extras-commits mailing list