2GB memory limit running fsck on a +6TB device

santi at usansolo.net santi at usansolo.net
Tue Jun 10 15:34:35 UTC 2008



On Mon, 9 Jun 2008 17:33:20 -0400, Theodore Tso <tytso at mit.edu> wrote:
 
> If you are using e2fsprogs 1.40.10, there is another solution that may
> help.  Create an /etc/e2fsck.conf file with the following contents:
> 
> [scratch_files]
> 	directory = /var/cache/e2fsck

(..)

> This will cause e2fsck to store certain data structures which grow
> large with backup servers that have a vast number of hard-linked files
> in /var/cache/e2fsck instead of in memory.  This will slow down e2fsck
> by approximately 25%, but for large filesystems where you couldn't
> otherwise get e2fsck to complete because you're exhausting the 2GB VM
> per-process limitation for 32-bit systems, it should allow you to run
> through to completion.

I'm trying with fsck.ext3 v1.40.8, backported from Lenny's package to Etch,
instead of v1.40.10 because we have the same sceneario in all backup
servers running BackupPC, and package must be distributed. If needed, we
can make test with the latest version ;-)

fsck.ext3 started 4 hours ago, and still is in "Pass 1: Checking inodes,
blocks, and sizes", that's normal knowing that the filesystem has +113
million inodes?

I will send more info as requested Ted in "Call for testers w/ using
BackupPC" [1], but now this is the scenario:

- fsck.ext3 is using more than 2GB of memory and no swap, server has 4GB
phisycal RAM + 2GB of swap, this's the output of "pmap -d"  with memory
map:

# pmap -d 7014
7014:   fsck.ext3 -y /dev/sda4
Address   Kbytes Mode  Offset           Device    Mapping
(..)
242fd000 1834768 rw--- 00000000242fd000 000:00000   [ anon ]
942c2000  582604 rw--- 00000000942c2000 000:00000   [ anon ]
(..)

All the output is available at: http://pastebin.com/f67115de2


- Files in "/var/cache/e2fsck" appears that grow very slow, I think, 300Kb
per hour aprox, now that's the size:

# ls -lh /var/cache/e2fsck/
total 170M
-rw------- 1 root root 76M 2008-06-10 17:24
7701b70e-f776-417b-bf31-3693dba56f86-dirinfo-VkmFXP
-rw------- 1 root root 95M 2008-06-10 17:24
7701b70e-f776-417b-bf31-3693dba56f86-icount-YO08bu


- fsck is using 100% of one CPU, it's dual processor motherboard, output of
strace available at:

http://pastebin.com/f68389cce


- More info:
   * Kernel 2.6.25.4, i686 arch on a Debian Etch box.
   * Storage: 3ware 9550SXU-16ML, 5.91 TB in a RAID-5 with 14 500GB SATA
disks (ST3500630AS), 64kB stripe size (array is in optimal state)


Thanks all for the advices :-)

[1] http://www.redhat.com/archives/ext3-users/2007-April/msg00017.html

--
Santi Saez




More information about the Ext3-users mailing list