RFC: Tuning ext3

Tod Hagan tod at gust.sr.unh.edu
Thu May 17 20:16:44 UTC 2007


All,

I'm requesting comments from the expert readers of ext3-users on these
notes for tuning ext3 for performance.

Most helpful would be feedback pertinent to RHEL 5; as XFS isn't
supported under Red Hat Enterprise Linux these items are an attempt to
match XFS performance with ext3.

These items were culled from a number of sources. Will they be effective
for achieving the performance goals? Is the risk assessment correct?
Are any too risky for consideration? Has anything been left out?

Thanks.



ext3 Performance Tuning

This document contains notes about optimizing ext3 filesystems for
storage nodes on Linux clusters used for scientific research.

Each of the goals and assumptions has been assigned an identifier in
square brackets which is referenced in the applicable tuning options.

Goals:

[BigSeqIO] Maximize sequential I/O performance with large files (> 1GB),
including simultaneous access to several large files.

[MaxStor] Maximize the amount of usable storage.

[BigFS] Allow large filesystems (> 8 TB).

[GenNFSPerf] Optimize for NFS and general performance.

Assumptions:

[NoHA] High availability isn't needed.

[IntegFS] The integrity of the filesystem is important -- don't lose
existing files.

[Recreate] Newly-created files are unimportant as they can be
re-created.

[UPS] Power to the hardware is guaranteed by a UPS.

[NoBigDir] Directories don't get large.

[NoSysFiles] The filesystems hold data and not system files.

Tuning:

A. Create using -O sparse_super (default) to save space on large
filesystems. [MaxStor]

B. Create using -T largefile4 (one inode per 4 megabytes) to avoid
wasting space on unused inodes. [MaxStor]

C. Create using -m 0 to reserve no blocks for the super-user.
[NoSysFiles][MaxStor]

D. Create using -E stride=N where N matches the underlying RAID.
[GenNFSPerf]

E. Use a kernel >= 2.6.19 (patches for extents and 48-bit support,
requires Ubuntu 7.04 feisty or Fedora Core 7 or custom kernel) to allow
filesystems > 8TB on Intel/AMD chips. [BigFS]

F. Use an external journal on a separate high-RPM drive. [GenNFSPerf]

G. Use a large journal. mkfs -J size=8192 [GenNFSPerf]

H. Mount using -o orlov to use the Orlov block allocator (default,
requires 2.6 kernel). Minimizes seeks by clustering files together. No
risk. [GenNFSPerf]

I. Mount using -o noatime,nodiratime. No risk. [GenNFSPerf]

J. Mount using -o reservation to speed writes to multiple files in the
same directory. No risk. [BigSeqIO][GenNFSPerf]

K. Mount using -o data=writeback. Comparable to XFS and JFS, relaxes all
restrictions on writing cached data. Risky.
[Recreate][UPS][BigSeqIO][GenNFSPerf]

L. Mount using -o commit=Nsec where N > 5 (requires 2.6 kernel, default
is 5 seconds). Reduces sync interval. Risky. [Recreate][UPS][GenNFSPerf]

M. Mount using -o barrier=0 and enable write-back caching on the
controllers and drives. The most(?) risky. [Recreate][UPS][GenNFSPerf]



-- 
Tod Hagan
Information Technologist
AIRMAP/Climate Change Research Center
Institute for the Study of Earth, Oceans, and Space
University of New Hampshire
Durham, NH 03824
Phone: 603-862-3116





More information about the Ext3-users mailing list