a comparison of ext3, jfs, and xfs on hardware raid

Thu Jul 14 18:56:15 UTC 2005

On Thu, 2005-07-14 at 12:33 -0600, Andreas Dilger wrote:
> On Jul 13, 2005  17:12 -0700, Jeffrey W. Baker wrote:
> > Using bonnie++ with a 10GB fileset, in MB/s:
> > 
> >          ext3    jfs    xfs
> > Read     112     188    141
> > Write     97     157    167
> > Rewrite   51      71     60
> > 
> > These number were obtained using the mkfs defaults for all filesystems
> > and the deadline scheduler.  As you can see JFS is kicking butt on this
> > test.
> 
> One thing that is important for Lustre is performance of EAs.  See
> http://samba.org/~tridge/xattr_results/ for a comparison.  Lustre
> uses large inodes (-I 256 or larger) to store the EAs efficiently.

This is of importance for only the metadata backend, or for OSTs as
well?

> > Next I used pgbench to test parallel random I/O.  pgbench has
> > configurable number of clients and transactions per client, and can
> > change the size of its database.  I used a database of 100 million
> > tuples (scale factor 1000).  I times 100,000 transactions on each
> > filesystem, with 10 and 100 clients per run.  Figures are in
> > transactions per second.
> > 
> >               ext3  jfs  xfs
> > 10 Clients      55   81   68
> > 100 Clients     61  100   64
> > 
> > Here XFS is not substantially faster but JFS continues to lead.  
> > 
> > JFS is roughly 60% faster than ext3 on pgbench and 40-70% faster on
> > bonnie++ linear I/O.
> 
> This is a bit surprising, I've never heard JFS as a leader in many
> performance tests.  Is pgbench at all related to dbench?  The problem
> with dbench is that for cases where the filesystem does no IO at all
> it reports a best result.  In real life the data has to make it to
> disk at some point.

pgbench comes in postgresql's contrib.  Believe me, the filesystem does
plenty of I/O.  It sustains roughly 600 iops for 15-20 minutes.  The
"scale factor of 1000" means pgbench is using a database with 100
million tuples, or about 16GB of data.  The entire run uses up only
about 2 minutes of CPU time.  

> 
> See http://sudhaa.com/~benchmark/ext3/newtiobenchresults.ext3gold/newtiobench/newtiobench.html
> for a comparison of ext3, xfs, jfs in the mode that Lustre runs in
> (specifically column 7, 14, 18).
> 
> > Are there any tunables that I might want to adjust to get better
> > performance from ext3?
> 
> Try creating your ext3 filesystem with a larger journal, as Lustre does:
> 
> mkfs -J size=400 ...
> 
> size is in MB, 400 might be excessive for your setup - I'd be interested
> in hearing where the "sweet spot" is for journal size.  The latest e2fsprogs
> use 128MB as the largest default size (up from 32MB) for large filesystems.

I intend to run many more benchmarks using various ext3 mount options.
I'll make sure to modulate the journal size as well.  However, it is my
impression that mballoc/delalloc/extents will be of use mainly to
workloads like tarring and untarring a large archive.  For linear reads
of one giant file, will these mount options make any difference?

Regards,
Jeffrey