EXT2 vs. EXT3: mount w/sync or fdatasync

Andreas Dilger adilger at clusterfs.com
Fri Mar 23 06:18:40 UTC 2007


On Mar 22, 2007  20:44 -0700, brian stone wrote:
> Machine A connects to machine B on a gigabit lan.  Machine A sends 
> 1024 1MB chucks of data; 1 GB in total. Machine B, the server, reads 
> in the MB and writes it to a file.
> 
> NOTE: server and client are little test programs written in C.  
> 
> Machine B (Server) hardware:
> - Single (no raid) Seagate Cheetah 70G Ultra320 15K
> - Quad Opteron 870
> - 16G DDR400
> - Backplane: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 8)
> 
> Sync methods include:
> 1. mount with sync option
>   - tried sync,dirsync which added no additional overhead
> 2. use O_SYNC open() flag
> 3. use fdatasync() just before closing the file
>   - fsync() and fdatasync() produced the same results
> 
> 
> EXT2 tests
> ==========================================
> No sync                     12.3 seconds  (83 MB/Sec)
> mount=sync                  44.3 seconds  (23 MB/Sec)
> O_SYNC                      31.7 seconds  (32 MB/Sec)
> fdatasync()                 31.3 seconds  (32 MB/Sec)
> 
> 
> EXT3 tests
> ===========================================
> No sync data=writeback      14.5 seconds  (70 MB/Sec)
> No sync data=ordered        17 seconds    (60 MB/Sec)
> No sync data=journal        65 seconds    (15 MB/Sec)
> data=ordered O_SYNC         49 seconds    (20 MB/Sec)
> data=ordered,sync           52 seconds    (19 MB/Sec)
> data=ordered fdatasync()    45.5 seconds  (22 MB/Sec)
> data=journal O_SYNC         72.5 seconds  (14 MB/Sec)
> data=journal,sync           81 seconds    (12 MB/Sec)
> data=journal fdatasync()    60.5 seconds  (17 MB/Sec)

If you are doing a large number of 1MB writes then I agree that
data=journal is probably not the way to go because it means you
can get at most 1/2 of the bandwidth of the disk (unless you
create the journal on a separate disk).  data=journal is good
for small writes and lots of transactions, like mail servers
that need lots of sync operations.

For large writes, I'd suggest you put the journal on a separate
device, and make it 1 or 2 GB (your server has plenty of RAM,
so that isn't a problem).  Are you using EAs, like selinux or
similar?  If yes, then you should also format your filesystem
with large inodes (-I 256).

You may also want to try out ext4dev with the mballoc and delalloc
patches from Alex Tomas, as this code has been optimized for
doing large power-of-two allocations in the filesystem.  They've
been posted to the ext4-devel lists a couple of times.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.




More information about the Ext3-users mailing list