EXT2 vs. EXT3: mount w/sync or fdatasync
Andreas Dilger
adilger at clusterfs.com
Fri Mar 23 06:18:40 UTC 2007
On Mar 22, 2007 20:44 -0700, brian stone wrote:
> Machine A connects to machine B on a gigabit lan. Machine A sends
> 1024 1MB chucks of data; 1 GB in total. Machine B, the server, reads
> in the MB and writes it to a file.
>
> NOTE: server and client are little test programs written in C.
>
> Machine B (Server) hardware:
> - Single (no raid) Seagate Cheetah 70G Ultra320 15K
> - Quad Opteron 870
> - 16G DDR400
> - Backplane: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 8)
>
> Sync methods include:
> 1. mount with sync option
> - tried sync,dirsync which added no additional overhead
> 2. use O_SYNC open() flag
> 3. use fdatasync() just before closing the file
> - fsync() and fdatasync() produced the same results
>
>
> EXT2 tests
> ==========================================
> No sync 12.3 seconds (83 MB/Sec)
> mount=sync 44.3 seconds (23 MB/Sec)
> O_SYNC 31.7 seconds (32 MB/Sec)
> fdatasync() 31.3 seconds (32 MB/Sec)
>
>
> EXT3 tests
> ===========================================
> No sync data=writeback 14.5 seconds (70 MB/Sec)
> No sync data=ordered 17 seconds (60 MB/Sec)
> No sync data=journal 65 seconds (15 MB/Sec)
> data=ordered O_SYNC 49 seconds (20 MB/Sec)
> data=ordered,sync 52 seconds (19 MB/Sec)
> data=ordered fdatasync() 45.5 seconds (22 MB/Sec)
> data=journal O_SYNC 72.5 seconds (14 MB/Sec)
> data=journal,sync 81 seconds (12 MB/Sec)
> data=journal fdatasync() 60.5 seconds (17 MB/Sec)
If you are doing a large number of 1MB writes then I agree that
data=journal is probably not the way to go because it means you
can get at most 1/2 of the bandwidth of the disk (unless you
create the journal on a separate disk). data=journal is good
for small writes and lots of transactions, like mail servers
that need lots of sync operations.
For large writes, I'd suggest you put the journal on a separate
device, and make it 1 or 2 GB (your server has plenty of RAM,
so that isn't a problem). Are you using EAs, like selinux or
similar? If yes, then you should also format your filesystem
with large inodes (-I 256).
You may also want to try out ext4dev with the mballoc and delalloc
patches from Alex Tomas, as this code has been optimized for
doing large power-of-two allocations in the filesystem. They've
been posted to the ext4-devel lists a couple of times.
Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.
More information about the Ext3-users
mailing list