Ext3: Why data=journal is better than data=ordered when data needs to be read from and written to disk at the same time
jidong.xiao at gmail.com
Sun Mar 27 04:52:21 UTC 2011
On Sat, Mar 26, 2011 at 10:44 PM, Ted Ts'o <tytso at mit.edu> wrote:
> On Sat, Mar 26, 2011 at 08:25:23PM -0400, Jidong Xiao wrote:
>> But my question is, why data=journal could outperform data=ordered,
>> for the data=journal mode, you have to write the data and metadata
>> blocks into the journal, but for the data=ordered mode, you only have
>> to write the metadata blocks into the journal. If, in some certain
>> cases, the former mode can avoid seeks, then the same behavior should
>> apply to the latter mode. So it's really odd that the former mode can
>> outperform the latter mode.
> When executing an fsync(), in data=ordered mode you have to write the
> data data blocks into the journal and wait for the data blocks to be
> written. This requires generally will require extra seeks. In
> data=journaled mode, the data blocks can be written directly into the
> sjoujournal without needing to seek.
> Of course eventually the data and metadata blocks will need to be
> written to their permanent locations before the journal space can be
> reused. But for short bursty write patterns, the fsync() latency will
> be much smaller in data=journal mode.
Thank you Ted, it is really helpful!
So the difference is:
data=ordered mode: fsync() will return only if the meta data blocks
have been written into the journal and the data blocks have been
written into the disk.
data=journal mode: fsync() returns if the meta data and data have been
written into the journal. The journal is contiguous, so data=journal
mode means no seeking needed, therefore, fsync() would return more
If, we perform read from and write to the disk simultaneously, like
First, write data to the filesystem as quickly as possible:
dd if=/dev/zero of=largefile bs=16384 count=131072
While data was being written to the test filesystem, read 16Mb of data
from the same filesystem on the same disk, timing the results:
Reading a 16Mb file
time cat 16-meg-file > /dev/null
In this case, if we conduct the experiment in data=journal mode and
data=ordered mode respectively, since write latency is much smaller in
data=journal mode, the disk will focus more on the read operation,
hence, the read operation will also finish earlier than it do in the
data=ordered mode. Am I understanding correctly?
More information about the Ext3-users