mode data=journal in ext3. Is it safe to use?

Helge Hafting helge.hafting at hist.no
Fri Jun 18 09:41:23 UTC 2004


Oleg Drokin wrote:

>Hello!
>
>On Thu, Jun 17, 2004 at 10:27:17AM +0200, Petter Larsen wrote:
>  
>
>>>PL> Can anybody of you acknowledge or not if mode data=journal in ext3 is
>>>PL> safe to use in Linux kernel 2.6.x?
>>>PL> Wee need to have a very consistent and integrity for our filesystem, and
>>>PL> it would then be desired to journal both data and metadata.
>>>OLEG> Actually data=journal mode would gain you mostly zero extra consistency compared
>>>to data=ordered mode. (the only more consistency bit that you get is
>>>correct mtime on files that have their pages overwritten, I think).
>>>You have zero control over transaction boundaries in ext3, so you still need
>>>to design your applications in such a way that they have their own
>>>sort of transactions (if this is needed).
>>>      
>>>
>>So your conclusion is that data=journal mode is useless if you do not
>>want a correct mtime?
>>    
>>
>
>Well, yes.
>
>  
>
>>It would be a littles sense in developing the data=journal mode if this
>>is the only benefit, don't you think?
>>>From the Linux/Documentation/filesystems/ext3.txt
>>data=journal            All data are committed into the journal prior
>>                        to being written into the main file system.
>>data=ordered    (*)     All data are forced directly out to the main
>>file                    system prior to its metadata being committed to
>>                        the journal.
>>My problem is that ext3 in the latest kernel, 2.6.x and the latest
>>2.4.x, are not well documented around the web. Whitepapers and so are
>>pretty old. Much have changed I belive in ext3 since it was first
>>introduced by Dr. Tweedie. The first release was journaling both data
>>and metadata, se also the transcript from Dr. Tweedie from the Ottawa
>>Linux Symposium 20th July 2000.
>>http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html
>>There he says that they are journaling both metadata and data, but that
>>the design goal is not to do that. So can this be interpreted that mode
>>data=journal is only there for historic reasons?
>>    
>>
>
>May be so. Also fsync heavy loads on real disk devices with large journals
>tend to benefit from journaled data mode as well.
>
>  
>
>>>PL> Data integrity is much more important for us than speed.
>>>
>>>OLEG> It is not clear what sort of extra data integrity do you expect from data
>>>journaling mode and why do you think it is there.
>>>      
>>>
>>I would belive that the goal for such a mode data=journal would gain
>>extra data integrity because it also journals data. Why should it not? I
>>    
>>
>
>Well, actually I bet you do not care if the data goes through journal or not
>as long as it is not lost.
>In case of ordered journaling mode, data is written first before metadata
>updates, mostly the same happens with data journal mode, only with the latter
>case date is written into journal and if transaction was not committed, after
>a reboot it won't be copied to where it should be, same scenario in ordered
>journal mode will result in data getting where it should be, but due to
>lack of metadata updates, you won't see it. (this is in case of append,
>for overwrite it will be a little bit different, but still you have no
>control over how much of stuff will be overwritten).
>
>  
>
>>would belive that it makes sense to have these different modes so people
>>can choose the best mode for there applications.
>>    
>>
>
>True.
>
>  
>
>>>OLEG> Garbage in files should not happen in data ordered mode as data pages are
>>>written first before metadata updates are committed.
>>>      
>>>
>>Are you sure?
>>    
>>
>
>If you can reproduce a garbage in files in ordered journal mode, that would be a
>bug that should be fixed then.
>  
>
Hard to _produce_, but consider:
1. Write data to an existing file
2. Sync metadata
3. data is forced out because of ordered mode, a powerout crash happens
    in the middle of this. The file now has a block with a mix of new 
and old,
    it may even be unreadable due to a bad sector checksum.

With data journalling you either get the old data (because the crash 
happened
during a write to the journal) or new data (crash happened during data 
write,
the data is restored from the good copy in the journal.)

Helge Hafting





More information about the Ext3-users mailing list