[Ext2-devel] Re: Linux performance bug: fsync() for files with zero links

Victor Porton porton at ex-code.com
Tue Feb 28 20:57:51 UTC 2006

On 28-Feb-2006 Stephen C. Tweedie wrote:
> On Tue, 2006-02-28 at 17:30 +0100, Erik Mouw wrote:
>> > From man write(2):
>> > 
>> >        write  writes  up  to  count  bytes  to the file referenced by the file
>> >        descriptor fd from the buffer starting at buf.  POSIX requires  that  a
>> >        read()  which  can  be  proved  to  occur  after a write() has returned
>> >        returns the new data.  Note that not all file systems  are  POSIX  con-
>> >        forming.

Erik, Stephen Tweedie has already correctly answered your other concerns.

I will add about the semantics:

>> Again: the number of links of an inode is not a reason to break
>> established semantics.
> Correct.  And the semantics *will* change with this patch, but in a
> subtle way.
> Ext3 happens to guarantee that after fsync(), *all* metadata for a file
> --- including directory metadata --- are synchronised to disk.  So if
> you unlink an open file and then fsync() it, you are guaranteed that the
> unlink has been committed to disk.  This is not, strictly speaking, a
> behaviour required by POSIX; but it's still useful, and would be broken
> if we disabled fsync() for files with i_nlink==0.

OK, Stephen, you has pointed where following my idea would really
significantly change the semantics, and it should not do.

So fsync() (but not fdatasync()) should indeed have effect on an inode with
zero links but _only the first time_. Precisely:

1. With every fd should be associated a boolean flag "no_links_committed"
(to save a bit of memory it could be instead implemented e.g. as having -1
(minus one) as the count of links in the fd data structure instead of 0).

2. When a file is unlinked, then if the number of links becomes zero
no_links_commited should be in reset state (or write zero as the count of
links in the fd data structure). 

3. When fsync() (but not fdatasync() which is simpler) is called on a file:
   - If the number of links is above 0 proceed as usual.
   - If the number of links is zero:
     * If no_links_commited is false do directory synchronization
       (as mentioned by Stephen) but no other synchronization and
       then set no_links_committed to true (or number of links to -1 for
       a little more efficient impl.)
     * If no_links_committed is true, do nothing.

Victor Porton (porton at ex-code.com) - http://porton.ex-code.com

More information about the Ext3-users mailing list