[Ext2-devel] Re: Linux performance bug: fsync() for files with zero links
porton at ex-code.com
Tue Feb 28 20:57:51 UTC 2006
On 28-Feb-2006 Stephen C. Tweedie wrote:
> On Tue, 2006-02-28 at 17:30 +0100, Erik Mouw wrote:
>> > From man write(2):
>> > write writes up to count bytes to the file referenced by the file
>> > descriptor fd from the buffer starting at buf. POSIX requires that a
>> > read() which can be proved to occur after a write() has returned
>> > returns the new data. Note that not all file systems are POSIX con-
>> > forming.
Erik, Stephen Tweedie has already correctly answered your other concerns.
I will add about the semantics:
>> Again: the number of links of an inode is not a reason to break
>> established semantics.
> Correct. And the semantics *will* change with this patch, but in a
> subtle way.
> Ext3 happens to guarantee that after fsync(), *all* metadata for a file
> --- including directory metadata --- are synchronised to disk. So if
> you unlink an open file and then fsync() it, you are guaranteed that the
> unlink has been committed to disk. This is not, strictly speaking, a
> behaviour required by POSIX; but it's still useful, and would be broken
> if we disabled fsync() for files with i_nlink==0.
OK, Stephen, you has pointed where following my idea would really
significantly change the semantics, and it should not do.
So fsync() (but not fdatasync()) should indeed have effect on an inode with
zero links but _only the first time_. Precisely:
1. With every fd should be associated a boolean flag "no_links_committed"
(to save a bit of memory it could be instead implemented e.g. as having -1
(minus one) as the count of links in the fd data structure instead of 0).
2. When a file is unlinked, then if the number of links becomes zero
no_links_commited should be in reset state (or write zero as the count of
links in the fd data structure).
3. When fsync() (but not fdatasync() which is simpler) is called on a file:
- If the number of links is above 0 proceed as usual.
- If the number of links is zero:
* If no_links_commited is false do directory synchronization
(as mentioned by Stephen) but no other synchronization and
then set no_links_committed to true (or number of links to -1 for
a little more efficient impl.)
* If no_links_committed is true, do nothing.
Victor Porton (porton at ex-code.com) - http://porton.ex-code.com
More information about the Ext3-users