Things to do this week instead of arguing about mixers

Eric Sandeen sandeen at redhat.com
Mon Apr 27 19:10:10 UTC 2009


Now that we have ext4 as the new default filesystem, it'd be nice if we
can get more applications to take advantage of some of the features.

One big feature that has already been brought up on the list[1] is file
preallocation, which allows an application to pre-allocate blocks it
knows that it will eventually write into, thereby making sure it won't
run out of space, and also generally getting a more efficient/contiguous
file layout.

Only a few applications are taking advantage of this so far, in part
because it's new.[2]  The transmission bittorrent client is using it,
but only if you tweak a configfile in (IMHO) non-obvious ways.

So it'd be great to help more applications take advantage of this, and
evangelize the interface a bit, and maybe do it in a semi-organized
fashion.  First a bit of background on what this interface is:

Filesystems which can flag ranges of blocks as allocated but not
initialized can preallocate those blocks to a file very quickly, without
needing to write 0s or do any actual file data IO to perform the
allocation.  When read, they return 0 because they are flagged as
uninitialized.

ext4 (as well as btrfs, ocfs2, and xfs) has an ->fallocate inode
operation which is the hook to this fast preallocation interface.  The
low-level interface is via a syscall, but this is unlikely what we'd
like the applications to use directly.  There are 2 other paths to the
interface:

posix_fallocate(3):
int posix_fallocate(int fd, off_t offset, off_t len);

This has existed for some time, but recent glibc will call the efficient
syscall if the underlying filesystem supports it.  It will fall back to
essentially writing 0s to the file if not (or if on older glibc), and
this may not be desired behavior; preallocating 10G could take a very
long time this way.

fallocate(2):
long fallocate(int fd, int mode, loff_t offset, loff_t len);

This is directly wired to the syscall, so only succeeds on filesystems
that support it.  It also takes a FALLOC_FL_KEEP_SIZE mode argument,
which allows one to allocate blocks without updating the file size if
desired (blocks can then be allocated past EOF).  This call is only
wired up in very recent glibc, but it is available in F11.

So, tasks I see to be done, to get this started:

* Come up with some template autoconf magic to make it easy for
  apps to detect fallocate() at build time, and some example
  code on how to use it
  - Should it fall back to posix_fallocate if fallocate is absent?

* Decide on some consistent buildt-time, run-time, and
  configuration behavior when enabling this
  - should build time use posix_fallocate if only it is available?
  - config enabled == use fallocate whenever the fs supports it?
  - config enabled == fall back to posix_fallocate or not?
  - I'd be happy enough with exclusively using fallocate()

* Come up with a list of apps which could benefit:
  - all torrent clients?
  - rsync? (some patches have floated before)
  - rpm? (file installation and/or db files?)
  - databases?
  - file downloaders?
  - virt image tools?
  - ____ ?

* Work with Fedora package maintainers and/or upstream to get this
  hooked up where appropriate

* (Make a wiki page or a tracker bug to follow all this?)

Whaddya say?  Anyone want to help with this?

Thanks,
-Eric

[1]https://www.redhat.com/archives/fedora-devel-list/2009-April/msg00110.html
[2]https://www.redhat.com/archives/fedora-devel-list/2009-April/msg02078.html




More information about the fedora-devel-list mailing list