[libvirt] [PATCH] Use posix_fallocate() to allocate disk space

Amit Shah amit.shah at redhat.com
Mon Mar 2 09:32:11 UTC 2009


On (Tue) Feb 24 2009 [11:58:31], Daniel P. Berrange wrote:
> On Tue, Feb 24, 2009 at 05:09:31PM +0530, Amit Shah wrote:
...
> > The best case to get a non-fragmented VM image is to have it allocated
> > completely at create-time with fallocate().
> 
> The main problem with this change is that it'll make it harder for
> us to provide incremental feedback. As per the comment in the code, 
> it is our intention to make the volume creation API run as a background
> job which provides feedback on progress of allocation, and the ability
> to cancel the job. Since posix_fallocate() is an all-or-nothing kind of
> API it wouldn't be very helpful. 
> 
> What sort of performance boost does this give you ?  Would we perhaps
> be able to get close to it by writing in bigger chunks than 4k, or 
> mmap'ing the file and then doing a memset across it ?

I have a program up at [1] that gives me the following data.

[1]
http://fedorapeople.org/gitweb?p=amitshah/public_git/alloc-perf.git;a=blob_plain;f=test-file-zero-alloc-speed.c;hb=HEAD

I compiled results for ext3, ext4, xfs and btrfs. I used the following
methods to allocate a file (1 GB in size) and zero it:

- posix_fallocate()
- mmap() and memset()
- write chunks, sized 4k and 8k.

Results:

---
ext4:
posix-fallocate run time:
        (approx 0s)
mmap run time:
        (approx 13s)
4096-sized chunk run time:
        (approx 15s)
8192-sized chunk run time:
        (approx 18s)

$ sudo filefrag /mnt/ext4/*
/mnt/ext4/file-chunk4: 29 extents found
/mnt/ext4/file-chunk8: 20 extents found
/mnt/ext4/file-mmap: 38 extents found
/mnt/ext4/file-pf: 1 extent found

---
xfs:
posix-fallocate run time:
        (approx 0s)
mmap run time:
        (approx 14s)
4096-sized chunk run time:
        (approx 18s)
8192-sized chunk run time:
        (approx 19s)

$ sudo filefrag /mnt/xfs/*
/mnt/xfs/file-chunk4: 3 extents found
/mnt/xfs/file-chunk8: 4 extents found
/mnt/xfs/file-mmap: 2 extents found
/mnt/xfs/file-pf: 1 extent found

---
ext3:
posix-fallocate run time:
        (approx 18s)
mmap run time:
        (approx 20s)
4096-sized chunk run time:
        (approx 22s)
8192-sized chunk run time:
        (approx 24s)

$ sudo filefrag /mnt/ext3/*
/mnt/ext3/file-chunk4: 38 extents found, perfection would be 9 extents
/mnt/ext3/file-chunk8: 9 extents found
/mnt/ext3/file-mmap: 44 extents found, perfection would be 9 extents
/mnt/ext3/file-pf: 9 extents found

---
btrfs:
posix-fallocate run time:
        (approx 0s)
mmap run time:
        (approx 18s)
4096-sized chunk run time:
        (approx 17s)
8192-sized chunk run time:
        (approx 19s)

$ sudo /mnt/btrfs/*
FIBMAP: Invalid argument

---

I have detailed results up at

http://fedorapeople.org/gitweb?p=amitshah/public_git/alloc-perf.git;a=blob_plain;f=results.txt;hb=HEAD

The link to the git tree is

http://fedorapeople.org/gitweb?p=amitshah/public_git/alloc-perf.git


Clearly, extents-based file systems provide a very very fast fallocate()
implementation that allocates a new file and zeroes it. Since F11 is
going to have ext4 by default, I strongly suggest we switch to
posix_fallocate() for Linux hosts. The feedback should not matter on the
newer file systems as the alloc is really fast and we anyway don't have
an implementation currently for non-extent-based file systems. It really
won't be missed for newer hosts.

Inspite of this if some feedback is needed for a non-extents-based file
system, a run-time probe for the underlying file system can be made and
we could default to a chunk-based allocation in that case.

For systems that do not implement posix_fallocate(), some
configure-magic is needed.

Amit




More information about the libvir-list mailing list