[libvirt] [PATCH v2 1/3] virfile: Refactor safezero, introduce virFileFdPosixFallocate

Daniel P. Berrange berrange at redhat.com
Fri Aug 22 15:18:34 UTC 2014


On Fri, Aug 22, 2014 at 04:15:14PM +0100, Daniel P. Berrange wrote:
> On Fri, Aug 22, 2014 at 10:56:47AM -0400, John Ferlan wrote:
> > 
> > 
> > On 08/22/2014 10:46 AM, Daniel P. Berrange wrote:
> > > On Mon, Aug 11, 2014 at 04:30:19PM -0400, John Ferlan wrote:
> > >> Currently the safezero() function uses build conditionals to choose either
> > >> the posix_fallocate() or mmap() with a fallback to safewrite() in order to
> > >> preallocate a file.
> > >>
> > >> This patch will modify the logic in order to allow fallbacks in the
> > >> event that posix_fallocate() or the ftruncate()and mmap() doesn't work
> > >> properly. The fallback will be to use the slow safewrite of zero filled
> > >> buffers to the file.
> > > 
> > > Have you actually encountered failing of posix_fallocate() in the
> > > real world ?  It is supposed to automatically fallback to the
> > > equivalent of writing zeros if the filesystem / kernel does not
> > > support it, so we should not have todo runtime fallback ourselves.
> > > The existance of fallback is the main distinction between the
> > > posix_fallocate() and fallocate() system calls.
> > > 
> > 
> > It wasn't so much as a "failure" as "unexpected results" - the key being
> > that the resulting created (or resized) file was not sized as expected.
> > 
> > For an NFS target the results are not what was expected.  I've left some
> > history in the prior set of patches with the following probably having
> > the most details:
> > 
> > http://www.redhat.com/archives/libvir-list/2014-August/msg00367.html
> 
> So, IIUC, the bug happens when the rsize mount option to NFS is not 4k.
> 
> strace'ing libvirtd on an NFS volume in this case shows:
> 
> open("/var/lib/libvirt/images/lettuce/foo", O_RDWR|O_CREAT|O_EXCL, 0600) = 24
> fstat(24, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
> ftruncate(24, 1073741824)               = 0
> fallocate(24, 0, 0, 1073741824)         = -1 EOPNOTSUPP (Operation not supported)
> fallocate(24, 0, 0, 1073741824)         = -1 EOPNOTSUPP (Operation not supported)
> fstat(24, {st_mode=S_IFREG|0600, st_size=1073741824, ...}) = 0
> fstatfs(24, {f_type="NFS_SUPER_MAGIC", f_bsize=1048576, f_blocks=118342, f_bfree=71002, f_bavail=65632, f_files=7678560, f_ffree=5495931, f_fsid={0, 0}, f_namelen=255, f_frsize=1048576}) = 0
> pread(24, "\0", 1, 1048575)             = 1
> pwrite(24, "\0", 1, 1048575)            = 1
> pread(24, "\0", 1, 2097151)             = 1
> pwrite(24, "\0", 1, 2097151)            = 1
> pread(24, "\0", 1, 3145727)             = 1
> 
> 
> So we can see glibc here trying fallocate() and then falling back to
> writing zeros. Since the volume does not come out at the right size
> this seems to show a bug in glibc.
> 
> So I think we really ought to report that bug to glibc to be fixed
> there rather than working around it in libvirt, as there are many
> more applications besides libvirt that will be impacted by this
> bug.

Opps, meant to include the stack trace to show where the pread/writes
are coming from:

(gdb) bt
#0  pread64 () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f55a29f9c5e in internal_fallocate (fd=fd at entry=24, offset=1048575, len=1072693248)
    at ../sysdeps/posix/posix_fallocate.c:78
#2  0x00007f55a29f9cc7 in posix_fallocate (fd=fd at entry=24, offset=<optimized out>, len=<optimized out>)
    at ../sysdeps/unix/sysv/linux/wordsize-64/posix_fallocate.c:62
#3  0x00007f55a6071026 in safezero (fd=fd at entry=24, offset=<optimized out>, len=<optimized out>) at util/virfile.c:1031
#4  0x00007f55916258c2 in createRawFile (inputvol=0x0, vol=0x7f5570008280, fd=24) at storage/storage_backend.c:389
#5  virStorageBackendCreateRaw (conn=<optimized out>, pool=<optimized out>, vol=0x7f5570008280, inputvol=0x0, 
    flags=<optimized out>) at storage/storage_backend.c:450


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list