[Linux-cluster] GFS create file performance

Jeff Sturm jeff.sturm at eprize.com
Thu Mar 18 16:59:55 UTC 2010


> -----Original Message-----
> From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com]
> On Behalf Of Steven Whitehouse
> Sent: Wednesday, March 17, 2010 11:45 AM
> To: linux clustering
> Subject: Re: [Linux-cluster] GFS create file performance
> 
> The create case can cause a (potentially) a lot of other I/O to occur.
> Adding a directory entry most of the time only takes a short period of
> time, due to there already being space available in the directory.
> Potentially, if the directory has become full in some sense and needs to
> be expanded, there can be I/O to allocate a directory leaf block and/or
> hash table blocks and/or indirect blocks.

Steven, thanks for the detailed description--in hindsight this makes perfect sense.  Filesystems do a lot of magic behind the scenes, clearly, that normal users (like us) don't often consider.

> This is in addition to the block for the inode itself, and if selinux or
> acls are in use, additional blocks may be allocated to contain their
> xattrs as well.

Good to know.  We'll likely disable selinux for a future round of testing.

We're very aware that we're starting to push the envelope of GFS and clustered filesystems a bit.  Armed with this understanding, we might try to revamp our session storage so we don't need to create as many files.  One possibility is appending new information to existing files in lieu of creating new files for each session.

The high performance of mainstream filesystems like ext3 (without sync) can delude us into thinking file creation is cheap, or essentially free.  In the end we're still limited by I/O.

We can also take a more critical look at our block storage itself.  Either FiberChannel or solid-state storage should help to reduce latency.  High capacity drives are not required for our application, but latency and throughput are critical.

(I'm aware too there are non-filesystem approaches to solving our problem, but if we abandon the cluster fs too quickly I won't have learned what its true limits are.)

-Jeff





More information about the Linux-cluster mailing list