optimising filesystem for many small files

Jon Burgess jburgess777 at googlemail.com
Sun Oct 18 15:07:37 UTC 2009

On Sun, 2009-10-18 at 18:44 +0530, Viji V Nair wrote:
> On Sun, Oct 18, 2009 at 5:11 PM, Matija Nalis <mnalis-ml at voyager.hr> wrote:
> > On Sun, Oct 18, 2009 at 03:01:46PM +0530, Viji V Nair wrote:
> >> The application which we are using are modified versions of mapnik and
> >> tilecache, these are single threaded so we are running 4 process at a
> >
> > How does it scale if you reduce the number or processes - especially if you
> > run just one of those ? As this is just a single disk, 4 simultaneous
> > readers/writers would probably *totally* kill it with seeks.
> >
> > I suspect it might even run faster with just 1 process then with 4 of
> > them...
> with one process it is giving me 6 seconds

That seems a little slow. Have you looked in optimising your mapnik
setup? The mapnik-users list or IRC channel is a good place to ask[1].

For comparison, the OpenStreetMap tile server typically renders a 8x8
block of 64 tiles in about 1 second, although the time varies greatly
depending on the amount of data within the tiles.

> >
> >> time. We can say only four images are created at a single point of
> >> time. Some times a single image is taking around 20 sec to create. I
> >
> > is that 20 secs just the write time for an precomputed file of 10k ?
> > Or does it also include reading and processing and writing ?
> this include processing and writing
> >
> >> can see lots of system resources are free, memory, processors etc
> >> (these are 4G, 2 x 5420 XEON)

4GB may be a little small. Have you checked whether the IO reading your
data sources is the bottleneck?

> > If you can modify hardware setup, RAID10 (better with many smaller disks
> > than with fewer bigger ones) should help *very* much. Flash-disk-thingies of
> > appropriate size are even better option (as the seek issues are few orders
> > of magnitude smaller problem). Also probably more RAM (unless you full
> > dataset is much smaller than 2 GB, which I doubt).
> >
> > On the other hand, have you tried testing some other filesystems ?
> > I've had much better performance with lots of small files of XFS (but that
> > was on big RAID5, so YMMV), for example.
> >
> > --
> > Opinions above are GNU-copylefted.
> >
> I have not tried XFS, but tried reiserfs. I could not see a large
> difference when compared with mkfs.ext4 -T small. I could see that
> reiser is giving better performance on overwrite, not on new writes.
> some times we overwrite existing image with new ones.
> Now the total files are 50Million, soon (with in an year) it will grow
> to 1 Billion. I know that we should move ahead with the hardware
> upgrades, also files system access is a large concern for us. There
> images are accessed over the internet and expecting a 100 million
> visits every month. For each user we need to transfer at least 3Mb of
> data.

Serving 3MB is about 1000 tiles. This is a total of 100M * 1000 = 1e11
tiles/month or about 40,000 requests per second. If every request needed
an IO from a hard disk managing 100 IOPs then you would need about 400
disks. Having a decent amount of RAM should dramatically cut the number
of request reaching the disks. Alternatively you might be able to do
this all with just a few SSDs. The Intel X25-E is rated at >35,000 IOPs
for random 4kB reads[2].

I can give you some performance numbers about the OSM server for
comparision: At last count the OSM tile server had 568M tiles cached
using about 500GB of disk space[3]. The hardware is described on the
wiki[4]. It regularly serves 500+ tiles per second @ 50Mbps[5]. This is
about 40 million HTTP requests per day and several TB of traffic per


1: http://trac.mapnik.org/
2: http://download.intel.com/design/flash/nand/extreme/extreme-sata-ssd-product-brief.pdf
3: http://wiki.openstreetmap.org/wiki/Tile_Disk_Usage
4: http://wiki.openstreetmap.org/wiki/Servers/yevaud
5: http://munin.openstreetmap.org/openstreetmap/yevaud.openstreetmap.html

More information about the Ext3-users mailing list