[Linux-cluster] GFS Journal Size

Fri Nov 6 09:45:58 UTC 2009

Hi,

On Thu, 2009-11-05 at 15:30 -0700, Andrew A. Neuschwander wrote:
> Is there a good way to determine what size journals are needed on a gfs? Is there anyway to tell if 
> a journal gets full? I have a vary large file system (20TB) shared by three hosts with 48GB RAM 
> each. With the default journal size, applications would become unresponsive under heavy sequential 
> writes. I increased the journal size to 1GB (from the default 128M) and this alleviated the problem.
> 
> Searching hasn't turned up any discussions on journal size.
> 
> Thanks,
> -Andrew

It depends a lot upon the workload, and also upon the hardware, so its
tricky to give any hard and fast answers. We don't currently have any
easy way to tell if the journal is getting full, although with the
tracepoints built into upstream/fedora kernels it should be possible to
get this information indirectly.

Unless you have journaled data mode turned on, then only metadata will
be journaled, so that it is the amount of metadata being modified that
determines how quickly the journal fills up. Streaming writes will
create a fair amount of metadata (assuming the files are not
preallocated) in the form of indirect blocks.

The journaled blocks are pinned in memory until they are written to the
journal. This means that with a larger journal, you can potentially take
up a lot of memory which would otherwise be used for the running of
applications and/or caching data. As a result its not a good idea to
have a journal that is too large a percentage of physical memory.

There are actually two limits to consider wrt to journal size. The first
is the number of blocks which can be put in the journal before the
journal is flushed, and the second (probably what you are coming up
against) is the requirement that all the journaled blocks must be
written back "in place" before a segment of the journal can be freed. It
is also possible to adjust the first of these items with the sysfs
incore_log_blocks setting. I should warn you though that this particular
setting is rather a crude way to make adjustments and at some future
point we intend to replace that with a better method.

Does that answer your question?

Steve.