[Linux-cluster] I/O scheduler and performance
wcheng at redhat.com
Wed Jul 5 01:05:08 UTC 2006
On Wed, 2006-07-05 at 00:16 +0200, Ramon van Alteren wrote:
> I wondered if it mattered what default I/O scheduler I choose in my
> kernel setup for gfs performance.
> I've looking /sys/block but can't find any place to set the I/O
> scheduler for my devices I run gfs off and looking at the docs it seems
> that gfs uses the VFS layer in the linux kernel for it's reading & writing.
> If I'm correct that should mean that the scheduler influences
> performance right? If so what one would be benficial for gfs performance
> in general (if such a statement can be made) and what scheduler would be
> benificial for a workload that consists of mostly writes and fairly few
> reads (90% vs 10%)
The io scheduler does influence GFS performance and the general rules of
linux IO and filesystem tuning can be applied to GFS - e.g. if you have
lots of random writes all over the disk partitions, try to avoid the
schedulers that will attempt to do merging and sorting. In reality, I've
never found one single io scheduler that can outperform all others in
all types of IO workloads, even in mostly-write or mostly-read cases.
The performance is very much dependent on individual workloads (random
write, sequential writes, file size, directory setup), system
configurations (memory size, disk array types, etc), and sometimes
cluster and disk layouts. You have to actually experiment with or
benchmark your workload before you can be sure of the choice.
On the other hand, be aware of the cluster filesystem nature of GFS -
that is, if you try to access the same file (or directory) from
different nodes, the inter-nodes locking and sync issues must be
considered. For example, if you do frequent writes on one node and
mingle the writes with immediate reads (with the same file) on another
node, you may see performance drop significantly. This is because the
write node has to obtain an exclusive lock, write the file, and sync the
changes into the disk before it can be read by other node, comparing
with single node filesystem where no inter-node locking (network
latency) is involved and the read could obtain its data from memory
cache without actual disk IOs.
More information about the Linux-cluster