[Linux-cluster] High availability mail server
Gordan Bobic
gordan at bobich.net
Mon Oct 26 09:05:23 UTC 2009
Samer ITS wrote:
> Dear Madi,
>
> I have 2 years of working with sun cluster 3.x
>
> So the concept is there, so I want to know how linux clustering is work for
> mail system
> Because I still planning for the project.
>
> The cluster is needed for Performance & high availability.
>
> So I need your advice in the planning & the best requirement for best
> performance (such servers)
> Already I have SAN storage (EMC Clariion - CX300)
Unfortunately, the very nature of the way Maildir behaves (lots of files
in few directories) makes it very at odds with your intention of using
clustering for gaining performance through parallel operation using SAN
backed storage with a cluster file system (being GFS1, GFS2 or OCFS2).
You'll be better off with a NAS, but even so, you'll find that one
machine with local storage will still perform a pair of clustered
machines running in parallel under heavy load, because the clustered
machines will spend most of their time bouncing locks around. You may
find that configuring them as fail-over with an ext3 volume on the SAN
that gets mounted _ONLY_ on the machine that's currently running the
service works faster.
The problem is that most people overlook why clustered file systems are
so slow, given the apparently low ping times to the SAN and between
machines on gigabit ethernet (or something faster). The generally
erroneous assumption is that given that the ping time is typically <
0.1ms, this is negligible compared to the 4-8ms access time of
mechanical disks. The problem is that 4-8ms is the wrong figure to be
comparing to - if the machine is really hitting the disk for every data
fetch, it is going to grind to a halt (think heavy swapping sort of
performance). Most of the working data set is expected to be in caches
most of the time, which is accessible in < 40ns (when all the latencies
between the CPU, MCH and RAM are accounted for).
The cluster file system takes this penalty for all accesses where a lock
isn't cached (and if both machines are accessing the same data set
randomly, the locks aren't going to be locally held most of the time).
This may well be fine when you are dealing with large-ish files and your
workload is arranged in such a way that accesses to particular data
subtrees is typically executed on only one node at a time, but for cases
such as a large Maildir being randomly accessed, from multiple nodes,
you'll find the performance will tend to fall off a cliff pretty quickly
as the number of users and concurrent accesses starts to increase.
The only way you are likely to overcome this is by logically
partitioning your data sets.
Gordan
More information about the Linux-cluster
mailing list