[Linux-cluster] GFS performance problem

FM dist-list at LEXUM.UMontreal.CA
Thu Oct 26 14:57:56 UTC 2006


Thanks again :)

I read an interesting Q/A in the cluster project faq :
#  Why is GFS slow doing things like 'ls -lr *' whereas 'ls -r *' is fast?
Mostly due to design constraints. An ls -r * can simply traverse the
directory structures, which is very fast. An ls -lr * has to traverse
the directory, but also has to stat each file to get more details for
the ls. That means it has to acquire and release a cluster lock on each
file, which can be slow. We've tried to address these problems with the
new GFS2 file system.

Because the way rsync works, it must stat all the files and so have to
create a lock for each one.

Our setup is :
*) all GFS are LUN on SAN (RAID 5)
*) 6 web servers write to same logs file.
*) One of these 6 servers writes into the the web data folder (hundreds
of thousand HTML files) , all others are readers. It is in this folder
that we are using rsync

So what about that :
1 GFS with dlm_lock for our web log files (where all servers are writing).
1 GFS with nolock where the web data folder is located where all servers
read but only one write.

Sound good or .... ;-) ?

Thank again

Riaan van Niekerk wrote:
>
>
> FM wrote:
>> Thank you for the answers.
>>
>> Another bottleneck that I could have is the way I connect to the SAN
>> : device mapper multipath instead of the HP officiel module.
>>
>> I do not not if the performance can we increase with the official
>> module  ?
>>
>
> There is no SecurePath for RHEL 4 (assuming that is what you mean by
> "official module". That leaves you with
> a) device-mapper-multipath
> b) the Emulex or Qlogic HBA-based failover.
>
> I have heard mostly good things about (a) even though I dont have a
> lot of production experience with it myself. It seems the trend is to
> move away from vendor/array and HBA-based multipathing towards
> OS-based multipathing, be it Windows or Linux.
>
>>
>>
>>
>> Riaan van Niekerk wrote:
>>>
>>>
>>> FM wrote:
>>>> Hello,
>>>>
>>>> Here is my setup :
>>>> RedHat Enterprise 4 (update 3).
>>>> 5 web servers connected to a 600 GB  GFS (noatime) on a SAN.
>>>> On the GFS : all web sites ROOTS and httpd logs files
>>>> All web servers are writing to the same log. No problem here.
>>>>
>>>> The prob is with rsync. When writing on the GSF it is very very slow !
>>>> (40 % slower when we are lucky :) )
>>>>
>>>> lots of question :
>>>> *) could it be because if I have 1 GFS for the heavy write on the log
>>>> and another GFS for the websites' files ?
>>>> *) expect for the noatime, is there other technic to speed up GFS ?
>>>
>>> a) we have increased the number of GFS locks cached to bring them
>>> closer in line with the number of locks in use:
>>> echo "200000" > /proc/cluster/lock_dlm/drop_count
>>> (not sure how much of a performance increase this got us)
>>>
>>> b) We have disabled quotas
>>> gfs_tool settune /mnt/san quota_account 0
>>> We saw a 3 - 5 increase in performance
>>>
>>> c) if you are using lots of small files, look into section 5.8 in
>>> the GFS manual, Data Journaling. However,
>>> - I dont know how to change this for existing data on a GFS
>>> - I have asked on the mailing list, and no-one seems to be using it.
>>>
>>>> *) Can several webserver access EXT3 FS (read only) when only on other
>>>> server have RW access to it ?
>>>
>>> no. the RO server will get confused when things change from under
>>> it, since it is not expecting things to change
>>>
>>>> *) is there a options to tune rsync when using GFS ?
>>>> *) we are using DLM as the locking system. All servers are connected
>>>> with Gb RJ45. Is DLM using the network to manage the lock. And if
>>>> it is
>>>> the case, could my problem come from the network latency ?
>>>
>>> DLM is using the network, yes. not sure about latency. We use GB
>>> ethernet with RJ45/CAT5+ and have not had any problems related to
>>> DLM and the network (that we are aware of). As it was explained to
>>> me by Red Hat Support, DLM is extremely efficient, being able to
>>> master/distribute thousands of locks per second between nodes.
>>>
>>>>
>>>> As I said, lots of questions here :)
>>>>
>>>
>>> and some answers. Wish I had answers to all your questions.
>>>
>>> greetings
>>> Riaan
>>>
>>> -- 
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>> -- 
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




More information about the Linux-cluster mailing list