[Linux-cluster] GFS Performance Problems (RHEL5)
Paul Risenhoover
prisenhoover at sampledigital.com
Tue Nov 27 23:54:19 UTC 2007
Yes and No.
I've been running a RHEL 4.x server connected to a VTrak M500i with
750GB disks for the last year, and it's run beautifully. I have had no
performance problems with a 5TB volume (the disk array wasn't fully loaded).
In an effort to increase storage, I just purchased a VTrak 610 with 1TB
disks and prepped it exactly like the other (except with RHEL5). The
ultimate goal is to have two servers in an active/passive configuration
serving SAMBA.
Would you be willing to share your discoveries?
Paul
James Chamberlain wrote:
> Hi Paul,
>
> I'm guessing from the information you give below that you're using a
> Promise VTrak M500i with 1 TB disks? Can you confirm this? I had
> uneven experience with that platform, which led me to abandon it; but
> I did make one or two discoveries along the way which may be useful if
> they are applicable to your setup. Can you share a little more about
> your hardware and setup?
>
> Regards,
>
> James Chamberlain
>
> On Tue, 27 Nov 2007, Paul Risenhoover wrote:
>
>>
>> Sorry about this mis-send.
>>
>> I'm guessing my problem has to do with this:
>>
>> https://www.redhat.com/archives/linux-cluster/2007-October/msg00332.html
>>
>> BTW: My file system is 13TB.
>>
>> I found this article that talks about tuning the glock_purge setting:
>> http://people.redhat.com/wcheng/Patches/GFS/readme.gfs_glock_trimming.R4
>>
>> But it seems to require a special kernel module that I don't have :(.
>> Anybody know where I can get it?
>>
>> Paul
>>
>> Paul Risenhoover wrote:
>>> Hi All,
>>>
>>> I am experiencing some substantial performance problems on my RHEL 5
>>> server running GFS. The specific symptom that I'm seeing is that
>>> the file
>>> system will hang for anywhere from 5 to 45 seconds on occasion.
>>> When this
>>> happens it stalls all processes that are attempting to access the file
>>> system (ie, "ls -l") such that even a ctrl-break can't stop it.
>>>
>>> It also appears that gfs_scand is working extremely hard. It runs at
>>> 7-10% CPU almost constantly. I did some research on this and
>>> discovered a
>>> discussion about cluster locking in relation to directories with large
>>> numbers of files, and believe it might be related. I've got some
>>> directories with 5000+ files. However, I get the stalling behavior
>>> even
>>> when nothing is accessing those particular directories.
>>>
>>> I also tried some tuning some of the parameters:
>>>
>>> gfs_tool settune /mnt/promise demote_secs 10
>>> gfs_tool settune /mnt/promise scand_secs 2
>>> gfs_tool settune /mnt/promise/ reclaim_limit 1000
>>>
>>> But this doesn't appear to have done much. Does anybody have some
>>> thoughts on how I might resolve this?
>>>
>>> Paul
>>>
>>> --
More information about the Linux-cluster
mailing list