[Linux-cluster] Problem in clvmd/dlm_recoverd

Tom Lanyon tom at netspot.com.au
Wed Nov 26 10:29:28 UTC 2008


On 20/11/2008, at 8:37 AM, Brandon Young wrote:

> Could you please share the parameters you tuned, and perhaps a brief  
> explanation of your thinking?  I have hideously slow backups, too,  
> and haven't been successful in improving it through tuning.
>
> --
> Brandon
>
>
> On Wed, Nov 19, 2008 at 3:56 PM, Tom Lanyon <tom at netspot.com.au>  
> wrote:
>> After tuning some GFS parameters yesterday, last night's backup ran  
>> without a hitch! :)
>
>>


I changed settings to try and reduce the amount of locks held on the  
backup server. I was assuming that as the application servers were  
still trying to access the GFS mountpoint, they had to wait until the  
locks held on the backup server demoted and this was causing the  
applications to hang.

I tuned glock_purge (it was disabled, I set it to 50%) and reduced  
demote_secs to 100; I also turned on fast_statfs and increased the  
number of statfs_slots to try and increase backup performance (in case  
it does a lot of statfs calls).

However, after further testing we encountered more instability issues.  
Performance was greatly improved, but we were still finding storage  
locking up on multiple cluster nodes.

This seems to be more of an issue with GNBD (on which we're running  
GFS) rather than GFS itself. When the lock up happens, all machines  
continue to run but any commands that reference the GNBD export hang  
(ie, lvm, df, mount).

I haven't been able to confirm this since GNBD isn't giving me any  
errors, but we're still investigating.

Regards,
Tom




More information about the Linux-cluster mailing list