[Linux-cluster] Slowness above 500 RRDs

Mon Apr 23 21:17:18 UTC 2007

On Sat, Apr 21, 2007 at 01:59:52PM +0200, Ferenc Wagner wrote:
> I modified librrd to use flock instead if the LIBRRD_USE_FLOCK
> environmental variable is defined.  It makes this case evenly slow
> (with SHRINK_CACHE_MAX et al. left at their defaults):
> 
> filecount=500
>   iteration=0 elapsed time=10.335737 s
>   iteration=1 elapsed time=10.426273 s
>   iteration=2 elapsed time=10.634286 s
>   iteration=3 elapsed time=10.342333 s
>   iteration=4 elapsed time=10.458371 s
> total elapsed time=52.197 s
> 
> (The same for 501 files).  strace -T reveals an irregular pattern here:
> 
> filecount=5
> flock(3, LOCK_EX)                       = 0 <0.000116>
> flock(3, LOCK_EX)                       = 0 <0.000261>
> flock(3, LOCK_EX)                       = 0 <0.000113>
> flock(3, LOCK_EX)                       = 0 <0.037657>
> flock(3, LOCK_EX)                       = 0 <0.000093>
>   iteration=0 elapsed time=0.04244 s
> flock(3, LOCK_EX)                       = 0 <0.000094>
> flock(3, LOCK_EX)                       = 0 <0.037314>
> flock(3, LOCK_EX)                       = 0 <0.000105>
> flock(3, LOCK_EX)                       = 0 <0.038323>
> flock(3, LOCK_EX)                       = 0 <0.000087>
>   iteration=1 elapsed time=0.079957 s
> [...]

I suspect what's happening here is that another node is mounting the fs,
causing some of the dlm resource-directory lookups to be remote -- those
are the ones taking the longer time.  To verify, you could run this same
test with just the one node mounting the fs, then all should be uniformly
quick.  No way to change this.

> So I've got the impression that flock is slower than fcntl locking.
> The difference probably erodes with greater number of files, both
> being equally slow.  I must be missing something, it's just the
> opposite of what you suggest.

I suspect that the faster fcntl results below 1000 (shown in your other
email) are a result of the cache you were adjusting of recently used dlm
locks.  Once the number of files/locks exceeds SHRINK_CACHE_MAX, then the
cache is useless and you see plocks a bit slower than the flocks (although
I'm surprised, I expected them to be slower than you've shown.)

I'm going to write a quick test to simulate what you're doing here to
verify these suspicions.

> Also, what's that new infrastructure?  Do you mean GFS2?  I read it
> was not production quality yet, so I didn't mean to try it.  But again
> you may have got something else in your head...

GFS1 and GFS2 both run on the new openais-based cluster infrastructure.
(in the cluster-2.00.00 release, and the RHEL5 and HEAD cvs branches).

Dave