[Linux-cluster] Slowness above 500 RRDs

Tue Jun 12 15:43:24 UTC 2007

David Teigland <teigland at redhat.com> writes:

> On Tue, Jun 12, 2007 at 04:01:04PM +0200, Ferenc Wagner wrote:
>
>> with -l0:
>> 
>> filecount=500
>>   iteration=0 elapsed time=5.966146 s
>>   iteration=1 elapsed time=0.582058 s
>>   iteration=2 elapsed time=0.528272 s
>>   iteration=3 elapsed time=0.936438 s
>>   iteration=4 elapsed time=0.528147 s
>> total elapsed time=8.541061 s
>>
>> Looks like the bottleneck isn't the explicit locking (be it plock
>> or flock), but something else, like the built-in GFS locking.
>
> I'm guessing that these were run with a single node in the cluster?
> The second set of numbers (with -l0) wouldn't make much sense
> otherwise.

Yes, you guessed right.  For some reason I found it a good idea to
reveal this at the end only.  (Sorry.)

> In the end I expect that flocks are still going to be the fastest
> for you.

They really seem to be faster, but since the [fp]locking time is
negligible, it doesn't buy much.

> I think if you add nodes to the cluster, the -l0 numbers will go up
> quite a bit.

Let's see.  Mounting on one more node (and switching on the third):

# cman_tool services
type             level name     id       state       
fence            0     default  00010001 none        
[1 2 3]
dlm              1     clvmd    00020001 none        
[1 2 3]
dlm              1     test     000a0001 none        
[1 2]
gfs              2     test     00090001 none        
[1 2]

$ <commands to run the test>
filecount=500
  iteration=0 elapsed time=5.030971 s
  iteration=1 elapsed time=0.657682 s
  iteration=2 elapsed time=0.798228 s
  iteration=3 elapsed time=0.65742 s
  iteration=4 elapsed time=0.776301 s
total elapsed time=7.920602 s

Somewhat slower, yes.  But still pretty fast.

Mounting on the third node:

# cman_tool services
type             level name     id       state       
fence            0     default  00010001 none        
[1 2 3]
dlm              1     clvmd    00020001 none        
[1 2 3]
dlm              1     test     000a0001 none        
[1 2 3]
gfs              2     test     00090001 none        
[1 2 3]

$ <commands to run the test>
filecount=500
  iteration=0 elapsed time=0.822107 s
  iteration=1 elapsed time=0.656789 s
  iteration=2 elapsed time=0.657798 s
  iteration=3 elapsed time=0.881496 s
  iteration=4 elapsed time=0.659481 s
total elapsed time=3.677671 s

It's much the same...

>> Again, the above tests were done with a single node switched on,
>> and I'm not sure whether the results carry over to the real cluster
>> setup, will test is soon.
>
> Ah, yep.  When you add nodes the plocks will become much slower.
> Again, I think you'll have better luck with flocks.

I didn't get anywhere with flocks.  At least strace didn't lead me
anywhere.  It showed my that the flock() calls are indeed faster than
the fcntl64() calls, but neither took a significant percentage of the
full run time.  I don't say I fully understand the strace output, but
the raw numbers above reflect the reality for sure.

>> Also, I can send the results of the scenario suggested by you, if
>> it's still relevant.  In short: the locks are always mastered on
>> node A only, but the performance is poor nevertheless.
>
> Poor even in the first step when you're just mounting on nodeA?

Yes, as detailed in the other mail.
-- 
Thanks,
Feri.