[Linux-cluster] Slowness above 500 RRDs

Ferenc Wagner wferi at niif.hu
Tue Jun 12 15:06:56 UTC 2007


Hi David,

Here is the old mail I haven't sent before.  Meanwhile, I'm switching
in other nodes to continue the tests in my previous mail.

David Teigland <teigland at redhat.com> writes:

> Make sure that drop_count is zero again, now it's in sysfs:
>   echo 0 > /sys/fs/gfs/<foo>:<bar>/lock_module/drop_count

Ok, I set it after mount, on nodeA only.

> Then run some tests:
> - mount on nodeA
> - run the test on nodeA
filecount=500
  iteration=0 elapsed time=7.392987 s
  iteration=1 elapsed time=4.27927 s
  iteration=2 elapsed time=5.262367 s
  iteration=3 elapsed time=5.265202 s
  iteration=4 elapsed time=5.269652 s
total elapsed time=27.469478 s
> - count locks on nodeA
>   (cat /sys/kernel/debug/dlm/<bar> | grep Master | wc -l)
nodeA: 1031
> - mount on nodeB (don't do anything on this node)
> - run the test again on nodeA
filecount=500
  iteration=0 elapsed time=4.288288 s
  iteration=1 elapsed time=5.282437 s
  iteration=2 elapsed time=5.244141 s
  iteration=3 elapsed time=5.268136 s
  iteration=4 elapsed time=5.261129 s
total elapsed time=25.344131 s
> - count locks on nodeA and nodeB (see above)
nodeA: 1030
nodeB: 20
> - mount on nodeC (don't do anything on nodes B or C)
> - run the test again on nodeA
filecount=500
  iteration=0 elapsed time=4.307917 s
  iteration=1 elapsed time=5.298193 s
  iteration=2 elapsed time=5.295678 s
  iteration=3 elapsed time=5.336343 s
  iteration=4 elapsed time=5.308529 s
total elapsed time=25.54666 s
> - count locks on nodes A, B and C (see above)
nodeA: 1030
nodeB:   20
nodeC:   21

> We're basically trying to produce the best-case performance from one node,
> nodeA.  That means making sure that nodeA is mastering all locks and doing
> maximum caching.  That's why it's important that we not do anything at all
> that accesses the fs on nodes B or C, or do any extra mounts/unmounts.

Yes, this sounds very reasonable.  But looks like nodeA feels obliged
to communicate its locking process around the cluster.  What confuses
me is that he emits multicast packets even when he's the only member.
Otherwise, it passes tokens around the cluster, which makes more
sense, though still unnecessary, as he is the lock master (if I get
the lock master concept right).
-- 
Thanks,
Feri.




More information about the Linux-cluster mailing list