[Linux-cluster] Finding the bottleneck between SAN and GFS2
Daniel Dehennin
daniel.dehennin at baby-gnu.org
Tue Jun 30 19:37:27 UTC 2015
Hello,
We are experiencing slow VMs on our OpenNebula architecture:
- two Dell PowerEdge M620
+ Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz
+ 96GB RAM
+ 2x146Go SAS drives
- 2TB SAN LUN to store qcow2 images with GFS2 over cLVM
We made some tests, installing Linux OS in parallel and we did not find
any issues with performance.
Since 3 weeks, 17 users use ±60 VMs and everything became slow.
The SAN administrator complain about very high IO/s so we limited each
VM to 80 IO/s with the libvirt configuration
#+begin_src xml
<total_iops_sec>80</total_bytes_sec>
#+end_src
But it did not get better
Today I ran some benchmark to try to find out what happens.
Checking plocks/s
=================
I started with ping_pong[1] to see how many locks per second the GFS2
can sustain.
I use it as describe on the samba wiki[2], here are the results:
- starting ”ping_pong /var/lib/one/datastores/test_plock 3” on first
node display around 4k plocks/s
- then starting ”ping_pong /var/lib/one/datastores/test_plock 3” on the
second node display around 2k on each node
For the single node process, I was expecting an much higher rate, they
speak about 500k to 1M locks/s.
Do my numbers looks strange?
Checking fileio
===============
I use “sysbench --test=fileio” to check inside the VM and outside (on
bare metal node), with files in cache or cache dropped.
The short result is that bare metal access to the GFS2 without any cache
is terribly slow, around 2Mb/s and 90 requests/s.
Is there a way to find out if the problem comes from my
GFS2/corosync/pacemaker configuration or from the SAN?
Regards.
Following are the full sysbench results
In the VM, qemu disk cache disabled, total_iops_sec = 0
-------------------------------------------------------
I try with the IO limit but the difference is minimal:
- the request/s drop to ±80
- the Mb/s is around 1.2Mb/s
root at vm:~# sysbench --num-threads=16 --test=fileio --file-total-size=9G --file-test-mode=rndrw prepare
sysbench 0.4.12: multi-threaded system evaluation benchmark
128 files, 73728Kb each, 9216Mb total
Creating files for the test...
root at vm:~# sysbench --num-threads=16 --test=fileio --file-total-size=9G --file-test-mode=rndrw run
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 16
Extra file open flags: 0
128 files, 72Mb each
9Gb total file size
Block size 16Kb
Number of random requests for random IO: 10000
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Threads started!
Done.
Operations performed: 6034 Read, 4019 Write, 12808 Other = 22861 Total
Read 94.281Mb Written 62.797Mb Total transferred 157.08Mb (1.4318Mb/sec)
91.64 Requests/sec executed
Test execution summary:
total time: 109.7050s
total number of events: 10053
total time taken by event execution: 464.7600
per-request statistics:
min: 0.01ms
avg: 46.23ms
max: 11488.59ms
approx. 95 percentile: 125.81ms
Threads fairness:
events (avg/stddev): 628.3125/59.81
execution time (avg/stddev): 29.0475/6.34
On the bare metal node, with the caches dropped
-----------------------------------------------
After creating the 128 files, I drop the caches to get “from SAN” results.
root at nebula1:/var/lib/one/datastores/bench# sysbench --num-threads=16 --test=fileio --file-total-size=9G --file-test-mode=rndrw prepare
sysbench 0.4.12: multi-threaded system evaluation benchmark
128 files, 73728Kb each, 9216Mb total
Creating files for the test...
# DROP CACHES
root at nebula1: echo 3 > /proc/sys/vm/drop_caches
root at nebula1:/var/lib/one/datastores/bench# sysbench --num-threads=16 --test=fileio --file-total-size=9G --file-test-mode=rndrw run
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 16
Extra file open flags: 0
128 files, 72Mb each
9Gb total file size
Block size 16Kb
Number of random requests for random IO: 10000
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Threads started!
Done.
Operations performed: 6013 Read, 3999 Write, 12800 Other = 22812 Total
Read 93.953Mb Written 62.484Mb Total transferred 156.44Mb (1.5465Mb/sec)
98.98 Requests/sec executed
Test execution summary:
total time: 101.1559s
total number of events: 10012
total time taken by event execution: 1109.0862
per-request statistics:
min: 0.01ms
avg: 110.78ms
max: 13098.27ms
approx. 95 percentile: 164.52ms
Threads fairness:
events (avg/stddev): 625.7500/114.50
execution time (avg/stddev): 69.3179/6.54
On the bare metal node, with the test files filled in the cache
---------------------------------------------------------------
I run md5sum on all the files to let the kernel cache them.
# Load files in cache
root at nebula1:/var/lib/one/datastores/bench# md5sum test*
root at nebula1:/var/lib/one/datastores/bench# sysbench --num-threads=16 --test=fileio --file-total-size=9G --file-test-mode=rndrw run
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 16
Extra file open flags: 0
128 files, 72Mb each
9Gb total file size
Block size 16Kb
Number of random requests for random IO: 10000
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Threads started!
Done.
Operations performed: 6069 Read, 4061 Write, 12813 Other = 22943 Total
Read 94.828Mb Written 63.453Mb Total transferred 158.28Mb (54.896Mb/sec)
3513.36 Requests/sec executed
Test execution summary:
total time: 2.8833s
total number of events: 10130
total time taken by event execution: 16.3824
per-request statistics:
min: 0.01ms
avg: 1.62ms
max: 760.53ms
approx. 95 percentile: 5.51ms
Threads fairness:
events (avg/stddev): 633.1250/146.90
execution time (avg/stddev): 1.0239/0.33
Footnotes:
[1] https://git.samba.org/?p=ctdb.git;a=blob;f=utils/ping_pong/ping_pong.c
[2] https://wiki.samba.org/index.php/Ping_pong
--
Daniel Dehennin
Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 342 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20150630/59b135c9/attachment.sig>
More information about the Linux-cluster
mailing list