[Linux-cluster] DDraid benchmarks

Daniel Phillips phillips at redhat.com
Tue Mar 29 15:16:49 UTC 2005


Good morning,

Here are the promised benchmarks for the ddraid cluster raid 3.5.

Test configuration:

   - 1 GHz PIII, 1 GB memory
   - 5 scsi disks in a hotplug scsi backplane, 39 GB/sec each
   - ddraid server running on same machine
   - AF_UNIX network connection to server from dm target

Two tests from opposite ends of the disk load spectrum:

   1) untar linux-2.6.11.3 to root dir
      - Many small, nonlinear transfers

   2) cp linux-2.6.11.3 to root dir
      - A few large, linear transfers

Unmount after each test.  Unmounting flushes the cache, which 
constitutes some or all of the actual disk IO.  So for each test, two 
timings are shown, one for the test and one for the unmount.

To see where the ddraid overheads lie, each ddraid test was run in four 
combinations of parity calculation enabled or disabled, and network 
synchronization enabled or disabled, as indicated.  Some tests were 
done both with a 3 disk (ddraid order 1) and 5 disk (ddraid order 2) 
array.  Some of the 5 disk tests failed to complete (a timer BUG, not 
investigated yet).

DDraid synchronization includes not only network message traffic, but 
recording the region-by-region dirty state persistently to disk before 
completing each write IO.  Parity calculations are performed both on 
write and read (the later to double-check for errors).


First, the fragmented load:

untar linux-2.6.11.3
====================

raw scsi disk
process: real     48.994s user     45.526s sys       3.063s
umount:  real      3.084s user      0.002s sys       0.429s

raw scsi disk (again)
process: real     48.603s user     45.380s sys       2.976s
umount:  real      3.145s user      0.005s sys       0.421s

ddraid order 1, no calc, no sync
process: real     49.942s user     46.328s sys       3.028s
umount:  real      2.034s user      0.005s sys       0.626s

ddraid order 1, calc, no sync
process: real     50.864s user     46.221s sys       3.195s
umount:  real      1.839s user      0.006s sys       1.099s

ddraid order 1, calc, sync
process: real     50.979s user      46.382s sys      3.222s
umount:  real      1.895s user      0.002s sys       0.531s

ddraid order 2, no calc, no sync
process: real     49.532s user     45.837s sys       3.145s
umount:  real      1.318s user      0.004s sys       0.718s

ddraid order 2, calc, no sync
process: real     49.742s user      45.527s sys      3.135s
umount:  real      1.625s user      0.004s sys       1.054s

ddraid order 2, calc, sync: <oops>


Interpretation: Fragmented IO to the ddraid array runs in about the same 
time as IO to a single, raw disk, regardless of whether the persistent 
dirty is running or not, and regardless of parity calculations or not.  
Unmount time is consistently faster for ddraid than for the raw disk, 
giving ddraid a slight edge over the raw disk overall.

I had hoped to show considerably more improvement over the raw disk 
here, but there are per-transfer overheads that prevent the array from 
running at full speed.  I suspect these overheads are not in the ddraid 
code, but we shall see.  The main point is: ddraid is not slower than a 
draw disk for this demanding load.

Next, the linear load:

cp /zoo/linux-2.6.11.3.tar.bz2 /x

raw scsi disk
process: real      0.258s user      0.008s sys       0.236s
umount:  real      1.019s user      0.003s sys       0.032s

raw scsi disk (again)
process: real      0.264s user      0.013s sys       0.237s
umount:  real      1.053s user      0.005s sys       0.029s

raw scsi disk (again)
process: real      0.267s user      0.018s sys       0.233s
umount:  real      1.019s user      0.006s sys       0.028s

ddraid order 1, calc, no sync
process: real      0.267s user      0.007s sys       0.243s
umount:  real      0.568s user      0.006s sys       0.250s

ddraid order 1, no calc, sync
process: real      0.267s user      0.011s sys       0.240s
umount:  real      0.608s user      0.002s sys       0.032s

ddraid order 1, calc, sync
process: real      0.265s user      0.008s sys       0.239s
umount:  real      0.596s user      0.004s sys       0.042s

ddraid order 2, no calc, no sync
process: real      0.266s user      0.013s sys       0.234s
umount:  real      0.381s user      0.004s sys       0.049s

ddraid order 2, calc, no sync
process: real      0.269s user      0.010s sys       0.239s
umount:  real      0.392s user      0.004s sys       0.201s

ddraid order 2, no calc, sync: <oops>

ddraid order 2, calc, sync: <oops>

Interpretation: DDraid really shines on the linear load, beating the raw 
disk by 41% with a 3 disk array and 62% with a 5 disk array.

Bottom line: Even with the overhead of running a persistent dirty log 
and doing parity calculations, ddraid is not slower than a single raw 
disk on worst case fragmented loads, and is dramatically faster on best 
case linear loads.  In my opinion, this performance is good enough for 
production use.  Besides the performance advantage, there is of course 
the redundancy, which is actually the main point.  (Note however, that 
there are some bugs and miscellaneous deficiencies remaining to be 
ironed out.)

There is still a lot room for further optimization.  It is likely that 
ddraid performance can eventually approach the aggregate throughput of 
the array, less the parity disk.

Regards,

Daniel




More information about the Linux-cluster mailing list