[dm-devel] Thin-provisioned LVs throughput

Moshe Lazarov Moshe.Lazarov at axxana.com
Thu Jul 27 23:02:05 UTC 2017


Hi,
I am Moshe Lazarov. A researcher in Axxana (https://www.axxana.com). We have developed a Black Box to for Zero Data Loss in asynchronous replications systems.
We consider integrating the LVM2 and its thinly-provisioned and snapshots mechanisms to our product.
Lately I have been running several experiments with LVs and snapshots, using LVM2 (LVM2.2.02.98) and Linux Kernel 4.9.11.
The platforms which I run the tests on are VM (on Dell server carrying intel Xeon E5-2620 CPUs with 2 SAS HDDs) and Axxana's Black Box ("BBX"); a unique motherboard carrying intel i7-3517UE CPU and 2 SAS SSDs.
Both platforms run the same scripts and tester application for creating the destinations (LVs and snapshots) and writing to them. The VG (that include all the LVs and snapshots) is stripped (64KB) over the 2 drives.
In the graph below one can see the throughput of writing 1GB (each time a different and random 1GB is written) to each of the destinations in the VM or in the BBX:
SlowLV - A linear LV.
T0 1st - 1st write to a thinly-provisioned LV. Since the allocation of a chunk is initiated in this access, the downgraded throughput is clear.
T0 - Consecutive writes to T0 (thinly-provisioned LV).
T0S0 1st - 1st write to a thinly-provisioned snapshot of T0 (the thinly-provisioned LV above). Since the allocation of a chunk is initiated in this access, the downgraded throughput is clear.
T0S0 - Consecutive writes to T0S0 (the snapshot of thinly-provisioned LV).

[cid:image004.png at 01D306F1.ABEA4B60]

IMHO, the degradation of the throughput between writes to the SlowLV and writes to T0 or T0S0 is extremely high; while the BBX writes to the LV in ~740MBps, the writes to other destinations is limited to ~400Mbps (almost 50% degradation). On the VM there ~17% degradation at least.
Since both platforms run the same code and reach around the same write throughput, it seems that the SW/Kernel is the bottleneck for achieving higher throughput.
What is your opinion about that?
In addition, I added the output of IOSTAT while executing writes to T0 (thin-provisioned LV).
The following legend applies:
vg-slowlv                                             (252:0)
vg-ramlv1                                           (252:1)
vg-ramlv2                                           (252:2)
vg-poolThinDataLV_tmeta           (252:3)
vg-poolThinDataLV_tdata             (252:4)
vg-poolThinDataLV-tpool              (252:5)
vg-poolThinDataLV                          (252:6)
vg-t0                                                      (252:7)
vg-t0s0                                                 (252:8)

1GB of data was written during 3 IOSTAT sampling windows (samples every 1 second). See in yellow in the 2nd second window; data was written to dm-4 (poolThinDataLV_tdata) and dm-5 (poolThinDataLV-tpool) at 557MBps, while it was written to dm-7 (t0) in 820MBps.
During the 1st second window, the behavior was different; data was written to dm-4, dm-5 and dm-7 at the same throughput (average of 204MBps over this 1 second window).
In other cases, data is written to the 3 destinations at the same speed during all three 1 second windows.
What could be the reason for the behavior in the 2nd window?
Is data really written twice to dm-4 and dm-5 (one time for each), or is it the same write?
Can throughput be improved by increasing the request size (write in larger packets, how?)?

avg-cpu:

%user

%nice

%system

%iowait

%steal

%idle


0.26

0

30.08

0

0

69.67


Device:

rrqm/s

wrqm/s

r/s

w/s

rMB/s

wMB/s

avgrq-sz

avgqu-sz

await

r_await

w_await

svctm

%util

sda

0

0

19

0

0.07

0

8

0.04

2.11

2.11

0

0.21

0.4

sdb

0

24785

14

1312

0.05

96.31

148.84

5.69

4.17

0.29

4.21

0.44

58.4

sdc

0

24772

1

1312

0

96.31

150.23

5.52

4.07

0

4.08

0.44

58.4

dm-0

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-1

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-2

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-3

0

0

15

0

0.06

0

8

0

0.27

0.27

0

0.27

0.4

dm-4

0

0

0

52224

0

204

8

1253.31

23.21

0

23.21

0.01

58.4

dm-5

0

0

0

52224

0

204

8

1253.35

23.21

0

23.21

0.01

58.4

dm-7

0

0

0

52225

0

204

8

1255.08

23.25

0

23.25

0.01

58.4

dm-8

0

0

0

0

0

0

0

0

0

0

0

0

0


avg-cpu:

%user

%nice

%system

%iowait

%steal

%idle


0

0

57.37

3.68

0

38.95


Device:

rrqm/s

wrqm/s

r/s

w/s

rMB/s

wMB/s

avgrq-sz

avgqu-sz

await

r_await

w_await

svctm

%util

sda

0

0

1

0

0

0

8

0

0

0

0

0

0

sdb

0

18031

20

53270

0.08

284.15

10.92

52.74

0.99

3.4

0.99

0.02

99.6

sdc

0

18044

27

53264

0.11

284.12

10.92

32.19

0.61

0.59

0.61

0.02

99.2

dm-0

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-1

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-2

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-3

0

0

47

0

0.18

0

8

0.08

1.79

1.79

0

0.6

2.8

dm-4

0

0

0

142612

0

557.08

8

1505.75

10.85

0

10.85

0.01

100

dm-5

0

0

0

142613

0

557.08

8

1545.26

10.85

0

10.85

0.01

100

dm-7

0

0

0

209919

0

820

8

35341.8

87.93

0

87.93

0

100

dm-8

0

0

0

0

0

0

0

0

0

0

0

0

0


avg-cpu:

%user

%nice

%system

%iowait

%steal

%idle


0.51

0

30.61

9.95

0

58.93


Device:

rrqm/s

wrqm/s

r/s

w/s

rMB/s

wMB/s

avgrq-sz

avgqu-sz

await

r_await

w_await

svctm

%util

sda

0

8

1291

4

5.83

0.05

9.3

1.62

1.25

1.25

2

0.12

15.2

sdb

0

0

0

33674

0

131.54

8

9.09

0.27

0

0.27

0.01

43.2

sdc

0

0

0

33680

0

131.56

8

7

0.21

0

0.21

0.01

43.6

dm-0

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-1

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-2

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-3

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-4

0

0

0

67308

0

262.92

8

15.96

0.24

0

0.24

0.01

43.6

dm-5

0

0

0

67307

0

262.92

8

16.2

0.24

0

0.24

0.01

43.6

dm-7

0

0

0

0

0

0

0

14886.74

0

0

0

0

43.6

dm-8

0

0

0

0

0

0

0

0

0

0

0

0

0



I hope the information is clear.
I would appreciate your response to the questions I've raised above.

Thanks a lot,
-Moshe
----------------------------------------
Moshe Lazarov
Axxana

C:    +1-669-213-9752
F:    +972-74-7887878
www.axxana.com<UrlBlockedError.aspx>

[Description: cid:image004.jpg at 01CD29D9.D454ED30]

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20170727/3708ba49/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 13437 bytes
Desc: image001.jpg
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20170727/3708ba49/attachment.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image004.png
Type: image/png
Size: 27762 bytes
Desc: image004.png
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20170727/3708ba49/attachment.png>


More information about the dm-devel mailing list