[Linux-cluster] GFS2 DLM problem on NVMes

성백재 bj.sung at sk.com
Mon Nov 20 04:23:35 UTC 2017


Hello, List.

We are developing storage systems using 10 NVMes (current test set).
Using MD RAID10 + CLVM/GFS2 over four hosts achieves 22 GB/s (Max. on Reads).
However, a GFS2 DLM problem occurred. The problem is that each host frequently reports “dlm: gfs2: send_repeat_remove” kernel messages, and I/O throughput becomes unstable and low.
I found a GFS2 commit message about “send_repeat_remove” function.
(https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git/commit/?id=96006ea6d4eea73466e90ef353bf34e507724e77)

Information about the test environment.
Four hosts share 10 NVMes, and each host deploys CLVM/GFS2 on top of the cluster MD RAID1 + MD RAID0.
GFS2 has 2,000 directories, each with 1,900 media files (3 MB on average).
Each host runs 20 threads of NGINX, and each thread randomly reads media files on demand.
The Linux kernel version is 4.11.8.

Can you offer suggestions or directions to solve these problems?
Thank you in advance :)

Best regards,
/Jay Sung

Jay Sung (Baegjae), Manager | Software Defined Storage Lab | SK Telecom Co., LTD.
bj.sung at sk.com | mobile: +82-10-2087-5637
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20171120/b6102828/attachment.htm>


More information about the Linux-cluster mailing list