[Cluster-devel] [GFS2 PATCH v1 0/2] Improve throughput through rgrp sharing
Bob Peterson
rpeterso at redhat.com
Wed Apr 18 16:58:36 UTC 2018
This is a preliminary patch set, but the results are promising. I've been
testing it pretty hard in a variety of circumstances, and it seems to be
fairly solid, so I thought I'd get it out here for people to review.
On 17 January 2018, I posted an experimental patch set meant to improve
intra-node resource group sharing, titled "GFS2: Rework rgrp glock
congestion functions for intra-node". It improved rgrp contention by
simply distributing contentious processes to use different rgrps. In
RHEL6 we used "try locks" which basically accomplished the same thing.
Steve Whitehouse suggested a better approach: to actually share the same
rgrps within a node. This patch set implements Steve's suggestion.
The first patch introduces a new glock locking mode called EXSH, meaning
exclusively shared within one node. To all other nodes (and to DLM) the
glock looks and acts like it is held EX. But to the node that has it
locked, it may be shared among processes like an SH lock.
The second patch adds hooks to the rgrp code to use the new glock locking
mode. A new rwsem, rd_sem, ensures exclusive use of the rgrp when it is
needed. Whenever an rgrp is added to a transaction, the rwsem is taken and
it is queued to the transaction. When the transaction is ended, every rwsem
for all rgrps queued to that transaction are unlocked.
Preliminary performance testing using iozone looks very promising.
With 16 simultaneous writers, GFS2 performs 6 times faster with the patch.
Even with 4 writers, overall performance is doubled:
7.5 kernel Patched kernel
-------------- --------------
Children see throughput for 1 initial writers 525062.81 kB/s 527972.50 kB/s
Parent sees throughput for 1 initial writers 525049.74 kB/s 527971.69 kB/s
Children see throughput for 2 initial writers 612600.62 kB/s 603398.75 kB/s
Parent sees throughput for 2 initial writers 600944.08 kB/s 603140.65 kB/s
Children see throughput for 4 initial writers 596730.64 kB/s 694901.31 kB/s
Parent sees throughput for 4 initial writers 232777.32 kB/s 472287.19 kB/s
Children see throughput for 6 initial writers 574034.05 kB/s 739531.62 kB/s
Parent sees throughput for 6 initial writers 160751.73 kB/s 515363.98 kB/s
Children see throughput for 8 initial writers 644463.33 kB/s 727810.48 kB/s
Parent sees throughput for 8 initial writers 155939.49 kB/s 559100.85 kB/s
Children see throughput for 10 initial writers 613880.30 kB/s 736029.91 kB/s
Parent sees throughput for 10 initial writers 174366.86 kB/s 663429.43 kB/s
Children see throughput for 12 initial writers 610206.54 kB/s 744490.04 kB/s
Parent sees throughput for 12 initial writers 150910.72 kB/s 682414.33 kB/s
Children see throughput for 14 initial writers 625055.97 kB/s 804518.57 kB/s
Parent sees throughput for 14 initial writers 129122.67 kB/s 781340.39 kB/s
Children see throughput for 16 initial writers 627972.96 kB/s 794149.06 kB/s
Parent sees throughput for 16 initial writers 124565.02 kB/s 764981.28 kB/s
There are still some fairness/parallelism issues. It's not perfect.
But when multiple processes are sharing the same resource, I'm not sure
how much better we can go without separating them to their own rgrps.
The statistics indicate this is well worth pursuing.
---
Bob Peterson (2):
GFS2: Introduce EXSH (exclusively shared on one node)
GFS2: Take advantage of new EXSH glock mode for rgrps
fs/gfs2/bmap.c | 2 +-
fs/gfs2/dir.c | 2 +-
fs/gfs2/glock.c | 12 +++++++-
fs/gfs2/glock.h | 16 +++++++---
fs/gfs2/glops.c | 3 +-
fs/gfs2/incore.h | 12 +++++---
fs/gfs2/inode.c | 4 +--
fs/gfs2/lock_dlm.c | 5 +++-
fs/gfs2/rgrp.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++++----
fs/gfs2/rgrp.h | 5 ++++
fs/gfs2/super.c | 2 +-
fs/gfs2/trace_gfs2.h | 2 ++
fs/gfs2/trans.c | 16 ++++++++++
fs/gfs2/xattr.c | 6 ++--
14 files changed, 147 insertions(+), 24 deletions(-)
--
2.14.3
More information about the Cluster-devel
mailing list