[Cluster-devel] [GFS2 PATCH v1 0/2] Improve throughput through rgrp sharing

Bob Peterson rpeterso at redhat.com
Wed Apr 18 16:58:36 UTC 2018


This is a preliminary patch set, but the results are promising. I've been
testing it pretty hard in a variety of circumstances, and it seems to be
fairly solid, so I thought I'd get it out here for people to review.

On 17 January 2018, I posted an experimental patch set meant to improve
intra-node resource group sharing, titled "GFS2: Rework rgrp glock
congestion functions for intra-node". It improved rgrp contention by
simply distributing contentious processes to use different rgrps. In
RHEL6 we used "try locks" which basically accomplished the same thing.

Steve Whitehouse suggested a better approach: to actually share the same
rgrps within a node. This patch set implements Steve's suggestion.

The first patch introduces a new glock locking mode called EXSH, meaning
exclusively shared within one node. To all other nodes (and to DLM) the
glock looks and acts like it is held EX. But to the node that has it
locked, it may be shared among processes like an SH lock.

The second patch adds hooks to the rgrp code to use the new glock locking
mode. A new rwsem, rd_sem, ensures exclusive use of the rgrp when it is
needed. Whenever an rgrp is added to a transaction, the rwsem is taken and
it is queued to the transaction. When the transaction is ended, every rwsem
for all rgrps queued to that transaction are unlocked.

Preliminary performance testing using iozone looks very promising.
With 16 simultaneous writers, GFS2 performs 6 times faster with the patch.
Even with 4 writers, overall performance is doubled:

                                               7.5 kernel      Patched kernel
                                               --------------  --------------
Children see throughput for  1 initial writers 525062.81 kB/s  527972.50 kB/s
Parent sees throughput for  1 initial writers  525049.74 kB/s  527971.69 kB/s

Children see throughput for  2 initial writers 612600.62 kB/s  603398.75 kB/s
Parent sees throughput for  2 initial writers  600944.08 kB/s  603140.65 kB/s

Children see throughput for  4 initial writers 596730.64 kB/s  694901.31 kB/s
Parent sees throughput for  4 initial writers  232777.32 kB/s  472287.19 kB/s

Children see throughput for  6 initial writers 574034.05 kB/s  739531.62 kB/s
Parent sees throughput for  6 initial writers  160751.73 kB/s  515363.98 kB/s

Children see throughput for  8 initial writers 644463.33 kB/s  727810.48 kB/s
Parent sees throughput for  8 initial writers  155939.49 kB/s  559100.85 kB/s

Children see throughput for 10 initial writers 613880.30 kB/s  736029.91 kB/s
Parent sees throughput for 10 initial writers  174366.86 kB/s  663429.43 kB/s

Children see throughput for 12 initial writers 610206.54 kB/s  744490.04 kB/s
Parent sees throughput for 12 initial writers  150910.72 kB/s  682414.33 kB/s

Children see throughput for 14 initial writers 625055.97 kB/s  804518.57 kB/s
Parent sees throughput for 14 initial writers  129122.67 kB/s  781340.39 kB/s

Children see throughput for 16 initial writers 627972.96 kB/s  794149.06 kB/s
Parent sees throughput for 16 initial writers  124565.02 kB/s  764981.28 kB/s

There are still some fairness/parallelism issues. It's not perfect.
But when multiple processes are sharing the same resource, I'm not sure
how much better we can go without separating them to their own rgrps.
The statistics indicate this is well worth pursuing.
---
Bob Peterson (2):
  GFS2: Introduce EXSH (exclusively shared on one node)
  GFS2: Take advantage of new EXSH glock mode for rgrps

 fs/gfs2/bmap.c       |  2 +-
 fs/gfs2/dir.c        |  2 +-
 fs/gfs2/glock.c      | 12 +++++++-
 fs/gfs2/glock.h      | 16 +++++++---
 fs/gfs2/glops.c      |  3 +-
 fs/gfs2/incore.h     | 12 +++++---
 fs/gfs2/inode.c      |  4 +--
 fs/gfs2/lock_dlm.c   |  5 +++-
 fs/gfs2/rgrp.c       | 84 ++++++++++++++++++++++++++++++++++++++++++++++++----
 fs/gfs2/rgrp.h       |  5 ++++
 fs/gfs2/super.c      |  2 +-
 fs/gfs2/trace_gfs2.h |  2 ++
 fs/gfs2/trans.c      | 16 ++++++++++
 fs/gfs2/xattr.c      |  6 ++--
 14 files changed, 147 insertions(+), 24 deletions(-)

-- 
2.14.3




More information about the Cluster-devel mailing list