[dm-devel] Shell Scripts or Arbitrary Priority Callouts?

Tue Mar 24 12:21:45 UTC 2009

On Tue, 2009-03-24 at 13:57 +0200, Pasi Kärkkäinen wrote:
> On Tue, Mar 24, 2009 at 07:02:41AM -0400, John A. Sullivan III wrote:
> > > > <snip>
> > > > I'm trying to spend a little time on this today and am really feeling my
> > > > ignorance on the way iSCSI works :(  It looks like linux-iscsi supports
> > > > MC/S but has not been in active development and will not even compile on
> > > > my 2.6.27 kernel.
> > > > 
> > > > To simplify matters, I did put each SAN interface on a separate network.
> > > > Thus, all the different sessions.  If I place them all on the same
> > > > network and use the iface parameters of open-iscsi, does that eliminate
> > > > the out-of-order problem and allow me to achieve the performance
> > > > scalability I'm seeking from dm-multipath in multibus mode? Thanks -
> > > > John
> > > 
> > > If you use ifaces feature of open-iscsi, you still get separate sessions.
> > > 
> > > open-iscsi just does not support MC/s :(
> > > 
> > > I think core-iscsi does support MC/s.. 
> > > 
> > > Then you again you should play with the different multipath settings, and
> > > tweak how often IOs are split to different paths etc.. maybe that helps.
> > > 
> > > -- Pasi
> > <snip>
> > I think we're pretty much at the end of our options here but I document
> > what I've found thus far for closure.
> > 
> > Indeed, there seems to be no way around the session problem.  Core-iscsi
> > does seem to support MC/s but has not been updated in years.  It did not
> > compile with my 2.6.27 kernel and, given that others seem to have had
> > the same problem, I did not spend a lot of time troubleshooting it.
> > 
> 
> Core-iscsi developer seems to be active developing at least the 
> new iSCSI target (LIO target).. I think he has been testing it with
> core-iscsi, so maybe there's newer version somewhere? 
> 
> > We did play with the multipath rr_min_io settings and smaller always
> > seemed to be better until we got into very large numbers of session.  We
> > were testing on a dual quad core AMD Shanghai 2378 system with 32 GB
> > RAM, a quad port Intel e1000 card and two on-board nvidia forcedeth
> > ports with disktest using 4K blocks to mimic the file system using
> > sequential reads (and some sequential writes).
> > 
> 
> Nice hardware. Btw are you using jumbo frames or flow control for iSCSI
> traffic? 
> 
> > With a single thread, there was no difference at all - only about 12.79
> > MB/s no matter what we did.  With 10 threads and only two interfaces,
> > there was only a slight difference between rr=1 (81.2B/s), rr=10 (78.87)
> > and rr=100 (80).
> > 
> > However, when we opened to three and four interfaces, there was a huge
> > jump for rr=1 (100.4, 105.95) versus rr=10 (80.5, 80.75) and rr=100
> > (74.3, 77.6).
> > 
> > At 100 threads on three or four ports, the best performance shifted to
> > rr=10 (327 MB/s, 335) rather than rr=1 (291.7, 290.1) or rr=100 (216.3).
> > At 400 threads, rr=100 started to overtake rr=10 slightly.
> > 
> > This was using all e1000 interfaces. Our first four port test included
> > one of the on board ports and performance was dramatically less than
> > three e1000 ports.  Subsequent testing tweaking forcedeth parameters
> > from defaults yielded no improvement.
> > 
> > After solving the I/O scheduler problem, dm RAID0 behaved better.  It
> > still did not give us anywhere near a fourfold increase (four disks on
> > four separate ports) but only marginal improvement (14.3 MB/s) using c=8
> > (to fit into a jumbo packet, match the zvol block size on the back end
> > and be two block sizes).  It did, however, give the best balance of
> > performance being just slightly slower than rr=1 at 10 threads and
> > slightly slower than rr=10 at 100 threads though not scaling as well to
> > 400 threads.
> > 
> 
> When you used dm RAID0 you didn't have any multipath configuration, right? 
Correct although we also did test successfully with multipath in
failover mode and RAID0.
> 
> What kind of stripe size and other settings you had for RAID0?
Chunk size was 8KB with four disks.  
> 
> What kind of performance do you get using just a single iscsi session (and
> thus just a single path), no multipathing, no DM RAID0 ? Just a filesystem
> directly on top of the iscsi /dev/sd? device.
Miserable - same roughly 12 MB/s.
> 
> > Thus, collective throughput is acceptable but individual throughput is
> > still awful.
> > 
> 
> Sounds like there's some other problem if invidual throughput is bad? Or did
> you mean performance with a single disktest IO thread is bad, but using multiple
> disktest threads it's good.. that would make more sense :) 
Yes, the latter.  Single thread (I assume mimicking a single disk
operation, e.g., copying a large file) is miserable - much slower than
local disk despite the availability of huge bandwidth.  We start
utilizing the bandwidth when multiplying concurrent disk activity into
the hundreds.

I am guessing the single thread performance problem is an open-iscsi
issue but I was hoping multipath would help us work around it by
utilizing multiple sessions per disk operation.  I suppose that is where
we run into the command ordering problem unless there is something else
afoot.  Thanks - John
<snip>
-- 
John A. Sullivan III
Open Source Development Corporation
+1 207-985-7880
jsullivan at opensourcedevel.com

http://www.spiritualoutreach.com
Making Christianity intelligible to secular society