[dm-devel] Shell Scripts or Arbitrary Priority Callouts?

John A. Sullivan III jsullivan at opensourcedevel.com
Sun Mar 29 21:22:21 UTC 2009


On Sun, 2009-03-29 at 21:09 +0300, Pasi Kärkkäinen wrote:
> On Fri, Mar 27, 2009 at 02:28:45PM -0400, John A. Sullivan III wrote:
> > On Fri, 2009-03-27 at 03:03 -0400, John A. Sullivan III wrote:
> > > On Wed, 2009-03-25 at 12:21 -0400, John A. Sullivan III wrote:
> > > > On Wed, 2009-03-25 at 17:52 +0200, Pasi Kärkkäinen wrote:
> > > > > On Tue, Mar 24, 2009 at 11:41:00PM -0400, John A. Sullivan III wrote:
> > > > > > > > Latency seems to be our key.  If I can add only 20 micro-seconds of
> > > > > > > > latency from initiator and target each, that would be roughly 200 micro
> > > > > > > > seconds.  That would almost triple the throughput from what we are
> > > > > > > > currently seeing.
> > > > > > > > 
> > > > > > > 
> > > > > > > Indeed :) 
> > > > > > > 
> > > > > > > > Unfortunately, I'm a bit ignorant of tweaking networks on opensolaris.
> > > > > > > > I can certainly learn but am I headed in the right direction or is this
> > > > > > > > direction of investigation misguided? Thanks - John
> > > > > > > > 
> > > > > > > 
> > > > > > > Low latency is the key for good (iSCSI) SAN performance, as it directly
> > > > > > > gives you more (possible) IOPS. 
> > > > > > > 
> > > > > > > Other option is to configure software/settings so that there are multiple
> > > > > > > outstanding IO's on the fly.. then you're not limited with the latency (so much).
> > > > > > > 
> > > > > > > -- Pasi
> > > > > > <snip>
> > > > > > Ross has been of enormous help offline.  Indeed, disabling jumbo packets
> > > > > > produced an almost 50% increase in single threaded throughput.  We are
> > > > > > pretty well set although still a bit disappointed in the latency we are
> > > > > > seeing in opensolaris and have escalated to the vendor about addressing
> > > > > > it.
> > > > > > 
> > > > > 
> > > > > Ok. That's pretty big increase. Did you figure out why that happens? 
> > > > Greater latency with jumbo packets.
> > > > > 
> > > > > > The once piece which is still a mystery is why using four targets on
> > > > > > four separate interfaces striped with dmadm RAID0 does not produce an
> > > > > > aggregate of slightly less than four times the IOPS of a single target
> > > > > > on a single interface. This would not seem to be the out of order SCSI
> > > > > > command problem of multipath.  One of life's great mysteries yet to be
> > > > > > revealed.  Thanks again, all - John
> > > > > 
> > > > > Hmm.. maybe the out-of-order problem happens at the target? It gets IO
> > > > > requests to nearby offsets from 4 different sessions and there's some kind
> > > > > of locking or so going on? 
> > > > Ross pointed out a flaw in my test methodology.  By running one I/O at a
> > > > time, it was literally doing that - not one full RAID0 I/O but one disk
> > > > I/O apparently.  He said to truly test it, I would need to run as many
> > > > concurrent I/Os as there were disks in the array.  Thanks - John
> > > > ><snip>
> > > Argh!!! This turned out to be alarmingly untrue.  This time, we were
> > > doing some light testing on a different server with two bonded
> > > interfaces in a single bridge (KVM environment) going to the same SAM we
> > > used in our four port test.
> > > 
> > > For kicks and to prove to ourselves that RAID0 scaled with multiple I/O
> > > as opposed to limiting the test to only single I/O, we tried some actual
> > > file transfers to the SAN mounted in sync mode.  We found concurrently
> > > transferring two identical files to the RAID0 array composed of two
> > > iSCSI attached drives was 57% slower than concurrently transferring the
> > > files to the drives separately. In other words, copying file1 and file2
> > > concurrently to RAID0 took 57% longer than concurrently copying file1 to
> > > disk1 and file2 to disk2.
> > > 
> > > We then took a little different approach and used disktest.  We ran two
> > > concurrent sessions with -K1.  In one case, we ran both sessions to the
> > > 2 disk RAID0 array.  The performance was significantly less again, than
> > > running the two concurrent tests against two separate iSCSI disks.  Just
> > > to be clear, these were the same disks as composed the array, just not
> > > grouped in the array.
> > > 
> > > Even more alarmingly, we did the same test using multipath multibus,
> > > i.e., two concurrent disktest with -K1 (both reads and rights, all
> > > sequential with 4K block sizes).  The first session completely starved
> > > the second.  The first one continued at only slightly reduced speed
> > > while the second one (kicked off just as fast as we could hit the enter
> > > key) received only roughly 50 IOPS.  Yes, that's fifty.
> > > 
> > > Frightening but I thought I had better pass along such extreme results
> > > to the multipath team.  Thanks - John
> > HOLD THE PRESSES - This turned out to be a DIFFERENT problem.  Argh!
> > That's what I get for being a management type out of my depth doing
> > engineering until we hire our engineering staff!
> > 
> > As mentioned, these tests were run on a different, lighter duty system.
> > When we ran the same tests on the larger, four dedicated SAN port
> > server, RAID0 scaled nicely showing little degradation between one
> > thread and four concurrent threads, i.e., our test file transfers took
> > almost the same when a single user did them as opposed to when four
> > users did them concurrently.
> > 
> > The problem with our other system was, the RAID (and probably
> > multi-path) was backfiring because the iSCSI connection was buckling
> > under any appreciable load because the Ethernet interfaces use bridging.
> > 
> > These are much lighter duty systems and we bought them from the same
> > vendor as the SAN but with only the two onboard Ethernet ports.  Being
> > ignorant, we looked to them for design guidance (and they were excellent
> > in all other regards) and were not cautioned about sharing these
> > interfaces.  Because these are light duty, we intentionally broke the
> > cardinal rule of not using a dedicated SAN network for them.  That's not
> > so much the problem. However, because they are running KVM, the
> > interfaces are bridged (actually bonded and bridged using tlb as alb
> > breaks with bridging in its current implementation - but bonding is not
> > the issue).  Under any appreciable load, the iSCSI connections time out.
> > We've tried varying the noop time out values but with no success.  We do
> > not have the time to test rigorously but assume this is why throughput
> > did not scale at all.  disktest with -K10 achieved the same throughput
> > as disktest with -K1.  Oh well, the price of tuition.
> 
> Uhm, so there was virtualization in the mix.. I didn't realize that earlier..
> 
> Did you benchmark from the host or from the guest? 
> 
> So yeah.. the RAID-setup is working now, if I understood you correctly.. 
> but the multipath setup is still problematic? 
<snip>
I wish I had more time to pursue this but we lost an unexpected five
weeks on this on a three month project.  I would seem the RAID0 is
scaling although we haven't absolutely verified it.  Multipath does as
well.  We did find wide variations in how well it scales based upon the
combination of rr_min_io and maximum number of concurrent transactions.
As I believe would be expected, lower rr_min_io (as in 1) performed much
better under few concurrent threads (e.g., 10) but started to lose its
edge under heavier concurrent threads (e.g., 100).  Thus my ignorant
guess is for something low density virtualization, the former works
better and for high density (such as our desktop setup) higher works
better.  Of course, even those definitions vary widely depending on the
nature of the guests.

RAID0 was more even, being just slightly slower than rr=1 under light
loads and just slightly slower than rr=10 under moderate and heavy
loads.

Tests were run from the virtualization host.  Thanks for all your help -
John
-- 
John A. Sullivan III
Open Source Development Corporation
+1 207-985-7880
jsullivan at opensourcedevel.com

http://www.spiritualoutreach.com
Making Christianity intelligible to secular society





More information about the dm-devel mailing list