Fedora SMP dual core, dual AMD 64 processor system

Bryan J. Smith b.j.smith at ieee.org
Fri Sep 23 02:14:48 UTC 2005


Bill Broadley <bill at cse.ucdavis.edu> wrote:
> Here are my results:
> Version  @version@      ------Sequential Output------
> --Sequential Input- --Random-
>                     -Per Chr- --Block-- -Rewrite- -Per Chr-
> --Block-- --Seeks--
> Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP
> K/sec %CP  /sec %CP
> fileserver      16G           250195  39 75476  19         
>  141364  23 328.1   1

I'm confused, which numbers are which (for software, for
hardware, etc...)?

And what was the _exact_ command you ran?  It has a lot to do
with everything.

> If you want to convince me of the "obvious" performance
> superiority of hardware RAID-5 please post performance
> numbers.

The next time I roll out a DL365 or DL585 for a client, I
will.

> The above numbers are from a dual opteron + pci-e system
> with 8 SATA drives, linux, software RAID, and a Areca
> controller.  With this configuration I managed 250MB/sec
> writes, and 140MB/sec or so reads.  

Which is which?  And how did you thread to simulate multiple
clients?  The more clients and I/O queues, the better many
hardware RAID solution do.

Furthermore, I'm kinda scratching my head how you could get
higher RAID-5 performance in writes than reads?  That would
seem rather impossible.

_Unless_ buffer is not being flushed.  Then the lack latency
in synchronous DRAM writes are vastly improved over the
latency in a DRAM read (regardless if it is synchronous or
not).

> I've not read anything like this, instead all of my data
> indicates just the opposite.  Can you provide data to the
> contrary?

No, because I don't have a 3Ware system in front of myself
other than an older dual-P3/ServerWorks system.  I also
deploy RAID-10 more than RAID-5.  I will get you numbers.

But I'm still trying to figure out what numbers are what?
Which was the software RAID?  Which was the hardware RAID?
And what was your exact command (options and all)?

Did you run multiple clients/operations?  The more
independent operations you through at a hardware RAID
controller, the more they queue, the more differential they
introduce.

> I never used the word bliss or mentioned LVM.

Okay, fair enough.

> Right, 3ware can do it, but there's a custom interface
> specific to 3ware.

Yes, and it's well known -- has been for 6+ years.

> My point is that software raid standardizes all the
> tools, is flexible, and doesn't make you learn a set
> of vendor specific tools.

First off, there is no "learning curve" to 3Ware's tools. 
Secondly, 3Ware does have some interfaces into SMART and
other capabilities now.  Granted, they are not yet supported
by 3Ware itself, but the work is happening.  3Ware sends many
standard kernel messages, they do a lot of GPL work.

But given the hardware already sends these to both the kernel
_and_ its own management software, as well as a NVRAM that
logs _all_ events (regardless of the OS'), that's a nice
option IMHO.
 
> This lets you migrate RAIDs between linux boxes without
> hunting for another RAID controller of the same brand.

I've heard this argument again and again.  When someone can
demonstrate 6+ years of Linux MD compatibility, I will
believe them.  So far, I haven't seen it myself.  MD has
changed several times.

> Monitor then without custom binaries,

3Ware sends standard kernel messages.  You _can_ trap those
like any other disk controller or syslog approach.

Don't knock 3Ware because they give you an additional option.

> serial connections,

???  Are you're thinking of external subsystems  ???

>From the standpoint of 3Ware v. software RAID, same
difference, the driver/controller is local, so the local
kernel sees _all_ messages.

> or web interfaces.

3Ware offers both CLI and web interfaces.

> You don't have to rebuild a kernel with the right drivers,

Considering 3Ware releases GPL drivers in the _stock_ kernel,
I _rarely_ run into this.  At the most, I do a make, then
copy the module, depmod -a and create a new initrd.  Done.

> and download binaries from the vendor website.

*NO*BINARY*DRIVERS*
100% GPL drivers/source.
3Ware isn't "FRAID."

You only need the CLI/3DM _if_ you want their integrated
monitoring.  You _can_ trap syslog messages from the kernel
driver like anything else.

> Note avoiding the posting of numbers #1.

The studies are out there my friend.  But the next time I
have a modern Opteron with a 3Ware card, I'll send them to
you.

> Please post them, preferably on a file big enough to make
> sure that the file is coming through the RAID controller
> and not the file cache.

I'm not sure that's what you did.  From your benchmark where
the RAID-5 write performance was better than RAID-5 read
performance, I can only assume you benefitted from the fact
that write SDRAM latency is better than read SDRAM.

Otherwise, to disk, it's _worse_.

> There's 250MB/sec posted above.  The I/O interconnect is
> 8.0 GB/sec why would it be useless?

Because you're making _redundant_ trips.

If you have hardware RAID, you go directly from memory to
I/O.
If you have software RAID, you push _all_ data from memory to
CPU first.

> So?  The CPU<->CPU interconnect is 8.0GB/sec.  The I/O bus
> from the cpu is 8.0GB/sec, and pci-e is 2GB/sec (4x),

Actually it's 1GBps (4x) one-way.

> 4GB/sec (8x) or 8GB/sec (16x).  250-500MB/sec creates
> this huge problem exactly why?

You're pushing from memory to CPU first, instead of directly
to memory mapped I/O.  You're basically turning your direct
memory access (DMA) disk transfer into one long, huge
programmed I/O (PIO) disk transfer.

If your CPU/interconnect is doing _nothing_ else, your CPU
might be able to handle it, no issue.  That's why generic
mainboards are commonly used for dedicated storage systems.

But when you're doing additional, user/service I/O, you don't
want to tie up your interconnect with that additional
overhead.  It takes away from what your system _could_ be
doing in actually servicing user requests.

> Notice avoid the posting of numbers #2.

Again, there are some great studies out there.
I will get you numbers when I have a system to play with.

> Yes, I have many fileserver in production using it.

And it probably works fine.
But you're taking throughput away from services.

> I'm not learning this stuff from reading, I'm learning it
> from doing. "read up on things" is not making a
> particularly strong case for your point of view.

There are several case studies out there from many
organizations where user/services are pushing a lot of I/O.

If your system isn't servicing much I/O, and the disk is just
for local processing, then the impact is less.  But when your
storage processing is contending for the same interconnect as
user/service processing (such as a file server), every
redundant transfer you have due to software RAID takes away
from what you could be servicing users with.

> Actually I do know quite a bit on this subject.  Please
> post numbers to support your conclusions.

I will.

Please post your _exact_setup_, including commands,
client/processes, etc...

> More hand waving.  So you are right because you built too
> many file and database servers?

Can you at least understand what I mean by the fact that
software RAID-5 turns your disk transfer into a PIO transfer,
instead of a DMA transfer?

And do you understand why we use network ASICs instead of
CPUs for networking equipment?  Same difference, the PC
interconnect is not designed for throwing around raw data. 
It is not an I/O processor and interconnect.

> Bring me up to date, show me actual performance numbers
> representing this advantage you attribute to these
> hardware raid setups your discussing.

> Quite a few, mainly clusters, webservers, mailservers, and
> fileserver's.  I thought this was a technical discussion
> and not a pissing contest.

I just don't understand how you could not understand that any
additional overhead you place on the interconnect in doing a
programmed I/O transfer for just storage takes away from
user/service transfer usage.

If you have web servers and mail servers, which are more
about CPU processing, you're probably not having too much
user/service I/O contending with the software RAID PIO.

But on database server and NFS fileservers, it clearly makes
a massive dent.

> So avoid all this handwaving and claims of superiority
> please provide a reference or actual performance numbers.

Will do, personally when I have the next opportunity to
benchmark.

But I'll need your _exact_ commands, how you emulated
multiple operations, etc...



-- 
Bryan J. Smith                | Sent from Yahoo Mail
mailto:b.j.smith at ieee.org     |  (please excuse any
http://thebs413.blogspot.com/ |   missing headers)




More information about the amd64-list mailing list