Opteron Vs. Athlon X2

Bill Broadley bill at cse.ucdavis.edu
Fri Dec 9 04:45:35 UTC 2005


On Thu, Dec 08, 2005 at 06:56:01PM -0800, Bryan J. Smith wrote:
> But, FYI, the Areca 1200 series uses the Intel IOP332 which
> actually uses a PCI-X to PCIe bridge internally, with the
> hardware on the PCI-X side
> (http://developer.intel.com/design/iio/iop332.htm).  So I'm
> sure they are getting no where near that.  But it's quite
> enough to handle any stream of data, regardless.

That does not necessarily mean that it's only 1.0GB/sec (total for
reads and writes).  Since it's inside the controller and is of a known
configuration (the bus has exactly 2 loads and is extremely short) it
could easily be faster.  I know the early AGP bridges for AGP native
GPUs on pci-e ran at double speed to provide 4GB/sec instead of 2.
After all why would intel design for an 8x slot if they only needed a 4x?

In either case with only 16 disks even 1.0 GB/sec isn't going to
be the limit.

> Because you're not writing several, additional and redundant
> copies to the disk in the case of RAID-1 or 10, or pushing
> every single bit from memory to CPU back to memory just to
> calculate parity.  It may seem like 200-300MBps doesn't make
> a dent on I/O or CPU interconnect that is in the GBps, but
> when you're moving enough redundant copies around for just
> mirroring or XORs, it does detract from what else your system
> could be doing.

Well with the memory system being 6.4GB/sec and the hypertransport
connections being 8.0GB/sec.  Say your writing 100MB to a 5 disk
RAID-5.  In the hardware RAID case:
100MB read from ram -> CPU copies it to the I/O space of the controller ->
controller calculated raid-5 checksums -> 125 MB is written to the disks.

Software RAID:
100MB read from ram -> cpu copies and checksums 125 MB to the controller
-> controller writes 125 MB to the disks.

So yes your 4GB/sec I/O bus sees another 25MB/sec so it's 1.25% more
busy.  Is that a big deal?

Say it was a mirror, it's 5% more busy.  So?  

I just checked my fileserver, it can RAID-5 checksum at 7.5GB/sec.  So
yes one cpu would be slightly more busy, just a few %.

> I agree it's getting to the point that hardware RAID is less
> and less of a factor, as the I/O and CPU-memory interconnects
> can cope much better.  But before PCI-X (let alone PCIe or
> Opteron/HyperTransport) and multi-GBps memory channels it was
> pretty much a major factor.

I never noticed, I've done many comparisons between hardware and software
RAID.  Even a small fraction of an opteron or p4 is a substantially 
larger resource then the controllers which even today are only 32 bit
and a few 100 MHz.

> I know, that's why I want to see more PCIe hardware RAID. 
> The Areca's are very nice, since Intel offers the IOP332 (and
> IOP333) with a built-in PCI-X to PCIe bridge (although it
> would be nice if they were PCIe native), which they use . 

I'd like native as well, asthetically.  It won't necessarily
make a performance difference.  Video cards deal with higher bandwidths
then disks and the only difference I noticed in the benchmarks and
reviews is that the pci-e native take a few less watts than the
pci-e -> bridge -> agp.  The 6600GT for instance requires a power
connector for AGP and none for pci-e.

> Supposedly the 3Ware/AMCC 9550SX PowerPC controller is also
> capable of PCIe, like 

Strange, they don't mentioned anything like that on the 9550sx
page.  In any case the more the merrier, more competition
is a good thing for the consumer.

> Any Opteron solution with the AMD8131 has 2 PCI-X channels. 
> The HP DL585 has 4, the Sun Sunfire v40z has 6 (4 slots at
> the full 133MHz PCI-X.  There is also the AMD8132 which is
> PCI-X 2.0 (266MHz), although it's adoption is surely to be
> limited by the growing adoption of PCIe.

Indeed, I'm puzzled by suns move to pci-x 2.0.  Especially when
it seems like the market has already moved to pci-e.  HP, IBM,
Tyan, Supermicro, Myrinet and related are all supporting Pci-e.

It's great for consumers, for relatively cheaply you can get TWO
8GB/sec I/O slots on a motherboard.  Sure you could buy a HP quad
or a Sun quad... but I certainly can't afford one for home.

> Once PCIe x4 and x8 cards become commonplace (which is almost
> true), there will be little need for PCI-X other than legacy
> support.  I'm still hoping to see a low-cost PCIe x4 SATA
> RAID card, 2 or 4 channel, because the current entry point
> seems to be $400+.

If you want cheap I'd switch to software RAID.  I've seen pci-e
2 channel controllers for $60 or so.  Or just get a new motherboard
getting 8 ports on the motherboard is fairly easy on a $100-$150
motherboard.

> Most of the reviews I've seen have not done so.  At the most
> the CPU utilization and RAID-5 performance is where software
> RAID can't keep up, because they are trying to push down

My experience is exactly the opposite, I've been shocked
how many hardware RAIDs couldn't manage a sustained 50MB/sec
for writes.

> > The linux drivers seem fine, I've not played with the
> > user-space tools, so far just the web interface.
> 
> How is the web interface?

Nice.  It allows creating/destroying RAIDs, shows temperatures,
allows you to blink a drives activity light to find it.  Supports
at least RAID 0,1,5, and 6.  I didn't really check for 10.  Both
JBOD and passthru work well ;-).

> What do you have to install on the host to get to it?

Nothing, just connect a network cable to the RAID card.

> You mean you're using the Areca for software RAID?  You're
> throwing away all it's XScale power?

Heh, all what 333 MHz of 32 bit power roaring inside my RAID
card?  I want it to be managed just like all my other RAIDs, I
want it to be able to migrate it to any of my other servers, and
I want the monitoring to be as easy as crontab entry:
diff /var/mdstat /proc/mdstat

So that I get an email if there is any change.  Sure I could install the
Areca tools, use hardware RAID, setup some areca specific documentation,
disaster recovery plan, monitoring, and management.  Then forever tie
that data and those disks to a specific type of controller.

Of course then I'd have to do it for my, er, dunno 5 other kinds
of hardware RAID cards.  

Or I could treat it just like every other software RAID I manage.
In the case of disaster I can easily troubleshoot with any hardware
that has enough SATA ports.

> If so, why don't you just use a cheaper card with 4 or 8 SATA
> channels?  I noticed HighPoint now has one for PCIe, and
> Broadcom should as well (or at least shortly).

At the time for 16 ports the areca seemed easiest (single card),
I'm certainly watching for cheaper solutions.  

> It would probably be better performing because the XScale
> won't be in the way.  Same deal on PCI-64/X when anyone uses
> 3Ware with software RAID, you'd be far better off going with

I've benchmark many 3ware hardware RAIDs that were slower than software
raid on the same hardware.  I've not benchmarked the new 9550sx though.
If they bring out a pci-e version maybe I'll get the chance.

> a Broadcom RAIDCore if you're doing software RAID.  The
> Broadcom doesn't have an ASIC/microcontroller between the bus
> and channels.

I don't see any reason why the XScale should slow anything down, all
it has to do is copy data from PCIe to the sata interface.  Certainly
the rate it can do that should be higher than the rate it can do 
RAID-5 calculations.

> E.g., the most common reason I see people say they went 3Ware
> was for hot-swap support.  But hot-swap doesn't work well
> _unless_ you use its hardware RAID, hiding the individual
> disks from the OS behind it's hardware ASIC.  That seems to
> be a repeat issue.

Agreed, I'm not hot-swapping.  I just have 3 5 disk RAIDs and one global
spare.  So during a failure I rebuild on the 16th disk and then do the
swaps during the next maintenance window.  Even 3ware's hotswap isn't
perfect, I've seen disks get confused enough to hang the controller.

Not that 3ware doesn't do it better than most of the non-RAID sata
drivers controllers.

> If you mean the RAID-5 support, 3Ware was _stupid_ to update
> the 6000 series to support RAID-5.  It was _only_ designed
> for RAID-0, 1 and 10 -- hence why the new 7000 series was
> quickly introduced.  But they did it to appease users, not
> smart IMHO.

Heh, I got bitten by that one, after a fair amount of load testing I
put them in production use.  Lost a filesystem.  3ware claimed a fix
would work.  A month later the same thing.  I've heard of other various
BIOS/driver bugs since then.  Software RAID on the other hand seems
pretty bulletproof, it's widely tested, and very robust.  I regularly
have 400-500 day uptimes on busy production servers.

> > Of course 3ware has gotten much better since then.
> 
> I don't think they've ever been "bad," but they've done some
> _stupid_ (technically speaking ;-) things at times.  Even I

I consider selling an expensive RAID card that is completely
broken and writes corrupt data to the drive bad.  I trusted
them and was betrayed.  Certainly hindsight is 20/20, shouldn't
have done that.

> _never_ adopt a new 3Ware firmware release until I've seen it
> "well received."  E.g., the 3Ware 9.2 firmware had a bug that
> killed write performance if you had more than 1 array.

Caution is warranted, more caution still (IMO) is to use software RAID.
Hopefully 3ware can read/write a block without errors, and the rest
I trust to software RAID.  I also find the mdadm functionality quite
desirable when compared to most hardware raid interfaces.  Even if
I found the exact functionality equivalent, it would still be specific
to that controller.

-- 
Bill Broadley
Computational Science and Engineering
UC Davis




More information about the amd64-list mailing list