Intel Woodcrest Crash under heavy load with FC5 and MySql

Ed K. ed at hp.uab.edu
Thu Sep 7 01:53:51 UTC 2006


On Thu, 7 Sep 2006, Albert Graham wrote:

> Hello all (especially the very technical),
>
> I  have been experiencing hardware lockups and crashes under Linux (Fedora 
> Core 5 latest kernel version 2.6.17-1.2174_FC5smp). The crashes occur under 
> what appears to be very heavy disk access and possibly multiple concurrent 
> access (i.e. multiple threads).
>
> I experience crashes using Mysql (MySQL-server-4.1.21-0.glibc23) latest 4.1 
> stable. In this case we also have multiple threads generating a database of 
> approx 13-30G in size or a period of about 18 hours.
>
> I also have experienced crashes using rsync local_disk to local_disk copies- 
> this creates multiple threads (unlike a simple copy - cp command which is a 
> single thread).
>
> The servers are 10 x:
>
> Woodcrest 5160 3Ghz (dual Core+Dual Xeon)  (1333 FSB)
> Supermicro servers 
> http://www.supermicro.com/products/system/1U/6015/SYS-6015P-8R.cfm
> Motherboard 
> http://www.supermicro.com/products/motherboard/Xeon1333/5000P/X7DBP-8.cfm 
> (BIOS 1.1c)
> 16 GB FB-DIMM RAM 677Mhz  - Approved and personally tested  by Supermicro USA
> 3ware 9550SX-4
> 4x500GB SATA Seagate Drives/16Mb cache.
>
>
> HINTS
> ====
>
> The crashes ONLY happen if we enable all 4 Cores in the BIOS (Dual core = 
> enabled)
>
> Our tests run 100% perfect if we disable the second core if each Xeon! (i.e. 
> one core from each Xeon)
>
> My questions
> =========
>
> Are there any "known" problems with Dual Core Xeons under load - e.g. 
> microcode issues ? kernel bugs ?
>
> From the kernel perspective is there any difference in operating code (i,e, 
> ignoring any superficial stuff like /proc/cpuinfo stuff) for Dual Core Xeons 
> ?
>
> I assumed that Dual Core would use the exact same code as SMP kernel ? is 
> this correct ? - I'm told it's not
>
> Are there any special specific patches for Dual Core ? (I did notice in RH AS 
> 4 a change log that stated something list "improved scheduling for Dual Core"
>
> Things I've tried
> ===========
> I have tried most combination of BIOS settings e.g. ACPI disabled in BIOS, 
> kernel parameters acpi=off noacpi noapic etc.. all of which make no 
> difference - the machines all crash unless I disabled Dual Core ?
>
> I've had extensive contact with Supermicro, 3ware and now Intel  - all of 
> which are blaming each other ?
>
> I've also recompiled the FC5 source RPM with exact same results.
>
> I'm told that AMD had a similar problem with one of their dual cores, but 
> this was fixed long ago and I assume that fix was specific to AMD chips and 
> would not apply to Intel due to differences in architecture.
>
> Any suggestions for helping be solve these crash problems would be very very 
> much appreciated.
>
>
> Thanks in advance.
>
> Albert.
>
>
>
> BIOS Output on boot:
>
> Phoenix TrustedCore(tm) Server
> Copyright 1985-2005 Phoenix Technologies Ltd.
> All Rights Reserved
>
> Supermicro X7DBP-8/X7DBP-I BIOS Rev 1.1b
>
> CPU = 2 Processors Detected, Cores per Processor = 2
> Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
> Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
> DRAM Type : DDR2-667, FSB at 1333MHz
> 16384M System RAM Passed
> 4096 KB L2 Cache
> System BIOS shadowed
> Video BIOS shadowed
>
> I will post some crash traces from our serial console server as a reply to 
> this message shortly.
>
>
>
>

.... 'no' to all your questions, but in my experience:

I bought a computer with an SuperMicro OEM motherboard, the H8DCE
http://www.supermicro.com/Aplus/motherboard/Opteron/nForce/H8DCE.cfm

and I had nothing but the same trouble you tell about. With this 
motherboard and a 3ware 9500S-8.
Using the onboard SATA was not a problem - only with the 3ware.

I sent the card back to 3ware, upgraded the bios in the motherboard, and 
updated the firmware in the
drives - all no luck.

Head over to https://www.3ware.com/
those guys are on top of things!

Then I put a Tyan S2850G2N in the server no problems!
http://www.tyan.com/PRODUCTS/html/tomcatk8s.html

I question the OEM line of SuperMicro.

ed





More information about the fedora-list mailing list