Kernel 2.6.9-55 issues

George Magklaras georgios at biotek.uio.no
Fri May 11 08:26:41 UTC 2007


Troy, what is your disk subsystem on the x2200? At what point it won't 
boot? Does it reach the bootloader and at least start the kernel? Also 
if you could do an 'lspci' and an lsmod and show the output from your 
good kernel.


##The following is a guess##
I don't have that kind of Sun kit, but there are all sorts of references 
to stability problems with AMD based chipsets. Also, FYI there is a 
kernel panic report for that kernel here:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=239484

This bug report concerns the Error Detection And Correction (EDAC) 
modules (hence the lsmod prompt). This comes from the edac kernel module 
thinking that there is something wrong with the bus or the memory. For 
your x2200, the system probably panics (any messages from the console 
during the boot failure?), as there is an option that defines a kernel 
panic on a kernel detecting EDAC parity errors. On your x1440 that are 
able to boot but they give the EDAC messages, do an lsmod and grep -i 
for edac.  They seem to point out a 'noedac' boot option, but I am not sure.

On the x1440 that spawn the edac messages, see if the /etc/modprobe.conf 
  contains any references to the edac modules and you could try to 
remove them, see if that makes a difference.

GM


Troy Knabe wrote:
> I upgraded from 2.6.9-42 to 2.6.9-55 kernel over the weekend.  I have had issues with 3 servers.  1 server wouldn't boot (x2200 amd 148 proc).  And two x4100's with 2 - Dual Core AMD Opteron(tm) Processor 285.  The two x4100's are spewing these errors, but if I reboot them with the old 2.6.9-42 kernel then I don't get any of them.  Anyone else experiencing issues with the new kernel?
>  
> thanks
> -Troy
>  
> May  9 16:25:43 hostname kernel: EDAC k8 MC0: general bus error: participating processor(local node response), time-out(no timeout) memory transaction type(generic read), mem or i/o(mem access), cache level(generic)May  9 16:25:43 hostname kernel: MC0: CE page 0xc, offset 0x108, grain 8, syndrome 0x4b39, row 0, channel 1, label "": k8_edacMay  9 16:25:43 hostname kernel: MC0: CE - no information available: k8_edac Error Overflow setMay  9 16:25:43 hostname kernel: EDAC k8 MC0: extended error code: ECC chipkill x4 errorMay  9 16:25:44 hostname kernel: EDAC k8 MC0: general bus error: participating processor(local node origin), time-out(no timeout) memory transaction type(generic read), mem or i/o(mem access), cache level(generic)May  9 16:25:44 hostname kernel: MC0: CE page 0x1f1, offset 0x0, grain 8, syndrome 0x28d8, row 3, channel 1, label "": k8_edacMay  9 16:25:44 hostname kernel: MC0: CE - no information available: k8_edac Error Overflow setMay  9 16:25:45 hostname kerne
l: EDAC k8 MC0: extended error code: ECC chipkill x4 errorMay  9 16:25:46 hostname kernel: EDAC k8 MC0: general bus error: participating processor(local node origin), time-out(no timeout) memory transaction type(generic read), mem or i/o(mem access), cache level(generic)May  9 16:25:46 hostname kernel: MC0: CE page 0x1f1, offset 0x0, grain 8, syndrome 0x28d8, row 3, channel 1, label "": k8_edacMay  9 16:25:46 hostname kernel: MC0: CE - no information available: k8_edac Error Overflow setMay  9 16:25:46 hostname kernel: EDAC k8 MC0: extended error code: ECC chipkill x4 errorMay  9 16:25:47 hostname kernel: EDAC k8 MC0: general bus error: participating processor(local node origin), time-out(no timeout) memory transaction type(generic read), mem or i/o(mem access), cache level(generic)May  9 16:25:47 hostname kernel: MC0: CE page 0x138, offset 0xac0, grain 8, syndrome 0xeeff, row 0, channel 1, label "": k8_edacMay  9 16:25:47 hostname kernel: MC0: CE - no information available: 
k8_edac Error Overflow setMay  9 16:25:47 hostname kernel: EDAC k8 MC0: extended error code: ECC chipkill x4 error
>  

-- 
--
George Magklaras

Senior Computer Systems Engineer/UNIX Systems Administrator
EMBnet Technical Management Board
The Biotechnology Centre of Oslo,
University of Oslo
http://www.biotek.uio.no/

EMBnet Norway:	http://www.no.embnet.org/





More information about the redhat-list mailing list