RHEL3 U5's kernel is unstable for 8-way SMP server?

Shelton, Darren Darren.Shelton at acs-inc.com
Tue May 16 21:12:19 UTC 2006


I am running rhel2.1 and rhel4 on some Compaq 8 way machines. Mine are 740's
and the only thing I really found to be necessary was to run on the bigmem
kernel since my machines had more 32gigs of memory in them. Other than that
I have not had to pass any special kernel arguments to get the machines to
run stable. In fact until a recent hardware issue, my 2.1 server had nearly
3 years of uptime.


darren
-----Original Message-----
From: redhat-list-bounces at redhat.com [mailto:redhat-list-bounces at redhat.com]
On Behalf Of jOe
Sent: Tuesday, May 16, 2006 3:40 PM
To: redhat-list at redhat.com
Subject: Re: RHEL3 U5's kernel is unstable for 8-way SMP server?

I've searched the redhat's bugzilla and googled again,maybe add nomce or
mce=off into boot parameter list will solve the problem?

because the system is in production, anyone ever use this way on 8way
system?

Thanks,

On 5/17/06, jOe <smartjoe at gmail.com> wrote:
>
> Hello all,
>
> I met a "system crash" probelm yesterday after processor replacement.
> The server is a HP DL760g2 8way server ,originally equipped with
4x2.7GHzXeon CPU and replaced by 8xXEON
> 3.0GHz  yesterday.
> The upgrading service are provided by HP team and this server passed the
> status checking and it's system ROM has been updated to latest version as
> well.
>
> We use this server with 4 old CPUs for long time, the system is very
> stable, it  has RHEL AS 3.0 U5 installed (the kernel is 2.4.21-32.ELsmp).
> The crash happend 1 hours later after finished processor upgrading. We've
> also checked the System ROM setting : the HT & Full Table APIC are enabled
> by default.
>
> I googled but got few information by searching  it's crash information,
> just wonder how to search the redhat's bugzilla database so that i can
have
> further information to confirm if it is not U5's kernel bring us the
> trouble.
>
> The following is the crash inoformation that appear on console.
>
> =======================cut=====================================
> Red Hat Enterprise Linux As release 3 ( U5)
> Kernel  2.4.21-32.ELsmp on an i686
>
> jydbserver1 login: CPU 4: Machine Check Exception:000000000000000000004
> <<>CP>#CPU 7: Machine Check Exception:000000000000000000<4>04
> <U0>Ke0>Kerconnue
> nic: U<4>nablea<0Exe  1: Machine Check ExceptionionExce00ion:<40>Kleon
> ItanieU
> ablani : U<<4>t4>n40>Kleon ItanieU
> ablani : U<<4>t4>>ncont<0>blieernel<<4<>y#4>Yevb a huam#rodbwaril #pr
> hhyove a
> Radhw##############################################################
> ###################################################################
> ####################################################################
> #######################
>
> ====================end cut==========================================
>
> It is a kernel bug in U5 cause the system crash?
> Or something we should set in system ROM or recompile the current kernel ?
> btw: The hardware(including server and it's processor/memory, hardisk,
> controller ...) has been checked again & again.
>
> Appreciate if any opinion.
>
> --
> Joe Yu
> Senior Linux Solution Architect
> LinuxTea Professional Group
>



-- 
Joe Yu
Senior Linux Solution Architect
LinuxTea Professional Group
-- 
redhat-list mailing list
unsubscribe mailto:redhat-list-request at redhat.com?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/redhat-list



More information about the redhat-list mailing list