Got my first EDAC error today
Steve Snyder
swsnyder at insightbb.com
Mon Apr 3 14:48:31 UTC 2006
On Monday 03 April 2006 10:21 am, Roger Heflin wrote:
> > -----Original Message-----
> > From: fedora-list-bounces at redhat.com
> > [mailto:fedora-list-bounces at redhat.com] On Behalf Of Steve Snyder
> > Sent: Sunday, April 02, 2006 6:54 AM
> > To: fedora-list at redhat.com
> > Subject: Got my first EDAC error today
> >
> > Got my first error report from the shiny-new EDAC driver
> > today. A kWriteD window popped up and displayed:
> >
> > EDAC MC0: UE page 0x2c, offset 0x0, grain 4096, row 0, labels
> > "": i82860 UE
> >
> > Great. Now where do I find how to interpret these error reports?
>
> Some questions:
>
> Does your system have ECC ram? If you don't have ECC and/or your
> chipset is not supported EDAC will pretty much only check PCI parity.
> Given that it is reporting i82860 I would guess that your chipset is
> supported, and that it believes that you have ECC>
Yes, I do have ECC RAM, and the BIOS is configured to use it for error
correction. Specifically, I have 2 sticks of 512MB dual-channel PC800
RDRAM.
> UE means uncorrectable error which means that more than 1 bit was
> messed up in your memory, generally you won't get these without getting
> lots of single big (CE) errors.
Well, it is possible I've been getting single-bit errors and didn't know
it. Still, though, I would have expected uncorrectable RAM errors to
have crashed my machine, or at least generated alarming system errors, in
the past. Instead this machine has been rock-solid stable in the 3 years
I've had it, and I've been using the same RAM thoughout that period.
Certainly, RAM can go bad, but the lack of "unexplained" lockups makes me
a little skeptical that frequent uncorrectable RAM errors are occurring
on a regular basis.
> You can check /proc/mc/0 that may give you better information, where
> the "" is is supposed to be a label to the dimm location on the
> motherboard, no one has yet mapped the locations that will be listed to
> actual locations on most motherboards.
Actually, I can't check that:
$ ll /proc/mc*
ls: /proc/mc*: No such file or directory
$ ll /proc/edac*
ls: /proc/edac*: No such file or directory
The EDAC info doesn't seem to be brought out to the /proc filesystem, at
least not in the kernel-smp-2.6.16-1.2069_FC4 that I'm running.
Thanks for the response.
More information about the fedora-list
mailing list