Nvidia Signal 11 error
Nifty Hat Mitch
mitch48 at sbcglobal.net
Sat Jan 8 12:39:06 UTC 2005
On Wed, Jan 05, 2005 at 07:03:37PM +0200, Chadley Wilson wrote:
> I get quite an odd error with my nvidia mx440se
...
> all my 3d apps run for a while then suddenly terminate, no errors
> reported, logs are empty. I have run the apps from a terminal, in
> Quake3 Arena and celestia I get this when the apps terminate.
> Received signal 11, exiting...
Signal 11 is simply:
SIGSEGV 11 Core Invalid memory reference
If the application runs for a while.... then terminates a couple
things can be going on.
Hardware problem.
DRAM, VRAM, thermal run away. Make sure that fans and heat
sinks are clean and functional. Modern chips (CMOS) generate
most of their heat when gates change state. Computational or
logic busy applications will heat up parts to failure. Make
sure BIOS settings are sane and avoid overclocking. The
symptom is that things run for a while then terminate.
AGP board, 2x, 4x, 8x... what is the BIOS permitting
what is being selected when the driver loads.
Library collision.
nVidia 3D libraries and Mesa libraries occupy the same name
space. It is possible for lots of things to work with
incorrect libraries involved. One hint is that the nVidia
installer makes noise that the installation has been modified
if you reinstall the driver+libs. In my limited experience
nVidia has a library structure that can execute in hardware
or in software. Simple things without race conditions or
side effects will run just fine. When things get busy the
mixed-up libraries trip up. The symptom is that things run
for a while then terminate.
Memory leak.
Applications and drivers can fail to reuse memory correctly
and can continue to allocate additional memory resources. As
memory is exhausted bad things can happen, i.e. it will run
then fall over. The symptom is that things run for a while
then terminate.
Asynchronous bug.
Some events including interrupts happen at odd times. In
some code there are race conditions between the validation of
a memory block and it's use. This can be the application or
the kernel (or both sort of). If I recall quake, arena and
company do lots of texture mapping, and lots of texture map
data transfer, with asynchronous signaling (hardware and
software), increased heat of chips and more. Some of these
might be more common on multi-processor systems. The symptom
is that things run for a while then terminate.
Kernel bugs:
kernel-2.6.9-1.11_FC2.i686.rpm contains this comment.
* Thu Dec 16 2004
- Better version of the PCI Posting fixes for agpgart.
- Add missing cache flush to the AGP code.
So try different kernel versions and tell us which you
are using. Always update and test the latest rpm.
Can you check for memory leaking? Can you compile or run under a
debugger? Can you run with debugging libraries (symbols). Can you
enable a core dump and report the stack trace?
--
T o m M i t c h e l l
spam unwanted email.
SPAM, good eats, and a trademark of Hormel Foods.
More information about the fedora-list
mailing list