[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

All coreutils binaries segfault, requiring complete reinstall of linux

I have really stepped in something here. 

This has been repeated 3 times on 3 different servers, 2 FC3, one Advanced Server:  
After creating a new database with Progress DB 9.1D09,  (which we have done 
for years on a variety of Redhat versions) SOMETHING as yet unknown is hozed 
regarding the root user.   The behavior is that ALL binaries in the coreutils 
package (ls, ps, mv, rm... really handy)  will "Segmentation Fault".  Even 
Basename will segfault before it gets to the "Usage:...." display.   

Other users can login, and do any normal command above.   However, any attempt 
by other users to su will fail.  And if the system reboots, since Mv, Rm, and 
practically every other util used during startup segfaults, that box ain't 
coming up no more.  Hello complete reinstall. 

Prior times when we reboot, it is possible to come up on a boot/rescue CD, 
but as soon as you chroot to the disk drive, everything starts get faulting again. 

This last server was FC3, and had run "yum update" a few days ago, so can 
be considered current.  This server has no activity, no one else has access to 
it, it just sits there.   An external hack doesn't sound likely, because there
is no access from the outside world to that server.   And the failure coincided
exactly with the creation of a Progress Database. 

FYI, Progress, while not a household name, is an sane commercial DB product which
runs on thousands of servers. 

There were some interesting fingerprints in /bin...
[neal idiot bin]$ ls -lt /bin | more
total 6552
-rwxr-xr-x  1 root root  11549 Apr 25 13:17 arch
-rwxr-xr-x  1 root root  19937 Apr 25 13:17 aumix-minimal
-rwxr-xr-x  1 root root  22417 Apr 25 13:17 basename
-rwxr-xr-x  1 root root 623285 Apr 25 13:17 bash
-rwxr-xr-x  1 root root  26113 Apr 25 13:17 cat
-rwxr-xr-x  1 root root  42177 Apr 25 13:17 chgrp
-rwxr-xr-x  1 root root  41773 Apr 25 13:17 chmod
-rwxr-xr-x  1 root root  63437 Apr 25 13:17 cpio
-rwxr-xr-x  1 root root  36445 Apr 25 13:17 cut

those dates make no sense.   Checksums didn't match our other server, (also yum'd
to be current)  so I copied /bin/* from it, but no help.   Then copied /lib/tls/*
from that server as well, but same results.  Root can do nuffin, ordinary users
can do anything.  And interestingly, not ALL binaries in /bin segfault - vi, rpm, 
cp, traceroute - they all work.   So I can't say it's all of coreutils, but it's 
most of them. 

Since BASENAME is the simplest program that fails, I hacked the sources on 
another server that is running fine:

	main (int argc, char **argv)
	  char *name;

	  printf("1"); fflush(stdout);

	  initialize_main (&argc, &argv);
	  printf("2"); fflush(stdout);

	  program_name = argv[0];
	  printf("3"); fflush(stdout);

	  setlocale (LC_ALL, "");
	  printf("4"); fflush(stdout);

	  bindtextdomain (PACKAGE, LOCALEDIR);
	  printf("5"); fflush(stdout);

	  textdomain (PACKAGE);
	  printf("6"); fflush(stdout);

	  atexit (close_stdout);
	  printf("7"); fflush(stdout);

Compiled, sent over to the crippled box, and ran it as normal user:

	[neal idiot ~]$ ./basename
	1234567./basename: too few arguments
	Try `./basename --help' for more information.

	as you would expect, and as root:

	[root idiot neal]# ./basename
	Segmentation fault
	[root idiot neal]#            

I surmise from the printf's that it is crapping out BEFORE even getting to the
Main() section, somewhere in the C initialization.   Which puts it out of my
league to debug. 

Neal Rhodes                    MNOP Ltd                       (770) 972-5430
President                  4737 Habersham Ridge         fax:  (770) 978-4741
                          Lilburn (atlanta) GA 30047    

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]