Extremely poor performance crunching random numbers under PIV-FC5

BankHacker bankhacker at gmail.com
Thu May 18 13:23:54 UTC 2006


Hi, I have installed Linux Fedora Core 5 (FC5) on a Pentium IV 3 Ghz,
4 Gb RAM, and SATA disk.

I have been detected CPU is running extremely slow on certain
situations: For example when doing random calculations. In order to
benchmark this particular situation I have writen an small C program
that make 10 million random numbers and measures the time consumed.

It is surprising that when the program is compiled with the static
flag enabled, it runs very fast, doing 10 million calculations in only
0.4 seconds. Nevertheless, when it is compiled without the static flag
(that is dynamic binary), the performance becomes very poor, consuming
40 seconds in doing it.

I have tested both compilations under other Linux distributions, like
Debian, and it runs in both case perfect, doing the job in only 0.4
seconds. I have also tested both programs under FC3 and I obtain the
same results than FC5. So I conclude that the problem only happens
when running Fedora!

This is the C code I have used to do the tests:

### test-cpu-2.c ##################################################
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
#include <string.h>

inline void randomize() {
    time_t seconds;

	time(&seconds);
	srand((unsigned int) seconds);
}

int main(int argc, char ** argv) {
    int i, r, numero_ciclos, numero_ciclosM, t1, t2;
    clock_t start, end;
    char* buf;
    time_t seconds;

    // Se inicializa el generador de numeros aleatorios
    randomize();


    start = clock();
    // Se reserva 0.1 Gb de memoria
    buf=malloc(100*1024*1024);
    end = clock();
    printf("Reservado 0.1 Gb de memoria en %.3f sec\n", (double)(end -
start)/CLOCKS_PER_SEC, r);


    start = clock();
    // Se escribe en 0.1 Gb de memoria
    for(i=0; i<100*1024*1024; i++) {
        buf[i]='0';
    }
    end = clock();
    printf("Escritura sobre 0.1 Gb de memoria en %.3f sec\n", (double)(end -
start)/CLOCKS_PER_SEC, r);


    numero_ciclos = 10000000; numero_ciclosM = numero_ciclos / 1E6;


    start = clock();
    for(i=0; i<numero_ciclos; i++) {
        r = rand();
    }
    end = clock();
    printf("%d M de rand() en %.3f sec (example.: %d)\n",
numero_ciclosM, (double)(end -
start)/CLOCKS_PER_SEC, r);


    start = clock();
    for(i=0; i<numero_ciclos; i++) {
        r = sqrt(i);
    }
    end = clock();
    printf("%d M de sqrt(i) en %.3f sec (example.: %d)\n",
numero_ciclosM, (double)(end -
start)/CLOCKS_PER_SEC, r);


    start = clock();
    for(i=0; i<numero_ciclos; i++) {
        r = log(i);
    }
    end = clock();
    printf("%d M de log(i) en %.3f sec (example.: %d)\n",
numero_ciclosM, (double)(end -
start)/CLOCKS_PER_SEC, r);



    start = clock();
    for(i=0; i<numero_ciclos; i++) {
        r = log10(i);
    }
    end = clock();
    printf("%d M de log10(i) en %.3f sec (example.: %d)\n",
numero_ciclosM, (double)(end -
start)/CLOCKS_PER_SEC, r);


#ifdef linux
    start = clock();
    for(i=0; i<numero_ciclos; i++) {
        r = random();
    }
    end = clock();
    printf("LINUX: %d M de random() en %.3f sec (example.: %d)\n",
numero_ciclosM, (double)(end -
start)/CLOCKS_PER_SEC, r);

    // Se inicializa el generador especial de numeros aleatorios
    srand48((unsigned int) seconds);

    start = clock();
    for(i=0; i<numero_ciclos; i++) {
        r = lrand48();
    }
    end = clock();
    printf("LINUX: %d M de lrand48() en %.3f sec (example.: %d)\n",
numero_ciclosM, (double)(end -
start)/CLOCKS_PER_SEC, r);
#else
#endif

    return (0);
}
### test-cpu-2.c (the end) ########################################

First test:
 gcc test-cpu-2.c -o static-test-cpu-2 -lm -static

Second test:
 gcc test-cpu-2.c -o dynamic-test-cpu-2 -lm

Obtaining these files:
	#file *-test-cpu-2
	
	 dynamic-test-cpu-2: ELF 32-bit LSB executable, Intel 80386, version
1 (SYSV), for GNU/Linux 2.2.5, dynamically linked (uses shared libs),
for GNU/Linux 2.2.5, not stripped
	

	 static-test-cpu-2: ELF 32-bit LSB executable, Intel 80386, version 1
(SYSV), for GNU/Linux 2.2.5, statically linked, for GNU/Linux 2.2.5,
not stripped

When runnnig both these are the results:

# ./static-test-cpu-2

	Reservado 0.1 Gb de memoria en 0.000 sec
	Escritura sobre 0.1 Gb de memoria en 0.410 sec
	10 M de rand() en 0.230 sec (example.: 1705120472)   <===============
	10 M de sqrt(i) en 0.020 sec (example.: 3162)
	10 M de log(i) en 0.050 sec (example.: 16)
	10 M de log10(i) en 0.050 sec (example.: 6)
	LINUX: 10 M de random() en 0.210 sec (example.: 1072609142)   <======
	LINUX: 10 M de lrand48() en 0.340 sec (example.: 1674848660)   <=====
	
# ./dynamic-test-cpu-2

	Reservado 0.1 Gb de memoria en 0.000 sec
	Escritura sobre 0.1 Gb de memoria en 0.410 sec
	10 M de rand() en 45.310 sec (example.: 661533760)   <===============
	10 M de sqrt(i) en 0.020 sec (example.: 3162)
	10 M de log(i) en 0.050 sec (example.: 16)
	10 M de log10(i) en 0.050 sec (example.: 6)
	LINUX: 10 M de random() en 37.610 sec (example.: 1311921343)   <=====
	LINUX: 10 M de lrand48() en 30.490 sec (example.: 839680703)   <=====


My kernel is the default for FC5 but it is the SMP version in order to
use the Hyperthreading:
# uname -a
	Linux obelix.breinestorm.net 2.6.15-1.2054_FC5smp #1 SMP Tue Mar 14
16:05:46 EST 2006 i686 i686 i386 GNU/Linux

This is my CPU description:
	# cat /proc/cpuinfo
	
	processor       : 0
	vendor_id       : GenuineIntel
	cpu family      : 15
	model           : 4
	model name      : Intel(R) Pentium(R) 4 CPU 3.00GHz
	stepping        : 1
	cpu MHz         : 2999.084
	cache size      : 1024 KB
	physical id     : 0
	siblings        : 2
	core id         : 0
	cpu cores       : 1
	fdiv_bug        : no
	hlt_bug         : no
	f00f_bug        : no
	coma_bug        : no
	fpu             : yes
	fpu_exception   : yes
	cpuid level     : 5
	wp              : yes
	flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
constant_tsc pni monitor ds_cpl cid xtpr
	bogomips        : 6007.68

	processor       : 1
	vendor_id       : GenuineIntel
	cpu family      : 15
	model           : 4
	model name      : Intel(R) Pentium(R) 4 CPU 3.00GHz
	stepping        : 1
	cpu MHz         : 2999.084
	cache size      : 1024 KB
	physical id     : 0
	siblings        : 2
	core id         : 0
	cpu cores       : 1
	fdiv_bug        : no
	hlt_bug         : no
	f00f_bug        : no
	coma_bug        : no
	fpu             : yes
	fpu_exception   : yes
	cpuid level     : 5
	wp              : yes
	flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
constant_tsc pni monitor ds_cpl cid xtpr
	bogomips        : 5997.58

I have tested both programs under other systems (Not PIV, but opteron,
and PIII), running Fedora 3, 4 and 5 and the results never shows a
poor performance.

I have deactivate SELinux functionality, and the results remain the
same, poor performance with or wothout SELinux.

So I can conclude that the combination of a PIV-3Ghz-4GbRAM-SATA plus
Fedora plus dynamic compilation is reporting the problem.

Any hint to find out what is happening or to know of somebody that
shares the problem, will be very helping.

Thanks in advance.

¤º°`°º¤ø,¸¸,ø¤º°`°º¤ø,¸¸,ø¤º°`°
Juan Ignacio Perez Sacristan
webmaster at bankhacker.com
Linux, Perl, PHP, MySQL ... solutions.
http://www.bankhacker.com/
Zaragoza, Spain
¤º°`°º¤ø,¸¸,ø¤º°`°º¤ø,¸¸,ø¤º°`°




More information about the fedora-list mailing list