Extremely poor performance crunching random numbers under PIV-FC5

Thu May 18 16:59:07 UTC 2006

> Ahaha, so if I understand you the strace output is huge for the dynamic
> one and smaller for the static one?

Yes, but it is no huge, just a little bit bigger. Only several lines
greater, those I wrote in the previous message.

> Can you post an example of one of the "10 million" syscalls that is
> present in the strace output for the dynamic case (if I understood you
> correctly)?

There is no trace for individual syscalls (supposing rand() function
is really a syscall ...) in the trace output.

> There are some magic things down
>
> /proc/sys/kernel/
>
> that you might want to meddle with to see if they affect the situation.  Eg
>
> echo "0" >/proc/sys/kernel/randomize_va_space
> echo "0" >/proc/sys/kernel/exec-shield

The original values for these files were "1". I set them to "0" as you
told me. These are the results:

	# ./dynamic-test-cpu-2
	... 10 M de rand() en 44.840 sec (example.: 93862647) ...

	# echo "0" >/proc/sys/kernel/randomize_va_space
	# ./dynamic-test-cpu-2
	... 10 M de rand() en 44.800 sec (example.: 1936430894) ...

	# echo "0" >/proc/sys/kernel/exec-shield
	# ./dynamic-test-cpu-2
	... 10 M de rand() en 44.800 sec (example.: 677622031) ...

I am afraid the problem is still present.

> Another idea, perhaps to to give a static libm and keep the dynamic libc
> stuff  eg
>
> gcc blah.c /usr/lib/libm.a
>
> not sure if that will work but worth a try.  Then the ldd for the
> resulting binary should no longer reference libm.

Ok. I compile it with libm this way:
	gcc test-cpu-2.c -o libm-test-cpu-2 /usr/lib/libm.a -lm -O3

Obtaining:
	# file ./libm-test-cpu-2
	./libm-test-cpu-2: ELF 32-bit LSB executable, Intel 80386, version 1
(SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs),
for GNU/Linux 2.6.9, not stripped

These are the results:
	# ./libm-test-cpu-2
	Reservado 0.1 Gb de memoria en 0.000 segundos
	Escritura sobre 0.1 Gb de memoria en 0.230 segundos
	10 M de rand() en 44.800 sec (example.: 3392997)
	10 M de sqrt(i) en 0.180 sec (example.: 3162)
	10 M de log(i) en 0.920 sec (example.: 16)
	10 M de log10(i) en 0.950 sec (example.: 6)
	...

So identical results in rand() but worst results in other mathematical
functions.

Doing top meanwhile Fedora is executing
(static/dynamic/libm)_test-cpu-2 gives always a load of 1.05 more or
less. So the CPU is working at full throttle in every case.