[K12OSN] TuxMath test on 8 core machine with teamed NICs 16GB RAM
jim at winonacotter.org
Thu Dec 6 14:31:07 UTC 2007
Well, as promised, I did a test yesterday afternoon to see how well TuxMath will scale
on my hardware/network. To review I have a dual quad core 2.66Ghz Xeon box (8 cores), 6
300GB SAS drives in RAID10, 16GB RAM, and 6 GB nics teamed with adaptive load balancing.
The the network is configured there are about 15-20 machines per GB connection to the
server. This is also running Edubuntu 7.04 with LDM_DIRECTX=true (no encryption) and
the linux-image-server kernel under 32-bit to recognize the added RAM.
So I set out yesterday to see how well my setup would run TuxMath on 30 clients
simultaneously......not good. I started up 10, things looked great, CPU load averages
under 10% and RAM under 2 GB. Then I started 10 more, CPU under 25% (starting to get
high), RAM under 3GB, and still moving at a decent speed and loading quickly. Then I
started 10 more, all hell broke loose :-( My CPU average started climbing drastically,
the server came to a halt, all the clients that were running slowed way down, I couldn't
quit any instance of TuxMath. The clients would respond to the point of exiting the
game, then clicking quit, but the app stayed open and would not close. My CPU average
climbed to 148% (Not sure how that is possible) with all instances of TuxMath at the
quit screen. So even with all instances of TuxMath no longer moving graphics, but just
sitting at the quit screen, my processor use still kept climbing, while RAM usage was
still under 4GB. The only way I could get the system to recover was by killing all
instances of TuxMath.
So, I don't think all the problems users are seeing with TuxMath are related to network
issues. It appears 15 or so instances of TuxMath on a single machine are the max due to
escalating processor usage. The usage seemed to grow exponentially over time with the
same amount of instances running. And once overloaded, the system can't recover, even
when there are no more moving graphics. And given my system can handle 75 simultaneous
Firefox sessions with Flash while switching back and forth to OpenOffice (on-line games
until the teacher walks by) without breaking a sweat at 1280x1024 resolution, I'd
venture to say the limitation is in the TuxMath code.
Another note, I also noticed for every workstation 3 instances of TuxMath showed up in
htop, and two of those processes were under 1% of usage, while the third sucked all the
power, maybe this is a hint to what is going on. In this lab there are also 30
machines, 15 on one 1GB connection, and 15 on another. So I'm fairly certain network
load was never an issue, all problems seem to point to excessive draw on CPU cycles.
No offense meant here to the programmers of TuxMath, it is great software and my kids
love it. I just hope this sheds some light on what is happening in a multiuser environment.
If anyone else has some suggestions on something to try, let me know. It may take a few
days to get the lab free and test, but I'll test it.
This message has been scanned for viruses and
dangerous content by the Cotter Technology
Department, and is believed to be clean.
More information about the K12OSN