[K12OSN] More feedback on Fedora 10 + LTSP

David Hopkins dahopkins429 at gmail.com
Tue Apr 21 17:23:48 UTC 2009

Let me first say that this is going to sound like a rant in places.
Not much I can do about that, but ... FC10+LTSP5 has not been
performing well at all.  I am currently at a loss to explain why.
However, since I have to have sound working, LTSP5 seems to be the
only way to ensure that sound works correctly. I have CentOS+LTSP4.2
and that works well for everything except sound. So, the only option I
see is to get a distribution that is using LTSP5 working. Again, just
to be clear, I am using identical hardware for the comparison and
using the same login accounts, same file server, same dns, same
authentication server, etc. All hardware is 32bit, both server and
clients. (I don't even want to deal/worry with the 64bit server/32bit
client possible issues at the moment).

Now, here is what I and the elementary school tech teacher observed
today. The following is her write-up.

"Things did not go so well this morning.  When all 10 computers were
in use at the same time, the delay between mouse and screen was
significant. . .  The point of the lesson was to improve mouse
skills--not possible when there is a lag between their mouse movements
and action on their monitors.   We muddled through the first group of
10 students, and when the 2nd group began the exercise, I allowed the
first 10 students to open Tux Paint.  I thought because  Tux Paint is
running local, this would work.  Big Mistake!  The delay for everyone
increased dramatically, making it virtually impossible to complete the
mouse task in Starfall.   When I tried to "QUIT" foxfire on 3
computers, it took 3-4 minutes to return to the desktop.   Although I
was circulating the room, trying to assist students, I glanced at the
load several times--I never saw it rise above 6.  It mostly hovered
between 4 and 5.  It took more than 5 minutes to successfully close
the website from 10 computers.  During that time, I had 10 students
just waiting.

When my second class arrived, I did not even try to use the website.
We used Tux Paint today.  However, shortly after we got started, I
"banned" students from selecting a new piece of paper . . .  The few
who had tried feature  had their monitor hung-up for more than a
minute.  That task used to respond immediately.   There is also a
terrific feature that allows students to select any color from the
rainbow . . .  but choosing that feature takes more than 1 full minute
to accomplish."

This is on a system where with CentOS+LTSP4.2 I could run 25 systems
simultaneously without issues.  She was trying to use 10.

Notice that the load average never exceeded 6. This is dual
hyperthreaded Xeon so a load average of 4 would mean 100% utilization
although that is a bit misleading as load averages of 6-8 perform
quite well on all my other systems. Also, the system was never using
swap. In fact, memory usage never exceeded 5GB.

So, where is the bottleneck?  The starfall activity is flash-based (it
was the Earth Day activity).  I know that FF3+flash is going to load
the system.  But, This issue is not as severe with FF2+Flash 9 except
that you don't get sound half the time.  FF3+Flash10 seems to really
slow down.  Also, it seems that network traffic is significantly
higher with FC10+LTSP5 using ldm than with gdm.  Can I switch back to
gdm as the default manager or is ldm it?  I have the LDM_DIRECTX set
to TRUE so that ssh is only used for login/logout.  And, login/logout
now takes 30+ secs compared to about 2 seconds for CentOS+LTSP4.2.

For the local apps, launching FF3 can take over a minute. And then it
will be sluggish, even when the local hardware isn't using swap

I have this suspicion that it is a network bandwidth issue. The only
difference there is that LTSP5 uses the ltsbr0 bridge setup while
LTSP4.2 does not.  To test this, I should be able to delete the bridge
and set up LTSP5 in the same dual NIC scenario as with LTSP4.2,
correct?  Though I am not sure I have the skills to do so without
breaking something else.  It might be as easy as deleting the ltsbr0
entry and then defining the IP address for the currently-slaved NIC to
be what the ltsbr0 was defined as.

I haven't had a chance to look at the stats from the switch (Amer.com,
SS2R24G4i ) but since I never changed the switch, only the OS, I don't
see why there would suddenly be an issue.

As for the Tuxpaint issue.  That is truly baffling.  I have the same
version of Tuxpaint running on an older server and it is very
responsive. There is a hardware difference for the server ... the one
that runs very well has CPU's with only 70% the speed of the newer
server.  The other difference is again CentOS+LTSP4.2 (using gdm) vs
FC10+LTSP5 (and ldm).

So, something looks like it 'just isn't right' except I'm not getting
any disk I/O errors, I'm not getting a huge spike in the load ... the
system just isn't responsive.

At this point the teacher has really reached her limit as have I.  A
single login with a single client works fine.  Add a few more and I
get the above. I want LTSP5 to work but I can't stay with it given the
current performance issues.  And I have to start planning now for next
fall.  If upgrading to FC10+LTSP5 means all my current hardware is not
acceptable, then I have a huge issue.  I know that all my current
hardware works with FC10+LTSP5, but the performance I'm seeing is
horrible.  I have been advocating/using K12LTSP since 2003, I really
want this to work, but right now to say I am depressed with FC10+LTSP5
would be an understatement.

So ... help?  I'll be back at the school tonight to try and determine
what might be happening. And once there, sitting behind the state
firewalls, access to IRC is blocked as is all other chat capabilities.

Dave Hopkins
Newark Charter School

More information about the K12OSN mailing list