[K12OSN] Openmosix & LTSP
James P. Kinney III
jkinney at localnetsolutions.com
Mon Apr 23 18:56:09 UTC 2007
On Mon, 2007-04-23 at 09:47 -0700, Sean Harbour wrote:
> >James P. Kinney III wrote:
> >> The direction that RedHat/Fedora is gearing up for is "Stateless Linux".
> >> http://fedoraproject.org/wiki/StatelessLinuxHOWTO
> >> I _don't_ see most
> >> schools going "COOL! Now we can roll our own custom environment!"
> >> anytime soon. It's a beast of a process.
> >I've always thought the right way to handle this would be a slight
> >variation of openmosix where you could designate one or more dedicated
> >servers that could run applications for anyone but in addition, your own
> >applications would have the option of running on your local machine
> >(only - not other clients)if it offers a reasonable amount of CPU and
> >RAM capacity. However I don't know enough about openmosix to know if
> >it actually has any concepts to associate users and their local
> >machines. You probably don't want your jobs running on some other client
> >that might be rebooted or unplugged at any time. Something like this
> >could automatically balance out the differences between thin and fat
> >clients without much custom tweaking.
> > Les Mikesell
> > lesmikesell at gmail.com
> Les, I've been having the same thoughts you so aptly described about using OpenMosix with LTSP. Like you, I think that the ability to selectively cluster appropriate clients would be a good thing. A first step might be as simple as selectively assigning an OpenMosix enabled boot kernel to certain client machines, with a default kernel for other clients that would not be OpenMosix enabled.
> Sadly, there doesn't appear to be much movement along the OpenMosix LTSP front these days, some of the info appears to be 4 or 5 years old. I'm going to try and forward these questions to the OpenMosix people and see if I can get a response.
> Here's some info from the OpenMosix FAQ:
> 'What kind of impact will Client A see if the LTSP server migrates a process that it is running for Client A to Client B, and Client B suddenly drops off the network?'
> In the LTSP+openMosix How-To I maintain( http://openmosix.sourceforge.net/ltsp-omr4-1.html ), the LTSP clients do not migrate their local processes (basically Xwindows, which is all they really run). All processes originate from the server.:
> There is a basic difference between hardware failure and shutdown. In case a computer fails, then obviously everything goes down with it, too, but that is to be expected and not different on non-openMosix machines. In case client B shutdown down all foreign processes will be migrated away again and things keep running normally. "
> So, it appears that the solution right now with OpenMosix to the problem of user initiated reboots is to super glue the power cords to the clients, and make sure the power button only triggers a soft reboot. This isn't going to work for most clients. We need some sort of process affinity so we can optionally migrate heavy apps such as OO to the thick clients that are using them. With this method we should be able to add a lot more clients to any single server, and if Johnny unplugs his power cord it only affects him.
I have been toying with having the LTSP servers PXE boot from a central
server, load everything into a RAM disk, do some per-machine configs
based on MAC's, then act a single-system image cluster. The PIA aspect
is all apps (at least the heavy weight ones) must be source-code tweaked
to have either processor affinity (sadly a compile time option - needs
fixing to be a run-time option) or significantly enhanced
multi-threading. Some apps like OO do fairly well at the threading, but
the process gets migrated all over the available CPU's and that just
wastes resources. Other apps like Firefox have rather poor threading and
tend to explode when thread cross (Think "Ghostbusters" and "Never cross
the streams" :)
OpenMosix _does_ have a development section with a 2.6 kernel in it
(2.6.18 - I think). But virtualization is the current developer toy so
not much is being done in the clustering arena.
Much work to do to make any of it really large-scale ready. Dropping
this stuff on a 8 socket, dual core Opteron system only means the apps
have more CPU's to jump to :(
James P. Kinney III
CEO & Director of Engineering
Local Net Solutions,LLC
GPG ID: 829C6CA7 James P. Kinney III (M.S. Physics)
<jkinney at localnetsolutions.com>
Fingerprint = 3C9E 6366 54FC A3FE BA4D 0659 6190 ADC3 829C 6CA7
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 189 bytes
Desc: This is a digitally signed message part
More information about the K12OSN