[olpc-software] graceful handling of out-of-memory conditions

Krzysztof Kowalczyk kkowalczyk at gmail.com
Fri Mar 31 12:13:24 UTC 2006


Relating to OOM discusion.

The way Pocket PC is trying to solve this problem is:
* there's a configurable setting that represents a free memory
threshold that OS considers to be "dangerously low". Usually it's set
to 2 MB
* when the amount of free memory falls below this threshold the OS
sends WM_HIBERNATE to all apps which means "please free up as much
memory as you can"
* if apps don't free up any memory, the OS does the OOM thing by
sending WM_CLOSE to background apps (it's the "please kill yourself"
message)
* if apps don't cooperate with WM_CLOSE, it shows a UI listing all
apps and asks the user to choose the one to kill

This complexity is a byproduct of a design decisions Microsoft made:
* the OS is multi-tasking so there can be multiple apps running at the same time
* closing the app (from user point of view) doesn't cause the process
to exit, it just gets put into background. this improves responsivness
when the user "launches" the app again but will eventually fill up all
the memory.

Palm OS or Amiga OS don't have any OOM handling. The apps are supposed
to handle malloc() failures properly or they die (sometimes taking the
OS with them).

While I agree with Havoc that handling malloc() failures takes more
code and is harder than ignoring failures and crashing, I disagree
with the conclusion that it's imposible to do or that it adds 30% of
exe size (exaggeration).

I know for a fact that Exchange Server or SQL Server won't crash just
because some malloc() failed so open-source developers on projects
1/10th of the size should be able to do it as well. Given enough
eyeballs and all that. For example, sqlite and cairo do the right
thing.

To go on a tangent: to me the "how to handle OOM in kernel" discussion
just shows that Linux (in the broad sense of Linux Kernel + X +
standard Linux libraries and apps) is wildly inappropriate technical
choice for OLAP.

Linux stack consists of millions of lines of C code written with 
desktop pc in mind, targeting specs of at least 1 GHz processor, 0.5
GB of RAM, gigabytes of hard-drive, and high-resolution screens.

The design of the whole system is optimized for those specs (the UI
assumes there's plenty of space to waste on menu bar etc. and
programmers assume it's ok to pretend that memory is infinite, etc.).

And the same time it lacks some of the things that proved to be useful
like good IPC mechanizm or system-wide, standard scripting language
that can be used to both script the apps and as a high-level language
for writing large class of apps (present in Amiga OS (ARexx, ports),
Newton (Newton Script), Mac OS (Apple Script)).

No matter how hard you try, you're not going to do a good job at
scaling this stack down to a reasonable size and it will still lack
the good features.

And its not like no-one created a good OS for low-powered devices before.

Amiga OS, released in 84, took 0.5 MB of ROM, ran beautifully on 7 MHz
68k processor with 0.5 MB of RAM. It had pre-emptive multi-tasking,
shared libraries, decent UI library, IPC mechanism (ports) that worked
well and system-wide scripting language that could be used to script
apps (something like Apple Script but based on Rexx). Written by a
handful of people in 2 years.

Newton OS: also beatuiful design running on much weaker machines than
OLAP with Newton Script - a scripting language powerful enough to
write end-user apps in.

A more recent example: Danger's SideKick OS which is basically Java
VM. Runs (fast) on 50 MHz ARM processor with 32 MB of RAM and 32 MB of
ROM. Has a decent UI and writing apps in Java is much easier than than
in C (the whole malloc() problem goes away).

And of course the things that Alan Kay and a handful of others at
Xerox Parc did with the Smalltalk system.

Even Lisp Machine, an elephant compared to those, would fit without
problems on OLAP hardware.

Palm OS - as weak OS as it is, its designers were the first ones to
really understand the PDS problem and create a solution that worked.
All previous and most later attempts at a PDA were big failures.

Any of those systems seems like a better choice for OLAP than Linux
system. Or a system designed and implemented from scratch, that uses
the good things that can be learned from those systems (and Linux, if
there really is anything to learn from it other than  given a chance
to redo things, you would probably want to do everything differently).

-- kjk




More information about the olpc-software mailing list