[olpc-software] graceful handling of out-of-memory conditions
David Zeuthen
davidz at redhat.com
Tue Mar 28 13:59:20 UTC 2006
Hi,
On Tue, 2006-03-28 at 15:46 +0800, Jaya Kumar wrote:
> - oom killer selects the pid to murder by picking the pid that has the
> highest mem size ( mm->total_mm + (all of it's children's mem size/2))
> - long running + if_niced. this sounds nice and intuitive to me.
> - the core oom killer function is badness() here
> http://sosdg.org/~coywolf/lxr/source/mm/oom_kill.c#L46
>
> I think the kernel already gives you the capability you request, right?
Sounds good to me; so
- init(1) set /proc/pid/oom_adj to OOM_DISABLE
- All child processes inherit this; HAL, NM, the X server, the panel,
the WM etc.
- Our desktop shell sets /proc/pic/oom_adj to something != OOM_DISABLE
for child processes launched (e.g. Firefox, Abiword etc.)
- For activation via IPC, teach activators about tweaking
/proc/pid/oom_adj too. AFAIK this currently includes the D-BUS
session bus and ORBit2 though we want to get rid of the latter. This
is probably a small patch we can get upstream
So on OOM only applications launched by the user will be killed. Of
course this only happens when OOM. This can happen in the following
scenarios
1. Our desktop shell is leaking; we fix this via QA and testing;
btw, said shell will grow and shrink a bit over time; for example
if you attach e.g. a USB storage device, hal will launch a new
hald-addon-storage process (it's tiny though) to poll for media
changes...
2. User is running only one "application" (e.g. Firefox) and this
app needs fixing; well, in this case the app needs fixing
_anyway_ - we are just guaranteed the app is nuked instead of
some vital system process
3. User is running multiple "apps"; maybe we shouldn't allow it but
this is probably too restrictive.. Or allow it with the caveat
that it may be killed - tough to tell the users (kids) about this
in an easy way though. Consider this totally scary dialog
+----------------------------------------------------+
| You are running more than one (legacy) application |
| already. Your applications may exit unexpectedly |
| and you may lose all your work if running more |
| than one (legacy) application. |
| |
| [Go Ahead Already] [Don't Launch App] |
+----------------------------------------------------+
It may sound silly but I think it's what we need to do. Of course
we should be able to tag applications in their .desktop file
with "OLPC_OOM_Handling_and_Cooperation_with_WM=True" if apps save
state on OOM and knows how to cooperate with the WM. Then the WM
and/or panel simply iconifies the app on exit
Then we don't show said silly dialog if all apps launched have
this key set to True. Btw, in the dialog, "legacy application"
refers to one that don't have the key
OLPC_OOM_Handling_and_Cooperation_with_WM=True
in the .desktop file.
Of course, wording etc. needs to be fixed, but this is what
I think we should do.
What do you think?
Once we know how lean our "desktop shell" is, we know exactly how much
memory is left for apps before they are OOM; to test this is easy, we
can just write small app that allocates a bunch of memory. Then we can
also verify that the kernel indeed nukes the right process, e.g. the
app.
Cheers,
David
More information about the olpc-software
mailing list