[libvirt] (Dropping) OOM Handling in libvirt

Wed May 22 08:27:06 UTC 2019

Eric Blake <eblake at redhat.com> writes:

> On 5/13/19 8:28 AM, Michal Privoznik wrote:
>> On 5/13/19 12:17 PM, Daniel P. Berrangé wrote:
>>> This is a long mail about ENOMEM (OOM) handling in libvirt. The executive
>>> summary is that it is not worth the maint cost because:
>>>
>
>>> The long answer follows...
>> 
>> I'm up for dropping OOM handling. Since we're linking with a lot of
>> libraries I don't think we can be confident that we won't abort() even
>> now anyway.
>
> I can live with the decision to drop OOM handling, as long as we still
> try to gracefully handle any user-requested large allocation (such as
> g_try_malloc) and only default to abort-on-failure for the more common
> case of small allocations (such as g_malloc).

This argument relies on the "implicit assumption that malloc will
actually return ENOMEM when memory is exhausted" (Daniel's wording).
His whole section is worth (re-)reading; I append it for your
convenience.

The way Linux behaves reduces the benefit of attempting to "gracefully
handle any user-requested large allocation".  It also makes it harder to
test, increasing its risks.

Is it worthwhile?

We had this discussion in QEMU (which has the policy you propose,
followed except when it isn't, error recovery largely untested).  I
think it's not worthwhile for QEMU, but others have different opinions.

[...]

The complication of Linux
=========================

Note that all of the above discussion had the implicit assumption that malloc
will actually return ENOMEM when memory is exhausted.

On Linux at least this is not the case in a default install. Linux tends
to enable memory overcommit causing the kernel to satisfy malloc requests
even if it exceeds available RAM + Swap. The allocated memory won't even be
paged in to physical RAM until the page is written to. If there is insufficient
to make a page resident in RAM, then the OOM killer is set free. Some poor
victim will get killed off.

It is possible to disable RAM overcommit and also disable or deprioritize the
OOM killer. This might make it possible to actually see ENOMEM from an
allocation, but few if any developers or applications run in this scenario
so it should be considered untested in reality.

With cgroups it is possible to RAM and swap usage limits. In theory you can
disable the OOM killer per-cgroup and this will cause ENOMEM to the app. In
my testing with a trivial C program that simply mallocs a massive array and
them memsets it, it is hard to get ENOMEM to happen reliably. Sometimes the
app ends up just hanging when testing this.

Other operating systems have different behaviour and so are more likely to
really report ENOMEM to application code, but even if you can get ENOMEM
on other OS the earlier points illustrate that its not worth worrying about.

End quote.