[libvirt] Increasing TasksMax when creating machines via systemd

Jim Fehlig jfehlig at suse.com
Thu May 23 17:53:55 UTC 2019


On 5/23/19 9:22 AM, Daniel P. Berrangé wrote:
> On Wed, May 22, 2019 at 05:16:38PM -0600, Jim Fehlig wrote:
>> Hi All,
>>
>> I recently received an internal bug report of VM "crashing" due to hitting
>> thread limits. Seems there was an assert in pthread_create within the VM
>> when hitting the limit enforced by pids controller on the host
>>
>> Apr 28 07:45:46 lpcomp02007 kernel: cgroup: fork rejected by pids controller
>> in /machine.slice/machine-qemu\x2d90028\x2dinstance\x2d0000634b.scope
>>
>> The user has TasksMax set to infinity in machine.slice, but apparently that
>> is not inherited by child scopes and appears to be hardcoded to 16384
>>
>> https://github.com/systemd/systemd/blob/51aba17b88617515e037e8985d3a4ea871ac47fe/src/machine/machined-dbus.c#L1344
>>
>> The TasksMax property can be set when creating the machine as is done in the
>> attached proof of concept patch. Question is whether this should be a
>> tunable? My initial thought when seeing the report was TasksMax could be
>> calculated based on number of vcpus, iothreads, emulator threads, etc. But
>> it appears that could be quite tricky. The following mail thread describes
>> the basic scenario encountered by my user
>>
>> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-March/008174.html
>>
>> As you can see, many rbd images attached to a VM can result in an awful lot
>> of threads. 300 images could result in 720K threads! We could punt and set
>> the limit to infinity, but it exists for a reason - fork bomb prevention. A
>> potential compromise between a hardcoded value and per-VM tunable is a
>> driver tunable in qemu.conf. If a per-VM tunable is preferred, suggestions
>> on where to place it and what to call it would be much appreciated :-).
> 
> Yeah, RBD is problematic as you can't predict how many threads it will
> use.
> 
> We currently have a "max_processes" stting in qemu.conf for the ulimit
> base process limit. This applies to the user as a whole though, not the
> cgroup.
> 
> On Fedora we don't seem to have any "tasks_max" cgroup setting or TasksMax
> systemd setting, at least when running with cgroups v1, so we can't set that
> unconditionally.

AFAICT, the TasksMax scope property maps to pids.max in pids controller 
hierarchy. E.g. with the hardcoded 32k value in the POC patch

# cat /sys/fs/cgroup/pids/machine.slice/machine-qemu\\x2d2\\x2dsles15.scope/pids.max
32768

Regards,
Jim

> 
> I'd be inclined to have a new qemu.conf setting "max_tasks". If this is
> set to 0, then we should just set TasksMax to infinity, otherwise honour
> the setting.
> 
> 
>> >From 0583ee3b26b2ee43efe8d25226eceb8547400d97 Mon Sep 17 00:00:00 2001
>> From: Jim Fehlig <jfehlig at suse.com>
>> Date: Wed, 22 May 2019 17:12:14 -0600
>> Subject: [PATCH] systemd: set TasksMax when calling CreateMachine
>>
>> An example of how to set TasksMax when creating a scope for a machine.
>>
>> Signed-off-by: Jim Fehlig <jfehlig at suse.com>
>> ---
>>   src/util/virsystemd.c | 10 ++++++----
>>   1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/util/virsystemd.c b/src/util/virsystemd.c
>> index 3f03e3bd63..6177447bdb 100644
>> --- a/src/util/virsystemd.c
>> +++ b/src/util/virsystemd.c
>> @@ -341,10 +341,11 @@ int virSystemdCreateMachine(const char *name,
>>                                 (unsigned int)pidleader,
>>                                 NULLSTR_EMPTY(rootdir),
>>                                 nnicindexes, nicindexes,
>> -                              3,
>> +                              4,
>>                                 "Slice", "s", slicename,
>>                                 "After", "as", 1, "libvirtd.service",
>> -                              "Before", "as", 1, "virt-guest-shutdown.target") < 0)
>> +                              "Before", "as", 1, "virt-guest-shutdown.target",
>> +                              "TasksMax", "t", UINT64_C(32768)) < 0)
>>               goto cleanup;
>>   
>>           if (error.level == VIR_ERR_ERROR) {
>> @@ -382,10 +383,11 @@ int virSystemdCreateMachine(const char *name,
>>                                 iscontainer ? "container" : "vm",
>>                                 (unsigned int)pidleader,
>>                                 NULLSTR_EMPTY(rootdir),
>> -                              3,
>> +                              4,
>>                                 "Slice", "s", slicename,
>>                                 "After", "as", 1, "libvirtd.service",
>> -                              "Before", "as", 1, "virt-guest-shutdown.target") < 0)
>> +                              "Before", "as", 1, "virt-guest-shutdown.target",
>> +                              "TasksMax", "t", UINT64_C(32768)) < 0)
>>               goto cleanup;
>>       }
>>   
>> -- 
>> 2.21.0
>>
> 
>> --
>> libvir-list mailing list
>> libvir-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/libvir-list
> 
> 
> Regards,
> Daniel
> 




More information about the libvir-list mailing list