[Crash-utility] [PATCH 3/4] remove unreachable (and slow) code
Greg Thelen
gthelen at google.com
Mon Apr 9 19:36:09 UTC 2018
On Mon, Apr 9, 2018 at 11:32 AM Dave Anderson <anderson at redhat.com> wrote:
> ----- Original Message -----
> > task_exists() scans known tasks for a given task address.
> >
> > refresh_radix_tree_task_table() looks like:
> > hq_open()
> > for cpus {
> > hq_enter(per_cpu_idle_thread)
> > }
> > for ... {
> > hq_enter(task)
> > }
> > hq_close()
> >
> > tt->running_tasks = 0
> > for task in retrieve_list(hq_entries) {
> > if (task_exists()) {
> > duplicate task address
> > continue
> > }
> > add_context()
> > }
> >
> > Because retrieve_list returns unique tasks, the above task_exists()
> > check will never find a duplicate task. So remove the check. This
> > converts the above loop from O(N^2) to O(N) which saves considerable
> > startup time when there are a large number of tasks. On a 1M task dump
> > this patch reduces startup time: 44m => 1m.
> >
> > I was not able to test the other pid table routines, but I suspect they
> > may also benefit from similar removal.
> Hi Greg,
> I'm thinking that the task_exists() removal may be the most beneficial
> piece of the patch set, right?
This patch (3/4) is the most beneficial part for startup time. But after
startup, 'ps' is still slow with a lot of tasks. We can speed up 'ps' with
the final patch (4/4) ("index task_context by task"), which depends on 1/4
("refactor store_context => add_context") and 2/4 ("refactor task_to_pid").
In my experiments on a 1M task dump:
* patch 3/4 saves ~45m of startup time
* patches 1,2,4/4 saves ~45m of ps time
Together patches 1-4 save ~90m for startup+ps.
> I have a set of ~250 vmcore files gathered over the years, but only 2 of
> them are from kernels using the idr radix tree scheme. I timed how long
> it took to simply bring up all 250 vmcores to their crash prompt and then
> quit immediately. These are the total times running with and without your
> patch:
> without the patch:
> real 46m44.023s
> user 17m29.956s
> sys 0m58.940s
> with your patch applied:
> real 46m39.752s
> user 17m36.496s
> sys 0m58.269s
> But as you've noted, given that all of the various refresh_xxx functions
> use hq_enter(), the task_exists() call is pretty much useless.
> I'm going to remove the task_exists() call from the remaining refresh_xxx
> functions, rerun the test, and see how it affects the numbers. I suspect
> it will help a bit, although the task count in the ~250 vmcores range from
> 22 to 4178, with only ~20 dumpfiles having more than 1000 tasks.
Thanks. I also suspect the other refresh_xxx routines will benefit.
Though I don't have such dumps handy to test.
More information about the Crash-utility
mailing list