[Crash-utility] [PATCH] Add -T option to configure task table via a file

Thu Jan 17 14:17:04 UTC 2019

----- Original Message -----
> 
> 
> > -----Original Message-----
> > ----- Original Message -----
> > > I faced an incomplete vmcore previous week which was generated because
> > > system running in kdump kernel was somehow rebooted in the middle of
> > > copying vmcore.
> > >
> > > Unfortunately, in the incomplete vmcore, most of the tasks failed to
> > > be detected via PID hash table because the objects relevant to PID
> > > hash including ptes needed to refer to the objects were lost.
> > >
> > > Although I successfully found many of objects of task_struct from
> > > another data structure such as via a circular list of task_struct::tasks
> > > and via run queue, crash sub-commands never work with the following
> > > message if a given object is not contained in the task table:
> > >
> > >     crash> bt 0xffffffff00000000
> > >     bt: invalid task or pid value: 0xffffffff00000000
> > >
> > > To address this issue, I made a patch to add a command-line option
> > > to pass a list of addresses of task_struct objects to make crash
> > > try to detect them in task table.
> > >
> > > I made this in very short time and there may be better interface
> > > than command-line option.
> > >
> > > Tested on top of crash-7.2.5.
> > 
> > Yeah, what bothers me about this patch is that even though it worked for
> > your
> > particular half-baked vmcore, it may never be of any help to anybody else
> > in the future.
> > 
> > It's similar in nature to patches that have posted that address a
> > particular
> > unique kernel bug that was seen in one vmcore, but it would be highly
> > unlikely
> > that the circumstances would ever be seen again.
> 
> I'm posting this patch because I think this could be useful for everyone...
> 
> It was unfortunate that incomplete vmcore was generated and sent to us but
> there's case where engineers have to investigate issues based on the
> incomplete vmcore.
> 
> I also think there would other cases where restoring task_struct could fail
> due to pure software issues, for example, memory corruption bugs, and think
> it natural that crash doesn't behave as expected when kernel data structures
> are abnormal state.
> 
> I think options such as --minimal and --no_kmem_cache are to deal with
> such cases and this feature is similar in this sense.

Yes, but --minimal is for extreme situations, and doesn't really apply in this case
given the small subset of available commands.  --no_kmem_cache is more for 
situations where for whatever reason the slab subsystem initialization fails
but it's not necessary to kill the whole session.

There is "crash --active", which gathers just the active tasks on each cpu
without using the normal task-gathering function.  Were you aware of that
option?

> 
> By the way, I feel like I saw vmcores where some error messages
> were output during "(gathering task table data)" in the past and
> I guess some tasks were missing there but this was the first case I actually
> needed to try to restore them.
> 
> > 
> > But in this case, it's even more unlikely given that it's dealing with
> > an incomplete vmcore.  You were lucky that you were able to even
> > bring up a crash session at all -- and then were able to generate
> > a task list after that.
> 
> It was incomplete but was complete about 98%. The detection from PID
> hash was affected by loss of the remaining 2%.
> 
> > 
> > Following the task_struct.tasks list doesn't gather all of the
> > tasks in a task group, so it doesn't create a fully populated task
> > list, correct?
> 
> Yes, I needed to repeat iterating successfully detected task_struct
> objects in order until all the target tasks were covered, and as your
> guess, there's no guarantee that I found all the task_struct objects,
> so I said 'many of'.

Ok, I suppose it's useful but not necessary to gather all tasks.  As I mentioned
before, --active will gather the tasks running at the time of the crash, which
are typically the most important.

But I understand your point, where there may be other non-active tasks whose
backtrace or whatever may be useful.

> 
> > 
> > Plus it doesn't make sense to add it unless it's documented *how* to
> > create the task list to begin with.
> 
> How about writing use case in help message or/and manual page?
> 
>   -T file
>       Make crash detect task_struct objects listed in file as in
>       task table. This is useful when your interesting tasks are
>       missing in task table. You may find your interesting
>       task_struct objects from various kernel data structures:
> 
>       From task_struct::tasks:
> 
>         crash> list task_struct.tasks -s task_struct.pid,comm -h
>         ffff88003daa8000
>         ffff88003daa8000
>           pid = 1
>           comm = "systemd\000\060\000\000\000\000\000\000"
>         ffff88003daa8fd0
>           pid = 2
>           comm = "kthreadd\000\000\000\000\000\000\000"
>         ffff88003daa9fa0
>           pid = 3
>           comm = "ksoftirqd/0\000\000\000\000"
>         ...<snip>...
> 
>       From runqueue:
> 
>         crash> runq
>         CPU 0 RUNQUEUE: ffff88003fc16cc0
>           CURRENT: PID: 2188   TASK: ffff8800360f8000  COMMAND: "foobar.sh"
>           RT PRIO_ARRAY: ffff88003fc16e50
>              [no tasks queued]
>           CFS RB_ROOT: ffff88003fc16d68
>              [no tasks queued]
> 
>         CPU 1 RUNQUEUE: ffff88003fd16cc0
>           CURRENT: PID: 1      TASK: ffff88003daa8000  COMMAND: "systemd"
>           RT PRIO_ARRAY: ffff88003fd16e50
>              [no tasks queued]
>           CFS RB_ROOT: ffff88003fd16d68
>              [120] PID: 19054  TASK: ffff88000b684f10  COMMAND: "kworker/1:0"
>              [120] PID: 3863   TASK: ffff88003bd02f70  COMMAND: "emacs"
> 
> This might be lengthy under option section.

Right, it could be more verbose under the help page section, but less so in the
man page and in "crash --help".

> 
> > 
> > I don't know, let me think about this...
> 
> I don't think the current design is best.
> 
> For example, it might be better to be able to update task table at runtime
> by some crash sub-command. Looking at source code, it appears that
> task table is updated at command execution when needed, so it's not
> so difficult?
> 
> void
> exec_command(void)
> {
> ...<snip>...
>         if ((ct = get_command_table_entry(args[0]))) {
>                 if (ct->flags & REFRESH_TASK_TABLE) {
>                         if (XEN_HYPER_MODE()) {
> #ifdef XEN_HYPERVISOR_ARCH
>                                 xen_hyper_refresh_domain_context_space();
>                                 xen_hyper_refresh_vcpu_context_space();
> #else
>                                 error(FATAL, XEN_HYPERVISOR_NOT_SUPPORTED);
> #endif
>                         } else if (!(pc->flags & MINIMAL_MODE)) {
>                                 tt->refresh_task_table()
>                                 sort_context_array();
>                                 sort_tgid_array();
>                         }
>                 }

Right, although each of the several task table gathering plugin functions
only run one time if it's a dumpfile.  But that could be adjusted for a
run-time command option that wants to add one or more tasks to the table.

Perhaps refresh_active_task_table() could be modified to allow the addition
of extra tasks, either by command line option, or allowing the function to be
re-run during runtime as directed by a "task -a task-address" or "task -A file"
option.

Thanks,
  Dave