[Crash-utility] [PATCH] runq: make tasks in throttled cfs_rqs/rt_rqs displayed

Thu Oct 25 18:26:35 UTC 2012

----- Original Message -----
> Hello Dave,
> 
> Sorry about not testing the patch fully enough. And I think we
> should make a discussion about the first patch. I have done some
> tests with the patch, and I attached it. So could you please test
> it in your box again.

Hello Zhang,

I applied only your new patch1 (the old patch2 no longer applies
after this new patch1), and I see this:

 $ make warn
  ...
  cc -c -g -DX86_64  -DGDB_7_3_1  task.c -Wall -O2 -Wstrict-prototypes -Wmissing-prototypes -fstack-protector 
  task.c: In function â€˜dump_CFS_runqueuesâ€™:
  task.c:7693:6: warning: variable 'tot' set but not used [-Wunused-but-set-variable]
  ...

And I still (always) see the same problem with a live kernel:

 crash> set
     PID: 25998
 COMMAND: "crash"
    TASK: ffff88020fd9dc40  [THREAD_INFO: ffff88017b6d2000]
     CPU: 2
   STATE: TASK_RUNNING (ACTIVE)
 crash> runq
 CPU 0 RUNQUEUE: ffff88021e213cc0
   CURRENT: PID: 0      TASK: ffffffff81c13420  COMMAND: "swapper/0"
   RT PRIO_ARRAY: ffff88021e213e28
      [no tasks queued]
   CFS RB_ROOT: ffff88021e213d58
      GROUP CFS RB_ROOT: ffff88020ec3b800runq: invalid kernel virtual address: 48  type: "cgroup dentry"
 crash>

I also still see numerous instances of the error above with some 
(but not all) of my "snapshot" dumpfiles, where your dump_task_group_name()
function is encountering (and trying to use) a NULL cgroup address here:

 static void
 dump_task_group_name(ulong group)
 {
         ulong cgroup, dentry, name;
         char *dentry_buf;
         int len;
         char tmp_buf[100];

         readmem(group + OFFSET(task_group_css) + OFFSET(cgroup_subsys_state_cgroup),
                 KVADDR, &cgroup, sizeof(ulong),
                 "task_group css cgroup", FAULT_ON_ERROR);
         readmem(cgroup + OFFSET(cgroup_dentry), KVADDR, &dentry, sizeof(ulong),
                 "cgroup dentry", FAULT_ON_ERROR);

Here are the examples, where it always happens on the "crash" process while
it's performing the snapshot file creation:

2.6.38.2-9.fc15 snapshot:

 crash> runq
 CPU 0 RUNQUEUE: ffff88003fc13840
   CURRENT: PID: 1180   TASK: ffff88003bea2e40  COMMAND: "crash"
   RT PRIO_ARRAY: ffff88003fc13988
      [no tasks queued]
   CFS RB_ROOT: ffff88003fc138d8
      GROUP CFS RB_ROOT: ffff880037ef1b00runq: invalid kernel virtual address: 38  type: "cgroup dentry"
 crash>

2.6.40.4-5.fc15 snapshot:

 crash> runq
 ...
 CPU 1 RUNQUEUE: ffff88003fc92540
   CURRENT: PID: 1341   TASK: ffff880037409730  COMMAND: "crash"
   RT PRIO_ARRAY: ffff88003fc92690
      [no tasks queued]
   CFS RB_ROOT: ffff88003fc925d8
      GROUP CFS RB_ROOT: ffff880037592f00runq: invalid kernel virtual address: 38  type: "cgroup dentry"
 crash>

3.5.1-1.fc17 snapshot:

 crash> runq
 ...
 CPU 1 RUNQUEUE: ffff88003ed13800
   CURRENT: PID: 31736  TASK: ffff88007c46ae20  COMMAND: "crash"
   RT PRIO_ARRAY: ffff88003ed13968
      [no tasks queued]
   CFS RB_ROOT: ffff88003ed13898
      GROUP CFS RB_ROOT: ffff88003deb3000runq: invalid kernel virtual address: 48  type: "cgroup dentry"
 crash>

3.1.7-1.fc16 snapshot:

 crash> runq
 ...
 CPU 2 RUNQUEUE: ffff88003e253180
   CURRENT: PID: 1495   TASK: ffff880037a60000  COMMAND: "crash"
   RT PRIO_ARRAY: ffff88003e2532d0
      [no tasks queued]
   CFS RB_ROOT: ffff88003e253218
      GROUP CFS RB_ROOT: ffff8800277f8500runq: invalid kernel virtual address: 38  type: "cgroup dentry"
 crash>

3.2.6-3.fc16 snapshot:

 crash> runq
 ...
 CPU 0 RUNQUEUE: ffff88003fc13780
   CURRENT: PID: 1383   TASK: ffff88003c932e40  COMMAND: "crash"
   RT PRIO_ARRAY: ffff88003fc13910
      [no tasks queued]
   CFS RB_ROOT: ffff88003fc13820
      GROUP CFS RB_ROOT: ffff88003a432c00runq: invalid kernel virtual address: 38  type: "cgroup dentry"
 crash>

But I also saw the error above on this 3.2.1-0.8.el7.x86_64 kernel
that actually crashed:

 crash> runq
 ...
 CPU 3 RUNQUEUE: ffff8804271d43c0
   CURRENT: PID: 11615  TASK: ffff88020c50a670  COMMAND: "runtest.sh"
   RT PRIO_ARRAY: ffff8804271d4590
      [no tasks queued]
   CFS RB_ROOT: ffff8804271d44a0
      GROUP CFS RB_ROOT: ffff88041e0d2760runq: invalid kernel virtual address: 38  type: "cgroup dentry"
 crash>

> will be fixed in patch2 later.

With respect to your patch2:

 +#define MAX_THROTTLED_RQ 100
 +struct throttled_rq {
 +       ulong rq;
 +       int depth;
 +       int prio;
 +};
 +static struct throttled_rq throttled_rt_rq_array[MAX_THROTTLED_RQ];
 +static struct throttled_rq throttled_cfs_rq_array[MAX_THROTTLED_RQ];

Can you please dynamically allocate the throttled_rt_rq_array and 
throttled_cfs_rq_array arrays with GETBUF(), perhaps in the 
task_group_offset_init() function?  They are only needed when
"runq" is executed, and then only if the kernel version supports
them.  You can FREEBUF() them at the bottom of dump_CFS_runqueues(),
and if the command fails prematurely, they will be FREEBUF()'d 
automatically by restore_sanity().

But this leads to the larger question of showing the task_group
data.  Consider that the current "runq" command does what it says
it does: 

 crash> help runq
 NAME
   runq - run queue

 SYNOPSIS
   runq [-t]

 DESCRIPTION
   With no argument, this command displays the tasks on the run queues
   of each cpu.

    -t  Display the timestamp information of each cpu's runqueue, which is the
        rq.clock, rq.most_recent_timestamp or rq.timestamp_last_tick value,
        whichever applies; following each cpu timestamp is the last_run or 
        timestamp value of the active task on that cpu, whichever applies, 
        along with the task identification.
   ...

Now, your patch adds signficant complexity to the runq handling code
and to its future maintainability.  I'm wondering whether your patch
can be modified such that the task_group info would only be displayed
via a new flag, let's say "runq -g".  It seems that there has been 
considerable churn in the kernel code in this area, and it worries me 
that this patch will potentially and unnecessarily cause the breakage 
of the simple display of the queued tasks. 

Thanks,
  Dave