[Crash-utility] [PATCH RFC 0/1] Filter ps output by scheduling policy

Dave Anderson anderson at redhat.com
Fri Oct 20 14:50:33 UTC 2017


----- Original Message -----
> In CEE we often analyze vmcores from customers and sometimes want
> to filter tasks by scheduling policy, for instance, to identify
> if customer runs realtime tasks. Doing this via foreach grepping
> and then feeding back pointers to task_struct and grepping it again
> is very slow, especially if customer runs thousands of tasks.
> 
> So, let's add another option for ps to filter tasks by scheduling
> policy

OK, let's do it -- this is an excellent idea!

Upon a quick examination and test of the patch, there is 1 bug, and I've 
got a couple usability suggestions.

First, in 2.6.22 this kernel patch changed the size of the task_struct.policy
member:
  
  commit 97dc32cdb1b53832801159d5f634b41aad9d0a23
  Author: William Cohen <wcohen at redhat.com>
  Date:   Tue May 8 00:23:41 2007 -0700
  
      reduce size of task_struct on 64-bit machines
     
      This past week I was playing around with that pahole tool
      (http://oops.ghostprotocols.net:81/acme/dwarves/) and looking at the size
      of various struct in the kernel.  I was surprised by the size of the
      task_struct on x86_64, approaching 4K.  I looked through the fields in
      task_struct and found that a number of them were declared as "unsigned
      long" rather than "unsigned int" despite them appearing okay as 32-bit
      sized fields.  On x86_64 "unsigned long" ends up being 8 bytes in size and
      forces 8 byte alignment.  Is there a reason there a reason they are
      "unsigned long"?
  
  ... [ cut ] ...
  
  @@ -825,7 +825,7 @@ struct task_struct {
          unsigned long long sched_time; /* sched_clock time spent running */
          enum sleep_type sleep_type;
  
  -       unsigned long policy;
  +       unsigned int policy;
          cpumask_t cpus_allowed;
          unsigned int time_slice, first_time_slice;
  
You task_policy() function reads it from the task_struct using UINT():

+/*
+ * Return task's policy as bitmask bit.
+ */
+static ulong
+task_policy(ulong task)
+{
+       ulong policy = 0;
+
+       fill_task_struct(task);
+
+       if (!tt->last_task_read)
+               return policy;
+
+       policy = 1 << UINT(tt->task_struct + OFFSET(task_struct_policy));
+
+       return policy;
+}
+

But that fails with an older kernel that has a "long" policy and has
a big-endian architecture.  For example, here is the output of your
patch on a 2.6.18 RHEL5 kernel on a ppc64 machine:
  
  crash> ps -y FIFO
     PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
  crash> 
  
If I change task_policy() to use ULONG():
  
  //      policy = 1 << UINT(tt->task_struct + OFFSET(task_struct_policy));
          policy = 1 << ULONG(tt->task_struct + OFFSET(task_struct_policy));
  
it works as expected:

  crash> ps -y FIFO
     PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
        2      1   0  c0000000e6ec00b0  IN   0.0       0      0  [migration/0]
        4      1   0  c0000000e6ec08c0  IN   0.0       0      0  [watchdog/0]
        5      1   1  c0000000074b10d0  IN   0.0       0      0  [migration/1]
        7      1   1  c0000000e6ec10d0  IN   0.0       0      0  [watchdog/1]
        8      1   2  c0000000074b00b0  IN   0.0       0      0  [migration/2]
       10      1   2  c0000000074bba20  IN   0.0       0      0  [watchdog/2]
       11      1   3  c0000000e6ec20f0  IN   0.0       0      0  [migration/3]
       13      1   3  c0000000074bb210  IN   0.0       0      0  [watchdog/3]
       14      1   4  c0000000e6ec3110  IN   0.0       0      0  [migration/4]
       16      1   4  c0000000074baa00  IN   0.0       0      0  [watchdog/4]
       17      1   5  c0000000e6ec4130  IN   0.0       0      0  [migration/5]
       19      1   5  c0000000e6ec4940  IN   0.0       0      0  [watchdog/5]
       20      1   6  c0000000074b99e0  IN   0.0       0      0  [migration/6]
       22      1   6  c0000000e6ec5960  IN   0.0       0      0  [watchdog/6]
       23      1   7  c0000000074b91d0  IN   0.0       0      0  [migration/7]
       25      1   7  c0000000074b89c0  IN   0.0       0      0  [watchdog/7]
  crash> 

So I suggest saving the task_struct.policy member size in the size_table
using MEMBER_SIZE_INIT() -- similar to size_table.task_struct_flags -- and
then use that as a qualifier for whether to use UINT() or ULONG().

Secondly, the invocation could be made a bit more user-friendly and less verbose.
The help page (besides needing commas at the end of the strings to force linefeeds)
shows this:

crash> help ps
...
       -r  display resource limits (rlimits) of selected, or all, tasks.
       -S  display a summary consisting of the number of tasks in a task state.
-y policy  only show tasks with given scheduling policy. Policy can be:           NORMAL, FIFO, RR, BATCH, ISO, IDLE, DEADLINE. Multiple policies           can be specified by passing multiple -y parameters
EXAMPLES
...

Given that a "task" or "struct" dump of the task_struct.policy field 
shows the value as a number, a user might be forced to go look up
which #define is associated with the numeric value before using the
command.  So I think it would be helpful to allow the -y argument 
entry to accept either a string as you have now, or alternatively as its 
numeric value. 

Also, can you use the conventional manner of accepting a comma-separated 
string for multiple arguments, instead of requiring multiple -y entries?

Other than that, it looks really good!

Thanks,
  Dave



.
> 
> Oleksandr Natalenko (1):
>   task: also filter ps output by ->policy
> 
>  defs.h | 12 ++++++++++
>  help.c |  7 ++++--
>  task.c | 86
>  ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
>  3 files changed, 98 insertions(+), 7 deletions(-)
> 
> --
> 2.14.2
> 
> 




More information about the Crash-utility mailing list