[Crash-utility] patch for slight modification to runq -g command

Thu Nov 7 21:57:58 UTC 2013

Hi Anthony,

With respect to the nr_running and h_nr_running displays, since you
can "see" the number of tasks queued underneath each particular 
group, I'm not convinced that it's worth displaying them?  

In your first post you mentioned:

> Since the way we crash the system by messing up the nr_running and h_nr_running,
> so we also display those two fields at the same time. Here’s an example of before and after.

Are you saying that you purposely modify those two values in order to force
a crash? 

Anyway, I bring this up because their display is kind of ugly, and also because
in the output logs of my test of your patch, I see this particular instance,
where I've got a 3.6.0 kernel where a crash was generated by entering 
"echo c > /proc/sysrq-trigger":

  crash> bt
  PID: 1212   TASK: ffff880035f60000  CPU: 1   COMMAND: "bash"
   #0 [ffff88007831fa20] machine_kexec at ffffffff8103e465
   #1 [ffff88007831fa90] crash_kexec at ffffffff810c6658
   #2 [ffff88007831fb60] oops_end at ffffffff815d5bf8
   #3 [ffff88007831fb90] no_context at ffffffff815c7dae
   #4 [ffff88007831fbf0] __bad_area_nosemaphore at ffffffff815c7f98
   #5 [ffff88007831fc40] bad_area at ffffffff815c81f0
   #6 [ffff88007831fc70] do_page_fault at ffffffff815d87d1
   #7 [ffff88007831fd80] page_fault at ffffffff815d5025
      [exception RIP: sysrq_handle_crash+22]
      RIP: ffffffff81388986  RSP: ffff88007831fe38  RFLAGS: 00010092
      RAX: 000000000000000f  RBX: ffffffff8192dc20  RCX: 00000000000014ff
      RDX: 000000000000332f  RSI: 0000000000000046  RDI: 0000000000000063
      RBP: ffff88007831fe38   R8: ffffffff81b26580   R9: 0000000000000397
      R10: 0000000000000002  R11: 0000000000000396  R12: 0000000000000063
      R13: 0000000000000286  R14: 0000000000000000  R15: 0000000000000007
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   #8 [ffff88007831fe40] __handle_sysrq at ffffffff813890a7
   #9 [ffff88007831fe80] write_sysrq_trigger at ffffffff8138915a
  #10 [ffff88007831feb0] proc_reg_write at ffffffff811ea879
  #11 [ffff88007831ff00] vfs_write at ffffffff8118991c
  #12 [ffff88007831ff30] sys_write at ffffffff81189c4a
  #13 [ffff88007831ff80] system_call_fastpath at ffffffff815dcae9
      RIP: 00007f64d1a94530  RSP: 00007fffbb0c1248  RFLAGS: 00010246
      RAX: 0000000000000001  RBX: ffffffff815dcae9  RCX: 00000000fbad2a84
      RDX: 0000000000000002  RSI: 00007f64d23ab000  RDI: 0000000000000001
      RBP: 00007f64d23ab000   R8: 000000000000000a   R9: 00007f64d23a4740
      R10: 0000000000000001  R11: 0000000000000246  R12: 0000000000000002
      R13: 00007f64d1d61280  R14: 0000000000000002  R15: 00007f64d1d61280
      ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b
  crash>

The "runq -g" output for that cpu looks like this:

  CPU 1
    CURRENT: PID: 1212  CFS: ffff880035cc2f00 TASK: ffff880035f60000  COMMAND: "bash"
    TASK_GROUP RT_RQ: ffff88007fa541e8
    RT PRIO_ARRAY: ffff88007fa541e8
       [no tasks queued]
    TASK_GROUP CFS_RQ: ffff88007fa540f0
    CFS RB_ROOT: ffff88007fa54118
       GROUP: ffff880078af7800 CFS_RQ: ffff880035cc2f00 RB_ROOT: ffff880035cc2f28 nr_running: 4294967297 h_nr_running: 201908650262921217 
          [120] PID: 1212   TASK: ffff880035f60000  COMMAND: "bash"

I don't understand where those values are coming from, because if
I look at the CFS_RQ, it shows this:

  crash> cfs_rq.nr_running,h_nr_running ffff880035cc2f00
    nr_running = 1
    h_nr_running = 1
  crash>

I also see this occurring on live "snapshot" dumps -- which I understand given
that the kernel's runqueue data structures are being changed while the dump
is being created.  But I don't understand why it's happening in the situation
above.

Dave

----- Original Message -----
> 
> 
> ----- Original Message -----
> > Hi Dave,
> > 
> > I have cleaned up the code and added another change.
> 
> OK thanks -- the patch runs through my sample set of vmcores with no problem.
> 
> > The current running task is not in the rb tree (rb_root), so run -q
> > displays it like:
> > 
> >   CURRENT: PID: 9048   TASK: ffff8808b07e4200  COMMAND: "actmain"
> >   TASK_GROUP RT_RQ: ffff880002493820
> >   RT PRIO_ARRAY: ffff880002493820
> >      [no tasks queued]
> >   TASK_GROUP CFS_RQ: ffff8800024936e0
> >   CFS RB_ROOT: ffff880002493710
> >      GROUP CFS RB_ROOT: ffff882d609ce830 <TDAT>
> >         GROUP CFS RB_ROOT: ffff883f0bcbfa30 <User>
> >                [no tasks queued]
> > 
> > I can understand why the current running task is not displayed.
> > However, the "-g" option displays all the task_groups the task
> > belongs to but at the end it shows "[no tasks queued]". That is
> > just strange.  The new change is to display the task that is running like:
> > 
> >   CURRENT: PID: 9048  CFS: ffff88039351a800 TASK: ffff8808b07e4200
> >   COMMAND: "actmain"
> >   TASK_GROUP RT_RQ: ffff880002493820
> >   RT PRIO_ARRAY: ffff880002493820
> >      [no tasks queued]
> >   TASK_GROUP CFS_RQ: ffff8800024936e0
> >   CFS RB_ROOT: ffff880002493710
> >      GROUP: ffff884052bc9800 CFS_RQ: ffff882d609ce800 RB_ROOT:
> >      ffff882d609ce830 <TDAT> nr_running: 1 h_nr_running: 1
> >         GROUP: ffff884058f1d000 CFS_RQ: ffff883f0bcbfa00 RB_ROOT:
> >         ffff883f0bcbfa30 <User> nr_running: 1 h_nr_running: 1
> >               [120] PID: 9048   TASK: ffff8808b07e4200  COMMAND: "actmain"
> 
> OK -- I guess I understand why it probably makes sense to duplicate the
> CURRENT task underneath its own GROUP list -- but if that is done, then
> why clutter the CURRENT line with the CFS_RQ address?  And it's not clear
> to me why in your example above, the CFS address of ffff88039351a800
> doesn't show up as the CFS_RQ address above the "actmain" line?
> 
> Taking a simple example, I see this:
> 
>  crash> runq -g
>  CPU 0
>    CURRENT: PID: 0     CFS: ffff88000c7d6aa8 TASK: ffffffff8178ba60  COMMAND:
>    "swapper"
>    TASK_GROUP RT_RQ: ffff88000c7d6b58
>    RT PRIO_ARRAY: ffff88000c7d6b58
>       [no tasks queued]
>    TASK_GROUP CFS_RQ: ffff88000c7d6aa8
>    CFS RB_ROOT: ffff88000c7d6ad0
>       [no tasks queued]
>  
>  CPU 1
>    CURRENT: PID: 1268  CFS: ffff88000c9b5aa8 TASK: ffff88002f11c620  COMMAND:
>    "bash"
>    TASK_GROUP RT_RQ: ffff88000c9b5b58
>    RT PRIO_ARRAY: ffff88000c9b5b58
>       [no tasks queued]
>    TASK_GROUP CFS_RQ: ffff88000c9b5aa8
>    CFS RB_ROOT: ffff88000c9b5ad0
>       [120] PID: 1268   TASK: ffff88002f11c620  COMMAND: "bash"
> 
>  crash>
>   
> Where the newly-interspersed CFS address redundantly shows the TASK_GROUP
> CFS_RQ
> below.  But adding the CFS address to the "swapper" line doesn't seem to make
> much sense, or help in any way, since the idle task is a special case that
> never
> gets queued.  And since the CFS address in the "bash" line is redundant with
> the
> TASK_GROUP CFS_RQ below, why bother showing it?
> 
> And in a more complicated example, with your patch, the "qemu-kvm" task also
> shows up underneath its group:
> 
>  CPU 0
>    CURRENT: PID: 3144  CFS: ffff88022aab2600 TASK: ffff88022a446040  COMMAND:
>    "qemu-kvm"
>    TASK_GROUP RT_RQ: ffff880133c16148
>    RT PRIO_ARRAY: ffff880133c16148
>       [no tasks queued]
>    TASK_GROUP CFS_RQ: ffff880133c16028
>    CFS RB_ROOT: ffff880133c16058
>       GROUP: ffff88012b880800 CFS_RQ: ffff88022ac8f000 RB_ROOT:
>       ffff88022ac8f030 <libvirt> nr_running: 1 h_nr_running: 1
>          GROUP: ffff88012c078000 CFS_RQ: ffff88022c075000 RB_ROOT:
>          ffff88022c075030 <qemu> nr_running: 1 h_nr_running: 1
>             GROUP: ffff88012b0fb400 CFS_RQ: ffff88022af94c00 RB_ROOT:
>             ffff88022af94c30 <guest1> nr_running: 1 h_nr_running: 1
>                GROUP: ffff88022c6bbc00 CFS_RQ: ffff88022aab2600 RB_ROOT:
>                ffff88022aab2630 <vcpu1> nr_running: 1 h_nr_running: 1
>                   [120] PID: 3144   TASK: ffff88022a446040  COMMAND:
>                   "qemu-kvm"
> 
> And note that its interspersed CFS address of ffff88022aab2600 is redundantly
> shown
> as the CFS_RQ of its GROUP down below.
> 
> So I don't understand why your example shows different CFS addresses in the
> CURRENT line vs. the GROUP CFS_RQ address above the queued "acctmain" task:
> 
> >   CURRENT: PID: 9048  CFS: ffff88039351a800 TASK: ffff8808b07e4200
> >   COMMAND: "actmain"
> >   TASK_GROUP RT_RQ: ffff880002493820
> >   RT PRIO_ARRAY: ffff880002493820
> >      [no tasks queued]
> >   TASK_GROUP CFS_RQ: ffff8800024936e0
> >   CFS RB_ROOT: ffff880002493710
> >      GROUP: ffff884052bc9800 CFS_RQ: ffff882d609ce800 RB_ROOT:
> >      ffff882d609ce830 <TDAT> nr_running: 1 h_nr_running: 1
> >         GROUP: ffff884058f1d000 CFS_RQ: ffff883f0bcbfa00 RB_ROOT:
> >         ffff883f0bcbfa30 <User> nr_running: 1 h_nr_running: 1
> >               [120] PID: 9048   TASK: ffff8808b07e4200  COMMAND: "actmain"
> 
> Am I missing something?  Or is there cut-and-paste error?
> 
> Dave
> 
>