[Crash-utility] patch for slight modification to runq -g command
Chen, Anthony
Anthony.Chen at Teradata.com
Fri Oct 18 00:41:47 UTC 2013
Dave,
Thanks for the comment. I'll rework the patch.
Anthony
> -----Original Message-----
> From: crash-utility-bounces at redhat.com [mailto:crash-utility-
> bounces at redhat.com] On Behalf Of Dave Anderson
> Sent: Thursday, October 17, 2013 12:46 PM
> To: Discussion list for crash utility usage, maintenance and development
> Subject: Re: [Crash-utility] FW: patch for slight modification to runq -g
> command
>
>
>
> ----- Original Message -----
> >
> >
> >
> >
> > Hi,
> >
> >
> >
> > We’re debugging an in-house application that makes use of hard limit
> > extensively. We ran into a lot of timing windows (all our own making)
> and we
> > use runq -g command a lot. The current runq -g display only the
> RBROOT
> > pointer. It really is a bit inconvenient to traverse the task_group
> > hierarchy. It would be nice to have the command also display the
> > corresponding task_group, cfs_rq pointer (at a minimum). Since the
> way we
> > crash the system by messing up the nr_running and h_nr_running, so
> we also
> > display those two fields at the same time. Here’s an example of before
> and
> > after.
> >
> >
> >
> > CPU 4
> >
> > CURRENT: PID: 0 TASK: ffff8840668c0380 COMMAND: "swapper"
> >
> > TASK_GROUP RT_RQ: ffff8800027d3820
> >
> > RT PRIO_ARRAY: ffff8800027d3820
> >
> > [no tasks queued]
> >
> > TASK_GROUP CFS_RQ: ffff8800027d36e0
> >
> > CFS RB_ROOT: ffff8800027d3710
> >
> > GROUP CFS RB_ROOT: ffff883ff69bcc30 <TDAT>
> >
> > GROUP CFS RB_ROOT: ffff884006290e30 <User>
> >
> > GROUP CFS RB_ROOT: ffff88400641c430 <TDWMVP1>
> >
> > GROUP CFS RB_ROOT: ffff88400646b030 <ServDown1:0>
> >
> > GROUP CFS RB_ROOT: ffff884006492630 <ServOrder1:1>
> >
> > GROUP CFS RB_ROOT: ffff883ff058fe30 <TDWMWD57> (THROTTLED)
> >
> > GROUP CFS RB_ROOT: ffff88047889ee30 <S:WD:3d:35fa>
> >
> > [120] PID: 27655 TASK: ffff8805078ce2c0 COMMAND: "actmain"
> >
> > <<< more throttled groups removed >>>
> >
> >
> >
> > CPU 4
> >
> > CURRENT: PID: 0 CFS: ffff8800027d36e0 TASK: ffff8840668c0380
> COMMAND:
> > "swapper"
> >
> > TASK_GROUP RT_RQ: ffff8800027d3820
> >
> > RT PRIO_ARRAY: ffff8800027d3820
> >
> > [no tasks queued]
> >
> > TASK_GROUP CFS_RQ: ffff8800027d36e0
> >
> > CFS RB_ROOT: ffff8800027d3710
> >
> > GROUP: ffff88405394d000 CFS: ffff883ff69bcc00 RB_ROOT:
> ffff883ff69bcc30
> > <TDAT> (0 0)
> >
> > GROUP: ffff88405906c400 CFS: ffff884006290e00 RB_ROOT:
> ffff884006290e30
> > <User> (0 0)
> >
> > GROUP: ffff884055081000 CFS: ffff88400641c400 RB_ROOT:
> ffff88400641c430
> > <TDWMVP1> (0 0)
> >
> > GROUP: ffff884055081c00 CFS: ffff88400646b000 RB_ROOT:
> ffff88400646b030
> > <ServDown1:0> (0 0)
> >
> > GROUP: ffff8840580fd400 CFS: ffff884006492600 RB_ROOT:
> ffff884006492630
> > <ServOrder1:1> (0 0)
> >
> > GROUP: ffff884058f58c00 CFS: ffff883ff058fe00 RB_ROOT:
> ffff883ff058fe30
> > <TDWMWD57> (7 9) (THROTTLED)
> >
> > GROUP: ffff8808fb976000 CFS: ffff88047889ee00 RB_ROOT:
> ffff88047889ee30
> > <S:WD:3d:35fa> (1 1)
> >
> > [120] PID: 27655 TASK: ffff8805078ce2c0 COMMAND: "actmain"
> >
> > <<< more throttled groups removed >>>
> >
> >
> >
> > I have attached the patch we use to display additional information.
> Could you
> > please take a look at my proposal to see if it is possible that you include
> > this kind of display format.
> >
> >
> >
> > Thanks,
> >
> > Anthony
>
> A couple quick observations...
>
> Adding the CFS: address makes sense, but besides you and me and
> whoever
> reads the patch/code, how would anybody know what the two numbers
> inside
> the parentheses mean? It could be documented in the help page, but I
> wonder if it could be made more obvious somehow?
>
> Anyway, I ran a test on a sample set of vmcores, and this addition
> will only work with relevant kernel versions. For example, in these
> four dumpfiles (and therefore their kernel versions), the command
> fails because the cfs_rq.h_nr_running member does not exist:
>
> 2.6.38.2-9.fc15:
>
> CPU 0
> CURRENT: PID: 1180 CFS: ffff880037ef1b00 TASK: ffff88003bea2e40
> COMMAND: "crash"
> TASK_GROUP RT_RQ: ffff88003fc13988
> RT PRIO_ARRAY: ffff88003fc13988
> [no tasks queued]
> TASK_GROUP CFS_RQ: ffff88003fc138b0
> CFS RB_ROOT: ffff88003fc138d8
> GROUP: ffff88003c610a00 CFS: ffff880037ef1b00 RB_ROOT:
> ffff880037ef1b28
> runq: invalid structure member offset: cfs_rq_h_nr_running
> FILE: task.c LINE: 7632 FUNCTION: print_group_header_fair()
>
>
> 2.6.40.4-5.fc15:
>
> CPU 1
> CURRENT: PID: 1341 CFS: ffff880037592f00 TASK: ffff880037409730
> COMMAND: "crash"
> TASK_GROUP RT_RQ: ffff88003fc92690
> RT PRIO_ARRAY: ffff88003fc92690
> [no tasks queued]
> TASK_GROUP CFS_RQ: ffff88003fc925b0
> CFS RB_ROOT: ffff88003fc925d8
> GROUP: ffff88003a84b800 CFS: ffff880037592f00 RB_ROOT:
> ffff880037592f28
> runq: invalid structure member offset: cfs_rq_h_nr_running
> FILE: task.c LINE: 7632 FUNCTION: print_group_header_fair()
>
>
> 2.6.32-131.0.15.el6:
>
> CPU 0
> CURRENT: PID: 28263 CFS: ffff8800794b7140 TASK: ffff880037aaa040
> COMMAND: "loop.ABA"
> TASK_GROUP RT_RQ: ffff88000a216098
> RT PRIO_ARRAY: ffff88000a216098
> [no tasks queued]
> TASK_GROUP CFS_RQ: ffff88000a215fe8
> CFS RB_ROOT: ffff88000a216010
> GROUP: ffff8800785bc200 CFS: ffff8800784d7c80 RB_ROOT:
> ffff8800784d7ca8 <A>
> runq: invalid structure member offset: cfs_rq_h_nr_running
> FILE: task.c LINE: 7632 FUNCTION: print_group_header_fair()
>
>
> 3.1.7-1.fc16:
>
> CPU 2
> CURRENT: PID: 1495 CFS: ffff8800277f8500 TASK: ffff880037a60000
> COMMAND: "crash"
> TASK_GROUP RT_RQ: ffff88003e2532d0
> RT PRIO_ARRAY: ffff88003e2532d0
> [no tasks queued]
> TASK_GROUP CFS_RQ: ffff88003e2531f0
> CFS RB_ROOT: ffff88003e253218
> GROUP: ffff88002722b000 CFS: ffff8800277f8500 RB_ROOT:
> ffff8800277f8528
> runq: invalid structure member offset: cfs_rq_h_nr_running
> FILE: task.c LINE: 7632 FUNCTION: print_group_header_fair()
>
> So you'll need to check VALID_MEMBER(cfs_rq_h_nr_running) before
> attempting
> to print it, and maybe just show the cfs_rq.nr_running value.
>
> When adding new items to the offset_table, they should be added to
> the
> end of the structure so that extension modules will continue to be able
> to access any offsets as expected without having to be re-compiled.
> But you can display the new item in dump_offset_table() wherever
> you'd
> like, typically grouped with similar/associated items -- although in your
> patch you did this:
>
> + fprintf(fp, " cfs_rq_nr_running: %ld\n",
> + OFFSET(cfs_rq_nr_running));
> + fprintf(fp, " cfs_rq_h_nr_running: %ld\n",
> + OFFSET(cfs_rq_h_nr_running));
>
> The previously-existing cfs_rq_nr_running display already exists,
> so I suggest that you just move your new cfs_rq_h_nr_running display
> underneath it. (and please line up the "help -o" output of the new
> item as well).
>
> Dave
>
>
>
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
More information about the Crash-utility
mailing list