[Crash-utility] [PATCH v2] ps: Add support to "ps -l|-m" to properly display process list

Thu Mar 3 07:50:07 UTC 2022

Hi Austin,

-----Original Message-----
> > > This is because output of "ps -l|-m" depends on task_struct.sched_info.last_arrival.
> > >
> > > Without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO, 'sched_info' field is not included
> > > in task_struct.
> > >
> > > So we make "ps -l|-m" option to access 'exec_start' field of sched_entity
> > > where 'exec_start' is task_struct.se.exec_start.
> >
> > Could you describe what the exec_start means?  When is it updated?
> >
> 
> The 'task_struct.se.exec_start' contains the most recently-executed
> timestamp when
> process is running in the below cases;
> 
>  - enqueued to runqueue
>  - dequeued from ruqueue
>  - scheduler tick is invoked
>  - etc
> 
> So I guess 'task_struct.se.exec_start' could be one of statistics
> which indicates
> the most recently run timestamp of process activity.
> 
> From CFS scheduler's point of view, 'task_struct.se.exec_start' is
> updated within update_curr()
> where its call path is various as below.
> 
>  - enqueue_task_fair, -dequeue_task_fair, task_tick_fair,
> check_preempt_wakeup, ...

Thank you for looking into this.

As for when the se.exec_start is updated, I think you are right.
(with my understanding, probably it's the last time or tick when a task
is in a runqueue regardless of getting a CPU.)

But I found a problem, is that the se.exec_start is from rq->clock_task,
not from rq->clock like last_arrival.  The rq->clock_task may not contain
irq/steal time, please see update_rq_clock_task().

This causes the following issues with a vmcore generated on a machine,
which had run for a while (273 days):

crash> ps -m | head
[  0 00:59:36.582] [RU]  PID: 4023608  TASK: ffff916f7c6b1840  CPU: 15  COMMAND: "makedumpfile"
        ^^^^^^^^^(1)
[  0 00:59:37.831] [ID]  PID: 413      TASK: ffff916f772d3080  CPU: 15  COMMAND: "kworker/15:1H"
[  0 00:59:39.765] [IN]  PID: 3929504  TASK: ffff916f5f0748c0  CPU: 15  COMMAND: "respawn_actlog"
[  0 00:59:40.650] [IN]  PID: 1974     TASK: ffff91647dc53080  CPU: 15  COMMAND: "CPU 2/KVM"
[  0 00:59:41.925] [IN]  PID: 1297     TASK: ffff916f63c46100  CPU: 15  COMMAND: "NetworkManager"
[  0 00:59:42.944] [ID]  PID: 3763057  TASK: ffff9160c4519840  CPU: 15  COMMAND: "kworker/15:0"
[  0 00:59:42.944] [IN]  PID: 101      TASK: ffff916040c91840  CPU: 15  COMMAND: "migration/15"
[  0 00:59:43.078] [IN]  PID: 100      TASK: ffff916040c5b080  CPU: 15  COMMAND: "watchdog/15"
[  0 00:59:47.533] [IN]  PID: 1292     TASK: ffff916f63c43080  CPU: 15  COMMAND: "lsmd"
[  0 00:59:49.089] [IN]  PID: 113105   TASK: ffff9160412248c0  CPU: 15  COMMAND: "kvm-nx-lpage-re"
                                                               ^^^^^^^(2)
(1) large difference from zero
(2) large differences among CPUs (probably due to the differences of irq time)

(1) might be solved with rq->clock_task, but (2) will be misleading and confusing.
So currently I'm thinking that the "ps -l|-m" options should not use the se.exec_start.
What do you think?

Thanks,
Kazu