[Crash-utility] improve ps performance

Dave Anderson anderson at redhat.com
Fri Sep 19 19:15:08 UTC 2014



----- Original Message -----
> 
> Hello Pan,
> 
> I've updated the patch I attached yesterday with a change that
> caches the most-recent tgid search result.  From ~70% to ~90% of
> the time, either the last tgid entry or the very next one in the
> tgid_array is the one being searched for, so it's not necessary
> to call bsearch() every time.  "help -t" will show the cache-hit
> statistics.
> 
> Thanks,
>   Dave

Hello Pan,

This patch as written needs to be made less restrictive for use
on a live system.

When running on a live system that has many tasks constantly 
forking/exec'ing, the "ps" command may occasionally fail like so:

  crash> ps
       PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
        0      0   0  ffffffff81c13440  RU   0.0       0      0  [swapper/0]
        0      0   1  ffff88021282d330  RU   0.0       0      0  [swapper/1]
  >     0      0   2  ffff88021282dac0  RU   0.0       0      0  [swapper/2]
        0      0   3  ffff88021282e250  RU   0.0       0      0  [swapper/3]
        1      0   1  ffff880212828000  IN   0.0   50140   3120  systemd
        2      0   3  ffff880212828790  IN   0.0       0      0  [kthreadd]
  ... [ cut ] ... 
     7578  27670   0  ffff8801f45e3c80  DE   0.0       0      0  cc
     7622  27668   1  ffff880210ee3c80  ZO   0.0       0      0  info
     7629  27667   1  ffff8801075bd330  DE   0.0       0      0  rev
     7631  27680   0  ffff8801075bf170  ZO   0.0       0      0  printenv
     7635  27685   3  ffff880108bbe9e0  ZO   0.0       0      0  ypwhich
  ps: bsearch for tgid failed: task: ffff880210ee6250 tgid: 7654
  crash>

Without this patch, the search for the matching tgid would not generate 
an error at all, but just quietly continue.

The problem is due to the task.tgid may change on a live system, or more
likely, the task itself may have been re-used.  

I would like to fix it simply ignoring tgid bsearch failures on live systems,
and just use the RSS stats stored in the per-tgid mm_struct.

Does that work for you?

Dave





More information about the Crash-utility mailing list