[Crash-utility] [PATCH 0/3] Add a "percpu" command
Petr Tesarik
ptesarik at suse.cz
Sat Oct 12 19:41:49 UTC 2013
On Fri, 11 Oct 2013 15:54:54 -0400 (EDT)
Dave Anderson <anderson at redhat.com> wrote:
> ----- Original Message -----
> > Hi all,
> >
> > a few people complained to me that the crash utility lacks a
> > user-friendly interface for percpu variables, especially if they are
> > dynamically allocated. I implemented a new command which should make
> > these people happy.
> >
> > Tested on kernel 3.11, but I didn't really add anything
> > version-dependent (instead I used the existing infrastructure).
> >
> > Comments welcome!
> >
> > Petr Tesarik
>
>
> Hi Petr,
Hi Dave,
> First let me preface this by saying that I like this concept a lot.
Glad you like it. I'm pretty sure we'll find a way to implement this
feature in a way that is likable by everybody.
> A couple comments...
>
> First, a minor nit -- "make warn" shows this:
>
> $ make warn
> ...
> cc -c -g -DX86_64 -DGDB_7_6 symbols.c -I./gdb-7.6/bfd -I./gdb-7.6/include -Wall -O2 -Wstrict-prototypes -Wmissing-prototypes -fstack-protector -Wformat-security
> symbols.c: In function 'cmd_percpu':
> symbols.c:6071:2: warning: implicit declaration of function 'isdigit' [-Wimplicit-function-declaration]
> symbols.c:6143:6: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
> ...
Ah, yes, I forgot about "make warn" when updating the patch set.
Fixing these minor issues should be easy enough.
> As a quick test, I created a command input file that looks like this:
>
> $ cat input
> percpu -a irq_stack_union
> percpu -a gdt_page
> percpu -a exception_stacks
> percpu -a cpu_llc_shared_map
> percpu -a cpu_core_map
> percpu -a cpu_sibling_map
> percpu -a cpu_llc_id
> percpu -a cpu_number
> percpu -a x86_bios_cpu_apicid
> percpu -a x86_cpu_to_apicid
> percpu -a cpu_loops_per_jiffy
> percpu -a xen_vcpu_info
> percpu -a xen_vcpu
> percpu -a idt_desc
> percpu -a shadow_tls_desc
> ...
>
> where I displayed all of the per-cpu variables that are seen in
> the kernel's per-cpu symbol list, and it worked nicely.
>
> Your help page shows this:
>
> crash> help percpu
>
> NAME
> percpu - percpu variables
>
> SYNOPSIS
> percpu [-dx][-a] [-c cpu] [cpu]... [struct|union|*] [struct_name] [address|symbol]
>
> So my test used the "[-a]" and "[symbol]" arguments.
>
> But it's not clear to me when you would use the "[struct|union|*]",
> "[struct_name]" and/or the "[address]" arguments? The remainder of
> the help page give no explanation of what they are, nor does it
> contain any examples of how they would be used. I'm presuming that
> it has something to do with dynamically-allocated per-cpu variables?
There is a second example in the help:
Show a per-cpu variable on all processors:
crash> percpu -a
[0]: ffff88011e21a468
struct disk_stats {
sectors = {11197190, 7550896},
ios = {360193, 159113},
merges = {11723, 22075},
ticks = {6180943, 11137498},
io_ticks = 1781827,
time_in_queue = 16785949
}
[and so on...]
The address (0x1a468) was taken from the dkstats field of a struct
hd_struct on my system. The "struct", "union" and "*" variants are
needed only to disambiguate from a symbol name (if needed) and they are
similar to the "struct", "union" and "*" commands.
> Now, as to the functionality of the command, currently the "p" command
> does this when presented with a percpu symbol:
>
> crash> p idle_threads
> PER-CPU DATA TYPE:
> struct task_struct *idle_threads;
> PER-CPU ADDRESSES:
> [0]: ffff88021e20e360
> [1]: ffff88021e24e360
> [2]: ffff88021e28e360
> [3]: ffff88021e2ce360
Yes, I know. The crash utility is in fact not so bad when it comes to
global percpu variables.
>[...]
> Your command pulls out the calculated per-cpu address and prints
> it exactly as the "p" command does, and then follows it with the
> symbolic display:
>
> crash> percpu -a idle_threads
> [0]: ffff88021e20e360
> (struct task_struct *) 0xffffffff81c13440 <init_task>
> [1]: ffff88021e24e360
> (struct task_struct *) 0xffff88021282d330
> [2]: ffff88021e28e360
> (struct task_struct *) 0xffff88021282dac0
> [3]: ffff88021e2ce360
> (struct task_struct *) 0xffff88021282e250
> crash>
>
> Since this patch essentially adds a command that does the
> same thing as the "p" command does with regular symbols, I
> wonder if it would be better suited to be merged with the
> "p" command? Did you try doing something like that?
Yes, I thought about extending one of the existing commands.
These are the issues I encountered:
1. The "percpu" combines both "*" and "p". The "*" functionality
(which you didn't test) is in fact primary, and I wanted to get it
with as little typing as possible. If you can propose an acceptably
short variant of "percpu 0 disk_stats 0x1a468" using the existing
commands, I will in fact prefer it to a standalone command.
2. Many options to the cmd_datatype_common were incompatible with the
"percpu" display (e.g. "-f"), some where meaningless (e.g. "-u").
3. I'd have to implement percpu twice (once for "p" and once for
cmd_datatype_common.
4. Percpu without any argument uses the current CPU. Try something like
"set -c 0", "percpu current_task", then
"set -c 1", "percpu current_task"
This may not be needed so often, so I'm fine with making it an
option.
In general, I think it's a matter of taste, and if you dislike new
commands, it all boils down to finding a suitable syntax to extend the
existing commands. Unfortunately, "-c" (as CPU) is already taken for
count, and "-p" (as processor) is already taken for pointer
dereference. :-(
I can think of:
A. "-C" (but it requires a shift key, and two options that only
differ in case may be confusing)
B. any other random letter ("-a" is free).
If you find an option letter (let's mark it "-?" here), it could be
used like this:
p -? current_task # global var on the current CPU
p -?1-3 current_task # global var on selected CPUs
p -?a current_task # global var on all CPUs
struct -? disk_stats 0x1a468 # dynamic var
*disk_stats -?a 0x1a468 # dynamic var on all CPUs
Petr T
More information about the Crash-utility
mailing list