[Crash-utility] [PATCH v2 00/25] support hiding data of offline cpu

qiaonuohan qiaonuohan at cn.fujitsu.com
Tue Sep 30 02:59:45 UTC 2014


On 09/30/2014 04:00 AM, Dave Anderson wrote:
>
>
> ----- Original Message -----
>> For those who don't care about the removed cpu, data of offline cpu is
>> bothering. This patchset is trying to hide data of offline cpu. Please
>> check each patch to see what is hiden.
>>
>> The last version is here:
>> https://www.redhat.com/archives/crash-utility/2014-September/msg00000.html
>>
>> Changelog:
>> v1->v2:
>> 1. patch 1: add environment variable and its argument to hide/show
>>     data of offline cpu.
>> 2. add description of environment variable to crash.8
>> 3. remove the restrict of x86_64
>> 4. add "<OFFLINE>" to indicate those hiden data.
>> 5. patch 22/23: addd "<OFFLINE>" to indicate idle task on offline
>>     cpu
>>
>> Qiao Nuohan (25):
>>    add a flag to display/hide data of offline cpus
>>    add an API to check whether to hide a cpus' data
>>    x86_64: modify help -m/-M to hide offline cpus' data
>>    x86_64: modify bt -E to hide offline cpus' data
>>    x86_64: modify mach to hide offline cpus' data
>>    x86_64: modify mach -c to hide offline cpus' data
>>    modify help -r to hide offline cpus' registers
>>    modify bt -c to hide offline cpus' data
>>    modify display_sys_stats() to hide cpus' number
>>    modify help -k to hide offline cpus' number
>>    modify set -c to hide offline cpu
>>    modify irq -s to hide offline cpus' data
>>    modify irq -a to hide offline cpus' data
>>    modify timer -r to hide offline cpus' data
>>    modify timer to hide offline cpus' data
>>    modify ptov offset:cpuspec to hide offline cpus' data
>>    fix max_cpudata_limit() when offlined cpu exists
>>    modify kmem -o to hide offline cpus' data
>>    modify kmem -S(SLUB) to hide offline cpus' data
>>    modify struct/union/* [:cpuspec] to hide offline cpus' data
>>    modify command p to hide offline cpus' data
>>    modify ps to indicate idle task on offline cpu
>>    modify print_task_header() to indicate idle task on offline cpu
>>    modify ps -l/-m -C cpu to hide offline cpus' data
>>    modify runq to hide offline cpus' data
>>
>>   crash.8    |  4 +++
>>   defs.h     |  3 ++
>>   diskdump.c |  6 +++-
>>   help.c     |  6 ++++
>>   kernel.c   | 83 +++++++++++++++++++++++++++++++++++++++++++++++++----
>>   main.c     | 15 +++++++++-
>>   memory.c   | 30 +++++++++++++++----
>>   netdump.c  | 25 ++++++++++++++--
>>   symbols.c  | 15 +++++++++-
>>   task.c     | 97
>>   ++++++++++++++++++++++++++++++++++++++++++++++++++++----------
>>   tools.c    | 15 ++++++++++
>>   x86_64.c   | 74 +++++++++++++++++++++++++++++++++++++++--------
>>   12 files changed, 331 insertions(+), 42 deletions(-)
>>
>> --
>> 1.8.5.3
>
>
> Hello Qiao,
>
> In the future, can you please post the patches as attachments?
> Unfortunately my Zimbra email web interface replace tabs with spaces,
> so I have to manually cut-and-paste each patch from the crash-utility
> mailing list archives.

I see.

>
> I have a problem with a number of the patches, some of which
> I have NAK'd completely, and others I would suggest changing
> the output slightly.
>
> As I recall, one of your arguments for doing this is to avoid
> verbose display of offline cpu data.  But in a few of your
> patches, it simply replaces a one-line per-cpu piece of data
> with the<OFFLINE>  string.  To me it makes more sense to
> continue to display the data with an "[OFFLINE]" tag appended
> to the end of the output line.
>
> Also, I had asked that there be *no* behavioral changes
> unless the "offline" environment variable was set to "hide".
> But there are a few patches that change things regardless of
> the setting.  So for the most part, if any of the individual
> patches do *not* use your hide_offline_cpu() function, then
> it's highly likely that it has been NAK'd.
>
> Lastly, can you make your output show "[OFFLINE]" instead
> of "<OFFLINE>"?   I'd prefer to keep the usage of "[ brackets ]"
> as "informational" in nature, whereas "<  braces>" are typically
> used to surround symbol names.

I would change it.

>
> Here are the patches that I have NAK'd or would like changed:
>
> [PATCH v2 03/25] x86_64: modify help -m/-M to hide offline cpus' data
>
>    NAK.  This patch makes no sense to me.  It's a debug command that
>    shows the contents of the crash-internal machdep and machine_specific
>    data structures.  What possible benefit would there be to hide the
>    legitimate addresses/contents?  Please leave them as they are.
>
> [PATCH v2 05/25] x86_64: modify mach to hide offline cpus' data
>
>    Please continue to display the stack addresses, but *followed by*
>    "[OFFLINE]".
>
> [PATCH v2 09/25] modify display_sys_stats() to hide cpus' number
>
>    Here you are changing crash behavior regardless of the "offline"
>    setting.  While it is true that you are now mimicking the PPC64
>    behaviour, that is because I typically allow IBM to do their own
>    thing with respect to the PPC64 (and S390/S390X) architectures.
>    What I would suggest you do is this: for all architectures except
>    PPC64, continue to show kt->cpus, but append the number of offlined
>    cpus after it, i.e., something like this:
>
>        KERNEL: ../kdump/vmlinux
>      DUMPFILE: ../kdump/vmcore  [PARTIAL DUMP]
>          CPUS: 4 [1 OFFLINE]
>          DATE: Tue Sep 23 14:54:26 2014
>        UPTIME: 00:02:45

I see. ppc64 bothers me.

>
> [PATCH v2 10/25] modify help -k to hide offline cpus' number
>
>    NAK.  "help -k" is a debug option whose purpose is to dump the
>    contents of the crash-internal kernel_table; accordingly, the
>    "cpus" output should show the contents of the kt->cpus member
>    regardless whether there are offline cpus.
>
> [PATCH v2 13/25] modify irq -a to hide offline cpus' data
>
>    NAK.  This command essentially shows the contents of a
>    cpu mask in a kernel data structure -- regardless whether
>    the cpu is offline.
>
> [PATCH v2 16/25] modify ptov offset:cpuspec to hide offline cpus' data
>
>    There is no reason to *not* show the per-cpu VIRTUAL address, and
>    you're not saving any space.  Instead please show the virtual address
>    *followed by* "[OFFLINE]".
>
> [PATCH v2 18/25] modify kmem -o to hide offline cpus' data
>
>    There is no reason to *not* show the per-cpu OFFSET values, and
>    you're not saving any space.  Instead please show the offset value
>    *followed by* "[OFFLINE]".
>
> [PATCH v2 22/25] modify ps to indicate idle task on offline cpu
>
>    NAK.  This patch changes behavior regardless of the "offline" setting.
>    The swapper task *does* exist, and it actually is the active task on
>    that (offline) cpu.  Plus you're doing it only for the active task
>    on the offline cpu -- but there would still be several other kernel
>    tasks shown for that cpu even when it is offline.

The swapper task of offline cpu showed by ps command is strange to me.
It shows the task is still running, with ">" at the start of the line
and "ST" is "RU", but the cpu is halt. So I think it is needed to change
the display of these tasks.

I will change the display when "offline" is "hide". And if you don't
like add "OFFLINE" at the end, What about change "RU" to other(like "OF")
and delete ">", just like:

<cut>
    PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
...
       0      0   2  ffff88003dad5b00  OF   0.0       0      0  [swapper/2] <OFFLINE>
...
<cut>

>
> [PATCH v2 23/25] modify print_task_header() to indicate idle task on offline cpu
>
>    NAK.  This patches changes behavior regardless of the "offline" setting.
>    The print_task_header() function is called by 53 different functions, and
>    I'm not going to even bother checking each one for relevancy.

Same as 22/25, change "ST" to "OF"?

>
> Thanks,
>    Dave
>
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
> .
>


-- 
Regards
Qiao Nuohan




More information about the Crash-utility mailing list