[Crash-utility] Handle the NT_PRSTATUS lost for the "bt" command

Toshikazu Nakayama nakayama.ts at ncos.nec.co.jp
Mon Jun 18 09:48:44 UTC 2012


The purpose of this patch is to work out "bt" command for the diskdump
which NT_PRSTATUS note could not be saved by IPI lost.
I think IPI is possibly lost by panic under the serious crashed condition.

I noticed that "bt" failed in my ppc environment
when the NT_PRSTATUS notes are lost on some CPUs while IPI delivery.
Then, I made CPU map for prstatus in diskdump more correctable
by checking a validation of crash_notes field.

I've tested this problem by patching kernel like,
- kernel/kexec.c
void crash_save_cpu(struct pt_regs *regs, int cpu)
{
+        if (current->pid == 0)
+                /* this cpu was idle; nothing to capture */
+                return;

It looks terrible and impractical test case but actually
I met this code in my using distro's kernel.
I couldn't reproduce actual IPI lost case, then fortunately, use this
as a example of the causes if IPI could not be delivered to other CPUs.

=> Taking diskdump by sysrq+c and makedumpfile.

crash> help -D | grep notes
  num_prstatus_notes: 1
           notes_buf: 10ba91a8
            notes[0]: 10ba91a8
crash> help -k | grep cpus
          cpus: 8
 cpus_override: (null)
crash> bt
PID: 1001   TASK: ea62b000  CPU: 2   COMMAND: "bash"
Segmentation fault

Since seven idle cpus did not save NT_PRSTATUS note,
crash could not handle CPU#2's note where is located as CPU#0's.

With this patch, crash get to work out with correct CPU map to prstatus.

WARNING: catch lost crash_notes at cpu#0
WARNING: catch lost crash_notes at cpu#1
WARNING: catch lost crash_notes at cpu#3
WARNING: catch lost crash_notes at cpu#4
WARNING: catch lost crash_notes at cpu#5
WARNING: catch lost crash_notes at cpu#6
WARNING: catch lost crash_notes at cpu#7
crash.fix> help -D | grep notes
  num_prstatus_notes: 1
           notes_buf: 107a3378
            notes[2]: 107a3378
crash.fix> help -k | grep cpus
          cpus: 8
 cpus_override: (null)
crash.fix> bt
PID: 1001   TASK: ea62b000  CPU: 2   COMMAND: "bash"

R0:  00000001   R1:  eb793e60   R2:  ea62b000   R3:  00000063
R4:  00000000   R5:  ffffffff   R6:  c043ba2c   R7:  00000000
R8:  00008000   R9:  00000000   R10: 00000000   R11: eb793e70
R12: 28242444   R13: 100b8448   R14: 100b07b8   R15: 100b0894
R16: 00000000   R17: 00000000   R18: 00000000   R19: 1006d270
R20: 00000000   R21: 100f0430   R22: 00000000   R23: 00000001
R24: c08f1ac8   R25: 00029002   R26: c08f1bac   R27: c08d0000
R28: 00000000   R29: c09ada48   R30: 00000063   R31: eb793e60
NIP: c0423378   MSR: 00021002   OR3: c09ada48   CTR: c0423344
LR:  c0423d8c   XER: 00000000   CCR: 28242444   MQ:  00008000
DAR: 00000000 DSISR: 00800000        Syscall Result: eb793e60
 NIP [00000000c0423378] sysrq_handle_crash
 LR  [00000000c0423d8c] __handle_sysrq

 #0 [eb793e60] sysrq_handle_crash at c0423378
  : snip

Thanks,
Toshi

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-use-calloc-for-nt_prstatus_percpu.patch
Type: text/x-patch
Size: 1154 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20120618/8b0b59d3/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-handle-cpus-which-lost-crash_notes.patch
Type: text/x-patch
Size: 3992 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20120618/8b0b59d3/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-ppc-use-kt-cpus-instead-of-dd-num_prstatus_notes.patch
Type: text/x-patch
Size: 1337 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20120618/8b0b59d3/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-move-the-common-structures-from-machdep-to-kernel.c.patch
Type: text/x-patch
Size: 3902 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20120618/8b0b59d3/attachment-0003.bin>


More information about the Crash-utility mailing list