[Crash-utility] About the use of 'gcore'

Patrick Agrain patrick.agrain at alcatel-lucent.com
Tue Jan 7 06:47:30 UTC 2014


Hello Dave,

Thanks for the answers. I'll check your suggestions.

More answers in the text...

Le 06/01/2014 22:08, Dave Anderson a écrit :
>
> ----- Original Message -----
>> Hello all,
>>
>> First of all, I wish a Happy New Year (with less crash, but still enhanced
>> tools...)
>>
>> Thanks for the links, they were very useful.
>> I dig further in the way of analyzing the User Space, but it seems that I'm
>> linked to a dead-end way.
>> Below is a snapshot of kernel / userland stack dump.
>>
>> What I've done :
>> - Crash is triggered by a page fault inside a kernel module (write 0 in
>> 0xFFFFFFFF, classic).
>> - Using gcore to create the 'core.<pid>.bash (which is the user task running
>> at time of crash).
> I'm curious as to how the bash task was related to the module crash?
> Did the bash task write to a procfs interface that the module created
> to then generate the "write 0 to 0xFFFFFFFF"?  Does the crash utility
> indicate that the bash task is the panic task?  And if so, what does
> its "bt" show?  (i.e., the kernel-mode backtrace)
That's correct.
I wrote a kernel module (timecrash.ko) to do the page fault after a 
timeout elapsed.
The timer is triggered by: echo <timeout_in_second> > /proc/tocrashme

The 'bt' command shows following:
PID: 892    TASK: c274e550  CPU: 0   COMMAND: "bash"
  #0 [c2699d20] crash_kexec at c0492ecc
  #1 [c2699d78] oops_end at c07ebbb2
  #2 [c2699d90] no_context at c042d389
  #3 [c2699db8] __bad_area_nosemaphore at c042d4b3
  #4 [c2699df8] bad_area_nosemaphore at c042d57d
  #5 [c2699e04] __do_page_fault at c042da5c
  #6 [c2699e88] do_page_fault at c07ed531
  #7 [c2699ea4] error_code (via page_fault) at c07eaf3d
     EAX: 00000028  EBX: 00000003  ECX: c09e6514  EDX: 00000000  EBP: 
c2699f20
     DS:  007b      ESI: 00000000  ES:  007b      EDI: 094a5408  GS:  00e0
     CS:  0060      EIP: f87ad1c6  ERR: ffffffff  EFLAGS: 00010296
  #8 [c2699ed8] proc_crash_setdelay at f87ad1c6 [timecrash]
  #9 [c2699f24] proc_file_write at c0572856
#10 [c2699f44] proc_reg_write at c056d5dd
#11 [c2699f68] vfs_write at c051f637
#12 [c2699f90] sys_write at c051ff38
#13 [c2699fb0] system_call at c07ea7ad
     EAX: 00000004  EBX: 00000001  ECX: 094a5408  EDX: 00000003
     DS:  007b      ESI: 00000003  ES:  007b      EDI: 094a5408
     SS:  007b      ESP: bfd1b6d8  EBP: bfd1b704  GS:  0033
     CS:  0073      EIP: b776a416  ERR: 00000004  EFLAGS: 00000246

>
>> - Evaluating an EBP (between { }) chaining value (hypothesis), EIP value
>> (between [ ]) is then just pushed beside
>>
>> The purpose of this study is to find a method to analyze futur crashes from
>> kernel space down to user space applications.
>>
>> Do you have an idea about the cause of this non-dumping of the memory in
>> user-space ?
>> Should I use other extension as 'gcore' ?
>>
>> Thank in advance.
>> Best regards,
>> Patrick Agrain
>>
>>
>> -------
>> ===============================================================================
>> --------------------- Go down into User Space Territory
>> -----------------------
>>
>> Last pt_regs of kernel stack is:
>> | pt_regs
>> 00000001 094a5408 00000003 ..~......TJ..... | bx cx dx
>> c2699fc0: 00000003 094a5408 bfd1b704 00000004 .....TJ......... | si di bp ax
>> c2699fd0: 0000007b ffff007b c07e0000 00000033 {...{.....~.3... | ds es fs gs
>> c2699fe0: 00000004 b776a416 00000073 00000246 ......v.s...F... | orig_eax ip
>> cs flags
>> c2699ff0: bfd1b6d8 0000007b | sp ss
>> v cccccccc cccccccc ....{........... | padding
>> |
>> |----------------------------------------------------------------|
>> |
>> (gdb) x/32xw 0xbfd1b680 |
>> 0xbfd1b680: 0xbfd1b6d0 0x0000000f 0x094b4568 0x080c90b9 |
>> 0xbfd1b690: 0x094b4568 0x080cd160 0x00001936 0x00000001 |
>> 0xbfd1b6a0: 0x094ab9c8 0x00000000 0x094b4b48 0xbfd1b7c8 |
>> 0xbfd1b6b0: 0x080ce9e8 0x094b4b48 0x094b4b48 0xbfd1b728 |
>> 0xbfd1b6c0: 0x094aed28 0x00000020 0x00000000 0x00000070 |
>> 0xbfd1b6d0: 0x094b4588 0x080cc080 |
>> 0xb7698b43 <--|
>> 0xb7757ff4
>> 0xbfd1b6e0: 0xb76343b4 0x00000001 0x094a5408 0x00000003
>> 0xbfd1b6f0: 0xb77584e0 0x080cc080 0xbfd1b728 0xb77584e0
>>
>> |------------------------------------------ Hypothesis : this is an EBP
>> |value...
>> v
>> 0xbfd1b700: 0x00000003 {0xbfd1b72c} [0xb7635c90] 0xb77584e0
>> 0xbfd1b710: 0x094a5408 0x00000003 0x094b4b48 0xbfd1b7c8
>> 0xbfd1b720: 0xb7757ff4 0xb77584e0 0x0000000a {0xbfd1b750}
>> 0xbfd1b730: [0xb7634e80] 0xb77584e0 0x094a5408 0x00000003
>> 0xbfd1b740: 0x0000000a 0xb7757ff4 0xb77584e0 0x0000000a
>> 0xbfd1b750: {0xbfd1b768} [0xb7637d2a] 0xb77584e0 0x0000000a
>> 0xbfd1b760: 0xb7757ff4 0xb77584e0 {0xbfd1b788} [0xb76312b5] >-|
>> 0xbfd1b770: 0xb77584e0 0x0000000a 0xb75c9940 0x094a3e48 |
>> 0xbfd1b780: 0x00000001 0x00000000 0x00000000 0x0809b64b |
>> |
>> Disassemble Try: EIP at 0xb76312b5
>> <---------------------------------------------|
>> (gdb) disassemble 0xb7631200, 0xb7631300
>> Dump of assembler code from 0xb7631200 to 0xb7631300:
>> 0xb7631200: Cannot access memory at address 0xb7631200
>> (gdb)
>> ----------
> Anyway, I'm guessing that the 0xb76312b5 IP address is in some
> library, probably libc?  If you do a "vm" on the active bash task
> from within the crash utility, you will see where it comes from.
> Try reading the user-space address from the crash utility to see
> if it was available to copy to the core.<pid>.bash file, i.e.,
> try this command:
>
>   crash> rd -u 0xb76312b5
>
> The command above presumes that you are in the context of the
> "bash" task while running crash.  (i.e., if you enter "set" alone,
> it shows that particular task)
>
> Dave
>
>   
>> Le 17/12/2013 19:12, Buland Kumar Singh a écrit :
>>
>>
>>
>> Hi Patrick,
>>
>> The following links may also be helpful to understand gdb and
>> it's usage for application core analysis.
>>
>> http://web.eecs.umich.edu/~sugih/pointers/gdb_core.html
>> https://sourceware.org/gdb/onlinedocs/gdb/
>>
>> -- BKS
>>
>>
>> On 17 December 2013 21:36, Patrick Agrain < patrick.agrain at alcatel-lucent.com
>>> wrote:
>>
>>
>> Hello all,
>>
>> Now that we have dumped the kernel stack, I'm intesresting in the user
>> process from which we came just before the 'panic'.
>> Googling around, I found mention of the 'gcore' extension.
>>
>> I compiled version 1.22 and installed it.
>> Using it on crash 6.1.0-1.el6, I get a file core.845.bash on process 'bash'
>> (in which I trigger a kernel panic) :
>>
>>
>>
>> crash> gcore -v 1 845
>> gcore: Opening file core.845.bash ...
>> gcore: done.
>> gcore: Writing ELF header ...
>> gcore: done.
>> gcore: Retrieving and writing note information ...
>> gcore: done.
>> gcore: Writing PT_NOTE program header ...
>> gcore: done.
>> gcore: Writing PT_LOAD program headers ...
>> gcore: done.
>> gcore: Writing PT_LOAD segment ...
>> gcore: PT_LOAD[0]: 8048000 - 8048000
>> gcore: PT_LOAD[1]: 80e2000 - 80e9000
>> gcore: PT_LOAD[2]: 80e9000 - 80ed000
>> gcore: PT_LOAD[3]: 94a2000 - 94d1000
>> gcore: PT_LOAD[4]: b7374000 - b7374000
>> gcore: PT_LOAD[5]: b7375000 - b7376000
>> gcore: PT_LOAD[6]: b7376000 - b7377000
>> gcore: PT_LOAD[7]: b7377000 - b7377000
>> gcore: PT_LOAD[8]: b737e000 - b737e000
>> gcore: PT_LOAD[9]: b737f000 - b737f000
>> gcore: PT_LOAD[10]: b73bb000 - b73bb000
>> gcore: PT_LOAD[11]: b75bb000 - b75bb000
>> gcore: PT_LOAD[12]: b75c7000 - b75c8000
>> gcore: PT_LOAD[13]: b75c8000 - b75c9000
>> gcore: PT_LOAD[14]: b75c9000 - b75ca000
>> gcore: PT_LOAD[15]: b75ca000 - b75ca000
>> gcore: PT_LOAD[16]: b7756000 - b7758000
>> gcore: PT_LOAD[17]: b7758000 - b7759000
>> gcore: PT_LOAD[18]: b7759000 - b775c000
>> gcore: PT_LOAD[19]: b775c000 - b775c000
>> gcore: PT_LOAD[20]: b775f000 - b7760000
>> gcore: PT_LOAD[21]: b7760000 - b7761000
>> gcore: PT_LOAD[22]: b7761000 - b7761000
>> gcore: PT_LOAD[23]: b7764000 - b7765000
>> gcore: PT_LOAD[24]: b7769000 - b776a000
>> gcore: PT_LOAD[25]: b776a000 - b776b000
>> gcore: PT_LOAD[26]: b776b000 - b776b000
>> gcore: PT_LOAD[27]: b7789000 - b778a000
>> gcore: PT_LOAD[28]: b778a000 - b778b000
>> gcore: PT_LOAD[29]: bfd07000 - bfd1d000
>> gcore: done.
>> Saved core.845.bash
>> crash>
>>
>> So far, so good... But
>>
>> Question: Are there anywhere some hints about how to use this core.<pid> file
>> ?
>>
>> Thanks in advance.
>> Regards,
>> Patrick Agrain
>>
>> --
>> Crash-utility mailing list
>> Crash-utility at redhat.com
>> https://www.redhat.com/mailman/listinfo/crash-utility
>>
>>
>>
>> --
>> BKS
>>
>>
>> --
>> Crash-utility mailing list Crash-utility at redhat.com
>> https://www.redhat.com/mailman/listinfo/crash-utility
>>
>>
>> --
>> Crash-utility mailing list
>> Crash-utility at redhat.com
>> https://www.redhat.com/mailman/listinfo/crash-utility
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility




More information about the Crash-utility mailing list