[Crash-utility] handling missing kdump pages in diskdump format

Dave Anderson anderson at redhat.com
Thu Mar 29 13:13:12 UTC 2007


Ken'ichi Ohmichi wrote:

> Hi,
>
> 2007/03/23 09:26:02 +0900, "Ken'ichi Ohmichi" <oomichi at mxs.nes.nec.co.jp> wrote:
> >>> 1)  Makedumpfile patch:  Ken'ichi Ohmichi's email of Wed, 7 Mar 2007
> >>> 10:43:38 +0900 contained the patch "point_same_zero_page.patch".  That
> >>> patch contains the nice solution to remove redundant zero page images
> >>> from the diskdump dump file by pointing the page descriptors of zero
> >>> pages to a common zero image.  I suggest that this patch should be
> >>> applied to makedumpfile as soon as possible, without waiting on a
> >>> possible solution to the ELF situation.  As described in my report, ELF
> >>> and diskdump dump files have not shown identical behavior in the past.
> >>> This patch makes diskdump dump files more accurate, and leaves ELF dump
> >>> files at the same level of accuracy that they have always had.
> >
> >I agree with Bob, I will merge the patch "point_same_zero_page.patch" into
> >a new makedumpfile. But this change is very important, and I want to check
> >that this change is correct by doing many tests.
> >I will release a new makedumpfile until the next weekend.
>
> I checked whether this change is correct by the following:
> (The following patches are attached with this mail)
> - makedumpfile-1.1.2 with "point_same_zero_page2.patch" creates a dumpfile.
> - crash-4.0-3.21 with "not-access-excluded-page.patch" analyzes the dumpfile.
> - The analysis result of the dumpfile is compared with /proc/vmcore's.
>
> And on i386 linux-2.6.19, I found the difference between the result
> of the dumpfile (excluding free pages) and /proc/vmcore's by subcommand
> "foreach bt".
> But by using crash-4.0-3.21 without "not-access-excluded-page.patch",
> there is not any difference. In a word, this difference happens due to
> considering the excluded pages as unaccess pages.
>
> It is the diff result of /proc/vmcore's analysis result and the dumpfile's
> as follows:
>
> --- result-vmcore.txt   2007-03-28 14:01:20.000000000 +0900
> +++ result-dumpfile-d16.txt     2007-03-28 14:01:06.000000000 +0900
> @@ -1,7 +1,24 @@
>  crash> foreach bt
>  PID: 0      TASK: c037c440  CPU: 0   COMMAND: "swapper"
> +bt: diskdump: paddr(44b1ae) excluded from dump
> +bt: diskdump: paddr(44b1ae) excluded from dump
> +bt: diskdump: paddr(44b2a0) excluded from dump
> +bt: diskdump: paddr(44b2a0) excluded from dump
> +bt: diskdump: paddr(44b1ae) excluded from dump
> +bt: diskdump: paddr(44b1ae) excluded from dump
> +bt: diskdump: paddr(44b2a1) excluded from dump
> +bt: diskdump: paddr(44b2a1) excluded from dump
> +bt: cannot resolve stack trace:
> +bt: diskdump: paddr(44b1ae) excluded from dump
> +bt: diskdump: paddr(44b1ae) excluded from dump
> +bt: diskdump: paddr(44b2a1) excluded from dump
> +bt: diskdump: paddr(44b2a1) excluded from dump
>   #0 [c0446f54] schedule at c0311ef0
> - #1 [c0446fcc] cpu_idle at c0102c8d
> +bt: text symbols on stack:
> +    [c0446fb0] mwait_idle_with_hints at c0102235
> +    [c0446fcc] cpu_idle at c0102c92
> +    [c0446fd4] start_kernel at c044b56f
> +    [c0446fdc] unknown_bootoption at c044b577
>
>  PID: 0      TASK: c812f050  CPU: 1   COMMAND: "swapper"
>   #0 [c8127f3c] schedule at c0311ef0
> _
>
> The physical address 0x44b1ae is the symbol start_kernel's.
> The text start_kernel is freed at free_initmem() while the kernel
> booting, and it is not problem that this text is excluded as free pages.
> I did not research the detail of this problem. I guess that the crash
> utility expect it can read the text of each process.
>

The x86 backtracer is constantly reading text to determine the size
of a stack frame from a given return address.   However, there are a
number of traps in the backtrace code, however, to recognize "cpu_idle"
as a backtrace ending point, but it still must be reading back farther.

>
> I think it is necessary not only the change of handling the excluded
> pages but also other changes of the crash utility.
>

Can you make the vmlinux/vmcore pair available to me?  Then I
fix this particular issue as well as testing my implementation of the
crash utility's excluded page handling.

Thanks,
  Dave





More information about the Crash-utility mailing list