[Crash-utility] handling missing kdump pages in diskdump format

Dave Anderson anderson at redhat.com
Thu Nov 9 13:53:36 UTC 2006


Takao Indoh wrote:

> Hi Bob, Dave,
>
> On Wed, 08 Nov 2006 16:08:08 -0500, Dave Anderson wrote:
>
> >Bob Montgomery wrote:
> >
> >> I've been experimenting with the makedumpfile utility for kdump on ia64.
> >> One of my experiments was to verify that a page that should have been
> >> missing indeed was missing.  I used crash 4.0-3.8 to look for a user
> >> page that should have been omitted from the dump.
> >>
> >> crash> x/xg 0xe0000040fc00c000
> >> 0xe0000040fc00c000:     0x0000000000000000
> >>
> >> On a full dump from makedumpfile as well as on a straight copy of
> >> vmcore, crash reports this:
> >>
> >> crash> x/xg 0xe0000040fc00c000
> >> 0xe0000040fc00c000:     0x00010102464c457f
> >>
> >> The dumpfiles created by makedumpfile appear to crash as diskdump files,
> >> and crash appears to excuse missing pages and report 0x0 contents here:
> >>
> >> diskdump.c:read_diskdump, line 454:
> >>
> >>        if (!page_is_dumpable(pfn)) {
> >>                 memset(bufptr, 0, cnt);
> >>                 return cnt;
> >>
> >> Shouldn't there be some indication that a requested page is missing as
> >> opposed to being legitimately full of zeros?
> >>
> >> Bob Montgomery
> >
> >Hi Bob,
> >
> >That decision was made by the diskdump developers
> >when I accepted their diskdump-specific work into the
> >crash utility.
> >
> >As I recall, I kind of agreed with you at the time, but there
> >was some compelling reason that they had that argued for
> >passing a zero-filled page -- although I don't recall what it
> >was?
> >
> >I don't believe it had anything to do with user pages,
> >but in the case of some of the other kernel pages that can
> >be left out of the dump, passing back a zero-filled page
> >was a cleaner way to handle whatever the problem was.
> >
> >Anyway, I've long since forgotten why they wanted it
> >that way...
> >
> >Perhaps the diskdump guys on this list can refresh my
> >memory?
>
> My memory is also unclear...
> As far as my dim memory goes, I decided the following policy when I
> wrote this code.
>
> 1) If the page is not found because of invalid address(e.g. address
>    points memory hole), read_diskdump should return error.
> 2) If the page is not found because of partial dump, read_diskdump
>    should return zero-filled page.
>
> Perhaps I decided the 2nd policy because of simplicity of implementation.
> Bob points out that crash should do something instead of passing
> zero-filled page (e.g. showing message) in the 2nd case, right?
>

If I recall, displaying an error message at the point of the
zero-filled page return could happen in an inopportune
time, messing up the output of something?  In other words,
the code that read the zero-filled page was expecting
some type of known data, and when it received a zero,
it quietly and gracefully continued.  I think...

What you could do is set up makedumpfile to restrict
pages to its maximum capability, and instead of returning
a zero-filled page in read_diskdump(), insert an
"error(FATAL,...)" call there.  It would be interesting
to note whether (1) the system comes up, and (2) whether
it effects commands that otherwise would work.

Dave







More information about the Crash-utility mailing list