[Crash-utility] [PATCH v2] Add read diagnostics to crash

Dave Anderson anderson at redhat.com
Tue Jan 10 19:43:24 UTC 2012



----- Original Message -----
> Hi Dave,
> 
> On 01/10/2012 09:24 AM, Dave Anderson wrote:
> >
> >
> > ----- Original Message -----
> >> Re-posting the patch as an attachment. I'm sorry for the mess on the
> >> first pass.
> >>
> >> --
> >> David.
> >
> > Hi Dave,
> >
> > Your patch has merit, but I'm going to pare it down a bit.  I use CRASHDEBUG(8)
> > fairly frequently, but with your changes as-is, it's far too verbose.  It's also
> > a bit redundant -- here's an example of a CRASHDEBUG(8) kdump read
> > currently:
> >
> >    crash>  rd ffff88003c1a5cc0
> >    <addr: ffff88003c1a5cc0 count: 1 flag: 490 (KVADDR)>
> >    <readmem: ffff88003c1a5cc0, KVADDR, "64-bit KVADDR", 8, (FOE), 7fff0096b5a8>
> >        addr: ffff88003c1a5cc0  paddr: 3c1a5cc0  cnt: 8
> >    ffff88003c1a5cc0:  0000000000000000                    ........
> >         text hit rate: 65% (3505 of 5378)
> >    crash>
> >
> > And then with your patch:
> >
> >    crash>  rd ffff88003c1a5cc0
> >    <addr: ffff88003c1a5cc0 count: 1 flag: 490 (KVADDR)>
> >    <readmem: ffff88003c1a5cc0, KVADDR, "64-bit KVADDR", 8, (FOE), 7fff1c97b6e8>
> >        addr: ffff88003c1a5cc0  paddr: 3c1a5cc0  cnt: 8
> >    rd: read_kdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
> >    rd: read_kdump trying read_netdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
> >    rd: read_netdump(-1, 7fff1c97b6e8, 8, ffff88003c1a5cc0, 3c1a5cc0)
> >    rd: read KDUMP or NETDUMP_ELF64 for multi-segment: 3c195fb8
> >    rd: seek and read offset: 3c195fb8
> >    ffff88003c1a5cc0:  0000000000000000                    ........
> >         text hit rate: 65% (3505 of 5378)
> >    crash>
> >
> > The "<readmem: ...>" and subsequent "addr: ..." lines come from readmem().
> > I don't understand the need for the next 3 lines, as they are completely
> > redundant:
> 
> Except that one intention was to show program flow (failures that didn't
> happen) as well as continuation of program state.
> 
> *	read_kdump() entered
> rd: read_kdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
> 
> *	None of the conditional fatal error exits in read_kdump()
> rd: read_kdump trying read_netdump for: virt=ffff88003c1a5cc0
> phys=3c1a5cc0
> 
> *	read_netdump() entered
> rd: read_netdump(-1, 7fff1c97b6e8, 8, ffff88003c1a5cc0, 3c1a5cc0)
> 
> *	switch (DUMPFILE_FORMAT(nd->flags)) VIA case KDUMP_ELF64 show calculated offset
> rd: read KDUMP or NETDUMP_ELF64 for multi-segment: 3c195fb8
> 
> *	Not the if (FLAT_FORMAT()) read case rd: seek and read offset: 3c195fb8
> 
> >    rd: read_kdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
> >    rd: read_kdump trying read_netdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
> >    rd: read_netdump(-1, 7fff1c97b6e8, 8, ffff88003c1a5cc0, 3c1a5cc0)
> > I don't argue with the displays of the resultant file offsets, but I may move
> > them to CRASHDEBUG(9):
> 
> I wouldn't object and can re-post a CRASHDEBUG(9) version myself if
> you'd prefer.

But there is still no need to redundantly display the virtual and physical
addresses.  And the displays of the calculated file offsets, zero-fill, etc.
will still end up showing the complete function sequence by putting the 
function name in the output.  For example, the updated readmem() output would
show a call to read_kdump(), but the file offset display would show that it
has transitioned to read_netdump(), so there's no need for any debug output
at all in read_kdump().  

> 
> >    rd: read KDUMP or NETDUMP_ELF64 for multi-segment: 3c195fb8
> >    rd: seek and read offset: 3c195fb8
> >
> > And the same goes for compressed kdumps:
> >
> >    crash>  rd ffff81005286e0c0
> >    <addr: ffff81005286e0c0 count: 1 flag: 490 (KVADDR)>
> >    <readmem: ffff81005286e0c0, KVADDR, "64-bit KVADDR", 8, (FOE), 7fff11d16958>
> >        addr: ffff81005286e0c0  paddr: 5286e0c0  cnt: 8
> >    ffff81005286e0c0:  0000000000000000                    ........
> >         text hit rate: 66% (239 of 357)
> >    crash>
> >
> > With the patch, aside from the "pfn" displays, it's redundant
> > information:
> 
> Except for the implied flow and continuation of state. I've been asked
> to look into alleged crash bugs that were no more than corrupt or
> incomplete dumps that required either hexdump style deciphering of the
> structure or a method of crash exposing its own deciphering of the
> structure because the program starts and exits with no more than an
> error code that can't always identify a unique point of failure and
> you'd need to find the cause in source anyway. For example, a case where
> an incomplete crash dump has page-present flags showing the page not
> present for kernel data structures (every single page present flag is
> zero in-fact). IOW, no crash bug because kdump created an incomplete
> crash dump.
> 
> >    crash>  rd ffff81005286e0c0
> >    <addr: ffff81005286e0c0 count: 1 flag: 490 (KVADDR)>
> >    <readmem: ffff81005286e0c0, KVADDR, "64-bit KVADDR", 8, (FOE),
> >    7fffd3ca4478>
> >        addr: ffff81005286e0c0  paddr: 5286e0c0  cnt: 8
> >    rd: read_diskdump(-1, 7fffd3ca4478, 8, ffff81005286e0c0,
> >    5286e0c0), pfn=5286e
> >    rd: Buffering page with pfn=5286e and current physaddr=5286e000
> >    ffff81005286e0c0:  0000000000000000                    ........
> >         text hit rate: 66% (239 of 357)
> >    crash>
> >
> > For starters, I think the patch can be simplified by modifying the "addr: ..."
> > line that is displayed by readmem() just prior to issuing it to the configured
> > pc->readmem function so that it shows the receiving function.  So that line,
> > which may be displayed several times if the read crosses a page boundary, could
> > be reworked to show something like this:
> >
> >    crash>  rd ffff81005286e0c0
> >    <addr: ffff81005286e0c0 count: 1 flag: 490 (KVADDR)>
> >    <readmem: ffff81005286e0c0, KVADDR, "64-bit KVADDR", 8, (FOE),
> >    7fffd3ca4478>
> >    <read_netdump: addr: ffff81005286e0c0 paddr: 5286e0c0 cnt: 8
> >    ...
> >
> > and then a line or two from the relevant read functions would show
> > the file offset that was resolved from the physical address.
> >
> > Also, the new function that translates the pc->readmem function address to a
> > string for display could also be used in a single location in main.c instead of
> > having to put debug statements in each location where where pc->readmem is
> > initialized.
> 
> Isn't there at least one case, e.g. x86_64_xen_kdump_p2m_create() that
> replaces and restores the pc->readmem value at runtime?

That is correct -- it's a one-time deal.  I'll leave a debug statement
in place for those two cases (x86 and x86_64).  

> 
> > So let me play around with this a bit...
> 
> You are welcome too but I would be happy to take responsibility and
> re-work it too. If you do go ahead with a re-work then please consider
> my goal of monitoring program flow.

I will absolutely keep that in mind -- I'll post the patch for your review
before checking it in.

Dave




More information about the Crash-utility mailing list