[Crash-utility] [PATCH v2] Add read diagnostics to crash

David Mair dmair at suse.com
Tue Jan 10 18:30:32 UTC 2012


Hi Dave,

On 01/10/2012 09:24 AM, Dave Anderson wrote:
>
>
> ----- Original Message -----
>> Re-posting the patch as an attachment. I'm sorry for the mess on the
>> first pass.
>>
>> --
>> David.
>
> Hi Dave,
>
> Your patch has merit, but I'm going to pare it down a bit.  I use CRASHDEBUG(8)
> fairly frequently, but with your changes as-is, it's far too verbose.  It's also
> a bit redundant -- here's an example of a CRASHDEBUG(8) kdump read currently:
>
>    crash>  rd ffff88003c1a5cc0
>    <addr: ffff88003c1a5cc0 count: 1 flag: 490 (KVADDR)>
>    <readmem: ffff88003c1a5cc0, KVADDR, "64-bit KVADDR", 8, (FOE), 7fff0096b5a8>
>        addr: ffff88003c1a5cc0  paddr: 3c1a5cc0  cnt: 8
>    ffff88003c1a5cc0:  0000000000000000                    ........
>         text hit rate: 65% (3505 of 5378)
>    crash>
>
> And then with your patch:
>
>    crash>  rd ffff88003c1a5cc0
>    <addr: ffff88003c1a5cc0 count: 1 flag: 490 (KVADDR)>
>    <readmem: ffff88003c1a5cc0, KVADDR, "64-bit KVADDR", 8, (FOE), 7fff1c97b6e8>
>        addr: ffff88003c1a5cc0  paddr: 3c1a5cc0  cnt: 8
>    rd: read_kdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
>    rd: read_kdump trying read_netdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
>    rd: read_netdump(-1, 7fff1c97b6e8, 8, ffff88003c1a5cc0, 3c1a5cc0)
>    rd: read KDUMP or NETDUMP_ELF64 for multi-segment: 3c195fb8
>    rd: seek and read offset: 3c195fb8
>    ffff88003c1a5cc0:  0000000000000000                    ........
>         text hit rate: 65% (3505 of 5378)
>    crash>
>
> The "<readmem: ...>" and subsequent "addr: ..." lines come from readmem().
> I don't understand the need for the next 3 lines, as they are completely
>   redundant:

Except that one intention was to show program flow (failures that didn't 
happen) as well as continuation of program state.

*	read_kdump() entered
rd: read_kdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0

*	None of the conditional fatal error exits in read_kdump()
rd: read_kdump trying read_netdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0

*	read_netdump() entered
rd: read_netdump(-1, 7fff1c97b6e8, 8, ffff88003c1a5cc0, 3c1a5cc0)

*	switch (DUMPFILE_FORMAT(nd->flags)) VIA case KDUMP_ELF64
*	show calculated offset
rd: read KDUMP or NETDUMP_ELF64 for multi-segment: 3c195fb8

*	Not the if (FLAT_FORMAT()) read case
rd: seek and read offset: 3c195fb8

>    rd: read_kdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
>    rd: read_kdump trying read_netdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
>    rd: read_netdump(-1, 7fff1c97b6e8, 8, ffff88003c1a5cc0, 3c1a5cc0)
>
> I don't argue with the displays of the resultant file offsets, but I may move
> them to CRASHDEBUG(9):

I wouldn't object and can re-post a CRASHDEBUG(9) version myself if 
you'd prefer.

>    rd: read KDUMP or NETDUMP_ELF64 for multi-segment: 3c195fb8
>    rd: seek and read offset: 3c195fb8
>
> And the same goes for compressed kdumps:
>
>    crash>  rd ffff81005286e0c0
>    <addr: ffff81005286e0c0 count: 1 flag: 490 (KVADDR)>
>    <readmem: ffff81005286e0c0, KVADDR, "64-bit KVADDR", 8, (FOE), 7fff11d16958>
>        addr: ffff81005286e0c0  paddr: 5286e0c0  cnt: 8
>    ffff81005286e0c0:  0000000000000000                    ........
>         text hit rate: 66% (239 of 357)
>    crash>
>
> With the patch, aside from the "pfn" displays, it's redundant information:

Except for the implied flow and continuation of state. I've been asked 
to look into alleged crash bugs that were no more than corrupt or 
incomplete dumps that required either hexdump style deciphering of the 
structure or a method of crash exposing its own deciphering of the 
structure because the program starts and exits with no more than an 
error code that can't always identify a unique point of failure and 
you'd need to find the cause in source anyway. For example, a case where 
an incomplete crash dump has page-present flags showing the page not 
present for kernel data structures (every single page present flag is 
zero in-fact). IOW, no crash bug because kdump created an incomplete 
crash dump.

>    crash>  rd ffff81005286e0c0
>    <addr: ffff81005286e0c0 count: 1 flag: 490 (KVADDR)>
>    <readmem: ffff81005286e0c0, KVADDR, "64-bit KVADDR", 8, (FOE), 7fffd3ca4478>
>        addr: ffff81005286e0c0  paddr: 5286e0c0  cnt: 8
>    rd: read_diskdump(-1, 7fffd3ca4478, 8, ffff81005286e0c0, 5286e0c0), pfn=5286e
>    rd: Buffering page with pfn=5286e and current physaddr=5286e000
>    ffff81005286e0c0:  0000000000000000                    ........
>         text hit rate: 66% (239 of 357)
>    crash>
>
> For starters, I think the patch can be simplified by modifying the "addr: ..."
> line that is displayed by readmem() just prior to issuing it to the configured
> pc->readmem function so that it shows the receiving function.  So that line,
> which may be displayed several times if the read crosses a page boundary, could
> be reworked to show something like this:
>
>    crash>  rd ffff81005286e0c0
>    <addr: ffff81005286e0c0 count: 1 flag: 490 (KVADDR)>
>    <readmem: ffff81005286e0c0, KVADDR, "64-bit KVADDR", 8, (FOE), 7fffd3ca4478>
>    <read_netdump: addr: ffff81005286e0c0 paddr: 5286e0c0 cnt: 8
>    ...
>
> and then a line or two from the relevant read functions would show
> the file offset that was resolved from the physical address.
>
> Also, the new function that translates the pc->readmem function address to a
> string for display could also be used in a single location in main.c instead of
> having to put debug statements in each location where where pc->readmem is
> initialized.

Isn't there at least one case, e.g. x86_64_xen_kdump_p2m_create() that 
replaces and restores the pc->readmem value at runtime?

> So let me play around with this a bit...

You are welcome too but I would be happy to take responsibility and 
re-work it too. If you do go ahead with a re-work then please consider 
my goal of monitoring program flow.

Thanks,
David




More information about the Crash-utility mailing list