[Crash-utility] x86 remap allocator in kernel 3.0

Dave Anderson anderson at redhat.com
Thu Jan 12 21:01:08 UTC 2012



----- Original Message -----
> Hi Dave,
>
> On 01/12/2012 12:58 PM, Dave Anderson wrote:
> >
> >
> > ----- Original Message -----
> >> On 01/11/2012 06:53 AM, Petr Tesarik wrote:
> >>> Dne Út 10. ledna 2012 19:23:24 Petr Tesarik napsal(a):
> >>>> Dne Út 10. ledna 2012 19:14:32 Petr Tesarik napsal(a):
> >> ... [ cut ] ...
> >>>> crash>   vtop f2800080
> >>>> VIRTUAL   PHYSICAL
> >>>> f2800080  1fde00080
> >>>>
> >>>> PAGE DIRECTORY: c08ed000
> >>>>     PGD: c08ed018 =>   8ea001
> >>>>     PMD:   8eaca0 =>   80000001fde001e3
> >>>>    PAGE: 1fde00000  (2MB)
> >>>>
> >>>>         PTE         PHYSICAL   FLAGS
> >>>> 80000001fde001e3  1fde00000
> >>>>   (PRESENT|RW|ACCESSED|DIRTY|PSE|GLOBAL|NX)
> >>>>
> >>>>     PAGE     PHYSICAL   MAPPING    INDEX CNT FLAGS
> >>>
> >>> BTW the data from struct page is really missing here. I traced
> >>> this down to an
> >>> integer overflow in dump_memory_nodes():
> >> ... [ cut ] ...
> >>> David (Mair), could you address this, as already discussed in
> >>> private mails,
> >>> please?
> >>
> >> The attached patch fixes this for me.
> >
> > Hmmm, not so much for me...
> >
> > When I test the patch on RHEL5, RHEL6 and Fedora x86 kernels, the
> > command always fails like this:
> >
> > crash>  kmem -n
> > NODE    SIZE    PGLIST_DATA   BOOTMEM_DATA   NODE_ZONES
> >    0    262075     c0a3a680      c0aa5ce8      c0a3a680
> >                                                c0a3b1c0
> >                                                c0a3bd00
> >                                                c0a3c840
> > MEM_MAP      START_PADDR    START_MAPNR
> > Segmentation fault
> > $
> >
> > I haven't look into it, and this is probably not related:
> >
> > cc -c -g -DX86 -m32 -D_FILE_OFFSET_BITS=64 -DGDB_7_3_1  memory.c
> > -Wall -O2 -Wstrict-prototypes -Wmissing-prototypes
> > -fstack-protector
> > memory.c: In function 'dump_memory_nodes':
> > memory.c:13410: warning: cast to pointer from integer of different
> > size
> > memory.c:13199: warning: 'node_start_paddr' may be used
> > uninitialized in this function
> >
> > But from under gdb:
> >
> > crash>  kmem -n
> > [Detaching after fork from child process 16870.]
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x0809cd94 in mkstring (s=0xffe8a8bc "  c0a3a680  ", size=16,
> > flags=133, opt=0x1000<Address 0x1000 out of bounds>)
> >      at tools.c:1620
> > 1620                    sprintf(s, "%llx", *((ulonglong *)opt));
> > (gdb) bt
> > #0  0x0809cd94 in mkstring (s=0xffe8a8bc "  c0a3a680  ", size=16,
> > flags=133, opt=0x1000<Address 0x1000 out of bounds>)
> >      at tools.c:1620
> > #1  0x080b1ad7 in dump_memory_nodes (initialize=0) at
> > memory.c:13406
> > #2  0x080d46cc in cmd_kmem () at memory.c:4221
> > #3  0x080941b8 in exec_command () at main.c:751
> > #4  0x08093fe6 in main_loop () at main.c:699
> > #5  0x081db622 in current_interp_command_loop ()
> > #6  0x081dbf01 in captured_command_loop ()
> > #7  0x081db0a4 in catch_errors ()
> > #8  0x081dce07 in captured_main ()
> > #9  0x081db0a4 in catch_errors ()
> > #10 0x081dce49 in gdb_main ()
> > #11 0x081dce99 in gdb_main_entry ()
> > #12 0x08116668 in gdb_main_loop (argc=2, argv=0xffe8d494) at
> > gdb_interface.c:75
> > #13 0x08093ce0 in main (argc=3, argv=0xffe8d494) at main.c:603
> > (gdb) p opt
> > $1 = 0x1000<Address 0x1000 out of bounds>
> > (gdb)
> >
> > I thought it might be just an x86 issue, but it also fails
> > the same way on RHEL5, RHEL6 and Fedora x86_64 kernels.
>
> I'm looking into it, that's the change I made to the fprintf() that uses
> node_start_paddr. The patch would be a fix for the problem Petr reported
> if it did not include the expansion to 64-bits in the fprintf() that's
> the whole last piece of the patch. The fix is pretty obvious now to the
> original patch:
>
> LONGLONG_HEX takes a ulonglong *
> LONG_HEX takes a ulong
>
> Changing the last modified MKSTR() to take &node_start_paddr should
> resolve it. Patch attached.
>
> --
> David.

Right -- our mails crossed...

But I don't want the output format to change, especially on 64-bit
machines.  For example on x86_64, it currently looks like these 
examples, where the 3 fields are centered:

    MEM_MAP       START_PADDR  START_MAPNR
ffff8100006e6000       0            0

    MEM_MAP       START_PADDR  START_MAPNR
ffffea0004280000   130000000     1245184

    MEM_MAP       START_PADDR  START_MAPNR
ffffea0000000038      1000          1


With your patch they looks like this:

crash> kmem -n
...
    MEM_MAP          START_PADDR    START_MAPNR
ffff8100006e6000             0                0

    MEM_MAP          START_PADDR    START_MAPNR
ffffea0004280000         130000000         1245184

    MEM_MAP          START_PADDR    START_MAPNR
ffffea0000000038           1000               1


And even on 32-bit, it seems to vary.  Here's 3 without 
the patch:

MEM_MAP   START_PADDR  START_MAPNR
c9000000       0            0

MEM_MAP   START_PADDR  START_MAPNR
c188a020      1000          1

MEM_MAP   START_PADDR  START_MAPNR
f5d2c200     10000          16

And with it applied:

MEM_MAP      START_PADDR    START_MAPNR
c9000000             0                0

MEM_MAP      START_PADDR    START_MAPNR
c188a020           1000               1

MEM_MAP      START_PADDR    START_MAPNR
f5d2c200           10000              16

It should be possible to incorporate the variable-size
adjustment without changing the display.

Dave













More information about the Crash-utility mailing list