[Crash-utility] is_page_ptr vs. x86_64_kvtop
Dave Anderson
anderson at redhat.com
Fri Mar 15 14:07:20 UTC 2013
----- Original Message -----
> Hi Dave, et al.,
>
> I have this little problem. I am trying to get a lustre file system
> extension working again. It used to work, but does no more.
> It first calls is_page_ptr(kvaddr, &kpaddr) to convert a virtual
> address into a physical address, and then calls:
>
> > readmem(kpaddr, PHYSADDR, buf, used,
> > "trace page data", RETURN_ON_ERROR)
>
> to fetch the bytes. Updating the release to SLES-11 SP2 causes
> this to now fail.
So are you saying that it works with an earlier kernel version?
> In my debugging of crash/gdb, this:
>
> > is_page_ptr (addr=18446719884937843744, phys=0x7fffffffd370) at memory.c:11448
> > 11448 if (IS_SPARSEMEM()) {
> > (gdb) p/x addr
> > $8 = 0xffffea001cdad420
>
> is about to fail. However, this:
>
> > crash> gdb x/4xg 0xffffea001cdad420
>
> works just fine. I've stepped through x_command until it gets to
> x86_64_kvtop() where I'm finding the logic a little twisty.
> But it pretty clearly does not rely on section_mem_map_addr() stuff.
>
> So, here's my point: this is confusing. What should I look for
> to determine why "is_page_ptr()" is saying 0xffffea001cdad420
> is invalid while "x86_64_kvtop()" is saying that it is and its
> physical address is 0x87afad420?
>
> > 878 return(readmem(addr, memtype, buf, len,
> > (gdb) s
> > readmem (addr=0xffffea001cdad420, memtype=0x1, buffer=0x5d85d10,
> > size=0x8,
> > type=0x945f0a "gdb_readmem_callback", error_handle=0x2) at
> > memory.c:1991
> >
> > 0xffffea001cdad420: PML4 DIRECTORY: ffffffff81623000
> > PAGE DIRECTORY: 87fff7067
> > PUD: 87fff7000 => 87fff6067
> > PMD: 87fff6730 => 800000087ae001e3
> > PAGE: 87ae00000 (2MB)
> > PTE PHYSICAL FLAGS
> > 800000087ae001e3 87ae00000
> > (PRESENT|RW|ACCESSED|DIRTY|PSE|GLOBAL|NX)
> > (gdb) p physpage
> > $34 = 0x87afad420
>
> 0xffffea001cdad420: 0x0200000000000000 0xffffffff00000001
> 0xffffea001cdad430: 0x0000000000000000 0x0000000000000000
>
> Help, please? Thank you!
It is translating the vmemmap'ed kernel address to a physical address
by walking the page tables, and finding it in a 2MB big-page.
If you skip the is_page_ptr() qualifier, does this work, and
if so, does it look like a legitimate page structure?:
crash> struct page ffffea001cdad420
But the sparsemem stuff doesn't seem to be accepting it as a vmemmap
page struct address. Does "kmem -p" include physical address 0x87afad420?
For example, on my system, the last physical page mapped in the
vmmemap is 21ffff000:
crash> kmem -p | tail
ffffea00087ffd80 21fff6000 0 0 0 0
ffffea00087ffdc0 21fff7000 0 0 0 0
ffffea00087ffe00 21fff8000 0 0 0 0
ffffea00087ffe40 21fff9000 0 0 0 0
ffffea00087ffe80 21fffa000 0 0 0 0
ffffea00087ffec0 21fffb000 0 0 0 0
ffffea00087fff00 21fffc000 0 0 0 0
ffffea00087fff40 21fffd000 0 0 0 0
ffffea00087fff80 21fffe000 0 0 0 0
ffffea00087fffc0 21ffff000 0 0 0 0
crash>
Anyway, the first thing that needs to be done is to verify that
the the SECTION_SIZE_BITS and MAX_PHYSMEM_BITS are being setup
correctly. The upstream kernel currently has:
# define SECTION_SIZE_BITS 27 /* matt - 128 is convenient right now */
# define MAX_PHYSADDR_BITS 44
# define MAX_PHYSMEM_BITS 46
And crash has these, where SECTION_SIZE_BITS is stable, but the MAX_PHYSMEM_BITS
can be either of 3 possible values, depending upon kernel version:
#define _SECTION_SIZE_BITS 27
#define _MAX_PHYSMEM_BITS 40
#define _MAX_PHYSMEM_BITS_2_6_26 44
#define _MAX_PHYSMEM_BITS_2_6_31 46
And in x86_64_init() there is a segment that tries to pick the correct value.
So for example, on my 3.7.9 kernel, I see:
crash> help -m | grep -e section -e physmem
section_size_bits: 27
max_physmem_bits: 46
sections_per_root: 128
crash>
Take a look at your SLES-11 SP2 kernel sources and determine what
values are being used, and compare them to what crash set them up
to be.
Dave
More information about the Crash-utility
mailing list