[Crash-utility] is_page_ptr vs. x86_64_kvtop

Dave Anderson anderson at redhat.com
Fri Mar 15 14:07:20 UTC 2013



----- Original Message -----
> Hi Dave, et al.,
> 
> I have this little problem.  I am trying to get a lustre file system
> extension working again.  It used to work, but does no more.
> It first calls is_page_ptr(kvaddr, &kpaddr) to convert a virtual
> address into a physical address, and then calls:
> 
> >    readmem(kpaddr, PHYSADDR, buf, used,
> > 	   "trace page data", RETURN_ON_ERROR)
> 
> to fetch the bytes.  Updating the release to SLES-11 SP2 causes
> this to now fail.

So are you saying that it works with an earlier kernel version?  

> In my debugging of crash/gdb, this:
> 
> >   is_page_ptr (addr=18446719884937843744, phys=0x7fffffffd370) at memory.c:11448
> >   11448           if (IS_SPARSEMEM()) {
> >   (gdb) p/x addr
> >   $8 = 0xffffea001cdad420
> 
> is about to fail.  However, this:
> 
> > crash> gdb x/4xg 0xffffea001cdad420
> 
> works just fine.  I've stepped through x_command until it gets to
> x86_64_kvtop() where I'm finding the logic a little twisty.
> But it pretty clearly does not rely on section_mem_map_addr() stuff.
> 
> So, here's my point: this is confusing.  What should I look for
> to determine why "is_page_ptr()" is saying 0xffffea001cdad420
> is invalid while "x86_64_kvtop()" is saying that it is and its
> physical address is 0x87afad420?
> 
> > 878             return(readmem(addr, memtype, buf, len,
> > (gdb) s
> > readmem (addr=0xffffea001cdad420, memtype=0x1, buffer=0x5d85d10,
> > size=0x8,
> >     type=0x945f0a "gdb_readmem_callback", error_handle=0x2) at
> >     memory.c:1991
> > 
> > 0xffffea001cdad420:     PML4 DIRECTORY: ffffffff81623000
> > PAGE DIRECTORY: 87fff7067
> >    PUD: 87fff7000 => 87fff6067
> >    PMD: 87fff6730 => 800000087ae001e3
> >   PAGE: 87ae00000  (2MB)
> >       PTE         PHYSICAL   FLAGS
> > 800000087ae001e3  87ae00000
> >  (PRESENT|RW|ACCESSED|DIRTY|PSE|GLOBAL|NX)
> > (gdb) p physpage
> > $34 = 0x87afad420
> 
> 0xffffea001cdad420:     0x0200000000000000      0xffffffff00000001
> 0xffffea001cdad430:     0x0000000000000000      0x0000000000000000
> 
> Help, please?  Thank you!

It is translating the vmemmap'ed kernel address to a physical address
by walking the page tables, and finding it in a 2MB big-page. 
If you skip the is_page_ptr() qualifier, does this work, and 
if so, does it look like a legitimate page structure?:

 crash> struct page ffffea001cdad420

But the sparsemem stuff doesn't seem to be accepting it as a vmemmap
page struct address.  Does "kmem -p" include physical address 0x87afad420?
For example, on my system, the last physical page mapped in the
vmmemap is 21ffff000:

 crash> kmem -p | tail
 ffffea00087ffd80 21fff6000                0        0  0 0
 ffffea00087ffdc0 21fff7000                0        0  0 0
 ffffea00087ffe00 21fff8000                0        0  0 0
 ffffea00087ffe40 21fff9000                0        0  0 0
 ffffea00087ffe80 21fffa000                0        0  0 0
 ffffea00087ffec0 21fffb000                0        0  0 0
 ffffea00087fff00 21fffc000                0        0  0 0
 ffffea00087fff40 21fffd000                0        0  0 0
 ffffea00087fff80 21fffe000                0        0  0 0
 ffffea00087fffc0 21ffff000                0        0  0 0
 crash> 

Anyway, the first thing that needs to be done is to verify that
the the SECTION_SIZE_BITS and MAX_PHYSMEM_BITS are being setup
correctly.  The upstream kernel currently has:

 # define SECTION_SIZE_BITS      27 /* matt - 128 is convenient right now */
 # define MAX_PHYSADDR_BITS      44
 # define MAX_PHYSMEM_BITS       46

And crash has these, where SECTION_SIZE_BITS is stable, but the MAX_PHYSMEM_BITS
can be either of 3 possible values, depending upon kernel version:

 #define _SECTION_SIZE_BITS        27
 #define _MAX_PHYSMEM_BITS         40
 #define _MAX_PHYSMEM_BITS_2_6_26  44
 #define _MAX_PHYSMEM_BITS_2_6_31  46

And in x86_64_init() there is a segment that tries to pick the correct value.
So for example, on my 3.7.9 kernel, I see:

 crash> help -m | grep -e section -e physmem
   section_size_bits: 27
    max_physmem_bits: 46
   sections_per_root: 128
 crash>

Take a look at your SLES-11 SP2 kernel sources and determine what
values are being used, and compare them to what crash set them up
to be.

Dave



 




More information about the Crash-utility mailing list