[Crash-utility] Request for ppc64 help from IBM

Dave Anderson anderson at redhat.com
Fri Dec 11 16:39:03 UTC 2009


Somewhere between the RHEL5 (2.6.18-based) and RHEL6 timeframe, 
the ppc64 architecture has started using a virtual memmap scheme
for the arrays of page structures used to describe/handle
each physical page of memory.

In RHEL5, the page structures in the memmap array were unity-mapped
(i.e., the physical address is or'd with c000000000000000), as 
"kmem -n" shows below in the sparsemem data breakdown under MEM_MAP:
  
  crash> kmem -n
  ... [ snip ] ...
  NR      SECTION        CODED_MEM_MAP        MEM_MAP       PFN
   0  c000000000750000  c000000000760000  c000000000760000  0               
   1  c000000000750008  c000000000760000  c000000000763800  256             
   2  c000000000750010  c000000000760000  c000000000767000  512             
   3  c000000000750018  c000000000760000  c00000000076a800  768             
   4  c000000000750020  c000000000760000  c00000000076e000  1024            
   5  c000000000750028  c000000000760000  c000000000771800  1280            
   6  c000000000750030  c000000000760000  c000000000775000  1536            
   7  c000000000750038  c000000000760000  c000000000778800  1792            
   8  c000000000750040  c000000000760000  c00000000077c000  2048            
   9  c000000000750048  c000000000760000  c00000000077f800  2304            
  10  c000000000750050  c000000000760000  c000000000783000  2560            
  11  c000000000750058  c000000000760000  c000000000786800  2816            
  12  c000000000750060  c000000000760000  c00000000078a000  3072            
  ...

also shown via the memmap page structure listing displayed by 
"kmem -p":
  
  crash> kmem -p
        PAGE       PHYSICAL      MAPPING       INDEX CNT FLAGS
  c000000000760000        0                0        0  1 400
  c000000000760038    10000                0        0  1 400
  c000000000760070    20000                0        0  1 400
  c0000000007600a8    30000                0        0  1 400
  c0000000007600e0    40000                0        0  1 400
  c000000000760118    50000                0        0  1 400
  c000000000760150    60000                0        0  1 400
  c000000000760188    70000                0        0  1 400
  c0000000007601c0    80000                0        0  1 400
  c0000000007601f8    90000                0        0  1 400
  ...

In RHEL6 (2.6.31-38.el6) the memmap page array is apparently
virtually memmap'd -- using a virtual range of memory starting
at a heretofore-unseen virtual address range starting at
f000000000000000:
  
  crash> kmem -n
  ... [ snip ] ...
  NR      SECTION        CODED_MEM_MAP        MEM_MAP       PFN
   0  c000000002160000  f000000000000000  f000000000000000  0               
   1  c000000002160020  f000000000000000  f000000000006800  256             
   2  c000000002160040  f000000000000000  f00000000000d000  512             
   3  c000000002160060  f000000000000000  f000000000013800  768             
   4  c000000002160080  f000000000000000  f00000000001a000  1024            
   5  c0000000021600a0  f000000000000000  f000000000020800  1280            
   6  c0000000021600c0  f000000000000000  f000000000027000  1536            
   7  c0000000021600e0  f000000000000000  f00000000002d800  1792            
   8  c000000002160100  f000000000000000  f000000000034000  2048            
   9  c000000002160120  f000000000000000  f00000000003a800  2304            
  10  c000000002160140  f000000000000000  f000000000041000  2560            
  ... [ snip ] ...
  crash> kmem -p
        PAGE       PHYSICAL      MAPPING       INDEX CNT FLAGS
  f000000000000000        0                0        0  0 0
  f000000000000068    10000                0        0  0 0
  f0000000000000d0    20000                0        0  0 0
  f000000000000138    30000                0        0  0 0
  f0000000000001a0    40000                0        0  0 0
  f000000000000208    50000                0 -4611686016392006416  0 0
  f000000000000270    60000                0        0  0 0
  f0000000000002d8    70000                0        0  0 0
  f000000000000340    80000                0        0  0 0
  f0000000000003a8    90000                0 -4611686016730798344  0 0
  f000000000000410    a0000                0        0  0 0
  f000000000000478    b0000                0        0  0 0
  f0000000000004e0    c0000                0        0  0 c0000000651534e0
  f000000000000548    d0000                0        0  0 0
  ...

But as can be seen in the "kmem -p" output, and when using other
commands that actually read the data in the page structure, the
data read is either bogus or the readmem() of the address just fails
the virtual address translation and indicates that the page is not mapped.

Because the page structures' virtual address is not unity-mapped, 
the page address gets translated via page table walk-through in the
same manner as vmalloc()'d addresses.  In the ppc64 architecture,
the vmalloc range starts at d000000000000000:

  crash> mach
  ...
  KERNEL VIRTUAL BASE: c000000000000000
  KERNEL VMALLOC BASE: d000000000000000
  ...

Since the ppc64 virtual-to-physical address translations of
these f000000000000000-based addresses returns either a 
bogus physical address or fails entirely, this in turn causes 
bizarre errors in crash commands that actually read the contents
of page structures -- such as "kmem -s", where slub data is 
stored in the page structure.

So my speculation (guess?) is that the ppc64.c ppc64_vtop()
function needs updating to properly translate these addresses.

Since the ppc64 stuff in the crash utility was written by, and 
has been maintained by IBM (and since I am ppc64-challenged), 
can you guys take a look at what needs to be done?

Thanks,
  Dave






More information about the Crash-utility mailing list