[Crash-utility] crash seek error, failed to read vmcore file

Dave Anderson anderson at redhat.com
Thu Apr 22 13:31:48 UTC 2010


----- "Pavan Naregundi" <pavan at linux.vnet.ibm.com> wrote:

> > 
> > In any case, unfortunately, there's nothing can be done from the crash
> > utility's perspective. 
> >   
> > Dave
> 
> Thank you Dave.
> 
> Our SLES11 does not have the above patch you mentioned, but at the same
> time system is not AMS enabled and CONFIG_CMM is also not set in the config file..
> 
> This system also has /proc/device-tree/memory at 0 dir only..

I don't have access to the original "problem" ppc64 machine, 
but here I'm logged into another ppc64, where the memory
advertised in /proc/device-tree is as expected.  It has 
these file memory at xxx/reg files, showing a set of contiguous
memory chunks.  The first one is 128MB, followed by a series
of 16MB chunks: 

         memory at 0: 0000000000000000 0000000008000000
   memory at 8000000: 0000000008000000 0000000001000000
   memory at 9000000: 0000000009000000 0000000001000000
   memory at a000000: 000000000a000000 0000000001000000
   memory at b000000: 000000000b000000 0000000001000000
   memory at c000000: 000000000c000000 0000000001000000
   memory at d000000: 000000000d000000 0000000001000000
   memory at e000000: 000000000e000000 0000000001000000
   memory at f000000: 000000000f000000 0000000001000000
  memory at 10000000: 0000000010000000 0000000001000000
  memory at 11000000: 0000000011000000 0000000001000000
  memory at 12000000: 0000000012000000 0000000001000000
  memory at 13000000: 0000000013000000 0000000001000000
  memory at 14000000: 0000000014000000 0000000001000000
  memory at 15000000: 0000000015000000 0000000001000000
  memory at 16000000: 0000000016000000 0000000001000000
  memory at 17000000: 0000000017000000 0000000001000000
  memory at 18000000: 0000000018000000 0000000001000000
  memory at 19000000: 0000000019000000 0000000001000000
  memory at 1a000000: 000000001a000000 0000000001000000
  memory at 1b000000: 000000001b000000 0000000001000000
  memory at 1c000000: 000000001c000000 0000000001000000
  memory at 1d000000: 000000001d000000 0000000001000000
  memory at 1e000000: 000000001e000000 0000000001000000
  memory at 1f000000: 000000001f000000 0000000001000000
  memory at 20000000: 0000000020000000 0000000001000000
  memory at 21000000: 0000000021000000 0000000001000000
  memory at 22000000: 0000000022000000 0000000001000000
  memory at 23000000: 0000000023000000 0000000001000000
  memory at 24000000: 0000000024000000 0000000001000000
  memory at 25000000: 0000000025000000 0000000001000000
  memory at 26000000: 0000000026000000 0000000001000000
  memory at 27000000: 0000000027000000 0000000001000000
  memory at 28000000: 0000000028000000 0000000001000000
  memory at 29000000: 0000000029000000 0000000001000000
  memory at 2a000000: 000000002a000000 0000000001000000
  memory at 2b000000: 000000002b000000 0000000001000000
  memory at 2c000000: 000000002c000000 0000000001000000
  memory at 2d000000: 000000002d000000 0000000001000000
  memory at 2e000000: 000000002e000000 0000000001000000
  memory at 2f000000: 000000002f000000 0000000001000000
  memory at 30000000: 0000000030000000 0000000001000000
  memory at 31000000: 0000000031000000 0000000001000000
  memory at 32000000: 0000000032000000 0000000001000000
  memory at 33000000: 0000000033000000 0000000001000000
  memory at 34000000: 0000000034000000 0000000001000000
  memory at 35000000: 0000000035000000 0000000001000000
  memory at 36000000: 0000000036000000 0000000001000000
  memory at 37000000: 0000000037000000 0000000001000000
  memory at 38000000: 0000000038000000 0000000001000000
  memory at 39000000: 0000000039000000 0000000001000000
  memory at 3a000000: 000000003a000000 0000000001000000
  memory at 3b000000: 000000003b000000 0000000001000000
  memory at 3c000000: 000000003c000000 0000000001000000
  memory at 3d000000: 000000003d000000 0000000001000000
  memory at 3e000000: 000000003e000000 0000000001000000
  memory at 3f000000: 000000003f000000 0000000001000000
  memory at 40000000: 0000000040000000 0000000001000000
  memory at 41000000: 0000000041000000 0000000001000000
  memory at 42000000: 0000000042000000 0000000001000000
  memory at 43000000: 0000000043000000 0000000001000000
  memory at 44000000: 0000000044000000 0000000001000000
  memory at 45000000: 0000000045000000 0000000001000000
  memory at 46000000: 0000000046000000 0000000001000000
  memory at 47000000: 0000000047000000 0000000001000000
  memory at 48000000: 0000000048000000 0000000001000000
  memory at 49000000: 0000000049000000 0000000001000000
  memory at 4a000000: 000000004a000000 0000000001000000
  memory at 4b000000: 000000004b000000 0000000001000000
  memory at 4c000000: 000000004c000000 0000000001000000
  memory at 4d000000: 000000004d000000 0000000001000000
  memory at 4e000000: 000000004e000000 0000000001000000
  memory at 4f000000: 000000004f000000 0000000001000000
  memory at 50000000: 0000000050000000 0000000001000000
  memory at 51000000: 0000000051000000 0000000001000000
  memory at 52000000: 0000000052000000 0000000001000000
  memory at 53000000: 0000000053000000 0000000001000000
  memory at 54000000: 0000000054000000 0000000001000000
  memory at 55000000: 0000000055000000 0000000001000000
  memory at 56000000: 0000000056000000 0000000001000000
  memory at 57000000: 0000000057000000 0000000001000000
  memory at 58000000: 0000000058000000 0000000001000000
  memory at 59000000: 0000000059000000 0000000001000000
  memory at 5a000000: 000000005a000000 0000000001000000
  memory at 5b000000: 000000005b000000 0000000001000000
  memory at 5c000000: 000000005c000000 0000000001000000
  memory at 5d000000: 000000005d000000 0000000001000000
  memory at 5e000000: 000000005e000000 0000000001000000
  memory at 5f000000: 000000005f000000 0000000001000000
  memory at 60000000: 0000000060000000 0000000001000000
  memory at 61000000: 0000000061000000 0000000001000000
  memory at 62000000: 0000000062000000 0000000001000000
  memory at 63000000: 0000000063000000 0000000001000000
  memory at 64000000: 0000000064000000 0000000001000000
  memory at 65000000: 0000000065000000 0000000001000000
  memory at 66000000: 0000000066000000 0000000001000000
  memory at 67000000: 0000000067000000 0000000001000000
  memory at 68000000: 0000000068000000 0000000001000000
  memory at 69000000: 0000000069000000 0000000001000000
  memory at 6a000000: 000000006a000000 0000000001000000
  memory at 6b000000: 000000006b000000 0000000001000000
  memory at 6c000000: 000000006c000000 0000000001000000
  memory at 6d000000: 000000006d000000 0000000001000000
  memory at 6e000000: 000000006e000000 0000000001000000
  memory at 6f000000: 000000006f000000 0000000001000000
  memory at 70000000: 0000000070000000 0000000001000000
  memory at 71000000: 0000000071000000 0000000001000000
  memory at 72000000: 0000000072000000 0000000001000000
  memory at 73000000: 0000000073000000 0000000001000000
  memory at 74000000: 0000000074000000 0000000001000000
  memory at 75000000: 0000000075000000 0000000001000000
  memory at 76000000: 0000000076000000 0000000001000000
  memory at 77000000: 0000000077000000 0000000001000000
  memory at 78000000: 0000000078000000 0000000001000000
  memory at 79000000: 0000000079000000 0000000001000000
  memory at 7a000000: 000000007a000000 0000000001000000
  memory at 7b000000: 000000007b000000 0000000001000000
  
So the end of physical memory is at 7b000000 + 1000000, which
I can verify by running crash on the live system:
  
  crash> eval 7b000000 + 0x1000000
  hexadecimal: 7c000000  (1984MB)
      decimal: 2080374784  
        octal: 17400000000
       binary: 0000000000000000000000000000000001111100000000000000000000000000
  crash> eval 7c000000 / 64k
  hexadecimal: 7c00  (31KB)
      decimal: 31744  
        octal: 76000
       binary: 0000000000000000000000000000000000000000000000000111110000000000
  crash> p num_physpages
  num_physpages = $9 = 31744
  crash>
  
and which matches what's in the kernel memory zone data:

  crash> kmem -n
  NODE    SIZE      PGLIST_DATA       BOOTMEM_DATA       NODE_ZONES   
    0    31744    c00000007bfdf280  c000000000a53c50  c00000007bfdf280
                                                      c00000007bfe1900
                                                      c00000007bfe3f80
      MEM_MAP       START_PADDR  START_MAPNR
  f000000000000000       0            0     
  
  ZONE  NAME         SIZE       MEM_MAP      START_PADDR  START_MAPNR
    0   DMA         31744  f000000000000000            0            0
    1   Normal          0                 0            0            0
    2   Movable         0                 0            0            0
  
  ...

So everything looks fine.

But if your system has just a single /proc/device-tree/memory at 0
directory whose size doesn't match up with what the live kernel is
using, then that's the kernel bug. 

> In any case, unfortunately, there's nothing can be done from the crash
> utility's perspective. 

BTW, you can get minimal data from your truncated vmcore using
the --minimal switch that IBM contributed a while back:

  # crash --minimal vmcore vmlinux

It at least offers the log, dis, rd, sym and eval commands, which may
or may not help.  It's actually come in quite handy a few times.

Anyway, if you guys come up with a kernel fix, can you post it here
as well?

Thanks,
  Dave








More information about the Crash-utility mailing list