[Crash-utility] "crash: cannot allocate any more memory!" for 2.6.27.4

Dave Anderson anderson at redhat.com
Thu Dec 4 14:44:11 UTC 2008


----- "Dheeraj Sangamkar" <dheerajrs at gmail.com> wrote:

> On the same machine, I am unable to live-debug the running kernel.
> Here is the error I get. Please let me know if there is something I
> can do to get crash working.
> 
> /dheeraj/crash/crash-4.0-7.4 # crash -d 255
> /boot/System.map-2.6.27.4-2-default
> /root/dheeraj/linux-2.6.27.4-2.1/vmlinux
> 
> crash 4.0-7.4
> Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 Red Hat, Inc.
> Copyright (C) 2004, 2005, 2006 IBM Corporation
> Copyright (C) 1999-2006 Hewlett-Packard Co
> Copyright (C) 2005, 2006 Fujitsu Limited
> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
> Copyright (C) 2005 NEC Corporation
> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public
> License,
> and you are welcome to change it and/or distribute copies of it under
> certain conditions. Enter "help copying" to see the conditions.
> This program has absolutely no warranty. Enter "help warranty" for
> details.
> 
> crash: diskdump / compressed kdump: dump does not have panic dump
> header
> get_live_memory_source: /dev/mem
> crash: pv_init_ops exists: ARCH_PVOPS
> _text: ffffffff80200000 Kernel code: 200000 -> phys_base: 0
> 
> gdb /root/dheeraj/linux-2.6.27.4-2.1/vmlinux
> GNU gdb 6.1
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and
> you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for
> details.
> This GDB was configured as "x86_64-unknown-linux-gnu"...
> GETBUF(248 -> 0)
> GETBUF(1500 -> 1)
> 
> FREEBUF(1)
> FREEBUF(0)
> <readmem: ffffffff804be660, KVADDR, "kernel_config_data", 32768,
> (ROE), 1ae5040>
> /dev/mem: Operation not permitted
> crash: read(/dev/mem, 4be660, 2464): 4294967295 (ffffffff)
> crash: read error: kernel virtual address: ffffffff804be660 type:
> "kernel_config_data"
> WARNING: cannot read kernel_config_data
> GETBUF(248 -> 0)
> FREEBUF(0)
> GETBUF(16 -> 0)
> <readmem: ffffffff80a14800, KVADDR, "cpu_possible_map", 16, (ROE),
> a49160>
> /dev/mem: Operation not permitted
> crash: read(/dev/mem, a14800, 16): 4294967295 (ffffffff)
> crash: read error: kernel virtual address: ffffffff80a14800 type:
> "cpu_possible_map"
> WARNING: cannot read cpu_possible_map
> <readmem: ffffffff80872710, KVADDR, "cpu_present_map", 16, (ROE),
> a49160>
> /dev/mem: Operation not permitted
> crash: read(/dev/mem, 872710, 16): 4294967295 (ffffffff)
> crash: read error: kernel virtual address: ffffffff80872710 type:
> "cpu_present_map"
> WARNING: cannot read cpu_present_map
> <readmem: ffffffff80846c90, KVADDR, "cpu_online_map", 16, (ROE),
> a49160>
> /dev/mem: Operation not permitted
> crash: read(/dev/mem, 846c90, 16): 4294967295 (ffffffff)
> crash: read error: kernel virtual address: ffffffff80846c90 type:
> "cpu_online_map"
> WARNING: cannot read cpu_online_map
> FREEBUF(0)
> <readmem: ffffffff80a74400, KVADDR, "xtime", 16, (FOE), a0d2f0>
> /dev/mem: Operation not permitted
> crash: read(/dev/mem, a74400, 16): 4294967295 (ffffffff)
> crash: read error: kernel virtual address: ffffffff80a74400 type:
> "xtime"

Live system analysis is impossible if /dev/mem won't allow you
to read physical memory.  Each read attempt above results in an
"Operation not permitted" (EPERM) error.  

They fail because of a pain-in-the-ass CONFIG_STRICT_DEVMEM configuration
option that was recently added to upstream kernels.  When the crash
utility makes a read() from /dev/mem, it ends up in the kernel's
drivers/char/mem.c driver, and proceeds like so:

  read_mem() 
    range_is_allowed()
      devmem_is_allowed()

Here is the call in read_mem() that returns the EPERM:

                if (!range_is_allowed(p >> PAGE_SHIFT, count))
                        return -EPERM;

Here are the two versions of range_is_allowed() in drivers/char/mem.c,
which depend upon CONFIG_STRICT_DEVMEM:

  #ifdef CONFIG_STRICT_DEVMEM
  static inline int range_is_allowed(unsigned long pfn, unsigned long size)
  {
          u64 from = ((u64)pfn) << PAGE_SHIFT;
          u64 to = from + size;
          u64 cursor = from;

          while (cursor < to) {
                  if (!devmem_is_allowed(pfn)) {
                          printk(KERN_INFO
                  "Program %s tried to access /dev/mem between %Lx->%Lx.\n",
                                  current->comm, from, to);
                          return 0;
                  }
                  cursor += PAGE_SIZE;
                  pfn++;
          }
          return 1;
  }
  #else
  static inline int range_is_allowed(unsigned long pfn, unsigned long size)
  {
          return 1;
  }
  #endif

And the call to devmem_is_allowed() above ends up here, where only the
first 256 pages (1MB) of physical memory is accessible:

  int devmem_is_allowed(unsigned long pagenr)
  {
          if (pagenr <= 256)
                  return 1;
          if (!page_is_ram(pagenr))
                  return 1;
          return 0;
  }

So that makes /dev/mem useless for the crash utility...

Bernhard Walle made a valiant attempt to make this a run-time configurable
rather than a compile-time config option, 

  http://lkml.org/lkml/2008/11/16/117

But the LKML thread went off into the weeds arguing whether the CONFIG_STRICT_DEVMEM
imposition itself should be ripped out, rather than putting lipstick on a pig.
And so it got nowhere...

This same 256-page restriction has been in place in RHEL4 and RHEL5 kernels
as well, because the same guy who managed to push it upstream formerly worked
at Red Hat, and convinced the powers-that-be here that /dev/mem was a security
hole.  As a result, I had to write a whole new physical memory-access device
driver (/dev/crash) that is a read-only "misc" driver that the crash utility
automatically loads when run on a live system.  But it's only available in
the Red Hat RHEL4 and RHEL5 distribution, although it can be ported easily
enough.

So you've got 3 options:

 1. Rebuild your kernel with CONFIG_STRICT_DEVMEM turned off.
 
 2. Port the RHEL /dev/crash driver to your kernel.  I did suggest that
    in another crash-utility thread here:

      https://www.redhat.com/archives/crash-utility/2008-October/msg00057.html 

 3. Write a kprobe module that forces devmem_is_allowed() to return 1 always:

      http://www.redhat.com/archives/crash-utility/2008-March/msg00036.html

If you've got the power, obviously #1 is the best option.  But if you cannot
change kernels, probably the kprobe option #3 is easier than porting the RHEL
/dev/crash driver.  Check Documentation/kprobes.txt file and the samples/kprobes
directory in your kernel build tree, and then use the RHEL kprobe example
I put in the thread referenced in #3 above.  If that won't work (and I don't
know why it wouldn't), then you can always port the RHEL /dev/crash driver
that's attached to the thread reference in #2 above.  

Dave




More information about the Crash-utility mailing list