[Crash-utility] The problems when running SuSE 12 on VirtualBox

David Mair dmair at suse.com
Thu Nov 19 15:16:15 UTC 2015


On 11/19/2015 12:36 AM, Nan Xiao wrote:
> Hi David & Dave,
> 
> Executing "crash" on a physical machine (not VirtualBox):
> 
>  # crash
> 
> crash 7.1.3
> Copyright (C) 2002-2014  Red Hat, Inc.
> Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
> Copyright (C) 1999-2006  Hewlett-Packard Co
> Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
> Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
> Copyright (C) 2005, 2011  NEC Corporation
> Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public License,
> and you are welcome to change it and/or distribute copies of it under
> certain conditions.  Enter "help copying" to see the conditions.
> This program has absolutely no warranty.  Enter "help warranty" for details.
> 
> crash: /boot/xen-4.5.gz: original filename unknown
>        Use "-f /boot/xen-4.5.gz" on command line to prevent this message.
> 
> WARNING: machine type mismatch:
>          crash utility: X86_64
>          /var/tmp/xen-4.5.gz_VIOmfp: X86
> 
> crash: /boot/symtypes-3.12.49-6-default.gz: original filename unknown
>        Use "-f /boot/symtypes-3.12.49-6-default.gz" on command line to
> prevent this message.
> 
> crash: /boot/symvers-3.12.49-6-default.gz: original filename unknown
>        Use "-f /boot/symvers-3.12.49-6-default.gz" on command line to
> prevent this message.
> 
> GNU gdb (GDB) 7.6
> Copyright (C) 2013 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-unknown-linux-gnu"...
> 
> crash: this kernel may be configured with CONFIG_STRICT_DEVMEM, which
>        renders /dev/mem unusable as a live memory source.
> crash: trying /proc/kcore as an alternative to /dev/mem
> 
>       KERNEL: /boot/vmlinux-3.12.49-6-xen.gz
>    DEBUGINFO: /usr/lib/debug/boot/vmlinux-3.12.49-6-xen.debug
>     DUMPFILE: /proc/kcore
>         CPUS: 128
>         DATE: Thu Nov 19 09:37:49 2015
>       UPTIME: 00:34:57
> LOAD AVERAGE: 1.77, 1.21, 1.02
>        TASKS: 1328
>     NODENAME: dl980-5
>      RELEASE: 3.12.49-6-xen
>      VERSION: #1 SMP Mon Oct 26 16:05:37 UTC 2015 (11560c3)
>      MACHINE: x86_64  (1995 Mhz)
>       MEMORY: 125.9 GB
>          PID: 39777
>      COMMAND: "crash"
>         TASK: ffff881eacaaa100  [THREAD_INFO: ffff881e8ff46000]
>          CPU: 3
>        STATE: TASK_RUNNING (ACTIVE)
> 
> crash>
> 
> I can see the crash will use "/proc/kcore" instead of "/dev/mem". So I
> try the same thing on VirtualBox:
> 
> # crash /boot/vmlinux-3.12.49-6-xen.gz /proc/kcore
> 
> crash 7.1.3
> Copyright (C) 2002-2014  Red Hat, Inc.
> Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
> Copyright (C) 1999-2006  Hewlett-Packard Co
> Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
> Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
> Copyright (C) 2005, 2011  NEC Corporation
> Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public License,
> and you are welcome to change it and/or distribute copies of it under
> certain conditions.  Enter "help copying" to see the conditions.
> This program has absolutely no warranty.  Enter "help warranty" for details.
> 
> GNU gdb (GDB) 7.6
> Copyright (C) 2013 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-unknown-linux-gnu"...
> 
>       KERNEL: /boot/vmlinux-3.12.49-6-xen.gz
>    DEBUGINFO: /usr/lib/debug/boot/vmlinux-3.12.49-6-xen.debug
>     DUMPFILE: /proc/kcore
>         CPUS: 1
>         DATE: Thu Nov 19 01:53:01 2015
>       UPTIME: 05:42:13
> LOAD AVERAGE: 0.19, 0.06, 0.06
>        TASKS: 239
>     NODENAME: linux-6ev3
>      RELEASE: 3.12.49-6-xen
>      VERSION: #1 SMP Mon Oct 26 16:05:37 UTC 2015 (11560c3)
>      MACHINE: x86_64  (2594 Mhz)
>       MEMORY: 855.2 MB
>          PID: 3106
>      COMMAND: "crash"
>         TASK: ffff88002ec5c040  [THREAD_INFO: ffff88000b3e2000]
>          CPU: 0
>        STATE: TASK_RUNNING (ACTIVE)
> 
> crash>
> 
> It seems OK now.
> 
> So my questions are:
> 
> (1) Is it OK to use "/proc/kcore" instead of "/dev/mem" as a workaround?
> Is there any side-effect?

As I read it, /proc/kcore is the kernel's virtual address space and
/dev/mem is the system's physical address space. It probably isn't wise
to debug the latter on any dom0 (whether nested in another
virtualization or not) except in very acute cases. It may explain the
problem you were having in the first place if VirtualBox affects whether
/dev/mem is a real physical memory view or if it doesn't then whether
VirtualBox itself affects those cases where, as I understand it, the
kernel has constant physical addresses for some things.

> (2) Execute "crash -d8" on physical machine will cause crash utility core dump.
> Use gdb to debug it:
> 
> # gdb /usr/bin/crash core-crash-11-0-0-40072-1447945769
> GNU gdb (GDB; SUSE Linux Enterprise 12) 7.9.1
> Copyright (C) 2015 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-suse-linux".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <http://bugs.opensuse.org/>.
> Find the GDB manual and other documentation resources online at:
> <http://www.gnu.org/software/gdb/documentation/>.
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from /usr/bin/crash...Reading symbols from
> /usr/lib/debug/usr/bin/crash.debug...done.
> done.
> [New LWP 40072]
> Core was generated by `crash -d8'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x00007f5119001fd0 in get_cie_encoding (cie=0x7f5119004cd8) at
> ../../../libgcc/unwind-dw2-fde.c:272
> 272     ../../../libgcc/unwind-dw2-fde.c: No such file or directory.
> (gdb)
> (gdb) bt
> #0  0x00007f5119001fd0 in get_cie_encoding (cie=0x7f5119004cd8) at
> ../../../libgcc/unwind-dw2-fde.c:272
> #1  0x00007f5119002699 in get_fde_encoding (f=0x7f5119006050) at
> ../../../libgcc/unwind-dw2-fde.c:319
> #2  _Unwind_IteratePhdrCallback (info=info at entry=0x7fff463c10e0,
> size=size at entry=64, ptr=ptr at entry=0x7fff463c1160)
>     at ../../../libgcc/unwind-dw2-fde-dip.c:408
> #3  0x00007f51196a3f3c in __GI___dl_iterate_phdr
> (callback=callback at entry=0x7f5119002270 <_Unwind_IteratePhdrCallback>,
>     data=data at entry=0x7fff463c1160) at dl-iteratephdr.c:76
> #4  0x00007f51190035c3 in _Unwind_Find_FDE (pc=0x7f5119001aa7
> <_Unwind_Backtrace+55>, bases=bases at entry=0x7fff463c1498)
>     at ../../../libgcc/unwind-dw2-fde-dip.c:459
> #5  0x00007f5118ffff86 in uw_frame_state_for
> (context=context at entry=0x7fff463c13f0, fs=fs at entry=0x7fff463c1240)
>     at ../../../libgcc/unwind-dw2.c:1241
> #6  0x00007f51190011d0 in uw_init_context_1
> (context=context at entry=0x7fff463c13f0,
> outer_cfa=outer_cfa at entry=0x7fff463c16a0,
>     outer_ra=0x7f511967bf46 <__GI___backtrace+86>) at
> ../../../libgcc/unwind-dw2.c:1562
> #7  0x00007f5119001aa8 in _Unwind_Backtrace (trace=0x7f511967bdd0
> <backtrace_helper>, trace_argument=0x7fff463c16a0)
>     at ../../../libgcc/unwind.inc:283
> #8  0x00007f511967bf46 in __GI___backtrace
> (array=array at entry=0x7fff463c1710, size=size at entry=4) at
> ../sysdeps/x86_64/backtrace.c:109
> #9  0x000000000047add7 in __error (type=type at entry=1,
> fmt=fmt at entry=0x853d38 "read(/dev/mem, %lx, %ld): %ld (%lx)\n") at
> tools.c:52
> #10 0x0000000000490c91 in read_dev_mem (fd=4, bufptr=0x7fff463c1f28,
> cnt=8, addr=0, paddr=1052672) at memory.c:2298
> #11 0x0000000000486398 in readmem (addr=1052672,
> memtype=memtype at entry=4, buffer=buffer at entry=0x7fff463c1f28,
> size=size at entry=8,
>     type=type at entry=0x8563e1 "devmem_is_allowed - pfn 257",
> error_handle=error_handle at entry=6) at memory.c:2198
> #12 0x000000000048704a in devmem_is_restricted () at memory.c:2414
> #13 readmem (addr=1052672, memtype=memtype at entry=4,
> buffer=buffer at entry=0x7fff463c1fb8, size=size at entry=8,
>     type=type at entry=0x8563e1 "devmem_is_allowed - pfn 257",
> error_handle=error_handle at entry=6) at memory.c:2209
> #14 0x000000000048704a in devmem_is_restricted () at memory.c:2414
> #15 readmem (addr=1052672, memtype=memtype at entry=4,
> buffer=buffer at entry=0x7fff463c2048, size=size at entry=8,
>     type=type at entry=0x8563e1 "devmem_is_allowed - pfn 257",
> error_handle=error_handle at entry=6) at memory.c:2209
> #16 0x000000000048704a in devmem_is_restricted () at memory.c:2414
> #17 readmem (addr=1052672, memtype=memtype at entry=4,
> buffer=buffer at entry=0x7fff463c20d8, size=size at entry=8,
>     type=type at entry=0x8563e1 "devmem_is_allowed - pfn 257",
> error_handle=error_handle at entry=6) at memory.c:2209
> #18 0x000000000048704a in devmem_is_restricted () at memory.c:2414
> #19 readmem (addr=1052672, memtype=memtype at entry=4,
> buffer=buffer at entry=0x7fff463c2168, size=size at entry=8,
>     type=type at entry=0x8563e1 "devmem_is_allowed - pfn 257",
> error_handle=error_handle at entry=6) at memory.c:2209
> #20 0x000000000048704a in devmem_is_restricted () at memory.c:2414
> #21 readmem (addr=1052672, memtype=memtype at entry=4,
> buffer=buffer at entry=0x7fff463c21f8, size=size at entry=8,
>     type=type at entry=0x8563e1 "devmem_is_allowed - pfn 257",
> error_handle=error_handle at entry=6) at memory.c:2209
> #22 0x000000000048704a in devmem_is_restricted () at memory.c:2414
> #23 readmem (addr=1052672, memtype=memtype at entry=4,
> buffer=buffer at entry=0x7fff463c2288, size=size at entry=8,
>     type=type at entry=0x8563e1 "devmem_is_allowed - pfn 257",
> error_handle=error_handle at entry=6) at memory.c:2209
> #24 0x000000000048704a in devmem_is_restricted () at memory.c:2414
> 
> ......
> 
> Below is always:
> readmem (addr=1052672, memtype=memtype at entry=4,
> buffer=buffer at entry=0x7fff463c2288, size=size at entry=8,
>     type=type at entry=0x8563e1 "devmem_is_allowed - pfn 257",
> error_handle=error_handle at entry=6) at memory.c:2209
> 0x000000000048704a in devmem_is_restricted () at memory.c:2414
> 
> Seems a dead-loop, but not sure.

That reads like a bug in the code to decide to switch to /proc/kcore. It
is at a test if /dev/mem is allowed for debugging by verifying this from
the function comment:

 *  On x86 and x86_64, only the first 256 pages of physical memory
 *  are accessible:

It's considered restricted if a sizeof(long) read from the start of
physical page 255 (address 1,044,480) succeeds and a sizeof(long) read
from page 257 fails (address 1052672). Note the nesting readmem()'s
first arguments.

Since there's no checking of the actual value read at those tests, their
failure status is how the memory file is considered restricted or not.

However,the actual method of implementation is like this simplified view:

readmem(addr, PHYSADDR, buffer, size...)
{
	/* Compute from arguments the memory file position to read from */
	.
	.
	.
	/* Perform the read of the open file descriptor for the memory file,
returns errno for the read  */
	switch(READMEM(fd, buffer, count...))
	{
		.
		.
		.
	case READ_ERROR:
		if (PRINT_ERROR_MESSAGE) {
			if ((pc->flags & DEVMEM)
				&& (kt->flags & PRE_KERNEL_INIT)
				&& devmem_is_restricted()
				&& switch_to_proc_kcore())
			{
				return (readmem(addr, memtype, buffer, size...);
			}
		}
	}

The devmem_is_restricted() contains two readmem() of it's own and
there's no protection against it nesting on the switch() above that I
can see.

FWIW, I need to spend more time to commit to a solution. You have a
support contract anyway, right? Like Dave said, SUSE is the main source
of Xen features in crash and what's needed now isn't really worth being
dragged out on the list. If you do have a support contract and could
report the issue it would probably help me a bit.

-- 
David.




More information about the Crash-utility mailing list