[Crash-utility] Can't read stack contents from qemu dump
Dave Anderson
anderson at redhat.com
Wed Apr 4 15:48:24 UTC 2018
----- Original Message -----
>
>
> On 4.04.2018 17:48, Dave Anderson wrote:
> >
> >
> > ----- Original Message -----
> >> Hello,
> >>
> >> I tried running crash-head (HEAD: 5d172b230cf4) against today's linus'
> >> master on a dump obtained via dump-guest-memory in qemu. And I got the
> >> following when the image is loaded:
> >>
> >> please wait... (determining panic task)
> >> bt: read error: kernel virtual address: fffffe0000007000 type: "stack
> >> contents"
> >>
> >> KERNEL: vmlinux
> >> DUMPFILE: memory-verbatim.img
> >> CPUS: 1
> >> DATE: Wed Apr 4 16:36:47 2018
> >> UPTIME: 00:27:48
> >> LOAD AVERAGE: 31.11, 17.80, 10.43
> >> TASKS: 145
> >> NODENAME: ubuntu-virtual
> >> RELEASE: 4.16.0-rc7-nbor
> >> VERSION: #570 SMP Wed Apr 4 16:03:44 EEST 2018
> >> MACHINE: x86_64 (3392 Mhz)
> >> MEMORY: 4 GB
> >> PANIC: ""
> >> PID: 0
> >> COMMAND: "swapper/0"
> >> TASK: ffffffff82016500 [THREAD_INFO: ffffffff82016500]
> >> CPU: 0
> >> STATE: TASK_RUNNING
> >> WARNING: panic task not found
> >>
> >> crash> bt
> >> PID: 0 TASK: ffffffff82016500 CPU: 0 COMMAND: "swapper/0"
> >> #0 [ffffffff82003dc8] __schedule at ffffffff817ea059
> >> bt: invalid RSP: ffffffff82003dc8 bt->stackbase/stacktop:
> >> ffffffff82000000/ffffffff82002000 cpu: 0
> >>
> >>
> >> So the kernel has been compiled with : gcc (Ubuntu
> >> 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 which has retpoline enabled.
> >>
> >> I have KASLR disabled: # CONFIG_RANDOMIZE_BASE is not set and the kernel
> >> is compiled with CONFIG_FRAME_POINTER=y .
> >>
> >> This scenario used to work around the 4.10 timeline. Am I doing
> >> something wrong or crash still needs time to work on the latest upstream
> >> kernel code?
> >
> > Presumably the latter.
> >
> > If you do a "task -R stack ffffffff82016500", I'm presuming that it
> > shows the stack base address is ffffffff82000000. And the looking at
> > the stackbase/stacktop values, the crash utility is presuming an 8K stack:
> >
> > bt: invalid RSP: ffffffff82003dc8 bt->stackbase/stacktop:
> > ffffffff82000000/ffffffff82002000 cpu: 0
> >
> > But the RSP is ffffffff82003dc8, which puts its beyond the 8K stack size,
> > so I'm presuming that the kernel is actually using 16K stacks. The most
> > recent kernel I have is 4.16.0-0.rc6.git3.1.fc29.x86_64, which uses 16K
> > stacks.
>
> This is correct, indeed the kernel size should be 16k. However...
>
> >
> > Here is how the crash utility determines the stack size. The x86_64
> > stacksize
> > starts out with a default size of 2 pages, as set here in
> > x86_64_init(PRE_SYMTAB):
> >
> > case PRE_SYMTAB:
> > ... [ cut ] ...
> > machdep->stacksize = machdep->pagesize * 2;
> > ...
> >
> > Then later on in task_init(), it gets resized as shown here, where
> > the STACKSIZE() macro is machdep->stacksize:
> >
> > if (VALID_SIZE(task_union) && (SIZE(task_union) != STACKSIZE())) {
> > error(WARNING, "\nnon-standard stack size: %ld\n",
> > len = SIZE(task_union));
> > machdep->stacksize = len;
> > } else if (VALID_SIZE(thread_union) &&
> > ((len = SIZE(thread_union)) != STACKSIZE()))
> > machdep->stacksize = len;
>
> This is not resized at all, instead VALID_SIZE(thread_union) actually
> fails, I've added the following else to the if statement there :
>
> + } else {
> + if (VALID_SIZE(thread_union)) {
> + error(WARNING, "WE ARE IN THE ELSE BRANCH: len: %llu thread_union size: %llu STACKSIZE(): %llu\n",
> + len, SIZE(thread_union), STACKSIZE());
> + } else {
> + error(WARNING, "thread_union is invalid\n");
> + }
> + }
>
> Also doing:
>
> crash> struct thread_union
> struct: invalid data structure reference: thread_union
BTW, that command should fail -- it should be "union thread_union".
But as you've shown below, it's not finding it in the debuginfo.
> So for some reason the thread_union cannot be found by gdb:
>
> help -o | grep thread_union
> thread_union: -1
I can't explain why. It's still declared in "include/linux/sched.h"
in today's linux-git tree:
union thread_union {
#ifndef CONFIG_ARCH_TASK_STRUCT_ON_STACK
struct task_struct task;
#endif
#ifndef CONFIG_THREAD_INFO_IN_TASK
struct thread_info thread_info;
#endif
unsigned long stack[THREAD_SIZE/sizeof(long)];
};
If you run "gdb vmlinux", does it find it? For example:
(gdb) ptype union thread_union
Python Exception <type 'exceptions.ImportError'> No module named gdb.types:
type = union thread_union {
struct task_struct task;
unsigned long stack[2048];
}
(gdb)
Dave
More information about the Crash-utility
mailing list