[Crash-utility] crash 4.0-2.8 fails on 2.6.14-rc5 (EM64T)

Mon Oct 31 23:11:20 UTC 2005

On Mon, 2005-10-31 at 17:24 -0500, Dave Anderson wrote:
> > There is no simple way to add #if KERNEL_VERSION > 2.6.10
> > in the header file and leave the hardcoded values there ?
> 
> 
> THIS_KERNEL_VERSION is based upon crash internal data variables in
> the 
> kernel_table data structure that get initialized in kernel_init
> (PRE_GDB) 
> based upon the contents of the kernel's "system_utsname" data
> structure 
> read from memory or the dumpfile. 
> 
> I was mistaken in using the value of "_stext" as the qualifier,
> though, 
> since the __START_KERNEL_map value of 0xffffffff80000000 is still the
> same. 
> But there must be *some* difference in the symbol list that can be
> used 
> to determine which set of address values to use.  It could even be
> just 
> the *existence* of some new kernel variable introduced as part of the 
> change to the new scheme.  Doing an "nm -Bn" on the old and new 
> vmlinux files should yield something obvious. 

Okay. I will look around :)

> > 
> > bt -t seems to better.
> > 
> > crash> bt 3144
> > PID: 3144   TASK: ffff81011dd1e100  CPU: 0   COMMAND: "mingetty"
> >  #0 [ffff81011d6b9c68] schedule at ffffffff803b12b3
> >     RIP: 000000377c7b85b2  RSP: 00007fffff87a110  RFLAGS: 00010246
> >     RAX: 0000000000000000  RBX: ffffffff8010dc26  RCX: 00007fffff87a7b0
> >     RDX: 0000000000000001  RSI: 00007fffff87a8c7  RDI: 0000000000000000
> >     RBP: 00007fffff87aca0   R8: 00002aaaaaac9b00   R9: 0000000000000000
> >     R10: 0000000000000001  R11: 0000000000000246  R12: 00007fffff87a900
> >     R13: 0000000000502d20  R14: 0000000000000000  R15: 000000007c92d8c0
> >     ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b
> > crash> bt -t 3144
> > PID: 3144   TASK: ffff81011dd1e100  CPU: 0   COMMAND: "mingetty"
> >               START: thread_return (schedule) at ffffffff803b12b3
> >   [ffff81011d6b9d10] do_con_write at ffffffff802689da
> >   [ffff81011d6b9d80] schedule_timeout at ffffffff803b1e4e
> >   [ffff81011d6b9db0] _spin_lock_irqsave at ffffffff803b28ce
> >   [ffff81011d6b9dc0] add_wait_queue at ffffffff8014cf5c
> >   [ffff81011d6b9de0] read_chan at ffffffff8025d1f7
> >   [ffff81011d6b9e48] default_wake_function at ffffffff80130c90
> >   [ffff81011d6b9e78] default_wake_function at ffffffff80130c90
> >   [ffff81011d6b9e90] tty_ldisc_deref at ffffffff802571c4
> >   [ffff81011d6b9ed0] tty_read at ffffffff802575ee
> >   [ffff81011d6b9f10] vfs_read at ffffffff80183a46
> >   [ffff81011d6b9f40] sys_read at ffffffff80183e03
> >   [ffff81011d6b9f80] system_call at ffffffff8010dc26
> >     RIP: 000000377c7b85b2  RSP: 00007fffff87a110  RFLAGS: 00010246
> >     RAX: 0000000000000000  RBX: ffffffff8010dc26  RCX: 00007fffff87a7b0
> >     RDX: 0000000000000001  RSI: 00007fffff87a8c7  RDI: 0000000000000000
> >     RBP: 00007fffff87aca0   R8: 00002aaaaaac9b00   R9: 0000000000000000
> >     R10: 0000000000000001  R11: 0000000000000246  R12: 00007fffff87a900
> >     R13: 0000000000502d20  R14: 0000000000000000  R15: 000000007c92d8c0
> >     ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b
> > crash>
> > 
> 
> 
> I still don't understand what happens in
> x86_64_low_budget_back_trace_cmd() 
> that causes the "bt" command to skip from the starting point in
> schedule() 
> to the end, where it dumps the user-mode entry exception frame,
> unless 
> the rsp has been bumped too high by the time it gets to this point: 
> 
>         /* 
>          *  Walk the process stack. 
>          */ 
>         for (i = (rsp - bt->stackbase)/sizeof(ulong); 
>              !done && (rsp < bt->stacktop); i++, rsp += sizeof(ulong))
> { 
> 
> ...and that conceivably may have something to do with the exception
> stack 
> problem.  It's hard to say without being there... 

I added lots of debug code while playing with it and forgot to
clean it up properly. All problems went away after a cleanup.
(stack trace issues, exception stack issues etc..)

Thanks,
Badari