[Crash-utility] bt: cannot determine starting stack pointer

Dave Anderson anderson at redhat.com
Tue Feb 14 19:07:23 UTC 2012



----- Original Message -----
> Hi,
> 
> I need the stack traces of the tasks that are on-proc as well as the
> tasks that are not.  "bt" fails for the on-proc tasks, even though there
> is a backup mechanism for finding the stack:  the "stack" field of the
> task structure.  Even if it is a bit out-of-date, it is better than an
> "I dunno" message.  Perhaps augment the stack trace with a "this
> might be slightly out-of-date because the task was running when
> the kernel crashed" message.
> 
> Example:
> 
> crash> foreach bt
> [...]
> PID: 20311  TASK: ffff8803ff654140  CPU: 9   COMMAND: "xtnhc"
> bt: cannot determine starting stack pointer
> [...]
> crash> ps | egrep '^>'
> >     0      0   4  ffff880205f6b0c0  RU   0.0       0      0  [swapper]
> >     0      0   5  ffff880205f77870  RU   0.0       0      0  [swapper]
> >     0      0   7  ffff880205d557f0  RU   0.0       0      0  [swapper]
> >     0      0  10  ffff880205d5c080  RU   0.0       0      0  [swapper]
> >  2982      2  11  ffff8801fd3b07f0  RU   0.0       0      0  [ldlm_cb_00]
> >  2983      2   8  ffff880205548080  RU   0.0       0      0  [ldlm_cb_01]
> > 20250  20245   1  ffff880202deb0c0  RU   0.0   82388   2372  fcntl17
> > 20251  20245   2  ffff88020537b7b0  RU   0.0   82388   2396  fcntl17
> > 20252  20245   3  ffff8801fd3b4770  RU   0.0   82388   2376  fcntl17
> > 20264  20249   0  ffff8801fd444830  RU   0.0       0      0  fcntl17
> > 20290      1   6  ffff8803fe86f7b0  RU   0.0   14044    516  xtnhc
> > 20311  20305   9  ffff8803ff654140  RU   0.0   14044    516  xtnhc
> crash> set ffff8803ff654140
>     PID: 20311
> COMMAND: "xtnhc"
>    TASK: ffff8803ff654140  [THREAD_INFO: ffff8803fd85a000]
>     CPU: 9
>   STATE: TASK_RUNNING (ACTIVE)
> crash> p task->stack
> p: gdb request failed: p task->stack
> crash> task
> PID: 20311  TASK: ffff8803ff654140  CPU: 9   COMMAND: "xtnhc"
> struct task_struct {
>   state = 0,
>   stack = 0xffff8803fd85a000,
> [...]
> crash> bt -S 0xffff8803fd85a000
> PID: 20311  TASK: ffff8803ff654140  CPU: 9   COMMAND: "xtnhc"
>  #0 [ffff8803fd85a000] schedule at ffffffff81297bc5
>  #1 [ffff8803fd85b830] ldlm_resource_get at ffffffffa0269380 [ptlrpc]
>  #2 [ffff8803fd85b900] ldlm_lock_match at ffffffffa0267359 [ptlrpc]
>  #3 [ffff8803fd85ba10] mdc_revalidate_lock at ffffffffa0423a8e [mdc]
>  #4 [ffff8803fd85bac0] mdc_intent_lock at ffffffffa042723f [mdc]
>  #5 [ffff8803fd85bbc0] __ll_inode_revalidate_it at ffffffffa04a79c2 [lustre]
>  #6 [ffff8803fd85bcf0] ll_inode_permission at ffffffffa04a8266 [lustre]
>  #7 [ffff8803fd85bd90] inode_permission at ffffffff810f0a09 
>  #8 [ffff8803fd85bda0] may_open at ffffffff810f14d7
>  #9 [ffff8803fd85bdd0] do_filp_open at ffffffff810f5294
> #10 [ffff8803fd85bf20] do_sys_open at ffffffff810e5850
> #11 [ffff8803fd85bf70] sys_open at ffffffff810e596b
> #12 [ffff8803fd85bf80] system_call_fastpath at ffffffff81002eab
>     RIP: 00007ffff78f2f80  RSP: 00007fffffffd818  RFLAGS: 00010202
>     RAX: 0000000000000002  RBX: ffffffff81002eab  RCX: 00000000006130f0
>     RDX: 00000000000001b6  RSI: 0000000000000000  RDI: 000000000060f960
>     RBP: 0000000000000008   R8: 0000000000000008   R9: 0000000000000001
>     R10: 000000000040a261  R11: 0000000000000246  R12: ffffffff810e596b
>     R13: ffff8803fd85bf78  R14: 0000000000000000  R15: 0000000000000000
>     ORIG_RAX: 0000000000000002  CS: 0033  SS: 002b
> crash>

You could also try "bt -t" or "bt -T".

But what kind of dumpfile was this anyway?  I'm wondering why you aren't
getting any stack traces at all for the active tasks? 
  
Dave




More information about the Crash-utility mailing list