LKCD patch (was: Re: [Crash-utility] Increase of NR_CPUS on IA64)

Dave Anderson anderson at redhat.com
Mon Oct 29 19:34:45 UTC 2007


 > Dave Anderson <anderson redhat com> [2007-10-22 15:32]:
 >> Troy Heber wrote:
 >>> On 10/19/07 12:23, Dave Anderson wrote:
 >>>> So my biggest worry would be if this somehow breaks
 >>>> backwards-compatibility, but I'm presuming that you took
 >>>> that into account.  But anyway, I leave this all up
 >>>> to Troy.
 >>> I just did a quick sanity check on a couple of old IA64 LKCD dumps and
 >>> everything seems to work, so I'm happy.
 >>> Troy
 >
 > Troy, thanks for checking this!
 >
 >> Bernhard, can you post a cleaned-up patch for queueing?
 >
 > Here it is (attached). I didn't see any warnings in the crash code
 > with 'make warn' now. I have used your own definition of offsetof()
 > but moved it into the header file.

My biggest worry came true, so I'm going to have to NAK
this patch in its current state.

We have a major customer who uses an older version
of LKCD (the dh_version in the header shows version 2).
Because of that, I wouldn't have thought your patch
would in any way affect them.  Anyway, it's the *only*
LKCD dumpfile that I test with each new crash release.
They run both x86 and x86_64.

With 4.0-4.7, the backtrace of the x86 panic task shows this:

crash> bt
PID: 12727  TASK: c086c000  CPU: 0   COMMAND: "httpd"
  #0 [c086da80] dump_execute at f5728f42
  #1 [c086da84] do_dump at f572928d
  #2 [c086db2c] die at c010798a
  #3 [c086db44] do_invalid_op at c0107c5a
  #4 [c086dc00] error_code (via invalid_op) at c010750e
     EAX: 0000001d  EBX: c0293cd6  ECX: c0330148  EDX: 0011062b  EBP: c086dc4c
     DS:  0018      ESI: c086dc9c  ES:  0018      EDI: c086c000
     CS:  0010      EIP: c011db63  ERR: ffffffff  EFLAGS: 00010002
  #5 [c086dc3c] panic at c011db63
  #6 [c086dc50] XXXXXXX_nmi_check at c010811b  (company name removed...)
  #7 [c086dc64] do_nmi at c0108254
  #8 [c086dc90] nmi at c0107595
     EAX: 000003dc  EBX: 00000000  ECX: 00000064  EDX: c086dcec  EBP: c086dd10
     DS:  0018      ESI: 000000f0  ES:  0018      EDI: 00000001
     CS:  0010      EIP: c0261440  ERR: 000003dc  EFLAGS: 00000286
  #9 [c086dccc] stext_lock (via prune_icache) at c0261440
#10 [c086dd14] shrink_icache_memory at c015f7dd
#11 [c086dd20] do_try_to_free_pages at c013f402
#12 [c086dd4c] try_to_free_pages at c013f8d2
#13 [c086dd64] _wrapped_alloc_pages at c01406bd
#14 [c086dd88] __alloc_pages at c014079d
#15 [c086dda8] __get_free_pages at c014083e
#16 [c086ddb0] kmem_cache_grow at c013a77b
#17 [c086dde8] kmalloc at c013ad8b
#18 [c086de20] skbmem_grow_bucket at f638cdd5
#19 [c086de3c] skbmemalloc at f638cfa0
#20 [c086de58] alloc_skb at c01f5770
#21 [c086de74] sock_alloc_send_skb at c01f4c15
#22 [c086de90] unix_stream_sendmsg at c02395c3
#23 [c086dee0] sock_sendmsg at c01f23c6
#24 [c086df34] sock_write at c01f25d0
#25 [c086df7c] sys_write at c0148d06
#26 [c086dfc0] system_call at c010740c
     EAX: 00000004  EBX: 0000000a  ECX: be1fd8fc  EDX: 00000004
     DS:  002b      ESI: 00000004  ES:  002b      EDI: be1fd8fc
     SS:  002b      ESP: be1fd8a4  EBP: be1fd8d4
     CS:  0023      EIP: 4024f214  ERR: 00000004  EFLAGS: 00000296
crash>

With your patch applied, it shows this:

crash> bt
PID: 12727  TASK: c086c000  CPU: 0   COMMAND: "httpd"
bt: cannot resolve stack trace:
bt: Task in user space -- no backtrace
crash>

and in fact, "bt -a" shows the same thing for all
active tasks:

crash> bt -a
PID: 12727  TASK: c086c000  CPU: 0   COMMAND: "httpd"
bt: cannot resolve stack trace:
bt: Task in user space -- no backtrace

PID: 0      TASK: cdccc000  CPU: 1   COMMAND: "swapper"
bt: cannot resolve stack trace:
bt: Task in user space -- no backtrace

PID: 9959   TASK: ce01a000  CPU: 2   COMMAND: "httpd"
bt: cannot resolve stack trace:
bt: Task in user space -- no backtrace

PID: 0      TASK: cdcde000  CPU: 3   COMMAND: "swapper"
bt: cannot resolve stack trace:
bt: Task in user space -- no backtrace

PID: 16444  TASK: dc4d8000  CPU: 1   COMMAND: "httpd"
bt: cannot resolve stack trace:
bt: Task in user space -- no backtrace

PID: 5874   TASK: d3920000  CPU: 0   COMMAND: "httpd"
bt: cannot resolve stack trace:
bt: Task in user space -- no backtrace
crash>

The backtraces of the non-active tasks are OK.

Any ideas on what's wrong, and how to address this?

Dave






More information about the Crash-utility mailing list