[Crash-utility] [PATCH V2] take Hardware Error & kernel pointer bug as separate panicmsg

Derek Cheng drc at yahoo-inc.com
Wed Feb 4 17:33:47 UTC 2015


On Tuesday, February 3, 2015 12:53 PM, Dave Anderson <anderson at redhat.com> wrote:
> I'll move the hardware error check to the bottom, and only use it if there
> are no other relevant strings found, and then re-test that configuration.    


how about match this string "Machine Check Exception:" ? or can I use a pattern matching ?this is a memory failure at bank 12, usually indicates we need to replace this memory in bank 12,

here we internally have a tool depends on crash to analyze kernel crash and find out reasons and solutions, I hope the 
"[Hardware Error]: CPU 14: Machine Check Exception: 5 Bank 12: fe00014b001000c3" line
can be matched, instead of the less useful
"Kernel panic - not syncing: Fatal machine check on current CPU" currently selected,

<0>[Hardware Error]: CPU 14: Machine Check Exception: 5 Bank 12: fe00014b001000c3
<0>[Hardware Error]: RIP !INEXACT! 10:<ffffffff810ace8a> {tick_check_idle+0xca/0xe0}
<0>[Hardware Error]: TSC 52e41ed579869d ADDR 53a92b000 MISC 908424000803e8c 
<0>[Hardware Error]: PROCESSOR 0:306e4 TIME 1423045186 SOCKET 0 APIC 5
<0>[Hardware Error]: Some CPUs didn't answer in synchronization
<0>[Hardware Error]: Machine check: Invalid
<0>Kernel panic - not syncing: Fatal machine check on current CPU
<4>Pid: 0, comm: swapper Tainted: G   M       ---------------    2.6.32-431.23.3.el6.YAHOO.20140804.x86_64 #1
<4>Call Trace:
<4> <#MC>  [<ffffffff8152866c>] ? panic+0xa7/0x16f
<4> [<ffffffff8102880f>] ? mce_panic+0x20f/0x230
<4> [<ffffffff81029c87>] ? do_machine_check+0x7a7/0xaf0
<4> [<ffffffff810ace8a>] ? tick_check_idle+0xca/0xe0
<4> [<ffffffff8152bc9c>] ? machine_check+0x1c/0x30
<4> [<ffffffff810ace8a>] ? tick_check_idle+0xca/0xe0
<4> <<EOE>>  <IRQ>  [<ffffffff8107a51c>] ? irq_enter+0x6c/0x80
<4> [<ffffffff8102b1d3>] ? smp_threshold_interrupt+0x13/0x40
<4> [<ffffffff8100bd13>] ? threshold_interrupt+0x13/0x20
<4> <EOI>  [<ffffffff812e14ee>] ? intel_idle+0xde/0x170
<4> [<ffffffff812e14d1>] ? intel_idle+0xc1/0x170
<4> [<ffffffff814274a7>] ? cpuidle_idle_call+0xa7/0x140
<4> [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110
<4> [<ffffffff8152219c>] ? start_secondary+0x2ac/0x2ef




Thanks,
- Derek




More information about the Crash-utility mailing list