[Crash-utility] ARM: crash registers might be read from the wrong physical address

Fri Jul 20 08:03:53 UTC 2012

I forgot to say that the __per_cpu_start symbol is placed at a similar address as you see in your example. So there is no change in the handling of the basic per_cpu area.

Jan

-----Original Message-----
From: Karlsson, Jan 
Sent: fredag den 20 juli 2012 09:49
To: 'Discussion list for crash utility usage, maintenance and development'
Cc: Fänge, Thomas
Subject: RE: [Crash-utility] ARM: crash registers might be read from the wrong physical address

What I see is the following:
crash> p crash_notes
crash_notes = $29 = (note_buf_t *) 0xf662e000
crash> p/x __per_cpu_offset
$31 = {0x39b2000, 0x39ba000, 0x39c2000, 0x39ca000}

0xf662e000 + 0x39b2000 = 0xf9fe0000 which is the address seen in readmem.

These are the interesting lines I see in source code (both newer and older kernels):

note_buf_t *crash_notes;

  crash_notes = alloc_percpu(note_buf_t);

I do not really understand this in detail, but it seems that alloc_percpu uses "chunks" and may allocate new chunks if there is not enough memory in the currently available chunks. So what might have happen is in older cases there is space in first(??) chunk, while in the newer case a new chunk have to be allocated.

Jan

Jan Karlsson
Senior Software Engineer
MIB

Sony Mobile Communications
Tel: +46703062174
sonymobile.com

-----Original Message-----
From: crash-utility-bounces at redhat.com [mailto:crash-utility-bounces at redhat.com] On Behalf Of Dave Anderson
Sent: torsdag den 19 juli 2012 14:42
To: Discussion list for crash utility usage, maintenance and development
Cc: Fänge, Thomas
Subject: Re: [Crash-utility] ARM: crash registers might be read from the wrong physical address

----- Original Message -----
> These are the same lines in my case.
> 
> <readmem: c0d2af6c, KVADDR, "crash_notes", 4, (ROE), 85ba84c>
> <read_kdump: addr: c0d2af6c paddr: 80f2af6c cnt: 4>
> <readmem: f9fe0000, KVADDR, "note_buf_t", 560, (ROE), 85bac40>  <--- !!
> <readmem: c0004000, KVADDR, "pgd page", 16384, (FOE), 914e8d0>
> 
> I have never seen this problem before, so the behavior you see is 
> exactly what I have seen before. However with a fairly new kernel I 
> did not get the correct crash_notes. The investigation lead to the 
> patch for the problem described in my previous mail.
> 
> I have not investigated if there is any patch in newer kernels that 
> changes this behavior and in that case where it comes from (it could 
> be a patch by us). However as the algorithm for reading crash_notes is 
> wrong, as it depends on a variable that is not yet initialized, I 
> think it should be corrected anyhow. I have tested my patch with both 
> newer and older kernels and it works as intended.

OK, good.  And so apparently the per-cpu region has been moved up into vmalloc space.  I'll queue the change into crash-6.0.9.

For curiosity's sake, can you show me the per-cpu symbol list?  In my sample ARM kernel, it's located in the unity-mapped region just below the .text section, and can be seen like this:

 crash> sym -l
 ... [ cut ] ...
 c004e000 (d) .data..percpu
 c004e000 (D) __per_cpu_load
 c004e000 (D) __per_cpu_start
 c004e000 (D) cpu_data
 c004e040 (d) percpu_clockevent
 c004e098 (D) current_kprobe
 c004e09c (D) kprobe_ctlblk
 c004e130 (d) bp_on_reg
 c004e170 (d) wp_on_reg
 c004e1b0 (D) mmu_gathers
 c004e1c0 (D) current_mm
 c004e1e0 (D) kstat
 ... [ cut ] ...
 c004f0b4 (d) xmit_recursion
 c004f0b8 (d) rt_cache_stat
 c004f100 (d) runqueues
 c004f620 (d) gcwq_nr_running
 c004f640 (d) cfd_data
 c004f660 (d) call_single_queue
 c004f6a0 (d) csd_data
 c004f6c0 (D) softnet_data
 c004f7a0 (D) __per_cpu_end
 c0050000 (t) .text
 ... 

Your newer kernel must move it up to ~fxxxxxxx?

Thanks,
  Dave

--
Crash-utility mailing list
Crash-utility at redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility