<html> <head> <style></style></head> <body class='hmmessage'><div dir='ltr'><br> Hi Dave:<BR> <BR> thank you very much for your detail answer, this really helpful. please see my inline words. thanks.<BR><div id="SkyDrivePlaceholder"></div><div>> Date: Thu, 17 Jan 2013 14:17:36 -0500<br>> From: anderson@redhat.com<br>> To: crash-utility@redhat.com<br>> Subject: Re: [Crash-utility] questions about crash utility<br><br>> The fact that crash gets as far as it does at least means that the<br>> ELF header you've created was deemed acceptable as an ARM vmcore.<br>> However, the error messages re: "cpu_present_mask indicates..." and<br>> "cannot determine base kernel version" indicate that the data<br>> that was read from the vmcore was clearly not the correct data.<br>> <br>> The "cpu_present_mask" value that it read contained too<br>> many bits -- presuming that the 32-bit ARM processor is<br>> still limited to only 4 cpus. (looks like upstream that<br>> CONFIG_NR_CPUS is still 2 in the arch/arm/configs files.)<br>> <br>> But more indicative of the wrong data being read is the second<br>> "cannot determine base kernel version" message, which was generated<br>> after it read the kernel's "init_uts_ns" uts_namespace structure.<br>> After reading it, it sees that the "release" string contains<br>> non-ASCII data, whereas it should contain the kernel version:<br>> <br>> crash> p init_uts_ns<br>> init_uts_ns = $3 = {<br>> kref = {<br>> refcount = {<br>> counter = 2<br>> }<br>> }, <br>> name = {<br>> sysname = "Linux\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000", <br>> nodename = "phenom-01.lab.bos.redhat.com\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000", <br>> release = "2.6.32-313.el6.x86_64\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000", <br>> version = "#1 SMP Thu Sep 27 16:25:19 EDT 2012\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000", <br>> machine = "x86_64\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000", <br>> domainname = "(none)\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"<br>> }<br>> }<br>> crash><br>> <br>> So it appears that you're reading data from the wrong<br>> locations in the dumpfile. You should be able to verify <br>> that by bringing up the crash session with the --minimal<br>> flag like this:<br>> <br>> $ crash --minimal vmlinux vmcore<br>> <br>> That will bypass most of the initialization, including all<br>> readmem() calls of the vmcore. Then do this:<br>> <br>> crash> rd linux_banner 20<br>> ffffffff818000a0: 65762078756e694c 2e33206e6f697372 Linux version 3.<br>> ffffffff818000b0: 63662e312d312e35 365f3638782e3731 5.1-1.fc17.x86_6<br>> ffffffff818000c0: 626b636f6d282034 69756240646c6975 4 (mockbuild@bui<br>> ffffffff818000d0: 2e33322d6d76646c 6465662e32786870 ldvm-23.phx2.fed<br>> ffffffff818000e0: 656a6f727061726f 202967726f2e7463 oraproject.org) <br>> ffffffff818000f0: 7265762063636728 372e34206e6f6973 (gcc version 4.7<br>> ffffffff81800100: 303231303220302e 6465522820373035 .0 20120507 (Red<br>> ffffffff81800110: 372e342074614820 47282029352d302e Hat 4.7.0-5) (G<br>> ffffffff81800120: 3123202920294343 75685420504d5320 CC) ) #1 SMP Thu<br>> ffffffff81800130: 3120392067754120 2033343a30353a37 Aug 9 17:50:43 <br>> crash> rd -a linux_banner<br>> ffffffff818000a0: Linux version 3.5.1-1.fc17.x86_64 (mockbuild@buildvm-23.phx2<br>> ffffffff818000dc: .fedoraproject.org) (gcc version 4.7.0 20120507 (Red Hat 4.7<br>> ffffffff81800118: .0-5) (GCC) ) #1 SMP Thu Aug 9 17:50:43 UTC 2012<br>> crash><br>> <br>> I'm guessing that you will not see a string starting with "Linux version"<br>> with your dumpfile as shown above.<br>> <br>> If that's the case, then it's clear that the readmem() function is ultimately<br>> reading from the wrong vmcore file offset. <br>> <br>> Here's what you can try doing. Taking the linux_banner example above, <br>> you can check where in the dumpfile it's reading from by setting the debug<br>> flag, before doing a simple read -- like this example on an ARM dumpfile:<br>> <br>> crash> set debug 8<br>> debug: 8<br>> crash> rd linux_banner<br>> <addr: c033ea10 count: 1 flag: 488 (KVADDR)><br>> <readmem: c033ea10, KVADDR, "32-bit KVADDR", 4, (FOE), ff94f048><br>> <read_kdump: addr: c033ea10 paddr: 33ea10 cnt: 4><br>> read_netdump: addr: c033ea10 paddr: 33ea10 cnt: 4 offset: 33f088<br>> c033ea10: 756e694c Linu<br>> crash><br>> <br>> The linux_banner is at virtual address c033ea10 (addr). First it gets translated<br>> into physical address 33ea10 (paddr). Then that paddr is translated into the<br>> vmcore file offset of 33f088. It lseeks to vmcore file offset 33f088 and<br>> reads 4 bytes, which contain "756e694c", or the first 4 bytes of the<br>> "Linux version ..." string.<br>> <br>> Note that if I subtract the physical address from vmcore file offset<br>> I get this:<br>> <br>> crash> eval 33f088 - 33ea10<br>> hexadecimal: 678 <br>> decimal: 1656 <br>> octal: 3170<br>> binary: 00000000000000000000011001111000<br>> crash><br>> <br>> which would put physical address 0 at a vmcore file offset of 0x678, and<br>> therefore implying that that the ELF header comprises the first 0x678 bytes.<br>> And looking at the vmcore, that can be verified:<br>> </div><div> </div><div>yes you are right, here i get the result as below:</div><div>crash> set debug 8<br>debug: 8<br>crash> rd linux_banner<br><addr: c065a071 count: 1 flag: 488 (KVADDR)><br><readmem: c065a071, KVADDR, "32-bit KVADDR", 4, (FOE), ffdf297c><br><read_kdump: addr: c065a071 paddr: 85a071 cnt: 4><br>read_netdump: addr: c065a071 paddr: 85a071 cnt: 4 offset: 65a0e5<br>c065a071: 03e59130 0...<br></div><div> the virtual address is 0xc065a071 , and the physical address is 0x85a071 , and the offset is 0x65a0e5. </div><div> my elf header is 116 bytes long, 0x65a0e5 - 116=0x65A071, which has a gap 0x00200000 with the physical address 0x85a071.</div><div> </div><div><br>> $ readelf -a vmcore<br>> ELF Header:<br>> Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 <br>> Class: ELF32<br>> Data: 2's complement, little endian<br>> Version: 1 (current)<br>> OS/ABI: UNIX - System V<br>> ABI Version: 0<br>> Type: CORE (Core file)<br>> Machine: ARM<br>> Version: 0x1<br>> Entry point address: 0x0<br>> Start of program headers: 52 (bytes into file)<br>> Start of section headers: 0 (bytes into file)<br>> Flags: 0x0<br>> Size of this header: 52 (bytes)<br>> Size of program headers: 32 (bytes)<br>> Number of program headers: 3<br>> Size of section headers: 0 (bytes)<br>> Number of section headers: 0<br>> Section header string table index: 0<br>> <br>> There are no sections in this file.<br>> <br>> There are no sections to group in this file.<br>> <br>> Program Headers:<br>> Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align<br>> NOTE 0x000094 0x00000000 0x004e345c 0x005e4 0x005e4 0<br>> LOAD 0x000678 0xc0000000 0x00000000 0x5600000 0x5600000 RWE 0<br>> LOAD 0x5600678 0xc5700000 0x05700000 0x100000 0x100000 RWE 0<br>> ...<br>> <br>> Note that the "Offset" value of the first PT_LOAD segment has a file offset<br>> value of 0x678. <br>> </div><div> </div><div>here i got the result as below:</div><div>Program Headers:<br> Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align<br> NOTE 0x000000 0x00000000 0x00000000 0x00000 0x00000 0<br> LOAD 0x000074 0xc0000000 0x00200000 0x2fe00000 0x2fe00000 RWE 0</div><div> </div><div> so the problem is i don't understand the elf header meaning accurately. if i modify code as below, everything is ok for me:</div><div> </div><div>offset += sizeof(struct elf_phdr);<br>phdr->p_offset = offset+0x00200000;<br> phdr->p_vaddr = 0xc0000000;<br> phdr->p_paddr = 0x00200000;<br> phdr->p_filesz = phdr->p_memsz = MEMSIZE-0x00200000;<br></div><div> </div><div> although my modification can make crash utility work well, i want to know exactly whether i am doing the right thing.</div><div> 1. our platform has the ddr address from physical address 0x0.</div><div> 2. when compiling Linux kernel, our platform set in .config file: CONFIG_PHYS_OFFSET=0x00200000</div><div> 3. when Kernel crash, all ddr content will be dumped, from address 0x0~768MB. but kernel data starts from 0x00200000 actually.</div><div> </div><div> my questions are:</div><div> 1. whether my setting of ELF header is correct this time? the offset, paddr, and p_memsz ?</div><div> 2. i am wondering how does crash utility translate virtual address to physical address before and after it get the kernel page table? before get kernel page table, does it calculate as : (virtual_addr - p_vaddr + p_paddr) ? after get kernel page table, does it walk through the page table and find out the real physical address accordingly?</div><div> 3. my real purpose is to get the ftrace content from dump file by crash utility , but seem the command trace is not for this case, do i need to compile the extension "trace" of crash utility? is there any guide to follow? </div><div><br>> Another thing to do is to verify that your phys_base of 0x20000000<br>> is being properly seen. In the --minimal session, you can verify that<br>> by doing this:<br>> <br>> crash> help -m | grep phys_base<br>> <br>> Trying the above should yield some clues into the problem you're encountering.<br>> <br>> Dave<br>> <br>> <br>> <br>> <br>> <br>> --<br>> Crash-utility mailing list<br>> Crash-utility@redhat.com<br>> https://www.redhat.com/mailman/listinfo/crash-utility<br></div> </div></body> </html>