[Crash-utility] Re: Error when analysing dump on 2.6.21.4 kernel
Dave Anderson
anderson at redhat.com
Thu Nov 15 15:18:34 UTC 2007
Dave Anderson wrote:
> Sachin P. Sant wrote:
>
>> Ankita Garg wrote:
>>
>>> Hi,
>>>
>>> Am working on backporting relocatable kernel support for x86_64 from
>>> 2.6.22.1 kernel to 2.6.21.4. kdump is working fine. But when opening the
>>> vmcore file with crash, I get the following error:
>>
>>
>> I had a discussion with Ankita about this problem. This is what i
>> think is happening.
>>
>> This x86-64 kernel has CONFIG_NUMA off with SPARSEMEM support.
>>
>> The failure occurs as line 11738 in memory.c [ This is with
>> latest crash ]
>>
>> crash: invalid structure member offset: pglist_data_node_mem_map
>> FILE: memory.c LINE: 11738 FUNCTION: dump_memory_nodes()
>>
>> Looking at the crash source here is the code in question :
>>
>> 11728 if (IS_SPARSEMEM()) {
>> 11729 zone_mem_map = 0;
>> 11730 zone_start_mapnr = 0;
>> 11731 if (zone_size) {
>> 11732 phys = PTOB(zone_start_pfn);
>> 11733 zone_start_mapnr =
>> phys/PAGESIZE();
>> 11734 }
>> 11735
>> 11736 } else if (!(vt->flags & NODES) &&
>> 11737 INVALID_MEMBER(zone_zone_mem_map)) {
>> 11738
>> readmem(pgdat+OFFSET(pglist_data_node_mem_map),
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> 11739 KVADDR, &zone_mem_map,
>> sizeof(void *),
>> 11740 "contig_page_data
>> mem_map",FAULT_ON_ERROR);
>> 11741 if (zone_size)
>> 11742 zone_mem_map += cum_zone_size *
>> SIZE(page);
>>
>> The code is trying to read pglist_data_node_mem_map value which does
>> not exist.
>> [Since CONFIG_NUMA is off]. It should have entered the if
>> (IS_SPARSEMEM())
>> condition [ line 11728 ] since SPARSEMEM is enabled for this kernel.
>> The flag value of SPARSEMEM is set by this code in memory.c
>>
>> 558 if (kernel_symbol_exists("mem_map")) {
>> 559 get_symbol_data("mem_map", sizeof(char *),
>> &vt->mem_map);
>> 560 vt->flags |= FLATMEM;
>> 561 } else if (kernel_symbol_exists("mem_section"))
>> 562 vt->flags |= SPARSEMEM;
>> 563 else
>> 564 vt->flags |= DISCONTIGMEM;
>>
>> But what i found was SPARSEMEM flag is not set, instead FLATMEM is set as
>> mem_map symbol exist in this particular kernel.[ mem_section kernel
>> symbol
>> is also present in this kernel]
>>
>> [crash-4.0-4.8]# cat /boot/System.map | grep mem_map
>> ffffffff8072dab0 B mem_map
>>
>> [crash-4.0-4.8]# cat /boot/System.map | grep mem_section
>> ffffffff8072e800 B mem_section
>>
>> From kernel source mm/memory.c: mem_map is defined if
>> CONFIG_NEED_MULTIPLE_NODES
>> is not defined. Which is the case here.
>> I am not a mm expert so i can't tell what to make out of this
>> situation where
>> both mem_map and mem_section kernel symbol exist. Anyone ??
>>
>> Anyway as for the crash problem this could be fixed by rearranging the
>> above code as follows:
>>
>> - if (kernel_symbol_exists("mem_map")) {
>> + if (kernel_symbol_exists("mem_section"))
>> + vt->flags |= SPARSEMEM;
>> + else if (kernel_symbol_exists("mem_map")) {
>> get_symbol_data("mem_map", sizeof(char *), &vt->mem_map);
>> vt->flags |= FLATMEM;
>> - } else if (kernel_symbol_exists("mem_section"))
>> - vt->flags |= SPARSEMEM;
>> - else
>> + } else
>>
>>
>> But since i am not very sure about the mm code, there might be a
>> better way to
>> fix this.
>>
>> Thanks
>> -Sachin
>
>
> The crash patch above looks fine to me -- I'll give it a test run.
Sachin,
Your patch tested fine on my stable of sample dumpfiles -- queued for
the next release.
Thanks,
Dave
More information about the Crash-utility
mailing list