[Crash-utility] Fix in bt for ARM64

Dave Anderson anderson at prospeed.net
Fri May 22 12:44:09 UTC 2015


> Hi Dave
>
> I have had the time to look closer at the problem with symbols in modules
> that I reported some time ago. I now know why the problem occurs and have
> some idea how to solve it. However my knowledge in this area is limited so
> you really have to review my code change.
>
> The module that causes the problem when it is loaded contains a symbol the
> is "per_cpu". Its address is higher than any symbol in any module, so the
> search in value_search_module in symbols.c is terminated too early.

OK good, that makes sense.  I'm out of the office for the holiday weekend
and so I won't be able to check this out fully until next Tuesday.

Can you please send me the output of "sym -M" both before, and then after
you do a "mod -S"?  Send them offline.

Thanks,
  Dave


>
> The test below in the code from value_search_module solves the problem in
> my example.
>
>     *  splast will contain the last module symbol encountered.
>     *  Note: "__insmod_"-type symbols will be set in splast only
>     *  when they have unique values.
>     */
>     splast = NULL;
>     for ( ; sp <= sp_end; sp++) {
>
>       if (IN_MODULE_PERCPU(sp->value,lm) &&
>           !IN_MODULE_PERCPU(value,lm)) continue;       // ADDED
>
>       if (value == sp->value) {
>         if (MODULE_END(sp) || MODULE_INIT_END(sp))
>           break;
>
> Jan
>
> Jan Karlsson
> Senior Software Engineer
> System Assurance
>
> Sony Mobile Communications
> Tel: +46 703 062 174
> jan.karlsson at sonymobile.com
>
> sonymobile.com
>
>
>
> -----Original Message-----
> From: crash-utility-bounces at redhat.com
> [mailto:crash-utility-bounces at redhat.com] On Behalf Of Dave Anderson
> Sent: den 12 maj 2015 15:28
> To: Discussion list for crash utility usage, maintenance and development
> Subject: Re: [Crash-utility] Fix in bt for ARM64
>
>
>
> ----- Original Message -----
>> Thanks Dave,
>>
>> I did not look close enough on this issue and you are quite right that
>> my fix has no effect. After some more testing I have found that the
>> problem has to do with one of the modules.
>>
>> In the core file I investigate the kernel has six modules, where the
>> wlan module (where the strange names occur) is the last one. If I have
>> loaded the third module in the mod list, with "mod -s" or "mod -S"
>> then I get the strange printouts in bt. If that module is not loaded,
>> independently if other modules are loaded or not, then the bt printout
>> is correct.
>>
>> I understand fully that this will be very difficult to investigate
>> without the possibility to run and debug the example. I have tried to
>> do that but have not found anything useful so far. Any hints what to
>> look for? I will also try to understand if there is anything specific
>> with the module itself.
>
> I looked at the data that you sent me offline, but nothing stands out as a
> smoking gun.
>
> If you take "bt" out of the picture and just run "sym
> <wlan-module-address>", you should see the same symptom, where before the
> third module is loaded, it finds the correct symbol name of the
> <wlan-module-address> OK, but fails to do so after the third module is
> loaded.  cmd_sym() will call value_search(), which should call
> value_search_symbol(), and in that function you will be able to see it
> cycling through the modules, and at least get an idea as to why the wlan
> module addresses are being prematurely found in another module's symbol
> list.
>
> Dave
>
>
>>
>> I will by the way be out of office up to next Monday mainly due to
>> national holidays here in Sweden.
>>
>> Jan Karlsson
>> Senior Software Engineer
>> System Assurance
>>
>> Sony Mobile Communications
>> Tel: +46 703 062 174
>> jan.karlsson at sonymobile.com
>>
>> sonymobile.com
>>
>>
>>
>> -----Original Message-----
>> From: crash-utility-bounces at redhat.com
>> [mailto:crash-utility-bounces at redhat.com] On Behalf Of Dave Anderson
>> Sent: den 11 maj 2015 17:49
>> To: Discussion list for crash utility usage, maintenance and
>> development
>> Subject: Re: [Crash-utility] Fix in bt for ARM64
>>
>>
>>
>> ----- Original Message -----
>> >
>> >
>> > Hi Dave
>> >
>> >
>> >
>> > I found an ARM64 problem for bt when a function belongs to a module.
>> >
>> >
>> >
>> > Printout before fix given below:
>> >
>> > #16 [ffffffc0be96f8d0] __this_module at ffffffbffc15a2f8 [wlan]
>> > #17 [ffffffc0be96f9b0] __this_module at ffffffbffc161b18 [wlan]
>> > #18 [ffffffc0be96f9c0] __this_module at ffffffbffc16033c [wlan]
>> > #19 [ffffffc0be96fa10] __this_module at ffffffbffc1630f8 [wlan]
>> > #20 [ffffffc0be96fab0] __this_module at ffffffbffc156ff8 [wlan]
>> > #21 [ffffffc0be96faf0] __this_module at ffffffbffc15aa58 [wlan]
>> > #22 [ffffffc0be96fb20] __this_module at ffffffbffc15bfc8 [wlan]
>> > #23 [ffffffc0be96fb60] __this_module at ffffffbffc115fac [wlan]
>> > #24 [ffffffc0be96fb90] tasklet_action at ffffffc000223738
>> > #25 [ffffffc0be96fbb0] __do_softirq at ffffffc000222e94
>> >
>> >
>> >
>> > Printout after fix:
>> >
>> > #16 [ffffffc0be96f8d0] dhd_bus_rx_frame at ffffffbffc15a2f8 [wlan]
>> > #17 [ffffffc0be96f9b0] dhd_update_flow_prio_map at ffffffbffc161b18
>> > [wlan]
>> > #18 [ffffffc0be96f9c0] dhd_update_flow_prio_map at ffffffbffc16033c
>> > [wlan]
>> > #19 [ffffffc0be96fa10] dhd_prot_process_ctrlbuf at ffffffbffc1630f8
>> > [wlan]
>> > #20 [ffffffc0be96fab0] dhd_bus_ringbell at ffffffbffc156ff8 [wlan]
>> > #21 [ffffffc0be96faf0] dhd_bus_console_in at ffffffbffc15aa58 [wlan]
>> > #22 [ffffffc0be96fb20] dhd_bus_dpc at ffffffbffc15bfc8 [wlan]
>> > #23 [ffffffc0be96fb60] dhd_sched_dpc at ffffffbffc115fac [wlan]
>> > #24 [ffffffc0be96fb90] tasklet_action at ffffffc000223738
>> > #25 [ffffffc0be96fbb0] __do_softirq at ffffffc000222e94
>> >
>> >
>> >
>> > From arm64.c:
>> >
>> >
>> >
>> > static int
>> >
>> > arm64_print_stackframe_entry(struct bt_info *bt, int level, struct
>> > arm64_stackframe *frame)
>> >
>> > {
>> >
>> > char *name, *name_plus_offset;
>> > ulong symbol_offset;
>> > struct syment *sp;
>> > struct load_module *lm;
>> > char buf[BUFSIZE];
>> >
>> > name = closest_symbol(frame->pc);
>> > name_plus_offset = NULL;
>> >
>> > if (bt->flags & BT_SYMBOL_OFFSET) {
>> > /*ADDED*/
>> > if (module_symbol(frame->pc, NULL, &lm, NULL, 0)) sp =
>> > value_search_module(frame->pc, &symbol_offset); else /*END ADDED*/
>> > sp = value_search(frame->pc, &symbol_offset);
>> >
>>
>> Hi Jan,
>>
>> I don't dispute that this is something to be fixed, but at the same
>> time I don't quite understand (1) why it's happening, and (2) how your
>> fix addresses it?
>>
>> The value_search() function does this if it's a module address:
>>
>> struct syment *
>> value_search(ulong value, ulong *offset) { ...
>>         if (IS_VMALLOC_ADDR(value))
>>                 goto check_modules;
>>
>> ...
>> check_modules:
>>         sp = value_search_module(value, offset);
>>
>>         return sp;
>> }
>>
>> And even if IS_VMALLOC_ADDR() above fails, it should just fail to find
>> it in the base kernel symbols, and fall through to the
>> value_search_module() call.
>>
>> Does something different happen in your case?
>>
>> I also note that in all cases "__this_module" is in the "(d)" section
>> of each module, and typically is the last/highest symbol value of the
>> module.  So I'm confused as to how it's getting picked up as the
>> closest value to all of the different text addresses in the wlan module?
>>
>> What does "sym -m wlan" look like?
>>
>> Thanks,
>>   Dave
>>
>>
>>
>> >
>> > You probably also want to prevent calling module_symbol a second
>> > time later in the function.
>> >
>> >
>> >
>> > Jan
>> >
>> >
>> >
>> > Jan Karlsson
>> >
>> > Senior Software Engineer
>> >
>> > System Assurance
>> >
>> >
>> >
>> > Sony Mobile Communications
>> >
>> > Tel: +46 703 062 174
>> >
>> > jan.karlsson at sonymobile.com
>> >
>> >
>> >
>> > sonymobile.com
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > --
>> > Crash-utility mailing list
>> > Crash-utility at redhat.com
>> > https://www.redhat.com/mailman/listinfo/crash-utility
>>
>> --
>> Crash-utility mailing list
>> Crash-utility at redhat.com
>> https://www.redhat.com/mailman/listinfo/crash-utility
>>
>> --
>> Crash-utility mailing list
>> Crash-utility at redhat.com
>> https://www.redhat.com/mailman/listinfo/crash-utility
>>
>
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
>
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
>
>




More information about the Crash-utility mailing list