[Crash-utility] [PATCH 0/4] To support module percpu symbol

Dave Anderson anderson at redhat.com
Wed Nov 10 21:00:18 UTC 2010


----- "Toshikazu Nakayama" <nakayama.ts at ncos.nec.co.jp> wrote:

> Hi Dave,
>
> ... 
> > And I haven't even bothered to look into how it would affect
> > the operation of all of the other architectures...
> >
> > Perhaps the patch can be re-done so that the code can handle it on a
> > per-module basis without inadvertently affecting everything else?
> >
> > In any case, there's no way I can accept the patch as it is 
> > currently designed.
> 
> I handled module symbol in kernel scope (target is not System.map).
> 
> In kernel, percpu data can't access directly with mapped virtual address.
> Related kernel macros are sometimes arch depends or looks compiler depend.
> (I can not understand enough...)
> Since percpu data area is entirely dynamic allocation,
> additionally some arch locate part of them at VMALLOC area.
> They are not straight-mapped so some arch are probably overlapped.
 
Right -- and because of that overlap, I still believe that it is possible 
that cmd_p() may still (inadvertently) do the wrong thing -- in the same
way that the "p in_suspend" attempt failed using your second patch. 

In other words, the code path taken for a symbol that is just beyond the 
kernel's " __per_cpu_end" would be found OK by the symbol_search() call 
in line 5669 -- but it would also be misconstrued as a module percpu
symbol by per_cpu_symbol_search() in line 5670:

   5669         if ((sp = symbol_search(args[optind])) && !args[optind+1]) {
   5670                 if ((percpu_sp = per_cpu_symbol_search(args[optind])) &&
   5671                     display_per_cpu_info(percpu_sp))
   5672                         return;
   5673                 sprintf(buf2, "%s = ", args[optind]);
   5674                 leader = strlen(buf2);
   5675                 if (module_symbol(sp->value, NULL, NULL, NULL, *gdb_output_radix))
   5676                         do_load_module_filter = TRUE;
   5677         } else if ((percpu_sp = per_cpu_symbol_search(args[optind])) &&
   5678                    display_per_cpu_info(percpu_sp))
   5679                 return;
   5680         else if (st->flags & LOAD_MODULE_SYMS)
   5681                 do_load_module_filter = TRUE;

because your IN_MODULE_PCPU() macro would only check whether the symbol value was
located in a module percpu offset range.

To fix that, perhaps the syment structures for all module symbols could
have an indication that they are module-only symbols?  There are a couple
of unused fields in the structure -- perhaps "pad1" could be changed to be
a "flags" field that could have a "MODULE_SYMBOL" bit set in it:

  struct syment {
          ulong value;
          char *name;
          struct syment *val_hash_next;
          struct syment *name_hash_next;
          char type;
          unsigned char cnt;
          unsigned char pad1;
          unsigned char pad2;
  };

If that were true, then get_mod_percpu_sym_owner() could reject
any syment that doesn't have the MODULE_SYMBOL flag set, and then 
there should be no problem with "symbol overlap".  What do you 
think about doing something like that?

Also, I don't quite understand why it was necessary in your patch to
modify cmd_p() like this:
  
  --- a/symbols.c
  +++ b/symbols.c
  @@ -5636,7 +5636,10 @@ cmd_p(void)
                  leader = strlen(buf2);
                  if (module_symbol(sp->value, NULL, NULL, NULL, *gdb_output_radix))
                          do_load_module_filter = TRUE;
  -       } else if (st->flags & LOAD_MODULE_SYMS)
  +       } else if ((percpu_sp = per_cpu_symbol_search(args[optind])) &&
  +                  display_per_cpu_info(percpu_sp))
  +               return;
  +       else if (st->flags & LOAD_MODULE_SYMS)
                  do_load_module_filter = TRUE;
  
          if (leader || do_load_module_filter)

If the symbol could not be found by symbol_search() in line 5669,
then how could it be found later by the second per_cpu_symbol_search() 
added above?  

And for for that matter, in the very few modules that have percpu sections 
in my test machines/dumpfiles, I don't see any actual per-cpu symbols listed 
by "sym -[mM]".  How do you actually "see" the name of a module's percpu symbol?

> Anyway with previous approach,
> there is no possible solution in my idea at least,
> but I might get really right way from your proposal.
> 
> Retake attached patches with per module space.
> 
> I use module.percpu which guarantee straight-mapping in every module space.
> Also its size get from bfd_section_size() for legacy percpu implementation
> although latest kernel gives percpu_size.
> - I am sorry but I don't have i386 target environs now, not tested about i386.
>   (I think that module.percpu can work well even if it is VMALLOC area.) 
>
> But there was still more problem at gdb interface.
> 
> When p &(module percpu) is not resolved by syment,
> gdb_pass_through() require pre-registration about module sections.
> The result gave wrong symbol value of module percpu.
>
> I append percpu section in gdb request from add_symbol_file_kallsyms()
> although percpu is not in kallsyms...
> Because multiple GNU_ADD_SYMBOL_FILE requests for one module did not
> work well.
> I am not very well around these implementations.
> Can I resolve it with other better way?

Not really.  But I don't see a problem with the way that you did it -- which
seems to work just fine.  The only thing that needs to be changed is where
add_symbol_file_percpu() calls RESIZEBUF() -- it needs to have "req" passed
as an argument instead of "req->buf", like this:

static long
add_symbol_file_percpu(struct load_module *lm, struct gnu_request *req, long buflen)
{
        char pbuf[BUFSIZE];
        int i;
        char *secname;
        long len;

        len = strlen(req->buf);
        for (i = 0; i < lm->mod_sections; i++) {
                secname = lm->mod_section_data[i].name;
                if ((lm->mod_section_data[i].flags & SEC_FOUND) &&
                    (STREQ(secname, ".data.percpu") ||
                     STREQ(secname, ".data..percpu"))) {
                        sprintf(pbuf, " -s %s 0x%lx", secname, lm->mod_percpu);
                        while ((len + strlen(pbuf)) >= buflen) {
                                RESIZEBUF(req->buf, buflen, buflen * 2);
                                buflen *= 2;
                        }
                        strcat(req->buf, pbuf);
                        len += strlen(pbuf);
                }
        }
        return buflen;
}

The way you had it written, if RESIZEBUF() were to be called, it would never
be seen by gdb, because req->buf would still point to the old buffer.
 
Dave




More information about the Crash-utility mailing list