[Crash-utility] [PATCH] arm64: exclude mapping symbols in modules

Fri Oct 7 15:02:01 UTC 2016

----- Original Message -----
> Dave,
> 
> > 
> > Now, this sample patch doesn't deal with branch instructions other than "bl",
> > so perhaps it could just check whether the last argument in the instruction
> > line is a translatable address.
> > 
> > On the other hand, for the PLT veneer issue, it would have to:
> > 
> >  (1) make sure it's a "bl", and
> 
> and other variants of "bl"

Specifically what other variants?  Do you mean any instruction that begins
with "b."?

> 
> >  (2) instead of blindly doing a translation of the PLT veneer label address,
> >      it would first have to check whether it points to a 12-byte chunk of
> >      kernel address construction, and if so, translate the reconstructed
> >      address.
> 
> Actually, a veneer always consists of 4 instructions:
>     mov  x16, #imm16
>     movk x16, #imm16, lsl #16
>     movk x16, #imm16, lsl #32
>     br   x16

Right, I meant that the target address is constructed in the first 12 bytes.

I'm not at all familiar with arm64 assembly.  It seems that each of the
instructions consume 4 bytes, but unlike the other architectures, I cannot
find any documentation as to how the instruction type, the target register,
the immediate value, etc., actually get encoded into the 32-bit instruction.
The documentation shows the assembly mnemonics themselves, but not how the
instruction is actually laid out it in memory.  Maybe I'm looking in the wrong
place.  

Taking the simplest of examples, here's a mov immediate instruction:

  crash> dis 0xfffffe00000fbc84 2
  0xfffffe00000fbc84 <select_task_rq_fair+528>:   mov     x7, #0xffffffffffffffff         // #-1
  0xfffffe00000fbc88 <select_task_rq_fair+532>:   add     x0, x0, x26
  crash>

And here's the encoding:

  crash> rd -32 0xfffffe00000fbc84
  fffffe00000fbc84:  92800007                              ....
  crash> 

Presumably the 7 is the register field, but how does it get -1 out of the rest
of the instruction?

Anwyay, without some basic understanding, I'm not touching this.  I was kind 
of hoping you could whip up the function...  ;-)

> It would be safe to identify any veneers with this type of sequence,
> but I'm wondering if there is any other trick of directly checking
> if the label address is fit in PLT section of a module.

I have no idea.

> (On arm64, this section is dynamically allocated on module loading,
> and so it's not trivial.)
> 
> >  
> > So I'm thinking something along these lines, say, where "value" may or may
> > not be modified by your new function:
> > 
> >        if (IS_MODULE_VADDR(vaddr)) {
> >                p1 = &inbuf[strlen(inbuf)-1];
> >                strcpy(buf1, inbuf);
> >                argc = parse_line(buf1, argv);
> >                if (STREQ(argv[argc-2], "bl") &&
> >                    extract_hex(argv[argc-1], &value, NULLCHAR, TRUE)) {
> > +                      value = PLT_veneer_to_kvaddr(value);
> >                        sprintf(p1, " <%s>\n",
> >                                value_to_symstr(value, buf2, output_radix));
> > 	       }
> >        }
> 
> Looks nice.
> 
> > However, another thing to consider is what "dis" shows if the "mod" command
> > has already loaded the debuginfo data.  In that case, I'm guessing that gdb
> > would translate the address of the PLT veneer location?
> 
> Give that the output from "bt" command shows "testmod_init" which is
> a module_init function of my sample module, I assume that the debug
> data have already been loaded in my case.

No, definitely not.  When a crash session is initiated, it kicks off the
gdb session with "gdb vmlinux", and so the embedded gdb has no clue about
the existence of any kernel modules.  The kernel data itself may contain
basic symbol information that was exported by the modules if the kernel was
configured with CONFIG_KALLSYMS, and if so, the "bt" command can translate
module symbols.  On the other hand, the "dis" command issues a disassembly
request to the embedded gdb module, which has no clue about module symbols
unless the debuginfo data of the modules is added.  To do that, you have to
enter either "mod -S" to load the debuginfo of all modules, or "mod -s <module>"
to load the debuginfo data of an individual module.  The "mod [-sS]" command
runs a gdb "add-symbol-file" command behind the scenes for each module, and
therefore requires that the kernel's debuginfo package is available on the 
host system.  

Anyway, that being the case, I'm still wondering whether the gdb output would
simply show the veneer address after the debuginfo data is loaded with the mod
command.  I presume that it would do so, I mean that's what it's supposed 
to do.  This veneer translation would simply be a nice-to-have feature. 

> > The sample KASLR vmcore you gave me doesn't have any modules, so I don't know.
> 
> I can give you my sample vmcore.
> Please tell me a location where I can push the iamge.

Do you have debuginfo objects for the modules?  I really need to see the
before-and-after-mod-command behavior.  I'll send you a link to a location
 offline where you can upload the vmlinux, vmcore, and module debuginfo
objects.

Thanks,
  Dave