instruction-analysis API(s)
Masami Hiramatsu
mhiramat at redhat.com
Tue Mar 10 19:57:11 UTC 2009
Hi Jim,
Jim Keniston wrote:
> Quoting Masami Hiramatsu <mhiramat at redhat.com>:
>
>> Hi Jim and Sriker,
>>
>> Here, I almost rewrote my patch.
>>
>> Changelog:
>> - rewrite decoding logic based on Intel' manual.
>> - supoort insn_get_sib(),insn_get_displacement()
>> and insn_get_immediate() too.
>> - support 3 bytes opcode and 64bit immediate.
>> - introduce some bitmaps.
>>
>> Thank you,
>
> Well, I didn't do much of a code review -- it looks like you addressed
> all my concerns -- but as I mentioned on IRC, I hacked together a test
> rig whereby you can disassemble a designated elf file (e.g., vmlinux,
> libc, libm) and then compare insn_get_length()'s results with objdump's
> results. The comment in distill.awk shows how to use objdump, awk, and
> test_get_len together.
Thank you for review and test!
> I also hacked up insn_x86.h and insn_x86.c to work in user space. Most
> of that is accomplished via insn_x86_user.h, but it certainly isn't
> necessary to do it that way. In particular, __u8, __s8, __u16, etc. are
> versions of u8, s8, u16, etc. that can be used in both kernel and user
> code, so maybe we should switch to those.
>
> I tested with vmlinux, libc, and libm on both an i686 system and an
> x86_64 system. I found and fixed a few bugs. Here are the ones that
> come to mind (all fixed):
> - shrd/shld, which we discussed
> - missing support for weird nops with modrm bytes (0f 1f ...).
> - neglected to include the REX prefix in prefixes.nbytes
> - missing static decl in an inline function in insn_x86.h
Thank you for fixing it.
BTW, it might have to support vm86 mode(especially, for user code).
> There are some other cases where insn_get_length() doesn't match up with
> the disassembly, but I don't consider them bugs:
> - 0x9b is an instruction (fwait), but the disassembler treats it as a
> prefix. For example 9b df ... can be disassembled as
> fstsw ... // wait, then store status word
> or
> fwait // wait
> fnstsw ... // store status word without waiting
> Perhaps it's relevant to investigate whether a single-step of 9b df ...
> would execute just the fwait or the whole fstsw. Anyway, this explains
> the "failures" of finit and fstsw that I mentioned to you. I also saw
> this with fstcw and fclex.
FYI, there is a single wait/fwait instruction described at Intel software
developers manual vol.2B p.399.
> - Illegal instruction sequences, such as an x86_64 instruction that
> starts with 0x40, or a misplaced 0x65 prefix. Typically, we see these
> when disassembling data. I just filtered out (via egrep) instructions
> whose disassembly starts with "rex" or includes "(bad)".
Sure, I think insn_* should return -EINVAL or set insn.invalid = 1
if we found those invalid ops. E.g. kernel use BUG() macro, it adds
some raw numbers after ud2, in that case, those raw numbers might
be decoded as an illegal instruction.
> We could address the above by filtering them out in distill.awk or
> test_get_len.c. I think we're clean otherwise.
>
> There's a little more housecleaning to do -- e.g., adding Hitachi (?)
> copyright to IBM copyright, discarding insn_field_exists() and
> insn_extract_reg(), putting this all in git somewhere. But not tonight.
>
> Pull all the attached files into a directory and have a go -- e.g.,
> $ make
> $ objdump -d vmlinux | awk -f distill.awk | ./test_get_len [x86_64]
>
> Jim
>
--
Masami Hiramatsu
Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division
e-mail: mhiramat at redhat.com
More information about the utrace-devel
mailing list