[Crash-utility] Faster iteration on list of struct.field

Dominique Martinet asmadeus at codewreck.org
Thu Feb 6 21:06:56 UTC 2020


Dave Anderson wrote on Thu, Feb 06, 2020:
> Right, the time-consumer is the call into gdb to get the structure member details.
> 
> I'm not following what you mean by "making 'datatype_member' static ...", but
> off the top of my head, I was thinking of adding a "tmp_offset" and "tmp_size"
> fields to the global offset_table and size_table structures, and adding a new
> pc->curcmd_flags bit.  Then, if a command that wants to support such a concept, 
> it could:
> 
>   (1) check the new pc->curcmd_flags bit, which will always be 0 the first time
>       the function gets called by the exec_args_input_file() loop.
>   (2) if the new bit is 0, then set OFFSET(tmp_offset) and SIZE(tmp_size)
>   (3) set the new flag in pc->curcmd_flags, and continue...
> 
> Then during the second and subsequent times through the loop, pc->curcmd_flags
> will still be set/unchanged, because restore_sanity() is not called from the 
> exec_args_input_flags() loop.
> 
> But that scheme falls down if a user requests a comma-separated list of
> multiple members (argc_members would be > 1).  Although, it might be better
> if the "struct -r' option rejects multiple-member arguments, especially given
> that the output would be pretty much unreadable.

I would tend to agree with that, struct -r with multiple members might
be somewhat parsable but if someone can do that they can dump the whole
struct and parse it anyway, so let's go with only one number.

On the good news though, this whole caching isn't going to be
immediately needed. I just finished the first part of this (allowing
struct -r with a member), and struct -r is already infinitely faster
than struct; so getting the offset wasn't the slow part:
 - with a small 100-elements file, I'm already going down from 12s to
near-instant on this old laptop.
 - I didn't wait for 1000-elements to finish normally but it's just
about one second with -r, which is acceptable enough for me.

Caching might make it another order of magnitude faster but for now I'm
happy to wait a couple of minutes for my 100k elements list, it's better
than not finishing in half an hour :)


Following up with patch, with a couple of remarks:
 - I had to change member_to_datatype() to use datatype_info() directly
instead of MEMBER_OFFSET(), to fill dm->member_size. I'm not sure if
this will have any side effects, but things like 'struct foo.a,b' still
work at least. You might have a better idea of what to check.
 - I'm only passing ANON_MEMBER_QUERY to member_to_datatype() in the
non-raw case. I'm not quite sure why we couldn't get the member size if
it's an anon union/stuct, but I'm not sure how one would name an
anonymous field in the first place here? Anyway, one would get invalid
data structure reference message there if they do. It might be better
to always pass the argument and then check for SHOW_RAW_DATA set with
dm->member_size still being 0 after call to give another more
appropriate error if you think people might hit that.

Anyway, thanks for the advices.
-- 
Dominique





More information about the Crash-utility mailing list