[Crash-utility] Faster iteration on list of struct.field

Dave Anderson anderson at redhat.com
Fri Feb 7 18:41:25 UTC 2020



----- Original Message -----
> Dave Anderson wrote on Thu, Feb 06, 2020:
> > Right, the time-consumer is the call into gdb to get the structure member
> > details.
> > 
> > I'm not following what you mean by "making 'datatype_member' static ...",
> > but
> > off the top of my head, I was thinking of adding a "tmp_offset" and
> > "tmp_size"
> > fields to the global offset_table and size_table structures, and adding a
> > new
> > pc->curcmd_flags bit.  Then, if a command that wants to support such a
> > concept,
> > it could:
> > 
> >   (1) check the new pc->curcmd_flags bit, which will always be 0 the first
> >   time
> >       the function gets called by the exec_args_input_file() loop.
> >   (2) if the new bit is 0, then set OFFSET(tmp_offset) and SIZE(tmp_size)
> >   (3) set the new flag in pc->curcmd_flags, and continue...
> > 
> > Then during the second and subsequent times through the loop,
> > pc->curcmd_flags
> > will still be set/unchanged, because restore_sanity() is not called from
> > the
> > exec_args_input_flags() loop.
> > 
> > But that scheme falls down if a user requests a comma-separated list of
> > multiple members (argc_members would be > 1).  Although, it might be better
> > if the "struct -r' option rejects multiple-member arguments, especially
> > given
> > that the output would be pretty much unreadable.
> 
> I would tend to agree with that, struct -r with multiple members might
> be somewhat parsable but if someone can do that they can dump the whole
> struct and parse it anyway, so let's go with only one number.
> 
> On the good news though, this whole caching isn't going to be
> immediately needed. I just finished the first part of this (allowing
> struct -r with a member), and struct -r is already infinitely faster
> than struct; so getting the offset wasn't the slow part:
>  - with a small 100-elements file, I'm already going down from 12s to
> near-instant on this old laptop.
>  - I didn't wait for 1000-elements to finish normally but it's just
> about one second with -r, which is acceptable enough for me.
> 
> Caching might make it another order of magnitude faster but for now I'm
> happy to wait a couple of minutes for my 100k elements list, it's better
> than not finishing in half an hour :)

Ok, so it must be the gdb-assisted print_struct() and parsing that's 
time-consuming, and not the gdb datatype query.

> 
> Following up with patch, with a couple of remarks:
>  - I had to change member_to_datatype() to use datatype_info() directly
> instead of MEMBER_OFFSET(), to fill dm->member_size. I'm not sure if
> this will have any side effects, but things like 'struct foo.a,b' still
> work at least. You might have a better idea of what to check.

Hmmm, I'd prefer to keep the member_datatype() behavior as it is, and
not have datatype_info() re-initialize the incoming dm.  (except for
the setting of dm->member).  Maybe have a different flag for gathering
the size as you have, and keep the original functionality the same?  
Or alternatively, leave the call to member_datatype() as-is, and if 
do_datatype_addr() sees SHOW_RAW_DATA, additionally call MEMBER_SIZE()?

>  - I'm only passing ANON_MEMBER_QUERY to member_to_datatype() in the
> non-raw case. 

I think you mean just the opposite...

                               if (!member_to_datatype(memberlist[i], dm,
-                                                       ANON_MEMBER_QUERY))
+                                                       (flags & SHOW_RAW_DATA) ? ANON_MEMBER_QUERY : 0))
                                        error(FATAL, "invalid data structure reference: %s.%s\n",
                                              dm->name, memberlist[i]);

The use of ANON_MEMBER_QUERY is just there for a fall-back option if the
MEMBER_OFFSET() call fails.  

>                I'm not quite sure why we couldn't get the member size if
> it's an anon union/stuct, but I'm not sure how one would name an
> anonymous field in the first place here? Anyway, one would get invalid
> data structure reference message there if they do. It might be better
> to always pass the argument and then check for SHOW_RAW_DATA set with
> dm->member_size still being 0 after call to give another more
> appropriate error if you think people might hit that.

All of the ANON_xxx macros were added for getting information for members
that are declared inside of anonymous structures within a structure, where
where the generic datatype_info() call fails.  In those cases, the request
gets directed to a gdb print command within anon_member_offset().  There is
no support for getting the size of such a member, so MEMBER_SIZE() would
fail.  So I don't think this feature would work for those types of members,
and would need some kind of ANON_MEMBER_SIZE() and accompanying 
anon_member_size() functionality.

Dave

> Anyway, thanks for the advices.
> --
> Dominique
> 
> 
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
> 




More information about the Crash-utility mailing list