[Crash-utility] Extensions ABI compatibility

lijiang lijiang at redhat.com
Thu Aug 27 10:00:10 UTC 2020


在 2020年08月26日 09:57, HAGIO KAZUHITO(萩尾 一仁) 写道:
> -----Original Message-----
>> Hi, Mathias
>>
>> Thank you for discussing this topic and describing its backgrounds.
>>
>> 在 2020年08月19日 23:45, Mathias Krause 写道:
>>> Hi list,
>>>
>>> during development of the PaX module patch[1], Lianbo pointed out an
>>> issue about adding new members to struct offset_table. They should be
>>> added to the end to ensure binary compatibility with already compiled
>>> extensions, as these won't be aware of the new members and, even worse,
>>> might be using wrong offsets within the offset_table if we didn't. As
>>> this seems to be rather fragile, here's a proposal to make it less so.
>>>
>>> To not blindly make already compiled extensions misbehave by using the
>>> wrong offsets when new members get added in the middle of shared types,
>>> we need to have some validation step, preferably during extension
>>> registration to catch these.
>>>
>> Sounds good to me.
> 
> hmm, Lianbo, are you Red Hat ready to recompile and release crash extension
> packages (almost) every time releasing a crash package?  As far as I know,

Not yet, but, that could be worth trying to make some changes. 

> at least Red Hat has not done so, so your crash release work will have to
> be changed.  And you will have to specify the versions of extension modules
> corresponding to a crash package for users somehow or unite those packages.
> 
Good understanding. That could make a confusion to specify the versions of extension
modules for users, still not sure if it's doable.

BTW: Is that possible to unite those packages?

It is also inconvenient to maintain the crash extensions tarball, if kernel changes
some symbols, which could affect the crash extensions, and also need to update the
changes accordingly. Otherwise, the crash extensions don't work. For example:  

Recently, I notice that the crash-gcore-command can not work and report the
following error: 

crash> bt -t|grep -i task
PID: 3968   TASK: ffff9e512e97af80  CPU: 86  COMMAND: "bash"
crash> gcore 3968
 
gcore: invalid structure member offset: thread_struct_io_bitmap_max
       FILE: libgcore/gcore_x86.c  LINE: 846  FUNCTION: ioperm_active()
 
[./crash] error trace: 7f31fca56108 => 7f31fca593b7 => 53a4e1 => 53a463
 
  53a463: OFFSET_verify.part.27+51
  53a4e1: OFFSET_verify+49
 
gcore: invalid structure member offset: thread_struct_io_bitmap_max
       FILE: libgcore/gcore_x86.c  LINE: 846  FUNCTION: ioperm_active()
 
Failed.

The above failure is caused by the kernel commit: 577d5cd7e585 ("x86/ioperm:
Move iobitmap data into a struct").

That is to say, even if we don't make any changes, the crash extension tarball
will need to be updated frequently, especially for the Fedora(btw: we are planning
to add the crash extension packages to Fedora.), because the Fedora kernel is
closely related to the latest kernel changes.

> I think this way would cost distributors some effort to change their release
> operations like above.  Users also will lose some convenience, even if we
> adopt a warning, usually they will not be able to use an extension emitting
> it, though they can ignore it in almost all cases..
> 
> I wonder if there are merits enough to warn and cost them for now.
> 
It really need to weight the advantage and disadvantage. And that is what we would
like to discuss further.

> But the implementation is nice, and might be useful as debugging information
> in case something happens.  Is it acceptable to move it into CRASHDEBUG(1)
> and "help -e"?
> 
Before making any changes, the warning could be better than nothing.

Thanks.
Lianbo

> Thanks,
> Kazu
> 
>>
>>> If we want to keep binary compatibility, we need to add members to the
>>> end. That's a rule maintainers can enforce but as we all make mistakes,
>>
>> Recently, we are considering putting the crash guidelines on the wiki in
>> order to help developers avoid making mistakes.
>>
>>> code can help us here. Changing a shared data structure has, at least,
>>> implications on its size. So we can use this as a base to build checks
>>> on. If the size of a data structure used during compilation of an
>>> extension differs from the size of that same data structure as seen
>>> within 'crash', we have an ABI issue. The response should be, at least,
>>> a warning to inform the user to recompile its extension to ensure ABI
>>> compatibility. Another option would be to decline loading the extension
>>> but that might be inconvenient for users, if we indeed made sure to add
>>> members only at the end of the involved data structures.
>>>
>>> These checks can be implemented "transparently" as sketched in the
>>> attached patch. It changes register_extension() to be a macro that
>>> stores the sizes of critical types in a data structure that gets passed
>>> to _register_extension() which is implemented in 'crash' and will do the
>>> size checks against its version of the involved types.
>>>
>>> The patch also takes care about backward compatibility by providing a
>>> stub register_extension() function that passes a NULL structure, as all
>>> current extension modules refer to register_extension() instead of the
>>> newly introduced _register_extension().
>>>
>>> Open TODOs:
>>> 1/ Compile a list of all types we want to protect and add checks for
>>>    these.
>>> 2/ Put these types into a 'shared.h' header file and use that
>>>    exclusively within extensions -- maybe even enforce it.
>>>
>>> Do we want to follow that approach or look for alternatives?
>>>
>>
>> Let's wait for more comments or suggestions.
>>
>> Thanks.
>> Lianbo
>>
>>> Regards,
>>> Mathias
>>>
>>> [1]: https://www.redhat.com/archives/crash-utility/2020-August/msg00024.html
>>>
>>
>> --
>> Crash-utility mailing list
>> Crash-utility at redhat.com
>> https://www.redhat.com/mailman/listinfo/crash-utility




More information about the Crash-utility mailing list