[Crash-utility] [PATCH] extension: Fix crash segfaults when loading same extension with different names twice

Liu Tao ltao at redhat.com
Wed Mar 24 06:51:13 UTC 2021


Hello Kazu,

RTLD_DEEPBIND is not a good way. I'd like to recall the patch.
After my research, I find making functions and global variables static
within extensions can avoid such problems.

Taking extensions/trace.so as example, trace_fini and trace_init are
non-static functions, if we load trace.so first, everything works fine, but
when we rename trace.so to trace2.so and load it again, we are expecting
trace_init within trace2.so to be called, but since we have RTLD_GLOBAL,
actually it's the trace_init within trace.so to be called. It causes
problems because currently some global variables within trace.so are not in
their initial status.

As I quoting dlopen manual:
Symbol references in the shared object are resolved using (in order):
symbols in the link map of objects loaded for the main program and its
dependencies; symbols in shared objects (and their dependencies) that were
previously opened with dlopen() using the  RTLD_GLOBAL flag; and
definitions in the shared object itself (and any dependencies that were
loaded for that object).

When using readelf --dyn-syms trace.so, for non-static trace_fini and
trace_init, I got:
Symbol table '.dynsym' contains 66 entries:
    ....
    64: 0000000000007e65    22 FUNC    GLOBAL DEFAULT   11 trace_fini
    65: 0000000000007e3d    40 FUNC    GLOBAL DEFAULT   11 trace_init

Since I'm not an expert to linkers and loaders, I guess the process is:

1. when loading trace.so
2. ld.so sees trace_init and trace_fini in .dynsym
3. ld.so searches the 2 symbols in main program, RTLD_GLOBAL shared
objects, and trace.so.
4. ld.so finds them in trace.so, resolves their address.
5. everything works fine.

6. when loading trace2.so
7. ld.so sees trace_init and trace_fini in .dynsym
8. ld.so searches the 2 symbols in main program, RTLD_GLOBAL shared
objects, and trace2.so.
9. ld.so finds them in RTLD_GLOBAL shared objects(trace.so), resolves their
address.
10. trace_init in trace.so is called again, but global variables are not in
initial status, fails.

If I make trace_fini and trace_init static, readelf --dyn-syms trace.so
will no longer give the 2 symbols in .dynsym, and ld.so will no longer
resolve them to the wrong place, then it works fine.

I think this cannot be a code patch, but a document for future extension
developers, if you don't want to export a symbol to subsequent extensions,
making it static.

On Mon, Mar 22, 2021 at 1:35 PM HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab at nec.com>
wrote:

> Hi Tao Liu,
>
> -----Original Message-----
> > If a same extension(Eg: extensions/trace.so) with two different names
> are loaded by
> > "extend" command twice, it sometimes segfaults crash.
> >
> > It's because crash uses RTLD_NOW|RTLD_GLOBAL flags of dlopen to load an
> extension.
> > RTDL_GLOBAL will make symbols defined by this shared object available for
> > symbol resolution of subsequently loaded shared objects. So symbols with
> the same
> > name will be exported from the former to the latter. In this case, when
> 2 extensions
> > only differ from file names, the subsequently loaded extension will have
> unexpected
> > initial values for global varibles.
>
> Thanks for the report.
>
> > This patch adds RTLD_DEEPBIND flag to dlopen, making extensions using its
> > own symbols preference to symbols with the same name contained by others.
>
> This looks a big API change for crash extension modules.
>
> As far as I've tested, getopt() in an extension module does not work well
> with this patch:
>
> # make extensions
>
> crash> extend extensions/echo.so
> ./extensions/echo.so: shared object loaded
> crash> echo test
> test
> crash> echo test
>
> crash> echo test test2
> test2
> crash> echo test test2
>
> crash> echo test
>
> crash> echo test test2
>
> crash> echo test test2 test3
> test3
>
> Can we fix this?  And probably all other modules using getopt() imitates
> this echo.c, they will also need to be fixed to adopt the patch.
> Also I'm concerned that there might be another regression.
>
> Do we need to fix the issue at these costs?  or is there any better way?
>
> Thanks,
> Kazu
>
> >
> > Signed-off-by: Tao Liu <ltao at redhat.com>
> > ---
> >  extensions.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/extensions.c b/extensions.c
> > index d23b1e3..e07f9a9 100644
> > --- a/extensions.c
> > +++ b/extensions.c
> > @@ -317,7 +317,7 @@ load_extension(char *lib)
> >          *  _init() function before dlopen() returns below.
> >       */
> >       pc->curext = ext;
> > -     ext->handle = dlopen(ext->filename, RTLD_NOW|RTLD_GLOBAL);
> > +     ext->handle = dlopen(ext->filename,
> RTLD_NOW|RTLD_GLOBAL|RTLD_DEEPBIND);
> >
> >       if (!ext->handle) {
> >               strcpy(buf, dlerror());
> > --
> > 2.29.2
> >
> > --
> > Crash-utility mailing list
> > Crash-utility at redhat.com
> > https://listman.redhat.com/mailman/listinfo/crash-utility
>
>
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://listman.redhat.com/mailman/listinfo/crash-utility
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20210324/d59c1653/attachment.htm>


More information about the Crash-utility mailing list