[Crash-utility] Corrupted tee in crash gdb

Andi Kleen ak at linux.intel.com
Tue Dec 22 18:05:03 UTC 2020


Hi,

I have several crash dump files that reliably crash gdb inside crash.

This is with compressed dumps and recent kernels

It appears there is an error during the symbol processing (usually
when looking up runqueue, but even if you disable that it fails on
something else) and then gdb crashes because the tee
structure it uses to print error messages is corrupted. The tee fputs
vector is NULL and gdb jumps to zero while trying to print the
error message. So somehow crash doesn't set this up properly.

Using -minimal works, but that's too limiting.

I'm using this gdb patch to work around it by disabling the gdb errors.
That makes crash work well enough to look at most things.

It's probably not the correct fix, but at least it works for me.


diff -urp gdb-7.6-orig/gdb/ui-file.c gdb-7.6/gdb/ui-file.c
--- gdb-7.6-orig/gdb/ui-file.c	2020-12-22 09:48:27.532409801 -0800
+++ gdb-7.6/gdb/ui-file.c	2020-12-17 13:10:07.806799729 -0800
@@ -740,6 +740,8 @@ tee_file_flush (struct ui_file *file)
 {
   struct tee_file *tee = ui_file_data (file);
 
+  return;
+
   if (tee->magic != &tee_file_magic)
     internal_error (__FILE__, __LINE__,
 		    _("tee_file_flush: bad magic number"));
@@ -752,6 +754,8 @@ tee_file_write (struct ui_file *file, co
 {
   struct tee_file *tee = ui_file_data (file);
 
+  return;
+
   if (tee->magic != &tee_file_magic)
     internal_error (__FILE__, __LINE__,
 		    _("tee_file_write: bad magic number"));
@@ -764,9 +768,16 @@ tee_file_fputs (const char *linebuffer,
 {
   struct tee_file *tee = ui_file_data (file);
 
+  return;
+
   if (tee->magic != &tee_file_magic)
     internal_error (__FILE__, __LINE__,
 		    _("tee_file_fputs: bad magic number"));
+  if (!tee->one->to_fputs) {
+	  fputs(linebuffer, stdout);
+	  return;
+  }
+ 
   tee->one->to_fputs (linebuffer, tee->one);
   tee->two->to_fputs (linebuffer, tee->two);
 }




WARNING: kernel relocated [336MB]: patching 167186 gdb minimal_symbol
values

please wait... (patching 167186 gdb minimal_symbol values) [Detaching
after vfork from child process 1824671]

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
... up

#10 0x000000000066e75a in c_parse_internal () at c-exp.y:442
442                                 error (_("%s is not an ObjC Class"),

(gdb) p *tee
$2 = {ssize_t (int, int, size_t, unsigned int)} 0x7ffff7cc4a30 <tee>


(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00000000007b3f09 in tee_file_fputs (
    linebuffer=0x23425f0 "No symbol \"runqueue\" in current context.\n",
    file=<optimized out>)
        at ui-file.c:770
        #2  0x00000000007b071b in fputs_maybe_filtered (
            linebuffer=linebuffer at entry=0x23425f0 "No symbol
            \"runqueue\" in current context.\n",
                stream=stream at entry=0x16c3ff0, filter=1) at utils.c:2091
                #3  0x00000000007b08c1 in vfprintf_maybe_filtered
                (stream=0x16c3ff0,
                    format=format at entry=0xa8f375 "%s\n",
                    args=args at entry=0x7ffffffe8828, filter=1, filter=1)
                        at utils.c:2332
                        #4  0x00000000007b0a2c in vfprintf_filtered
                        (args=0x7ffffffe8828, format=0xa8f375 "%s\n",
                            stream=<optimized out>) at utils.c:2392
                            #5  fprintf_filtered (stream=<optimized
                            out>, format=format at entry=0xa8f375 "%s\n")
                            at utils.c:2392
                            #6  0x00000000006f84f8 in throw_exception
                            (exception=...) at exceptions.c:234
                            #7  0x00000000006f874b in throw_it
                            (reason=reason at entry=RETURN_ERROR,
                                error=error at entry=GENERIC_ERROR,
                                fmt=<optimized out>,
                                ap=ap at entry=0x7ffffffe8978)
                                    at exceptions.c:434
                                    #8  0x00000000006f8956 in
                                    throw_verror
                                    (error=error at entry=GENERIC_ERROR,
                                    fmt=<optimized out>,
                                        ap=ap at entry=0x7ffffffe8978) at
                                        exceptions.c:440
                                        #9  0x00000000007af3e7 in error
                                        (string=<optimized out>) at
                                        utils.c:717
                                        #10 0x000000000066e75a in
c_parse_internal () at
                                        c-exp.y:442
                                        #11 0x000000000066eab7 in
                                        c_parse () at c-exp.y:3064
                                        #12 0x0000000000723091 in
                                        parse_exp_in_context
                                        (stringptr=stringptr at entry=0x7ffffffea708,
#13 0x00000000007232b8 in parse_exp_1
(stringptr=stringptr at entry=0x7ffffffea758, pc=pc at entry=0,
    block=block at entry=0x0, comma=comma at entry=0) at parse.c:1136
    #14 0x00000000007232f9 in parse_expression (string=<optimized out>)
    at parse.c:1279
    #15 0x00000000006cf083 in gdb_get_datatype (req=0xdf8b20
    <shared_bufs>) at symtab.c:5361
    #16 gdb_command_funnel (req=0xdf8b20 <shared_bufs>) at symtab.c:5210
    #17 0x0000000000518d4f in gdb_interface (req=0xdf8b20 <shared_bufs>)
    at gdb_interface.c:397
    #18 0x00000000005676dc in datatype_info (name=0x7ffffffec200
    "runqueue", member=0x0,
        dm=0x7ffffffeb000) at symbols.c:5635
        #19 0x000000000056abe5 in arg_to_datatype (s=0x7ffffffec200
        "runqueue", dm=0x7ffffffeb000,
            flags=2) at symbols.c:6872
            #20 0x000000000056f928 in get_array_length (s=0x90dcd6
            "runqueue.cpu", two_dim=0x0, entry_size=0)
                at symbols.c:8481
                #21 0x00000000004f1092 in kernel_init () at kernel.c:338
                #22 0x00000000004658a8 in main_loop () at main.c:781
                #23 0x00000000006fa433 in captured_command_loop
                (data=data at entry=0x0) at main.c:258
                #24 0x00000000006f8c4a in catch_errors
                (func=func at entry=0x6fa420 <captured_command_loop>,
                    func_args=func_args at entry=0x0,
                    errstring=errstring at entry=0x9617cf "",
                    mask=mask at entry=6)
                        at exceptions.c:557
                        #25 0x00000000006fb406 in captured_main
                        (data=data at entry=0x7fffffffd180) at main.c:1064
                        




More information about the Crash-utility mailing list