[Crash-utility] WARNING: malloc/free mismatch

Dave Anderson anderson at redhat.com
Thu Apr 18 18:05:33 UTC 2019



----- Original Message -----
> On Thu, 2019-04-18 at 11:09 -0400, Dave Anderson wrote:
> > 
> > ----- Original Message -----
> > > On Thu, 2019-04-18 at 15:02 +0100, Pierguido Lambri wrote:
> > > > Hello,
> > > > 
> > > > Today while I was looking into a vmcore, I got suddenly the
> > > > message
> > > > in $SUBJECT.
> > > > It started after I did a search into the process stack pages
> > > > (search
> > > > -t)
> > > > and for each command I run afterwards I kept getting that
> > > > message.
> > > > For example:
> > > > 
> > > > $ retrace-server-interact 603967269 crash
> > > > ...
> > > > crash> search -t ffff88040a0d5280
> > > > 
> > > > search: invalid list entry: 0
> > > > 
> > > > search: invalid list entry: 0
> > > > 
> > > > search: invalid list entry: 0
> > > > PID: 606    TASK: ffff88082d226eb0  CPU: 5   COMMAND:
> > > > "xfsaild/dm-0"
> > > > ffff88083ff5b948: ffff88040a0d5280
> > > > ffff88083ff5b990: ffff88040a0d5280
> > > > ffff88083ff5baa8: ffff88040a0d5280
> > > > ffff88083ff5baf0: ffff88040a0d5280
> > > > ffff88083ff5bcf0: ffff88040a0d5280
> > > > ffff88083ff5bd38: ffff88040a0d5280
> > > > ffff88083ff5bd98: ffff88040a0d5280
> > > > 
> > > > 
> > > > WARNING: malloc/free mismatch (29/32)
> > > > 
> > > > crash> ps -m | grep UN
> > > > [ 0 00:00:00.146] [UN]  PID: 1811   TASK: ffff880c17bd1fa0  CPU:
> > > > 1   COMMAND: "cp"
> > > > WARNING: malloc/free mismatch (29/32)
> > > > 
> > > > I guess this comes from a possible corrupted vmcore (I just got
> > > > it
> > > > from this vmcore),
> > > > but I wonder why every new command keeps returning the same
> > > > message.
> > > > 
> > > > Thanks,
> > > > 
> > > > Pier
> > > > 
> > > > --
> > > > Crash-utility mailing list
> > > > Crash-utility at redhat.com
> > > > https://www.redhat.com/mailman/listinfo/crash-utility
> > > 
> > > FWIW, I just pulled this up after plambri pinged me.  This is the
> > > backtrace that is being hit though I've not dug in more:
> > > 
> > > Breakpoint 3, do_list (ld=0x7ffffffea6c0) at tools.c:3820
> > > 3820                                    error(INFO, "\ninvalid list
> > > entry:
> > > 0\n");
> > > (gdb) list
> > > 3815                            return -1;
> > > 3816                    }
> > > 3817
> > > 3818                    if (next == 0) {
> > > 3819                            if (ld->flags & LIST_HEAD_FORMAT) {
> > > 3820                                    error(INFO, "\ninvalid list
> > > entry:
> > > 0\n");
> > > 3821                                    if (close_hq_on_return)
> > > 3822                                            hq_close();
> > > 3823                                    return -1;
> > > 3824                            }
> > > (gdb) bt
> > > #0  do_list (ld=0x7ffffffea6c0) at tools.c:3820
> > > #1  0x000000000047ec82 in dump_vmap_area (vi=0x7ffffffed0d0) at
> > > memory.c:8724
> > > #2  dump_vmlist (vi=0x7ffffffed0d0) at memory.c:8590
> > > #3  0x000000000047f3eb in last_vmalloc_address () at memory.c:16792
> > > #4  0x0000000000515e6b in x86_64_get_kvaddr_ranges
> > > (vrp=0x7fffffffd340) at x86_64.c:8706
> > > #5  0x000000000049c6ae in cmd_search () at memory.c:13988
> > > #6  0x0000000000465f9c in exec_command () at main.c:879
> > > #7  0x00000000004661ca in main_loop () at main.c:826
> > > #8  0x00000000006b21a3 in captured_command_loop (data=<value
> > > optimized out>) t main.c:258
> > > #9  0x00000000006b0a8b in catch_errors (func=0x6b2190
> > > <captured_command_loop>, func_args=0x0, errstring=0x90c106 "",
> > > mask=6) at exceptions.c:557
> > > #10 0x00000000006b3076 in captured_main (data=<value optimized
> > > out>) at main.c:1064
> > > #11 0x00000000006b0a8b in catch_errors (func=0x6b22b0
> > > <captured_main>, func_args=0x7fffffffe2e0, errstring=0x90c106 "",
> > > mask=6) at exceptions.c:557
> > > #12 0x00000000006b1fa4 in gdb_main (args=<value optimized out>) at
> > > main.c:1079
> > > #13 0x00000000006b1fde in gdb_main_entry (argc=<value optimized
> > > out>, argv=<value optimized out>) at main.c:1099
> > > #14 0x0000000000467030 in main (argc=3, argv=0x7fffffffe458) at
> > > main.c:707
> > 
> > Hmmm, the vmap_area list is a list_head type list, so there should
> > never be
> > a NULL "next" pointer.
> > 
> > I'm guessing that "kmem -v" also fails?  The last vmap_area entry
> > should point back to
> > the global "vmap_area_list" list header, for example:
> > 
> >   crash> kmem -v | tail
> >   ffff96e7ecaaca80  ffff96e54c89c400  ffffffffc0e54000 -
> > ffffffffc0e5a000    24576
> >   ffff96e757ffe380  ffff96e4be98f3c0  ffffffffc0e5d000 -
> > ffffffffc0e6d000    65536
> >   ffff96e467b33400  ffff96e6a3ae1a00  ffffffffc0e6d000 -
> > ffffffffc0e73000    24576
> >   ffff96e85cf4e600  ffff96e752c52b40  ffffffffc0e77000 -
> > ffffffffc0e7c000    20480
> >   ffff96e85cf4e380  ffff96e5506c6c00  ffffffffc0e7c000 -
> > ffffffffc0e81000    20480
> >   ffff96e802baa500  ffff96e5506c69c0  ffffffffc0e81000 -
> > ffffffffc0e86000    20480
> >   ffff96e802baac00  ffff96e5506c6cc0  ffffffffc0e86000 -
> > ffffffffc0e8c000    24576
> >   ffff96e574196f80  ffff96e55ffd6c80  ffffffffc0e90000 -
> > ffffffffc0e95000    20480
> >   ffff96e574196680  ffff96e55ffd6880  ffffffffc0e95000 -
> > ffffffffc0e9a000    20480
> >   ffff96e87c222800  ffff96e5496ca680  ffffffffc0e9a000 -
> > ffffffffc0ea4000    40960
> >   crash> vmap_area ffff96e87c222800
> >   struct vmap_area {
> >     va_start = 18446744072651120640,
> >     va_end = 18446744072651161600,
> >     flags = 4,
> >     rb_node = {
> >       __rb_parent_color = 18446628510972342169,
> >       rb_right = 0x0,
> >       rb_left = 0xffff96e574196698
> >     },
> >     list = {
> >       next = 0xffffffffae69af90,
> >       prev = 0xffff96e5741966b0
> >     },
> >     purge_list = {
> >       next = 0x0,
> >       prev = 0xdead000000000200
> >     },
> >     vm = 0xffff96e5496ca680,
> >     callback_head = {
> >       next = 0x0,
> >       func = 0xffff96e71d51aa00
> >     }
> >   }
> >   crash> sym 0xffffffffae69af90
> >   ffffffffae69af90 (D) vmap_area_list
> >   crash>
> >   
> > Dave
> > 
> > 
> > 
> 
> Yeah kmem -v fails as well:
> crash> kmem -v
> 
> kmem: invalid list entry: 0
> WARNING: malloc/free mismatch (29/30)
> crash>
> 
> 
> There's no indicating of an error when crash loads though - only after
> running these commands.  Do you think this a damaged vmcore that is not
> obvious?

I don't know it's damaged or if it's a symptom of the kernel crash.  Is the
kernel crash happening while the vmlist is being modified?

It's not obvious because it would only be seen when dump_vmlist() is called.
When dump_vmlist() calls dump_vmap_area(), and do_list() returns -1 back to
dump_vmap_area(), it gets used as a loop-ending index, and then causes 
presumably bogus values to get returned:

   do_vmap_area(struct meminfo *vi)
   {
   ...
  
          ld->end = symbol_value("vmap_area_list");
          cnt = do_list(ld);

          for (i = 0; i < cnt; i++) {
   ...   
  
          if (vi->flags & GET_HIGHEST)
                  vi->retval = start+size;

          if (vi->flags & GET_VMLIST_COUNT)
                  vi->retval = count;

          if (vi->flags & VMLIST_VERIFY)
                  vi->retval = verified;
  }

Mabye dump_vmap_area() should do a error(FATAL, ...) if cnt is -1?  Although, 
that would kill all search command attempts.  It's hard to say.

Dave



> 
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
> 




More information about the Crash-utility mailing list