[Crash-utility] Re: Fix for source line numbers for x86_64 modules

Wright, John (Telco Linux, Fort Collins) john.wright at hp.com
Wed Sep 23 18:41:21 UTC 2009


On Mon, Sep 21, 2009 at 04:24:58PM -0600, Bob Montgomery wrote:
> On Thu, 2009-09-17 at 12:55 +0000, Dave Anderson wrote:
> > ----- "Bob Montgomery" <bob.montgomery at hp.com> wrote:
> > 
> > > This patch allows the dis -l command to show real source line numbers
> > > for module code, instead of this sort of thing:
> > > 
> 
> > I haven't tested this patch or taken the time to understand it, 
> > but I'm taking it on good faith that it solves this bug, which
> > has been an elephant in the room since (I believe) the 2.6.21
> > timeframe. 
> 
> It will be nice to get some independent verification.
> 
> In the meantime, we've found a couple of other things.
> 
> 1)  There is a class of module routine whose info does not
> show up where this version of gdb checks, and those routines
> will still not show line numbers.  They are:
> 
>    a) routines declared with __devexit
>    b) routines declared with __devinit
>    c) routines in module_exit() MACROs
>    d) routines in module_init() MACROs
> 
> In our particular 2.6.29-based test kernel, we load 70 modules.
> Before this patch, we had 5507 routines that would not show
> line numbers (reported with "include/linux/cpumask.h: 612 instead
> of the real line).  
> 
> After the patch, we have only 89 routines that won't show 
> line numbers correctly for the above reasons.  For example:
> ...
> ffffffffa006bc22 (t) e1000_exit_module  include/linux/cpumask.h: 612
> ffffffffa006bc45 (t) e1000_remove  include/linux/cpumask.h: 612
> ffffffffa0078188 (t) sha1_generic_mod_fini  include/linux/cpumask.h: 612
> ffffffffa008af54 (t) bnx2_cleanup  include/linux/cpumask.h: 612
> ffffffffa008af66 (t) bnx2_remove_one  include/linux/cpumask.h: 612
> ffffffffa008afbe (t) bnx2_init_one  include/linux/cpumask.h: 612
> ffffffffa00b3210 (t) cciss_init_one  include/linux/cpumask.h: 612
> ...
> 
> 
> 2)  John took a look at a new version of gdb, and it appears
> to have some redesign in the symbol stuff that corrects 
> perhaps all of this.  I'll let him summarize what he found.

Sorry for the delay.  I'm testing this by running gdb directly on the
running kernel, using /proc/kcore as the core file.  To load symbols
from a module, I'm using this script:

  #!/bin/bash
  #
  # Usage: ./gdbline module image
  #
  # Outputs an add-symbol-file line suitable for pasting into gdb to examine
  # a loaded module.
  #
  cd /sys/module/$1/sections
  echo -n add-symbol-file $2 `/bin/cat .text`
  
  for section in .[a-z]* *; do
      if [ $section != ".text" ]; then
  	echo -n " -s" $section `/bin/cat $section`
      fi
  done
  echo

(This is based on [1], only without a multiline command, since that will
corrupt the heap until some CVS commits today [2].)

So, with that out of the way, this is what I get for cciss with gdb
6.8-3 on Debian/amd64:

  $ sudo gdb /usr/lib/debug/vmlinux-2.6.29-clim-3-amd64 /proc/kcore
  [sudo] password for jswright: 
  GNU gdb 6.8-debian
  Copyright (C) 2008 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-linux-gnu"...
  
  warning: core file may not match specified executable file.
  Core was generated by `root=/dev/cciss/c0d0p1 ro console=ttyS1,115200n8 mem=4G crashkernel=128M at 16M qui'.
  [New process 0]
  #0  0x0000000000000000 in ?? ()
  (gdb) add-symbol-file /usr/lib/debug/lib/modules/2.6.29-clim-3-amd64/kernel/drivers/block/cciss.ko 0xffffffffa007b000 -s .bss 0xffffffffa008feb0 -s .data 0xffffffffa0086f10 -s .devexit.text 0xffffffffa0082502 -s .devinit.text 0xffffffffa0081210 -s .exit.text 0xffffffffa00828cc -s .gnu.linkonce.this_module 0xffffffffa008fcb0 -s .init.text 0xffffffffa0006000 -s .note.gnu.build-id 0xffffffffa008292c -s .parainstructions 0xffffffffa0084b18 -s .rodata 0xffffffffa0082950 -s .rodata.str1.1 0xffffffffa0083280 -s .smp_locks 0xffffffffa0084bc8 -s .strtab 0xffffffffa0086208 -s .symtab 0xffffffffa0084c60 -s __bug_table 0xffffffffa0084bd8
  add symbol table from file "/usr/lib/debug/lib/modules/2.6.29-clim-3-amd64/kernel/drivers/block/cciss.ko" at
  	.text_addr = 0xffffffffa007b000
  	.bss_addr = 0xffffffffa008feb0
  	.data_addr = 0xffffffffa0086f10
  	.devexit.text_addr = 0xffffffffa0082502
  	.devinit.text_addr = 0xffffffffa0081210
  	.exit.text_addr = 0xffffffffa00828cc
  	.gnu.linkonce.this_module_addr = 0xffffffffa008fcb0
  	.init.text_addr = 0xffffffffa0006000
  	.note.gnu.build-id_addr = 0xffffffffa008292c
  	.parainstructions_addr = 0xffffffffa0084b18
  	.rodata_addr = 0xffffffffa0082950
  	.rodata.str1.1_addr = 0xffffffffa0083280
  	.smp_locks_addr = 0xffffffffa0084bc8
  	.strtab_addr = 0xffffffffa0086208
  	.symtab_addr = 0xffffffffa0084c60
  	__bug_table_addr = 0xffffffffa0084bd8
  (y or n) y
  Reading symbols from /usr/lib/debug/lib/modules/2.6.29-clim-3-amd64/kernel/drivers/block/cciss.ko...warning: section .strtab not found in /usr/lib/debug/lib/modules/2.6.29-clim-3-amd64/kernel/drivers/block/cciss.ko
  warning: section .symtab not found in /usr/lib/debug/lib/modules/2.6.29-clim-3-amd64/kernel/drivers/block/cciss.ko
  done.
  (gdb) info line cciss_init_one
  No line number information available for address 
    0xffffffffa0081210 <cciss_init_one>

With gdb 6.8.50.20090628-4 on Debian/amd64:

  # gdb /usr/lib/debug/vmlinux-2.6.29-clim-3-amd64 /proc/kcore
  GNU gdb (GDB) 6.8.50.20090628-cvs-debian
  Copyright (C) 2009 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-linux-gnu".
  For bug reporting instructions, please see:
  <http://www.gnu.org/software/gdb/bugs/>...
  
  warning: core file may not match specified executable file.
  Core was generated by `root=/dev/cciss/c0d0p1 ro console=ttyS1,115200n8 mem=4G crashkernel=128M at 16M qui'.
  #0  0x0000000000000000 in ?? ()
  (gdb) add-symbol-file /usr/lib/debug/lib/modules/2.6.29-clim-3-amd64/kernel/drivers/block/cciss.ko 0xffffffffa007b000 -s .bss 0xffffffffa008feb0 -s .data 0xffffffffa0086f10 -s .devexit.text 0xffffffffa0082502 -s .devinit.text 0xffffffffa0081210 -s .exit.text 0xffffffffa00828cc -s .gnu.linkonce.this_module 0xffffffffa008fcb0 -s .init.text 0xffffffffa0006000 -s .note.gnu.build-id 0xffffffffa008292c -s .parainstructions 0xffffffffa0084b18 -s .rodata 0xffffffffa0082950 -s .rodata.str1.1 0xffffffffa0083280 -s .smp_locks 0xffffffffa0084bc8 -s .strtab 0xffffffffa0086208 -s .symtab 0xffffffffa0084c60 -s __bug_table 0xffffffffa0084bd8
  add symbol table from file "/usr/lib/debug/lib/modules/2.6.29-clim-3-amd64/kernel/drivers/block/cciss.ko" at
  	.text_addr = 0xffffffffa007b000
  	.bss_addr = 0xffffffffa008feb0
  	.data_addr = 0xffffffffa0086f10
  	.devexit.text_addr = 0xffffffffa0082502
  	.devinit.text_addr = 0xffffffffa0081210
  	.exit.text_addr = 0xffffffffa00828cc
  	.gnu.linkonce.this_module_addr = 0xffffffffa008fcb0
  	.init.text_addr = 0xffffffffa0006000
  	.note.gnu.build-id_addr = 0xffffffffa008292c
  	.parainstructions_addr = 0xffffffffa0084b18
  	.rodata_addr = 0xffffffffa0082950
  	.rodata.str1.1_addr = 0xffffffffa0083280
  	.smp_locks_addr = 0xffffffffa0084bc8
  	.strtab_addr = 0xffffffffa0086208
  	.symtab_addr = 0xffffffffa0084c60
  	__bug_table_addr = 0xffffffffa0084bd8
  (y or n) y
  Reading symbols from /usr/lib/debug/lib/modules/2.6.29-clim-3-amd64/kernel/drivers/block/cciss.ko...warning: section .strtab not found in /usr/lib/debug/lib/modules/2.6.29-clim-3-amd64/kernel/drivers/block/cciss.ko
  warning: section .symtab not found in /usr/lib/debug/lib/modules/2.6.29-clim-3-amd64/kernel/drivers/block/cciss.ko
  done.
  (gdb) info line cciss_init_one
  Line 3597 of "drivers/block/cciss.c"
     starts at address 0xffffffffa0081210 <cciss_init_one>
     and ends at 0xffffffffa0081227 <cciss_init_one+23>.
  (gdb) dir /usr/src/linux-source-2.6.29-clim
  Source directories searched: /usr/src/linux-source-2.6.29-clim:$cdir:$cwd
  (gdb) info line drivers/block/cciss.c:3597
  Line 3597 of "drivers/block/cciss.c"
     starts at address 0xffffffffa0081210 <cciss_init_one>
     and ends at 0xffffffffa0081227 <cciss_init_one+23>.
  (gdb) l cciss_init_one
  3592	 *  stealing all these major device numbers.
  3593	 *  returns the number of block devices registered.
  3594	 */
  3595	static int __devinit cciss_init_one(struct pci_dev *pdev,
  3596					    const struct pci_device_id *ent)
  3597	{
  3598		int i;
  3599		int j = 0;
  3600		int rc;
  3601		int dac, return_code;

I haven't systematically tried this on every module symbol, but it seems
to work for every one I have tried in the 6.8.50.20090628-4 version, but
not in 6.8-4.  This changeset [3] seems to be the one that fixes things.
But it's not a trivial backport to gdb-6.1.  If I actually understood
the problem better, maybe I could just pull the relevant bits out of
that patch...

 [1]: http://lkml.indiana.edu/hypermail/linux/kernel/0406.0/0802.html
 [2]: http://sourceware.org/bugzilla/show_bug.cgi?id=10684
 [3]: http://sourceware.org/git/gitweb.cgi?p=gdb.git;a=commitdiff;h=b5830bae4e60b65f37560ce1ede15e8036c20f2d

-- 
+----------------------------------------------------------+
| John Wright <john.wright at hp.com>                         |
| HP Mission Critical OS Enablement & Solution Test (MOST) |
+----------------------------------------------------------+




More information about the Crash-utility mailing list