[Crash-utility] crash: invalid structure member offset

Koornstra, Reinoud koornstra at hp.com
Thu Aug 12 21:03:27 UTC 2010


> -----Original Message-----
> From: crash-utility-bounces at redhat.com [mailto:crash-utility-
> bounces at redhat.com] On Behalf Of Dave Anderson
> Sent: Thursday, August 12, 2010 12:18 PM
> To: Discussion list for crash utility usage, maintenance and
> development
> Subject: Re: [Crash-utility] crash: invalid structure member offset
> 
> 
> ----- "Reinoud Koornstra" <koornstra at hp.com> wrote:
> 
> > Thanks,
> >
> > Using crash 5.0.6 worked nicely.
> > However, I can't really look at a lot because of a bad EIP code.
> >
> > [  726.601381] 802.1Q VLAN Support v1.8 Ben Greear
> <greearb at candelatech.com>
> > [  726.601384] All bugs added by David S. Miller <davem at redhat.com>
> > [  726.646757] BUG: unable to handle kernel NULL pointer dereference
> at 00000000
> > [  726.732410] IP: [<00000000>]
> > [  726.766933] *pdpt = 0000000000431001 *pde = 0000000000000000
> > [  726.766937] Oops: 0010 [#1] SMP
> > [  726.790844] Modules linked in: 8021q iptable_filter ip_tables
> > x_tables ip_gre af_packet i2c_dev i2c_qs i2c_algo_bit i2c_core garp
> > stp llc ixgbe inet_lro psmouse serio_raw intel_agp shpchp iTCO_wdt
> > pci_hotplug iTCO_vendor_support agpgart ext3 jbd mbcache sd_mod
> > crc_t10dif sg ata_piix ata_generic ahci libata scsi_mod ehci_hcd
> > uhci_hcd usbcore [last unloaded: 8021q]
> > [  726.790844]
> > [  726.790844] Pid: 4, comm: ksoftirqd/0 Tainted: P          (2.6.27)
> > [  726.790844] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
> > [  726.790844] EIP is at 0x0
> > [  726.790844] EAX: e7f4c498 EBX: 00000000 ECX: 77470000 EDX:
> e7f4c498
> > [  726.790844] ESI: 4bd1d300 EDI: 00000007 EBP: f784df88 ESP:
> f784df78
> > [  726.790844]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > [  726.790844] Process ksoftirqd/0 (pid: 4, ti=f784c000 task=f783a5b0
> task.ti=f784c000)
> > [  726.790844] Stack: 40168080 00000001 403daaa0 4042c500 f784df90
> 401681bf f784dfb0 4012fe92
> > [  726.790844]        0000000a 00000000 40429340 00000246 00000000
> 40130120 f784dfbc 4012ff55
> > [  726.790844]        4042c500 f784dfcc 40130182 fffffffc 00000000
> f784dfe0 4013e707 4013e6c0
> > [  726.790844] Call Trace:
> > [  726.790844]  [<40168080>] ? __rcu_process_callbacks+0x70/0x190
> > [  726.790844]  [<401681bf>] ? rcu_process_callbacks+0x1f/0x40
> > [  726.790844]  [<4012fe92>] ? __do_softirq+0x82/0x100
> > [  726.790844]  [<40130120>] ? ksoftirqd+0x0/0xe0
> > [  726.790844]  [<4012ff55>] ? do_softirq+0x45/0x50
> > [  726.790844]  [<40130182>] ? ksoftirqd+0x62/0xe0
> > [  726.790844]  [<4013e707>] ? kthread+0x47/0x80
> > [  726.790844]  [<4013e6c0>] ? kthread+0x0/0x80
> > [  726.790844]  [<4010494f>] ? kernel_thread_helper+0x7/0x10
> > [  726.790844]  =======================
> > [  726.790844] Code:  Bad EIP value.
> > [  726.790844] EIP: [<00000000>] 0x0 SS:ESP 0068:f784df78
> >
> > So now I can't figure out the piece of code where this dereferencing
> > occurred. :(
> 
> Yeah, I don't know why the exception frame didn't displayed below in
> the
> bt output, but I think it may have been confusion due the the kernel
> text
> region starting a 4000000 (instead of the typical 3G/1G user/kernel
> virtual
> address split).  I'm guessing your kernel is configured as 1G/3G user-
> kernel?

That's right, the kernel is configured as 1G/3G user/kernel.

> (I've never seen that before...)

It's a weird config indeed. I'll try rewriting some stuff so it consumes way less memory so a normal kernel/user split can be used.
Never the less, why the pointer became null remains unsolved for the moment. :-)
Would the user/kernel split also be an issue in 64 bit?

Reinoud.

> 
> Anyway, somehow the EIP got zeroed out, and it took a fault trying
> to handle that.  That can happen if a kernel function corrupts its
> own stack by incorrectly writing to its own local stack variables,
> and in so doing writes a zero into the return address saved on the
> stack.  Then when the function returns, that zero is loaded into the
> EIP, and you'd see something like the above.
> 
> The exception frame in the log shows that the ESP is f784df78,
> and looking at the trace data below, it looks like
> rcu_process_callbacks()
> may have ended up calling something that lead to the EIP corruption.
> 
> Just a guess though...
> 
> Dave
> 
> >
> > crash>  bt
> > PID: 4      TASK: f783a5b0  CPU: 0   COMMAND: "ksoftirqd/0"
> >  #0 [f784de88] crash_kexec at 401534a8
> >  #1 [f784df28] __slab_free at 4019677f
> >  #2 [f784df8c] rcu_process_callbacks at 401681ba
> >  #3 [f784df94] __do_softirq at 4012fe90
> >  #4 [f784dfb4] do_softirq at 4012ff50
> >  #5 [f784dfd0] kthread at 4013e705
> >  #6 [f784dfe4] kernel_thread_helper at 4010494d
> >
> > Thanks,
> >
> > Reinoud.
> >
> >
> > > -----Original Message-----
> > > From: crash-utility-bounces at redhat.com [mailto:crash-utility-
> > > bounces at redhat.com] On Behalf Of Dave Anderson
> > > Sent: Thursday, August 12, 2010 6:14 AM
> > > To: Discussion list for crash utility usage, maintenance and
> > > development
> > > Subject: Re: [Crash-utility] crash: invalid structure member offset
> > >
> > >
> > > ----- "Reinoud Koornstra" <koornstra at hp.com> wrote:
> > >
> > > > Hi Everyone,
> > > >
> > > > I am trying to read a core file into crash, but I've got bad luck
> > as
> > > you can see below.
> > > > Is core file corrupt? It is a vmcore file from a 32 bits kernel
> > that
> > > > was compiled with PAE, could that have corrupted things?
> > > > Any hints here?
> > > > Thanks,
> > > >
> > > > Reinoud.
> > > >
> > > > $ crash System.map-2.6.27 ./vmlinux-2.6.27 ./vmcore
> > > >
> > > > crash 4.0-3.7
> > >
> > > I don't know if the vmcore is corrupt, but PAE wouldn't be an
> > issue.
> > >
> > > However, you are running a version of crash that was released
> > almost
> > > 4 years ago (13-Oct-2006) against a two-year-old kernel that was
> > > released 15-Oct-2008.  That's pretty much a guarantee of failure.
> > >
> > > Try updating to version 5.0.6 and see what happens.
> > >
> > > And BTW, if the vmlinux file is the exact same kernel as the
> > > one that generated the vmcore file, you don't need a System.map
> > > argument.
> > >
> > > Dave
> > >
> > >
> > >
> > > 15-Oct-2008
> > >
> > > > Copyright   2002, 2003, 2004, 2005, 2006  Red Hat, Inc.
> > > > Copyright   2004, 2005, 2006  IBM Corporation
> > > > Copyright   1999-2006  Hewlett-Packard Co
> > > > Copyright   2005  Fujitsu Limited
> > > > Copyright   2005  NEC Corporation
> > > > Copyright   1999, 2002  Silicon Graphics, Inc.
> > > > Copyright   1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
> > > > This program is free software, covered by the GNU General Public
> > > License,
> > > > and you are welcome to change it and/or distribute copies of it
> > under
> > > > certain conditions.  Enter "help copying" to see the conditions.
> > > > This program has absolutely no warranty.  Enter "help warranty"
> > for
> > > > details.
> > > >
> > > > GNU gdb 6.1
> > > > Copyright 2004 Free Software Foundation, Inc.
> > > > GDB is free software, covered by the GNU General Public License,
> > and
> > > you are
> > > > welcome to change it and/or distribute copies of it under certain
> > > conditions.
> > > > Type "show copying" to see the conditions.
> > > > There is absolutely no warranty for GDB.  Type "show warranty"
> > for
> > > details.
> > > > This GDB was configured as "i686-pc-linux-gnu"...
> > > >
> > > > please wait... (gathering kmem slab cache data)
> > > >
> > > > crash: invalid structure member offset: kmem_cache_s_c_num
> > > >        FILE: memory.c  LINE: 6891  FUNCTION: kmem_cache_init()
> > > >
> > > > [/usr/bin/crash] error trace: 80827a9 => 8095398 => 80aa7ef =>
> > > > 8131e88
> > > > /usr/bin/nm: /usr/bin/crash: no symbols
> > > > /usr/bin/nm: /usr/bin/crash: no symbols
> > > > /usr/bin/nm: /usr/bin/crash: no symbols
> > > > /usr/bin/nm: /usr/bin/crash: no symbols
> > > >
> > > > WARNING: Because this kernel was compiled with gcc version 4.1.2,
> > > certain
> > > >          commands or command options may fail unless crash is
> > invoked
> > > with
> > > >          the  "--readnow" command line option.
> > >
> > > --
> > > Crash-utility mailing list
> > > Crash-utility at redhat.com
> > > https://www.redhat.com/mailman/listinfo/crash-utility
> >
> > --
> > Crash-utility mailing list
> > Crash-utility at redhat.com
> > https://www.redhat.com/mailman/listinfo/crash-utility
> 
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility




More information about the Crash-utility mailing list