[Crash-utility] Re: crash 4.0-8.9 w/ 2.6.30-rc6

Dave Anderson anderson at redhat.com
Wed May 27 13:23:44 UTC 2009


----- "Mike Snitzer" <snitzer at redhat.com> wrote:

> On Wed, May 27 2009 at  8:37am -0400,
> Dave Anderson <anderson at redhat.com> wrote:
> 
> > 
> > ----- "Mike Snitzer" <snitzer at redhat.com> wrote:
> > 
> > > Hi Dave,
> > > 
> > > crash is failing with the following when I try to throw a
> 2.6.30-rc6
> > > vmcore at it:
> > > 
> > > crash: invalid structure size: x8664_pda
> > >        FILE: x86_64.c  LINE: 584  FUNCTION: x86_64_cpu_pda_init()
> > > 
> > > [/usr/bin/crash] error trace: 449c7f => 4ce815 => 4d00cf =>
> 50936d
> > > 
> > >   50936d: SIZE_verify+168
> > >   4d00cf: (undetermined)
> > >   4ce815: x86_64_init+3205
> > >   449c7f: main_loop+152
> > > 
> > > I can dig deeper but your help would be very much appreciated.
> > > 
> > > Mike
> > 
> > The venerable "been-there-since-the-beginning-of-x86_64" x8664_pda
> > data structure no longer exists.  It was a per-cpu array of a
> fundamental
> > data structure that things like "current", the per-cpu magic number,
> the
> > cpu number, the current kernel stack pointer, the per-cpu IRQ stack
> pointer,
> > etc. all came from:  
> > 
> > /* Per processor datastructure. %gs points to it while the kernel
> runs */
> > struct x8664_pda {
> >         struct task_struct *pcurrent;   /* Current process */
> >         unsigned long data_offset;      /* Per cpu data offset from
> linker address */
> >         unsigned long kernelstack;  /* top of kernel stack for
> current */
> >         unsigned long oldrsp;       /* user rsp for system call */
> > #if DEBUG_STKSZ > EXCEPTION_STKSZ
> >         unsigned long debugstack;   /* #DB/#BP stack. */
> > #endif
> >         int irqcount;               /* Irq nesting counter. Starts
> with -1 */
> >         int cpunumber;              /* Logical CPU number */
> >         char *irqstackptr;      /* top of irqstack */
> >         int nodenumber;             /* number of current node */
> >         unsigned int __softirq_pending;
> >         unsigned int __nmi_count;       /* number of NMI on this
> CPUs */
> >         int mmu_state;
> >         struct mm_struct *active_mm;
> >         unsigned apic_timer_irqs;
> > } ____cacheline_aligned_in_smp;
> > 
> > There have been upstream rumblings about replacing it with a more efficient
> > per-cpu implementation for some time now, but I haven't studied how the new
> > scheme works yet.  It will be a major re-work for the crash utility, so you're
> > pretty much out of luck for now.  (Try "gdb vmlinux vmcore" for basic info)
> 
> Ah OK.  I was just looking to get a stack trace.  Unfortunately gdb
> isn't playing nice either:
> 
> (gdb) bt
> #0  kstat_irqs_cpu (irq=<value optimized out>, cpu=2) at
> kernel/irq/handle.c:555
> Cannot access memory at address 0xffff88007e5e7d50

Mike,

Try the "--minimal" option that the IBM guys put into 4.0-7.1:

         - Implementation of a "--minimal" command line option, which brings 
           up a crash session that is restricted to the "log", "dis", "rd", 
           "sym", "eval" and "exit" commands.  This option may provide a way to 
           extract some minimal/quick information from a corrupted or truncated 
           dumpfile, or in situations where one of the several kernel subsystem 
           initialization routines, which are not called, would abort the
           crash session.  (sharyath at in.ibm.com, sachinp at in.ibm.com)

So just enter this:

 $ crash --minimal vmlinux vmcore

And you should at least get the kernel trace info with the "log" command.

Dave




More information about the Crash-utility mailing list