[Crash-utility] cannot load slab info from 3.11 dump because of invalid pointer in kmem_cache

Dave Anderson anderson at redhat.com
Thu Aug 22 19:57:59 UTC 2013



----- Original Message -----
> 
> 
> ----- Original Message -----
> > Hi, Dave
> > 
> > On Wed, Aug 21, 2013 at 12:16 PM, Dave Anderson <anderson at redhat.com>
> > wrote:
> > >
> > > ----- Original Message -----
> > >> Hi,
> > >>
> > >> Not clear if it is a 3.11 issue or just general memory corruption. But
> > >> I clearly cannot load slab information from any of my 3.11 dumps. Slab
> > >> info contain incorrect pointer and "crash" just drops all slab
> > >> information.
> > >
> > > Did all of your 3.11 dumps fail in a similar manner?
> > 
> > Initially I saw the same issue at least on 3 different crashes. So I
> > thought that it might be 3.11 specific.
> > 
> > But now with a new dump that I got just now I do not see the "invalid
> > kernel virtual address" message anymore. Instead when I do "kmem -S" I
> > have following message:
> > 
> > ======================================================
> > kmem: invalid structure member offset: kmem_cache_s_lists
> >       FILE: memory.c  LINE: 8955  FUNCTION: do_slab_chain_percpu_v2()
> > 
> > [/usr/local/google/home/anatol/sources/opensource/crash/crash] error
> > trace: 493f1d => 47eca0 => 517642 => 460b22
> > CACHE            NAME                 OBJSIZE  ALLOCATED     TOTAL  SLABS
> > SSIZE
> > 
> >   460b22: OFFSET_verify.part.28+71
> >   517642: OFFSET_verify+50
> >   47eca0: do_slab_chain_percpu_v2+96
> >   493f1d: dump_kmem_cache_percpu_v2+2205
> > 
> > kmem: invalid structure member offset: kmem_cache_s_lists
> >       FILE: memory.c  LINE: 8955  FUNCTION: do_slab_chain_percpu_v2()
> > ====================================================
> > 
> > 
> > 
> > 
> > I have no idea what does it mean. I'll try to find more time later
> > this week and look at the problem deeper.
> 
> Any "invalid structure member offset" error means that
> the upstream kernel structures have changed.
> 
> A quick glance at the current upstream kernel shows that
> he kmem_cache.nodelists name and pointer has been renamed
> as part of the slab/slub/slob unification rework.
> 
> In the 3.6 CONFIG_SLAB dumpfile I have, the kmem_cache
> structure looks like this:
> 
>   crash> kmem_cache
>   struct kmem_cache {
>       unsigned int batchcount;
>       unsigned int limit;
>       unsigned int shared;
>       unsigned int size;
>       u32 reciprocal_buffer_size;
>       unsigned int flags;
>       unsigned int num;
>       unsigned int gfporder;
>       gfp_t allocflags;
>       size_t colour;
>       unsigned int colour_off;
>       struct kmem_cache *slabp_cache;
>       unsigned int slab_size;
>       unsigned int dflags;
>       void (*ctor)(void *);
>       const char *name;
>       struct list_head list;
>       int refcount;
>       int object_size;
>       int align;
>       struct kmem_list3 **nodelists;
>       struct array_cache *array[4096];
>   }
>   SIZE: 32896
>   crash>
> 
> where the "nodelists" pointer points to the end of the
> array_cache array[], where there are per-node array_cache
> pointers located following the per-cpu array_cache pointers.
> 
> Upstream, the kmem_list3 structure looks to have been
> absorbed into the CONFIG_SLAB part of the generic
> "kmem_cache_node" structure, and its pointer name above
> has been changed from "nodelists" to "nodes":
>   
>   struct kmem_cache {
>   
>   ... [ cut ] ...
>   
>   /* 6) per-cpu/per-node data, touched during every alloc/free */
>           /*
>            * We put array[] at the end of kmem_cache, because we want to size
>            * this array to nr_cpu_ids slots instead of NR_CPUS
>            * (see kmem_cache_init())
>            * We still use [NR_CPUS] and not [1] or [0] because cache_cache
>            * is statically defined, so we reserve the max number of cpus.
>            *
>            * We also need to guarantee that the list is able to accomodate a
>            * pointer for each node since "nodelists" uses the remainder of
>            * available pointers.
>            */
>           struct kmem_cache_node **node;
>           struct array_cache *array[NR_CPUS + MAX_NUMNODES];
>           /*
>            * Do not add fields after array[]
>            */
>   };
> 
> So hopefully a few more bait-and-switch name changes
> similar to the patches you've been posting can handle
> the changes.

And after addressing the above, I saw this today on LKML, which will
wreak havoc with the CONFIG_SLAB support in crash:

  [PATCH 00/16] slab: overload struct slab over struct page to reduce memory usage
  https://lkml.org/lkml/2013/8/22/137

Dave




More information about the Crash-utility mailing list