[Crash-utility] Re: Question about fixing another crash annoyance...t

Tue Sep 29 15:19:49 UTC 2009

----- "Bob Montgomery" <bob.montgomery at hp.com> wrote:

> Dave,
> 
> Please pardon the direct question, I'm attempting to cash in on my "dis
> -l" goodwill :-)
> 
> The latest problem I'm working on:
> 
> We occasionally get dumps that wake up in crash with:
> 
> ...
> please wait... (gathering kmem slab cache data)
> crash-4.0.9-fix: page excluded: kernel virtual address:
> ffff88022457a000
> type: "kmem_cache_s buffer"
> 
> crash-4.0.9-fix: unable to initialize kmem slab cache subsystem
> ...
> 
> These are partial dumps with only kernel pages included.
> 
> This problem comes about because readmem fails to read one
> of the kmem_cache structs in the list, for example:
> 
> crash-4.0.9-fix> struct kmem_cache 0xffff880224579cc0
> struct kmem_cache struct: page excluded: kernel virtual address:
> ffff88022457a000  type: "gdb_readmem_callback"
> Cannot access memory at address 0xffff880224579cc0
> 
> This struct starts toward the end of a page (0xffff880224579cc0)
> and extends into the next page (0xffff88022457a000) which has
> been excluded from the dump because it isn't a kernel page.
> 
> That is pretty scary if I assume some bug in the kernel is
> giving pages back to user land that still hold parts of kernel
> structs.  But that's not what's happening.
> 
> crash-4.0.9-fix> struct -o kmem_cache
> struct kmem_cache {
>     [0x0] struct array_cache *array[32];
> ...
>   [0x158] struct list_head next;
>   [0x168] struct kmem_list3 *nodelists[64];
> }
> SIZE: 0x368
> 
> Crash thinks the struct is 0x368 in length, making the 
> apparent end of the struct lie in the next page (...a000
> instead of ...9000)
> 
> crash-4.0.9-fix> p/x 0xffff880224579cc0+0x368
> $3 = 0xffff88022457a028
> 
> But the clever kernel folks did this in slab.c:
> 
>         /*
>          * We put nodelists[] at the end of kmem_cache, because we want to size
>          * this array to nr_node_ids slots instead of MAX_NUMNODES 
>          * (see kmem_cache_init())
>          * We still use [MAX_NUMNODES] and not [1] or [0] because cache_cache
>          * is statically defined, so we reserve the max number of nodes.
>          */
>         struct kmem_list3 *nodelists[MAX_NUMNODES];
> 
> So that means crash needs to curtail the read of kmem_cache
> to the actual size of the nodelists array, instead of the
> declared size.
> 
> I still need to determine if the actual size is determined
> once for all instances, or per structure.  
> 
> This should affect partial dumps with kernels that use slab.c.

I never noticed that before -- the buffer_size of the global "cache_cache" 
kmem_cache structure gets downsized here in kmem_cache_init() in 2.6.22 
and later:

        /*
         * struct kmem_cache size depends on nr_node_ids, which
         * can be less than MAX_NUMNODES.
         */
        cache_cache.buffer_size = offsetof(struct kmem_cache, nodelists) +
                                 nr_node_ids * sizeof(struct kmem_list3 *);

So the fix would be to first determine the cache_cache.buffer_size value,
and use that to initialize the size_table.kmem_cache_s value used by the 
"SIZE(kmem_cache_s)" macro.  Secondly, "vt->kmem_cache_len_nodes", which 
is also based upon the same MAX_NUMNODES array index value, needs to be
downsized as well.  It looks like if the kernel "nr_node_ids" exists as 
symbol (instead of a #define), then it should be used.

> Any other structs in the kernel like this that crash already
> deals with?

None that I'm aware of...

Dave