[Crash-utility] crash-5.0: zero-size memory-allocation

Tue Jan 12 14:54:13 UTC 2010

----- "ville mattila" <ville.mattila at stonesoft.com> wrote:

> Hello,
> 
> We have a custom kernel based on 2.6.27.39. This kernel
> has 2/2 memory split. Now we have one crash dump that can be
> successfully be opened with crash 4.0-8.8 but not with crash 5.0.
> This crashdump happens because double free of memory block, so there
> might be some memory corruption in cache data area.
> 
> Unfortunately I cannot pinpoint the exact version where this
> starts to happen because I could not find older crash releases.
> 
> Here is some debug info.
> 
> The tail of crash -d 10 output
> ...
> NOTE: page_hash_table does not exist in this kernel
> please wait... (gathering kmem slab cache data)<readmem: 8075801c,
> KVADDR,
> "cache_chain", 4, (FOE), ffb944f8>
>     addr: 8075801c  paddr: 75801c  cnt: 4
> GETBUF(128 -> 0)
> FREEBUF(0)
> GETBUF(204 -> 0)
> <readmem: 8067f1c0, KVADDR, "kmem_cache buffer", 204, (FOE), 8520f00>
>     addr: 8067f1c0  paddr: 67f1c0  cnt: 204
>   GETBUF(128 -> 1)
>   FREEBUF(1)
>   GETBUF(128 -> 1)
>   FREEBUF(1)
> 
> kmem_cache_downsize: SIZE(kmem_cache_s): 204 cache_cache.buffer_size: 0
> kmem_cache_downsize: nr_node_ids: 1
> FREEBUF(0)
> 
> crash: zero-size memory allocation! (called from 80b7b7b)
> >
> addr2line -e crash 80b7b7b
> /workarea/build/packages/crash/crash-5.0.0-32bit/memory.c:7439
> 
> I'm happy to test patches.

Nice bug report!

Here's what's happening:

It's related to this patch that went into 4.1.0:

        - Fix for a potential failure to initialize the kmem slab cache 
          subsystem on 2.6.22 and later CONFIG_SLAB kernels if the dumpfile
          has pages excluded by the makedumpfile facility.  Without the patch, 
          the following error message would be displayed during initialization: 
          "crash: page excluded: kernel virtual address: <address> type: 
          kmem_cache_s buffer", followed by "crash: unable to initialize kmem 
          slab cache subsystem".
          (anderson at redhat.com)

The patch was put in place due to this definition of the kmem_cache data structure:

  struct kmem_cache {
  /* 1) per-cpu data, touched during every alloc/free */
          struct array_cache *array[NR_CPUS];
  /* 2) Cache tunables. Protected by cache_chain_mutex */
          unsigned int batchcount;
          unsigned int limit;

  ... [ snip ] ...

           * We put nodelists[] at the end of kmem_cache, because we want to size
           * this array to nr_node_ids slots instead of MAX_NUMNODES
           * (see kmem_cache_init())
           * We still use [MAX_NUMNODES] and not [1] or [0] because cache_cache
           * is statically defined, so we reserve the max number of nodes.
           */
          struct kmem_list3 *nodelists[MAX_NUMNODES];
          /*
           * Do not add fields after nodelists[]
           */
  };

where for all kernel instances of the kmem_cache data structure *except* for
the head "cache_cache" kmem_cache structure, every other kmem_cache structure
in the kernel has its nodelists[] array downsized to whatever "nr_node_ids"
is initialized to.  The actual size of all of the downsized kmem_cache data 
structures can be found in the head "cache_cache.buffer_size" field.

But when the crash utility queries gdb for the size of a kmem_cache 
structure it gets the "full" size as declared in the vmlinux debuginfo 
data.  And so whenever a kmem_cache structure was read by crash, it 
was using the "full" size instead of the downsized size.  Doing that 
type of over-sized read could potentially extend into the next page,
and there was a reported case where doing that happened to extend into
a page that was excluded by makedumpfile.  Hence the kmem_cache_downsize()
function added to memory.c.

Anyway, given that your debug output shows:

  kmem_cache_downsize: SIZE(kmem_cache_s): 204 cache_cache.buffer_size: 0
  kmem_cache_downsize: nr_node_ids: 1

In vm_init() there was an initial STRUCT_SIZE_INIT(kmem_cache_s, ...)
that set the size to 204 bytes.  But then kmem_cache_downsize() was
called to downsize to whatever cache_cache.buffer_size contains:

        ...

        buffer_size = UINT(cache_buf +
                MEMBER_OFFSET("kmem_cache", "buffer_size"));

        if (buffer_size < SIZE(kmem_cache_s)) {
                ASSIGN_SIZE(kmem_cache_s) = buffer_size;

                if (kernel_symbol_exists("nr_node_ids")) {
                        get_symbol_data("nr_node_ids", sizeof(int),
                                &nr_node_ids);
                        vt->kmem_cache_len_nodes = nr_node_ids;

                } else
                        vt->kmem_cache_len_nodes = 1;

                if (CRASHDEBUG(1)) {
                        fprintf(fp,
                            "\nkmem_cache_downsize: SIZE(kmem_cache_s): %ld "
                            "cache_cache.buffer_size: %d\n",
                                STRUCT_SIZE("kmem_cache"), buffer_size);
                        fprintf(fp,
                            "kmem_cache_downsize: nr_node_ids: %ld\n",
                                vt->kmem_cache_len_nodes);
                }
        }

But your kernel shows cache_cache.buffer_size set to zero -- and the 
ASSIGN_SIZE(kmem_cache_s) above dutifully downsized the data structure 
size from 204 to zero.  Later on, that size was used to allocate a 
kmem_cache buffer, which failed when a GETBUF() was called with a zero-size.

I guess a check could be made above for a zero cache_cache.buffer_size,
but why would that ever be?

Try this:

  # crash --no_kmem_cache vmlinux vmcore

which will allow you to get past the kmem_cache initialization.

Then enter:

  crash> p cache_cache

Does the "buffer_size" member really show zero?

BTW, you can work around the problem by commenting out the call
to kmem_cache_downsize() in vm_init().  (And if you're using 
makedumpfile with excluded pages, hope that the problem I described
above doesn't occur...)

Dave