[Crash-utility] [PATCH]: nr_node_ids

Dave Anderson anderson at redhat.com
Fri Sep 7 18:57:13 UTC 2012



----- Original Message -----
> On Fri, Sep 7, 2012 at 4:28 PM, Dave Anderson <anderson at redhat.com>
> wrote:
> >
> >
> > ----- Original Message -----
> >> Hi all,
> >>
> >> I'm wondering about the use of the kernel 'nr_node_ids' variable in
> >> memory.c. In kmem_cache_downsize(), vt->kmem_cache_len_nodes defaults
> >> to 1 when 'nr_node_ids' isn't present. But in vm_init() an error
> >> message is printed in the same case. The reason I'm asking is that I'm
> >> getting that error
> >>
> >>   "unable to initialize kmem slab cache subsystem"
> >>
> >> on a 3.4 kernel. Having vm_init() default to
> >>
> >>   vt->kmem_cache_len_nodes=1
> >>
> >> as well seems to bring up the slab subsystem, although I'm getting a
> >> couple of
> >>
> >>   "kmem: vm_area_struct: full list: slab: <nn1>  bad next pointer: <nn2>"
> >>
> >> mixed into my kmem -S output. I have no idea if it's related.
> >
> > Hi Per,
> 
> Hello =o)
> 
> >
> > I don't have any recent sample kernels that have the configuration that your
> > kernel is running, so I can't confidently answer/test this.  I presume that
> > your kernel does not configure CONFIG_NODES_SHIFT (or set it to 0), so
> > that nr_node_ids becomes a #define instead of a variable.  And to get it
> 
> Indeed, that's exactly what happened.
> 
> > to work, I'm also presuming that you changed the "else" clause in vm_init()
> > to something like this:
> >
> >                if (MEMBER_TYPE("kmem_cache", "nodelists") == TYPE_CODE_PTR) {
> >                         int nr_node_ids;
> >                         /*
> >                          * nodelists now a pointer to an outside array
> >                          */
> >                         vt->flags |= NODELISTS_IS_PTR;
> >                         if (kernel_symbol_exists("nr_node_ids")) {
> >                                 get_symbol_data("nr_node_ids", sizeof(int),
> >                                         &nr_node_ids);
> >                                 vt->kmem_cache_len_nodes = nr_node_ids;
> >                         } else {
> > -                               error(INFO, "nr_node_ids: symbol does not exist\n");
> > -                               error(INFO, "unable to initialize kmem slab cache subsystem\n\n");
> > -                               vt->flags |= KMEM_CACHE_UNAVAIL;
> > +                               vt->kmem_cache_len_nodes = 1;
> >                         }
> 
> Again, indeed, that's more or less to the character what I changed it to.
> 
> >
> > That looks reasonable to me.
> >
> 
> Ok, because that was the main purpose of my first mail, understanding
> whether there was a reason why the 'nr_node_ids'-has-been-turned-into-a-macro-case
> was treated as an error in this context. So, you agree we could change it?

Yep -- it's queued for crash-6.1.0.

> 
> > As far as the "kmem -S" output, are you running it on a live system?
> >
> 
> Nope, dead as a doornail. Are these messages to be expected then?

Not really.  You could follow the vm_area_struct's full-list in question
and verify that something's out of whack, starting from the (single)
kmem_cache->nodelists.slab_full linked list.  The list should either
point back to itself (empty) or be a simple list_head linked list,
that leads to a slab with a next value of "nn2".  Although, it would
also be interesting to know what the "nn2" value was?  In other
words, was it a bogus address entirely, or a maybe an address in
a page that wasn't capture in the dump?  (which shouldn't happen...)

It's here in verify_slab_v2():  

        list_head = (struct kernel_list_head *)(slab_buf + OFFSET(slab_list));
        if (!IS_KVADDR((ulong)list_head->next) ||
            !accessible((ulong)list_head->next)) {
                error(INFO, "%s: %s list: slab: %lx  bad next pointer: %lx\n",
                        si->curname, list, si->slab, (ulong)list_head->next);
                errcnt++;
        }

> Oh, and sorry for putting "[PATCH]" in the title when there wasn't
> one. It was by accident.
> 
> /Per

No problem...

Thanks,
  Dave




More information about the Crash-utility mailing list