[Crash-utility] HEAD'S UP -- problem with kernels built with gcc-4.6.0

Thu May 5 21:18:04 UTC 2011

As a heads-up to those of you who are working with kernels
that were compiled with the new gcc-4.6.0.

I had thought that gcc-4.6.0 was painful only as far as compiling
the crash utility was concerned, where there were a bunch of new
"error: variable <variable> set but not used [-Werror=unused-but-set-variable]
messages that I fixed in crash-5.1.2 and -5.1.3.  And you may be aware that
that those for-the-most-part useless warnings recently caused an LKML shitstorm
w/respect to building kernels. 

But it's worse than that -- there is a problem with crash's embedded gdb
determining the member offsets of the (large) pglist_data structure if
the kernel was compiled with gcc-4.6.0.  This is not specific to the
gdb-7.0 version that is built into crash, but with all gdb
versions as far as I can tell, certainly with gdb-7.2-48.el6 
and gdb-7.2.50.20110328-31.fc15.

The problem is most clearly seen with "struct -o pglist_data", which
dumps the structure, showing the offset of each member. 

For comparison, here is the output from a (good) 2.6.38-rc4 kernel
that was compiled with gcc-4.5.1:

  crash> help -k | grep gcc_version
     gcc_version: 4.5.1
  crash> struct -o pglist_data
  struct pglist_data {
        [0x0] struct zone node_zones[4];
     [0x1c00] struct zonelist node_zonelists[2];
    [0x13e40] int nr_zones;
    [0x13e44] spinlock_t node_size_lock;
    [0x13e48] long unsigned int node_start_pfn;
    [0x13e50] long unsigned int node_present_pages;
    [0x13e58] long unsigned int node_spanned_pages;
    [0x13e60] int node_id;
    [0x13e68] wait_queue_head_t kswapd_wait;
    [0x13e80] struct task_struct *kswapd;
    [0x13e88] int kswapd_max_order;
    [0x13e8c] enum zone_type classzone_idx;
  }
  SIZE: 0x13f00
  crash>

While here is the output from a 2.6.38.2-9.fc15 kernel that
was compiled with gcc-4.6.0:

  crash> help -k | grep gcc_version
     gcc_version: 4.6.0
  crash> struct -o pglist_data
  struct pglist_data {
        [0x0] struct zone node_zones[4];
     [0x1c00] struct zonelist node_zonelists[2];
        [0x0] int nr_zones;
        [0x0] spinlock_t node_size_lock;
        [0x0] long unsigned int node_start_pfn;
        [0x0] long unsigned int node_present_pages;
        [0x0] long unsigned int node_spanned_pages;
        [0x0] int node_id;
        [0x0] wait_queue_head_t kswapd_wait;
        [0x0] struct task_struct *kswapd;
        [0x0] int kswapd_max_order;
        [0x0] enum zone_type classzone_idx;
  }
  SIZE: 0x13f00
  crash>

It's interesting that it gets the size correct, but the member offset
values beyond the node_zonelists[] array are returned as 0.  

Taking the crash utility out of the picture, the problem can be seen 
by simply running "gdb vmlinux".

For example, with the first example above using the good kernel:

  $ gdb vmlinux-2.6.38-rc4
  GNU gdb (GDB) Red Hat Enterprise Linux (7.2-48.el6)
  Copyright (C) 2010 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-redhat-linux-gnu".
  For bug reporting instructions, please see:
  <http://www.gnu.org/software/gdb/bugs/>...
  Reading symbols from /root/vmlinux-2.6.38-rc4...done.
  (gdb) ptype struct pglist_data
  type = struct pglist_data {
      struct zone node_zones[4];
      struct zonelist node_zonelists[2];
      int nr_zones;
      spinlock_t node_size_lock;
      long unsigned int node_start_pfn;
      long unsigned int node_present_pages;
      long unsigned int node_spanned_pages;
      int node_id;
      wait_queue_head_t kswapd_wait;
      struct task_struct *kswapd;
      int kswapd_max_order;
      enum zone_type classzone_idx;
  }
  (gdb) p &((struct pglist_data *)(0x0)).node_zonelists[0]
  $1 = (struct zonelist *) 0x1c00
  (gdb) p &((struct pglist_data *)(0x0)).nr_zones
  $2 = (int *) 0x13e40
  (gdb) p &((struct pglist_data *)(0x0)).node_size_lock
  $3 = (spinlock_t *) 0x13e44
  (gdb) p &((struct pglist_data *)(0x0)).node_start_pfn
  $4 = (long unsigned int *) 0x13e48
  (gdb) p &((struct pglist_data *)(0x0)).node_present_pages
  $5 = (long unsigned int *) 0x13e50
  (gdb) p &((struct pglist_data *)(0x0)).node_spanned_pages
  $6 = (long unsigned int *) 0x13e58
  (gdb) p &((struct pglist_data *)(0x0)).node_id
  $7 = (int *) 0x13e60
  (gdb) p &((struct pglist_data *)(0x0)).kswapd
  $8 = (struct task_struct **) 0x13e80
  (gdb) p &((struct pglist_data *)(0x0)).kswapd_max_order
  $9 = (int *) 0x13e88
  (gdb) p &((struct pglist_data *)(0x0)).classzone_idx
  $10 = (enum zone_type *) 0x13e8c
  (gdb) 

And then with the kernel compiled with gcc-4.6.0:

  # gdb vmlinux-2.6.38.2-9.fc15
  GNU gdb (GDB) Red Hat Enterprise Linux (7.2-48.el6)
  Copyright (C) 2010 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-redhat-linux-gnu".
  For bug reporting instructions, please see:
  <http://www.gnu.org/software/gdb/bugs/>...
  Reading symbols from /root/vmlinux-2.6.38.2-9.fc15...done.
  (gdb) ptype struct pglist_data
  type = struct pglist_data {
      struct zone node_zones[4];
      struct zonelist node_zonelists[2];
      int nr_zones;
      spinlock_t node_size_lock;
      long unsigned int node_start_pfn;
      long unsigned int node_present_pages;
      long unsigned int node_spanned_pages;
      int node_id;
      wait_queue_head_t kswapd_wait;
      struct task_struct *kswapd;
      int kswapd_max_order;
      enum zone_type classzone_idx;
  }
  (gdb) p &((struct pglist_data *)(0x0)).node_zonelists[0]
  $1 = (struct zonelist *) 0x1c00
  (gdb) p &((struct pglist_data *)(0x0)).nr_zones
  $2 = (int *) 0x0
  (gdb) p &((struct pglist_data *)(0x0)).node_size_lock
  $3 = (spinlock_t *) 0x0
  (gdb) p &((struct pglist_data *)(0x0)).node_start_pfn
  $4 = (long unsigned int *) 0x0
  (gdb) p &((struct pglist_data *)(0x0)).node_present_pages
  $5 = (long unsigned int *) 0x0
  (gdb) p &((struct pglist_data *)(0x0)).node_spanned_pages
  $6 = (long unsigned int *) 0x0
  (gdb) p &((struct pglist_data *)(0x0)).node_id
  $7 = (int *) 0x0
  (gdb) p &((struct pglist_data *)(0x0)).kswapd_wait
  $8 = (wait_queue_head_t *) 0x0
  (gdb) p &((struct pglist_data *)(0x0)).kswapd
  $9 = (struct task_struct **) 0x0
  (gdb) p &((struct pglist_data *)(0x0)).kswapd_max_order
  $10 = (int *) 0x0
  (gdb) p &((struct pglist_data *)(0x0)).classzone_idx
  $11 = (enum zone_type *) 0x0
  (gdb) 

Anyway, given that the pglist_data structure is crucial to the
crash utility, the bogus offset data generates errors such as 
the MEMORY value, as shown here on a 4GB system:

  crash> sys
        KERNEL: vmlinux-2.6.38.2-9.fc15.gz
      DUMPFILE: vmcore.compressed
          CPUS: 12
          DATE: Thu May  5 16:01:44 2011
        UPTIME: 00:02:45
  LOAD AVERAGE: 1.26, 0.51, 0.20
         TASKS: 171
      NODENAME: amd-toonie2-02.lab.bos.redhat.com
       RELEASE: 2.6.38.2-9.fc15.x86_64
       VERSION: #1 SMP Wed Mar 30 16:55:57 UTC 2011
       MACHINE: x86_64  (2400 Mhz)
        MEMORY: 680 KB
         PANIC: ""
  crash> 

Bogus "kmem -n" node data gets output:  

  crash> kmem -n
  NODE    SIZE      PGLIST_DATA       BOOTMEM_DATA       NODE_ZONES   
   170     170     ffff88003ffec000        ----        ffff88003ffec000
                                                      ffff88003ffec700
                                                      ffff88003ffece00
                                                      ffff88003ffed500
      MEM_MAP       START_PADDR  START_MAPNR
  ffffea0000002530     aa000         170    

  ZONE  NAME         SIZE       MEM_MAP      START_PADDR  START_MAPNR
    0   DMA          4080  ffffea0000000380        10000           16
    1   DMA32      258048  ffffea0000038000      1000000         4096
    2   Normal          0                 0            0            0
    3   Movable         0                 0            0            0

  ...

And on a system configured with CONFIG_SLUB, "kmem -s" fails miserably:

  crash> kmem -s
  CACHE            NAME                 OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE
  kmem: page_to_nid: cannot determine node for pages: ffffea0000cb2bc0
  kmem: page_to_nid: cannot determine node for pages: ffffea0000cece18
  ffff880037bd0300 UDPLITEv6                984          0         0      0    32k
  kmem: page_to_nid: cannot determine node for pages: ffffea0000cc8640
  ffff880037bd0100 tw_sock_TCPv6            280          0         0      0     8k
  kmem: page_to_nid: cannot determine node for pages: ffffea0000cd70c0
  kmem: page_to_nid: cannot determine node for pages: ffffea0000c6bb90
  kmem: page_to_nid: cannot determine node for pages: ffffea0000ca5bf0
  ffff880037a7f100 dm_raid1_read_record    1064          0         0      0    32k
  ffff880037a7f000 kcopyd_job               368          0         0      0     8k
  ffff880037a7ef00 dm_uevent               2608          0         0      0    32k
  ffff880037a7ee00 dm_rq_target_io          400          0         0      0     8k
  kmem: page_to_nid: cannot determine node for pages: ffffea0000c304e0
  ...

And there may be other problems that I'm not aware of that are associated
with the pglist_data data structure members specifically -- and perhaps with 
other data structures as well?

I filed a bugzilla with gdb, although it may likely be a bug with
the debuginfo data created by gcc-4.6.0.  We'll see what happens...

In the meantime, I do have a workaround kludge for pglist_data members that
will be included in the upcoming crash-5.1.5 release.

Annoyed to no end,
  Dave