[Crash-utility] problems running crash on recent rawhide live kernels

Dave Anderson anderson at redhat.com
Tue Dec 4 18:23:51 UTC 2007


Hariharan S Reddy wrote:
> Hi Dave,
> 
> I also came across the same problem,
> 
> tt->refresh_hash_table is set to refresh_hlist_task_table_v2(),
> 
> after executing the function the tt->current is set to 0 (zero).

As Jeff mentioned, this must have something to do with the new
pid_namespace changes (of which I have no familiarity), although
it still seems that refresh_hlist_task_table_v2() should continue
to work.

To gather the set of tasks, refresh_hlist_task_table_v2() walks
the pid_hash[] array.  Each entry in the array is an hlist_head,
which points to either a hlist_node, or NULL.  If it contains a
pointer to an hlist_node, it's a pointer to the hlist_node embedded
in a pid_link (at least it *was* until 2.6.24):

   struct pid_link {
       struct hlist_node node;
       struct pid *pid;
   }

and from the "pid" structure pointer -- which points into a task
structure's task_struct.pids member -- the task can be readily
determined by subtracting the task_struct.pids member offset.

So, for example, on a 2.6.18 system, here's a non-NULL pid_hash
entry at the 141'th entry:

   crash> p pid_hash[141]
   $6 = {
     first = 0xffff810037da51e8
   }

it's a pointer to the embedded hlist_node in a pid_link:

   crash> pid_link 0xffff810037da51e8
   struct pid_link {
     node = {
       next = 0x0,
       pprev = 0xffff810009e53468
     },
     pid = 0xffff810037d0d9c0
   }

and from the pid pointer, the task can be determined by
subtracting 352 (the task_struct.pids offset):

   crash> eval 0xffff810037d0d9c0 - 352
   hexadecimal: ffff810037d0d860
       decimal: 18446604436669257824  (-139637040293792)
         octal: 1777774020006764154140
        binary: 1111111111111111100000010000000000110111110100001101100001100000
   crash> ps ffff810037d0d860
      PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
        13      1   3  ffff810037d0d860  IN   0.0       0      0  [watchdog/3]
   crash>

Note that the next pointer above could point to another task
in the chain, and the pprev pointer of the first entry points
back to pid_hash[141]:

   crash> p &pid_hash[141]
   $7 = (struct hlist_head *) 0xffff810009e53468
   crash>

Now, I don't see why this won't work in 2.6.24, even with the
pid_namespace changes.  I hacked up a crash to only gather the
swapper pid 0 task so I could see what's different about the
pid_hash array.  For example, here's a non-NULL entry at
index 112:

   crash> p pid_hash[112]
   $11 = {
     first = 0xffff810018899b88
   }

And here's what it looks like as a pid_link:

   crash> pid_link 0xffff810018899b88
   struct pid_link {
     node = {
       next = 0x0,
       pprev = 0xffff81000106ed80
     },
     pid = 0xcccccccccccccccc
   }

Note that the pprev pointer is correct for that index:

   crash> p &pid_hash[112]
   $12 = (struct hlist_head *) 0xffff81000106ed80
   crash>

But the pid structure pointer is 0xcccccccccccccccc.
And that is the case for every non-NULL entry in the
pid_hash array.

Does anybody have an idea what the significance of the
0xcccccccccccccccc is?  Or what is wrong with gathering
the tasks in that manner?

Anyway, crash is pretty much dead in the water for 2.6.24
until a proper task-gathering function is written.

Dave






More information about the Crash-utility mailing list