[Crash-utility] can't get hba info

Mon Oct 25 21:47:10 UTC 2010

----- "Mike Miller (OS Dev)" <Mike.Miller at hp.com> wrote:

> Hello,
> I'm trying to debug an issue using crash. I've moved the vmcore file
> over to a system other than the system from which the dump was
> collected. I've installed a variety of debug rpms including the
> kernel-debuginfo, kernel-debug-debuginfo, glib-debuginfo, and
> kernel-debuginfo-common. I've also copied the lib/modules from the
> test system to mine. 
> 
> When I run the command "p hba" it returns:
> 
> 	Crash > p hba
> 	Hba = $2 = 778764288 (please ignore the uppercase, stupid mail client)
> 
> I expect to see something like:
> 
> 	Hba = $1 = {0x102170b0000, 0x0, 0x0,..., 0x0}
> 
> Why is that? 

I'm not sure without more info...

> Do I need to be running crash on the test system itself?

No.

> I thought all relevant info would be contained in the vmcore file and
> I could attempt the analysis on any system?.

True.  

(although if the debuginfo data you need is not contained in the vmlinux file,
you may need to load debuginfo data from the relevant module's debuginfo data.)

Anyway, I wonder if your kernel has another "hba" symbol that's being
selected by the embedded gdb module?  

The only sample vmlinux/vmcore I have on hand that knows about an "hba"
symbol is this one:

  crash> whatis hba
  ctlr_info_t *hba[32];
  crash>

But I see that that kernel has two of them, so they must be statically defined:

  crash> sym hba
  ffffffff82700620 (b) hba  
  ffffffff82700aa0 (b) hba  
  crash>

When I run the command, the embedded gdb module picks the second one
at ffffffff82700aa0:

  crash> p hba
  hba = $4 = 
   {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 
    0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
    0x0, 0x0, 0x0, 0x0}
  crash> p &hba
  $11 = (ctlr_info_t *(*)[32]) 0xffffffff82700aa0
  crash>

So if I want to look at the first one at ffffffff82700620, I'd do this:

  crash> p *(ctlr_info_t *(*)[32]) 0xffffffff82700620
  $14 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 
   0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 
   0x0, 0x0, 0x0, 0x0, 0x0}
  crash> 

But since your example is displaying hba as "778764288", perhaps it's
using a different "hba" instance that's something different?  So enter
this:

  crash> whatis hba

And confirm whether it's the data structure instance you're referring to.
And if it's some other hba, then do this to get the possible addresses:

  crash> sym hba

And using casting on the relevant address to create the output you want
(like I did above to access the ctlr_info_t array at the other address).

Also, if the the data structure definition is in a module code, (i.e.,
not declared in a kernel header), then that module's debuginfo data would
needs to be loaded with the "mod" command.

> The crash version command returns version = $3 = v0.25941.

Huh?  (Sorry -- I don't know what you're talking about...)

> My running kernel is 2.6.18-prep. /etc/issue reports rhel5.5.
> The kernel on the test system is 2.6.18-164.el5. That is also the
> version of the debug packages I installed as well as the vmlinux file
> I'm using. Any problems mixing and matching like that? I have not
> rebooted my system since installing the debug packages but that
> doesn't seem necessary. Thanks in advance.

If you install the complete kernel-debuginfo/kernel-debuginfo-common
package pair that goes along with the vmlinux/vmcore pair that you're
working with, then there should be no problem loading all module
debuginfo data by just entering "mod -S".

The kernel-debug-debuginfo package is for the "debug" kernel, which
is a completely separate kernel from the base 2.6.18-164.el5 kernel.
The glib-debuginfo package is also unnecessary...

Hope this helps,
  Dave