[Crash-utility] System.map problems with my 32-bit systems

Patrick Lengel plengel at sourcefire.com
Thu Aug 29 21:57:13 UTC 2013


Dave,

Thank you so much for the advice of using the "--reloc" command line
option.  That was exactly what I needed.  I see that my system works with
the command:

crash --reloc=15m -S System.map vmlinux vmcore

I looked into my Linux kernel configuration file and saw the following two
lines:

CONFIG_PHYSICAL_START=0x1000000
CONFIG_PHYSICAL_ALIGN=0x100000

The difference between the two values is 15 MBytes.  The change log
documentation that you pointed me to states that I can change these values
within my Linux Kernel Configuration file to make the START value less than
or equal to the ALIGN value. As I start my research to better understand
the resulting system behavior changes that will result from this update can
you think of any negative consequences that I should be aware of before I
make this change?

Thank you again,
Patrick


On Thu, Aug 29, 2013 at 4:49 PM, Dave Anderson <anderson at redhat.com> wrote:

>
>
> ----- Original Message -----
> > Hello Crash Utility Community,
> >
> > I am hoping that someone in the Crash Analysis community can provide some
> > assistance with a problem that I am having to analyze vmcore files
> gathered
> > from our 32-bit machines. I am working to add kexec to our systems so
> that
> > we can run the crash utility (version 7.0.1) on our appliances and I am
> > having trouble with our 32-bit systems. Fortunately my 64-bit systems are
> > working fine so I know that can I make the technology work. I believe
> that
> > the crash analysis tool does not like the System.map file and I am
> trying to
> > get to the root cause of this problem.
>
> If the vmlinux file that you're using matches the vmcore, then please don't
> use any -S or System.map argument -- just enter: "crash vmlinux vmcore"
>
> System.map files are only required if the symbol values in the vmlinux
> file are different from those in the running kernel.  It doesn't sound
> like that's the case in your environment.
>
> Secondly, if the session doesn't start that way, please provide the
> debug output generated by entering:
>
>  $ crash -d8 vmlinux vmcore
>
> > My problem originally manifests itself when I try to decode the vmcore
> file.
> > After intentionally creating an oops panic event I upload the vmcore
> file to
> > my build machine and run crash on that system. While the vmcore file is
> > generated on an appliance I run crash analysis program on the build
> system
> > that produced the Linux kernel since the appliances are meant to be
> deployed
> > into the field and will not be accessible for running crash analysis
> events.
> >
> > build# crash -S System.map vmlinux vmcore
> > crash 7.0.1
> > ...
> > crash: read error: kernel virtual address: c1363c5c type:
> "cpu_possible_mask"
> >
> > So I then tried to find what this symbol is within the map:
> >
> > build# crash --minimal -S System.map vmlinux vmcore
> > ...
> > crash> sym cpu_possible_mask
> > c1363c5c (R) cpu_possible_mask
> > crash>
>
> When you were in the "minimal" session, were you able to "rd" the
> cpu_possible_mask
> address?  i.e.
>
>   crash> rd c1363c5c
>
> or what did this show:
>
>   crash> rd linux_banner 10
>
> > From this I can only see that the addresses match up. So I then decided
> to
> > run the crash utility on the appliance itself to see what happens. I
> copied
> > the crash utility to the appliance and the uncompressed kernel image to
> the
> > appliance as well. The appliance boots from a "bzImage" file and the
> crash
> > utility can't use the bzImage file for processing so I needed to manually
> > copy the uncompressed kernel image to the box.
>
> Right, crash is only interested in the vmlinux ELF file from which the
> bzImage file was generated.
>
> > I then run the following commands on our appliance for data gathering
> > purposes:
> >
> > root at appliance:/var/crash# crash -S /boot/System.map
> > vmlinux-2.6.32.24-sf.pentM-37
> > ...
> > WARNING: cannot read linux_banner string
> > crash: /boot/System.map and /dev/mem do not match!
> >
> > root at appliance:/var/crash# ls -l /boot/System.map
> > lrwxrwxrwx 1 root root 32 Aug 28 22:32 /boot/System.map ->
> > System.map-2.6.32.24-sf.pentM-37
> > root at appliance:/var/crash#
> >
> > root at appliance:/var/crash# cat /proc/version
> > Linux version 2.6.32.24sf.pentM-37 (build at ajax) (gcc version 4.7.1
> (GCC) ) #1
> > PREEMPT Mon Aug 26 22:26:34 UTC 2013
> > root at appliance:/var/crash#
> >
> > So from everything I can see the Linux kernel and the System.map file
> are in
> > version agreement but the crash utility disagrees with me. The crash
> utility
> > is the judge so something is wrong. My goal is to find out how I can get
> the
> > information that is needed to determine the problem.
>
> OK, while running on the appliance itself, again, try running without
> the System.map argument.  It will presumably still fail as shown above.
> On that appliance, what is the output from these commands:
>
>  $ cat  /proc/kallsyms | grep cpu_possible_mask
>  $ nm -Bn /usr/lib/debug/lib/modules/3.9.10-100.fc17.x86_64/vmlinux | grep
> cpu_possible_mask
>  $ grep cpu_possible_mask /boot/System.map
>
> If they are not the same, it is possible you may need to use the "--reloc
> <size>"
> command line argument.  That is required for 32-bit x86 kernels that are
> configured
> as described here:
>
>  http://people.redhat.com/anderson/crash.changelog.html#4_0_4_5
>
> Dave
>
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20130829/a1edc111/attachment.htm>


More information about the Crash-utility mailing list