[Crash-utility] Question for ARM developers/users w/respect to makedumpfile [PROBLEM SOLVED]

Dave Anderson anderson at redhat.com
Tue Mar 5 15:32:26 UTC 2013



With respect to this long-running thread:

 [Crash-utility] Question for ARM developers/users w/respect to makedumpfile
 https://www.redhat.com/archives/crash-utility/2013-January/msg00049.html

and the kludge/patch that went into crash-6.1.3:

    - Workaround for the "crash --osrelease dumpfile" option to be able
      to work with malformed ARM compressed kdump headers.  ARM compressed
      kdumps that indicate header version 3 may contain a malformed
      kdump_sub_header structure with offset_vmcoreinfo and size_vmcoreinfo
      fields offset by 4 bytes, and the actual vmcoreinfo data is not
      preceded by its ELF note header and its "VMCOREINFO" string.  This
      workaround finds the vmcoreinfo data and patches the stored header's
      offset_vmcoreinfo and size_vmcoreinfo values.  Without the patch, the
      "--osrelease dumpfile" command line option fails with the message
      "crash: compressed kdump: cannot lseek dump vmcoreinfo", followed by
      "unknown".
      (anderson at redhat.com)

Luc Chouinard has come to the rescue and figured out what was going on.

It is not a matter of a "malformed" ARM compressed kdump header, but rather
a result of using a 32-bit x86 crash binary created with "make target=ARM"
to analyze ARM compressed kdumps.

The ARM guys on the list can confirm this, but Luc debugged this issue
with a natively-built crash utility, and determined:

  Seems like the off_t members of the kdump_sub_header struct are being
  aligned on 8 byte boundaries. Which would explain the problem you are
  pointing out. This can be a normal compiler behavior for the arm processor,
  which will generate exceptions for unaligned memory accesses. This is 
  something we had to chase down and fix for some of our cross platform 
  code base (i386 and arm). The Arm experts may have to confirm, but I'd 
  think that all 'long long' members of structs may cause problem when 
  cross interpreted by a ARM crash. 

Now AFAICT, I believe that the only "cross interpreted" items that would 
come into play would be the kdump_sub_header:

struct kdump_sub_header {
        unsigned long   phys_base;
        int             dump_level;         /* header_version 1 and later */
        int             split;              /* header_version 2 and later */
        unsigned long   start_pfn;          /* header_version 2 and later */
        unsigned long   end_pfn;            /* header_version 2 and later */
        off_t           offset_vmcoreinfo;  /* header_version 3 and later */
        unsigned long   size_vmcoreinfo;    /* header_version 3 and later */
        off_t           offset_note;        /* header_version 4 and later */
        unsigned long   size_note;          /* header_version 4 and later */
        off_t           offset_eraseinfo;   /* header_version 5 and later */
        unsigned long   size_eraseinfo;     /* header_version 5 and later */
};

The header is originally created on the crashing ARM host, and written
by an ARM makedumpfile binary into the dumpfile header.  But when crash
is build on an x86/x86_64 host with "make target=ARM", but resultant
binary is a 32-bit x86 binary:

$ make target=ARM
...
$ file crash
crash: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=0x0459d34779357928839cd4d05f04517a441dc555, not stripped

And when compiled as an x86 binary, the structure's offsets would be:

struct kdump_sub_header {
[0]     unsigned long   phys_base;
[4]     int             dump_level;         /* header_version 1 and later */
[8]     int             split;              /* header_version 2 and later */
[12]    unsigned long   start_pfn;          /* header_version 2 and later */
[16]    unsigned long   end_pfn;            /* header_version 2 and later */
[20]    off_t           offset_vmcoreinfo;  /* header_version 3 and later */
[28]    unsigned long   size_vmcoreinfo;    /* header_version 3 and later */
[32]    off_t           offset_note;        /* header_version 4 and later */
[40]    unsigned long   size_note;          /* header_version 4 and later */
[44]    off_t           offset_eraseinfo;   /* header_version 5 and later */
[52]    unsigned long   size_eraseinfo;     /* header_version 5 and later */
};

But when compiled on an ARM processor, each 64-bit "off_t" would be pushed
up to an 8-byte boundary:

struct kdump_sub_header {
[0]     unsigned long   phys_base;
[4]     int             dump_level;         /* header_version 1 and later */
[8]     int             split;              /* header_version 2 and later */
[12]    unsigned long   start_pfn;          /* header_version 2 and later */
[16]    unsigned long   end_pfn;            /* header_version 2 and later */
[24]    off_t           offset_vmcoreinfo;  /* header_version 3 and later */
[32]    unsigned long   size_vmcoreinfo;    /* header_version 3 and later */
[40]    off_t           offset_note;        /* header_version 4 and later */
[48]    unsigned long   size_note;          /* header_version 4 and later */
[56]    off_t           offset_eraseinfo;   /* header_version 5 and later */
[62]    unsigned long   size_eraseinfo;     /* header_version 5 and later */
};

So the "kludge" for "crash --osrelease" and "crash --log" will have 
to continue, but only if the crash binary was built with "make target=ARM".

Again, many thanks to Luc for tracking this issue down.

Dave





More information about the Crash-utility mailing list